HPC system administration

The workshop discusses various topics on practice of administering High Performance Computing systems.

Currently two main topics are proposed:

  • parallel file systems: administering, uncommon configurations, troubleshooting.
  • InfiniBand: setup, monitoring, troubleshooting.

In order to discuss more real-life problems with HPC systems maintenance we plan to organize a panel discussion. The main purpose of the panel is to attract more specialists for finding solutions to problems arising in russian suprecomputer centers. Propose talk topics  that you see important, suggest possible lecturers, send topics to discuss at the panel.

Send your thoughts to cstef@parallel.ru, we'll discuss them and try to find a solution.