Running jobs on clusters

Time: 13:00 - 16:00

This module explains key features of a contemporary HPC cluster, such as deployed within NAISS and a number of Swedish universities. It will explain the principles behind the job scheduler and how the scheduler can be used to accomplish your computational work in an efficient manner. The examples will utilise the SLURM scheduler, which is deployed on the NAISS resources and many university resources.

The module is organised as an online event. The event addresses users who have recently started using HPC systems and prospective users considering using an HPC system in the near future.

Note

Click the Home (at the top) to see the date.

Prerequisites

  • An account at an HPC centre
  • If you do not have an account, you can still listen to the presentations.
  • Knowledge of logging in to your chosen HPC centre (Like in the “Connecting and File transfers” module for instance)
  • We have listed login info for several Swedish HPC centres
  • Very basic Linux knowledge
  • You can find the material and recordings from the most recent NAISS Linux introduction course here: Linux intro material and Recordings from the Linux intro course
  • Basic knowledge about the software modules system used at the NAISS centres
  • You can find the material and the recordings from the most recent NAISS selecting software modules course here: Selecting software modules and Recordings from the Selecting software modules course

Topics

  • Cluster architecture
  • sbatch options for CPU job scripts
  • Writing submission scripts for:
    • I/O intensive jobs
    • OpenMP and MPI jobs
    • Job arrays and simple task farms
    • Jobs with more memory per task
  • Running jobs on GPUs
  • Monitoring jobs and their efficiency

Note

Link to the course material: Running jobs on clusters