Tutorials - Slurm

Use of Slurm on the i-Trop cluster

Description	knowing how to use Slurm
Author	Ndomassi TANDO (ndomassi.tando@ird.fr)
creation date	08/11/2019
Modification Date	08/03/2021

Summary

Objectives
Launching jobs with Slurm
Monitor resources with Slurm
Links
License

Objectives

Knowing how to launch different types of jobs with Slurm.

Knowing how to monitor jobs

Launching jobs with Slurm:

Launching commands from the master

The following command allocate computing resources ( nodes, memory, cores) and immediately launch the command on each allocate resource.

$ srun + command

Example:

$ srun hostname

Allow to obtain the name of the computing resource used.

Connect to a node in interactive mode and launch commands:

To connect to a node in interactive mode for X minutes , use the following command :

$ srun -p short --time=X:00 --pty bash -i

Then you can launch on this node without using the srun prefix srun

Connect to a node in interactive mode with x11 support:

The x11 support allows you to launch graphical software within a node.

You first have to connect to the bioinfo-master.ird.fr with the -X option:

$ ssh -X login@bioinfo-master.ird.fr

Then you can launch this command with the --x11 option

$ srun -p short --x11 --pty bash -i

Partitions available :

Depending on the type of jobs you want to launch you have the choice between several partitions

The partitions can be considered as job queues, each of which has an assortment of constraints such as job size limit, job time limit, users permitted to use it, etc

Priority-ordered jobs are allocated nodes within a partition until the resources (nodes, processors, memory, etc.) within that partition are exhausted.

partition	role	nodes list	Number of course	Ram
short	Jobs < 1 day (high priority, interactive jobs )	node0,node1,node2,node13,node14	12 cores	48 to 64 GB
normal	jobs of 7 days max	node0,node1,node2,node13,node14,node15,node16,node17,node18,node19,node20,node22,node23,node24	12 to 24 cores	64 to 96GB
long	<7 days< long jobs< 45 days	node3,node8,node9,node10,node11,node12	12 to 24 cores	48 GB
highmem	jobs with memory needs	node4, node7,node17,node21	12 to 72 cores	144 GB and 256GB
highmemplus	jobs with memory needs	node5	88 cores	512GB
supermem	jobs with important memory needs	node25	40 cores	1 TB
gpu	Analyses on GPU cores	node26	24 cpus and 8 GPUS cores	192 GB

gpu partition access is restricted , a request can be made here: request access to gpu

Main options in Slurm:

srun or sbatch can be used with the following options:

actions	Slurm options	SGE options
Choose a partition	-p [queue]	-q [queue]
Number of nodes to use	-N [min[-max]]	N/A
Number of tasks to launch	-n [count]	-pe [PE] [count]
Time limit	-t [min] ou -t [days-hh:mm:ss]	-l h_rt=[seconds]
Precise an output file	-o [file_name]	-o [file_name]
Precise a error file	-e [file_name]	-e [file_name]
Combine STDOUT and STDERR files	use -o without -e	-j yes
Copying the environment	--export=[ALL , NONE , variables]
Send an email	--mail-user=[address]	-M [address]
Notifications to send	--mail-type=[events]	-V
Job Name	--job-name=[name]	-N [name]
Relaunch the job	--requeue	-r [yes,no]
Precise the workdir	--workdir=[dir_name]	-wd [directory]
Set the memory size to reserve	--mem=[mem][M,G,T] or --mem-per-cpu=[mem][M,G,T]	-l mem_free=[memory][K,M,G]
Launch with a particular account	--account=[account]	-A [account]
Number of tasks per node	--tasks-per-node=[count]	(Fixed allocation_rule in PE)
Number of cpus per task	--cpus-per-task=[count]	N/A
Dependencie to a job	--depend=[state:job_id]	-hold_jid [job_id , job_name]
Choose a node	--nodelist=[nodes] ET/OU --exclude=[nodes]	-q [queue]@[node] OR -q
Job arrays	--array=[array_spec]	-t [array_spec]
Launching date	--begin=YYYY-MM-DD[THH:MM[:SS]]	-a [YYMMDDhhmm]

Launching jobs via a script

The batch mode allows to launch an analysis by following the steps described into a script

Slurm allows to use different types of scripts such as bash, perl or python.

Slurm allocates the desired computing resources and launch analyses on these resources in background.

To be interpreted by Slurm, the script should contain a specific header with all the keyword #BATCH to precise the Slurm options. .

Slurm script example:

#!/bin/bash
## Define a jobname
#SBATCH --job-name=test
## Define an output file
#SBATCH --output=res.txt
## Define the number of tasks
#SBATCH --ntasks=1
## Define the timelimit
#SBATCH --time=10:00
## Define 100Mb of memory per cpu
#SBATCH --mem-per-cpu=100
sleep 180 #launch a 180s sleep

To launch an analysis use the following command:

$ sbatch script.sh

with script.shthe name of the script to use

Submit a array job

#!/bin/bash
#SBATCH --partition=short      ### Partition
#SBATCH --job-name=ArrayJob    ###  job name
#SBATCH --time=00:10:00        ### timelimit
#SBATCH --nodes=1              ### Number of nodes
#SBATCH --ntasks=1             ### Number of tasks per job array
#SBATCH --array=0-19%4           ### Array index from 0  to 19 with 4 runnings jobs

echo "I am Slurm job ${SLURM_JOB_ID}, array job ${SLURM_ARRAY_JOB_ID}, and array task ${SLURM_ARRAY_TASK_ID}."

You have to use the $SBATCH --array option to define the range

The ${SLURM_JOB_ID} variable precise the job id

${SLURM_ARRAY_JOB_ID}precise the id of the job array

${SLURM_ARRAY_TASK_ID} precise the number of tasks of the job array.

The script should give a answer like:

$ sbatch array.srun
Submitted batch job 20303
$ cat slurm-20303_1.out
I am Slurm job 20305, array job 20303, and array task 1.
$ cat slurm-20303_19.out
I am Slurm job 20323, array job 20303, and array task 19.

Submit a R job

You can use the same syntax than before for Slurm. You just have to launch your R script with the R script command
Rscript + script.R

#!/bin/bash
## Define the job name
#SBATCH --job-name=test
## Define the output file
#SBATCH --output=res.txt
## Define the number of tasks
#SBATCH --ntasks=1
## Define the execution time limit
#SBATCH --time=10:00
## Define 100Mo of memory per cpu
#SBATCH --mem-per-cpu=100
Rscript script.R #launch the R script script.R

Submit a job with several command in parallel at the same time

Use the options --ntasks ans --cpus-per-task

Example:

#!/bin/bash

#SBATCH --ntasks=2
#SBATCH --cpu-per-task=2

srun --ntasks=1 sleep 10 & 
srun --ntasks=1 sleep 12 &
wait

In this example, we use 2 tasks with 2 cpus allocated per task that is to say 4 cpus allocated for this job

For each task a sleep is launched at the same time.

Notice the use of srun to launch a parallelised command and the & to launch the command in background.

wait is needed here to ask the job to wait for the end of each command before stopping

Submit an OpenMP job:

A OpenMP job is a job using several cpus on the same single node. Therefore the number of nodes will always be one.
This will work with a program compiled with openMP

#!/bin/bash
#SBATCH --partition=short   ### Partition
#SBATCH --job-name=HelloOMP ### Job Name
#SBATCH --time=00:10:00     ### WallTime
#SBATCH --nodes=1           ### Number of Nodes
#SBATCH --ntasks-per-node=1 ### Number of tasks (MPI processes)
#SBATCH --cpus-per-task=28  ### Number of threads per task (OMP threads)
#SBATCH --account=hpcrcf    ### Account used for job submission

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

./hello_omp

Environment variables:

   SLURM_JOB_ID     ID  of the allocated job.
   SLURM_JOB_NAME       Name of the job.
   SLURM_JOB_NODELIST   List of the nodes allocates to the job .
   SLURM_JOB_NUM_NODES  Number of nodes allocated to the  job.
   SLURM_NTASKS     Number of tasks in the job.
   SLURM_SUBMIT_DIR  The directory from which sbatch was invoked.

Cancelling a job

$ scancel <job_id>

With<job_id>: ID of the job

Monitoring resources:

Get jobs infos:

$ squeue

To refresh the infos every 5 seconds

$ squeue -i 5

Infos on a particular job:

$ scontrol show job <job_id>

With <job_id>: ID of the job

Infos on the jobs of a particular user

$ squeue -u <user>

With <user>: login

More infos on jobs:

$ sacct --format=JobID,elapsed,ncpus,ntasks,state,node

Infos on resources used by a finished job

$ seff <job_id>

With<job_id>: the job ID

You can add the following command at the end of your script to get infos of the jobs in your output file

$ seff $SLURM_JOB_ID

Get infos on partition

$ sinfo

It gives infos on partitions and nodes

More informations:

$ scontrol show partitions

scontrol show can be used with nodes, user, account etc….

Knowing the time limit for each partition

$sinfo -o "%10P %.11L %.11l"

Get infos on nodes

$ sinfo -N -l

Several states are possible:

alloc : The node is fully used
mix : The node is partially used
idle : No job is runing on the node
drain : node is finishing the received jobs but it doesn’t accept new ones ( when the node will be stopped for maintenance)

More informations :

$ scontrol show nodes

License

The resource material is licensed under the Creative Commons Attribution 4.0 International License (here).

Tutorials – Slurm