Howtos cluster i-Trop
Description | HowTos for i-Trop Cluster |
---|---|
Author | Ndomassi TANDO (ndomassi.tando@ird.fr) |
Creation date | 08/11/19 |
Modification date | 04/03/21 |
Summary
- Preambule: Architecture of the Itrop Cluster and Softwares to install before connecting to the cluster
- How to:Transfer files with filezilla
sftp
on the i-Trop cluster - How to: Connect to the i-Trop cluster via
ssh
- How to: Reserve one or several cores of a node
- How to: Transfer my data from the nas servers to the node
- How to: Use the Module Environment
- How to: Launch a job with Slurm
- How to: Choose a particular partition
- How to: See or delete your data on the /scratch partition of the nodes
- How to: Use a singularity container
- How to: Cite the Itrop platform in your publications
- Links
- License
Preambule
Architecture of the i-Trop cluster :
The i-Trop computing cluster is made up of a set of computing servers accessible via a front-end machine. Connections to these compute servers are made via this master machine, which ensures the distribution of the different analyses between the machines available at any given moment.
The computing cluster is composed of :
- 1 master machine
- 3 nas servers for temporary storage of project data up to 150TB
- 26 CPU computing nodes with a total capacity of 508 cores and 2744GB of RAM and a GPU server with 8 RTX 2080 graphics cards..
Here is the architecture:
Connecting to a server in ssh from a Windows machine
System | Softwares | Description | url |
---|---|---|---|
![]() |
mobaXterm | an advanced terminal for Windows with an X11 server and an SSH client | Download |
![]() |
putty | Putty allows to connect to a Linux server from a Windows machine . | Télécharger |
Transfer files from your computer to Linux servers with SFTP
Systems | Softwares | Description | url |
---|---|---|---|
![]() ![]() ![]() |
![]() |
FTP and SFTP client | Download |
View and edit files locally or on a remote server
Type | url | |
---|---|---|
Distant, console mode | nano | Tutorial |
Distant, console mode | vi | Tutorial |
Distant, graphic mode | komodo edit | Télécharger |
Linux & windows éditeur | Notepad++ | Télécharger |
How to : How to:Transfer files with filezilla sftp
Download and install FileZilla
Open FileZilla and save the i-Trop cluster into the site manager
In the FileZilla menu, go to File > Site Manager. Then go through these 5 steps:
- Click onNew Site.
- Add an explicit name.
-
3 choices possible:
- bioinfo-nas2.ird.fr (nas2) to transfer to /data/project
- bioinfo-nas.ird.fr (nas) to transfer to /home/user, /data2/projects ou /teams
- bioinfo-nas3.ird.fr (nas3) to transfer to /data3/project
- Put the Logon Type àto"Normal" and type your cluster's credentials
- Choose port 22 and press the "Connect" button.
Transferring files
- From your computer to the cluster : click and drag an text file item from the left local column to the right remote column
- From the cluster to your computer: click and drag an text file item from he right remote column to the left local column
How to : How to: Connect to the i-Trop cluster via ssh
From a windows computer :
with mobaXterm:
- Click the session button and choose SSH.
- In the remote host box, type: bioinfo-master.ird.fr
- Check the specify username box and enter your login
- In the console, enter your password when asked .
From a MAC or Linux:
Open terminal application and type the following command:
ssh login@bioinfo-master.ird.fr
with login: your cluster account
1st connection:
your password has to be changed at first connection
"Mot de passe UNIX (actuel)": you are asked to type the password provided in the account creation email.
Then type your new password twice
The session will be automatically closed..
You will need to open a new session with your new password.
How to : Reserve one or several cores of a node
Cluster uses Slurm (https://slurm.schedmd.com/documentation.html) to deal with users analyses
It monitors available resources (CPU et RAM ) and allocate them to the users for job launching
When you are connected on bioinfo-master.ird.fr, you have the possibily to reserve one or serveral cores among them of the 28 nodes available
Reserving one core
Type the following command:
srun -p short --pty bash -i
You will be randomly connected to one of the nodes of the short partition with one core reserved.
Reserving several cores at the same time
Typê the following command:
srun -p short -c X --pty bash -i
With X the number of cores between 2 and 12.
You will be randomly connected to one of the nodes of the short partition with X reserved cores
Reserving on core of a specific node:
type the following command
srun -p short --nodelist=nodeX --pty bash -i
With nodeX belonging to the short partition
How to : Transfer my data from the nas server to nodes
On the cluster, every node has its own local partition called /scratch.
/scratch is used to receive data to analyse, perform analyses on them and produces data results temporarly.
Data on /scratch is hosted for 30 days max expect for the nodes from the long partition until 45 day
It is mandatory to transfer its data to the /scratch of the reserved node before launching its analyses.
The /scratch volumes range from 1TB to 14TB depending on the chosen node.
When the analyses are finished, consider recovering your data.
The following section tells you how to choose which nas server to transfer data to.
scp command:
To transfer data between 2 remote servers, we use the command scp
scp -r source destination
There are 2 possible syntaxes:
Retrieve data from a remote server:
scp -r remote_server_name:path_to_files/file local_destination
Transfer data to a remote server:
scp -r /local_path_to_files/file remote_server_name:remote_destination
Transfer from or to /home, /data2 or /teams:
The /home, /data2 and /teams scores are located at bioinfo-nas.ird.fr (nas)
Recovering files from nas :
Syntaxes to use
scp -r nas:/home/login/file local_destination
scp -r nas:/data2/project/project_name/file local_destination
scp -r nas:/teams/team_name/file local_destination
Copy files to nas:
Syntax to use:
scp -r /local_path_to_files/file nas:/home/login
scp -r /local_path_to_files/file nas:/data2/project/project_name
scp -r /local_path_to_files/file nas:/teams/team_name
Transfer to or from /data
/data partition is located on bioinfo-nas2.ird.fr (nas2)
Retrieve file from nas2 :
Syntax to use:
scp -r nas2:/data/project/project_name/file local_destination
Copying files to nas2:
Syntax to use:
scp -r /local_path_to_files/file nas2:/data/project/project_name
Transfer from or to /data3 :
/data3 partition is located on bioinfo-nas3.ird.fr (nas3)
Retrieve files from nas3 :
Syntax to use:
scp -r nas3:/data3/project/project_name/file local_destination
Copying files to nas3:
Syntax to use:
scp -r /local_path_to_files/file nas3:/data3/project/project_name
How to : Use module Environnement
Module Environment allows you to dynamically change your environment variable(PATH, LD_LIBRARY_PATH) and then choose your version software.
The nomenclature use for modules is package_name/package_version
Software are divided in 2 groups:
- bioinfo: list bioinformatics software
- system: list system softwares
Displaying the available software
module avail
Displaying the description of a sotfware
module whatis module_type/module_name/version
with module_type: bioinfo or system
with module_name: the name of themodule.
For example : samtools version 1.7:
module whatis bioinfo/samtools/1.7
load a software:
module load module_type/module_name/version
with module_type: bioinfo or system
with module_name: module name.
For example : samtools version 1.7:
module load bioinfo/samtools/1.7
unload a software
module unload module_type/module_name/version
with module_type: bioinfo or system
with module_name: module name.
For example : samtools version 1.7:
module unload bioinfo/samtools/1.7
Displaying the loaded modules
module list
Unloading all the modules
module purge
How to : Launch a job with Slurm
Cluster uses Slurmto manage and prioritize users jobs .
It checks the ressources availables (CPU and RAM ) and allocate them to the users to perform their analyses.
Connected to bioinfo-master.ird.fr, we can launch a command with srun
or a script withsbatch
.
Use srun with a command:
If you simply want to launch a command that will be executed on a node:.
$ srun + command
Example:
$ srun hostname
will launch the command hostname on the node choose by Slurm..
Use sbatch to launch a script:
Lthe batch mode allows to launch a analysis in several setps defined in a script.
Slurm allows to use several scripts languages such as bash, perl or python.
Slurm allocates the desired resources and launches the analyses in background.
To be interpreted by Slurm, a script must contain a header with the Slurm options beginning by the keyword #BATCH
.
Slurm example script:
#!/bin/bash
## Define the job name
#SBATCH --job-name=test
## Define the output file
#SBATCH --output=res.txt
## Define the number of tasks
#SBATCH --ntasks=1
## Define the execution limit
#SBATCH --time=10:00
## Define 100Mb of memory per cpu
#SBATCH --mem-per-cpu=100
sleep 180 #lance une pause de 180s
to launch a analysis via a script:
$ sbatch script.sh
Withscript.sh
the script to use
More Slurm options here: Slurm options
Examples of scripts :
How to: Choose a particular partitionn
Depending on the type of jobs (analyses) you want to run, you can choose between different partitions.
Partitions are analysis queues with specific priorities and constraints such as the size or time limit of a job, the users authorized to use it, etc...
Jobs are prioritized and processed using the resources (CPU and RAM) of the nodes making up these partitions.
partition | role | nodes list | Number of cores | Ram |
---|---|---|---|---|
short | short jobs < 1 day (high priority,interactive jobs) | node0,node1,node2,node13,node14 | 12 cores | 48 à 64 Gb |
normal | jobs < 7 days | node0,node1,node2,node13,node14,node15,node16,node17,node18,node19,node20,node22,node23,node24 | 12 to 24 cores | 64 to 96Gb |
long | <7 dayss< long jobs< 45 days | node3,node8,node9,node10,node11,node12 | 12 to 24 cores | 48 Gb |
highmem | jobs with memory needs | node4, node7,node17,node21 | 12 to 24 cores | 144 Gb |
highmemplus | jobs with memory needs | node5 | 88 cores | 512 Gb |
supermem | jobs with important memory needs | node25 | 40 coeurs | 1 To |
gpu | analyses on GPU cores | node26 | 24 cpus and 8 GPUS cores | 192 Go |
The access to the gpu partition is restricted . A request can be made here: : request access to gpu
The partition can be chosen following this scheme:
By default, the chosen partition is the normal partition.
Warning, highmem and highmemplus should only be used for jobs requiring at least 35-40GB of memory..
The supermem partition should be used for large assemblies and jobs requiring more than 100GB of memory.
You can use the htop
on a node to visualize the memory consumed by a process.
To choose a partition, use the -p option.
sbatch -p partition
srun -p partition
With partition
the chosen partition.
How to : View and delete your data contained in the /scratch partition of the nodes
the 2 scripts are located here/opt/scripts/scratch-scripts/
-
To see your data contained in the /scratch of the nodes:
sh /opt/scripts/scratch-scripts/scratch_use.sh
and follow the instructions -
To delete your data contained in the /scratch partition of the nodes: launch the following command:
sh /opt/scripts/scratch-scripts/clean_scratch.sh
and follow the instructions
How to : Use a singularity container
Singularity is installed on the Itrop Cluster in 3 versions 2.4,3.3.0 and 3.6.0
Containers are located in /data3/projects/containers
The 2.4 folder hosts the containers built with the 2.4 version of singularity
The 3.3.0 folder hosts the containers built with the 3.3.0 version of singularity
You first need to load the environment with the command:
module load system/singularity/2.4
or module load system/singularity/3.3.0
Get help:
Use the command:
singularity help /data3/projects/containers/singularity_version/container.simg
with container.simg the container name .
with singularity_version: 2.4 or 3.3.0
Shell connection to a container:
singularity shell /data3/projects/containers/singularity_version/container.simg
Launch a container with only one application:
singularity run /data3/projects/containers/singularity_version/container.simg + arguments
Launch a container with several applications:
singularity exec /data3/projects/containers/singularity_version/container.simg + tools + arguments
Bind a host folder to a singularity container.
Use the option --bind /host_partition:/container_partition
Example:
singularity exec --bind /toto2:/tmp /data3/projects/containers/singularity_version/container.simg + tools + arguments
The container will have access to the file of the partition /toto2
of the host in its /tmp
partition
By default, partitions /home, /opt,/scratch, /data, /data2 and /data3 are already binded.
How to : Cite the Itrop platform in your publications
Please just copy the following sentence:
“The authors acknowledge the ISO 9001 certified IRD i-Trop HPC (member of the South Green Platform) at IRD montpellier for providing HPC resources that have contributed to the research results reported within this paper. URL: https://bioinfo.ird.fr/- http://www.southgreen.fr”
Links
- Related courses : Linux for Dummies
- Related courses : HPC
- Tutorials : Linux Command-Line Cheat Sheet
License
