bioinfo@ird.fr

Trainings 2019 – TOOGLE – Practice

TOGGLe Practice

Description Hands On Lab Exercises for TOGGLe
Related-course materials TOGGLe introduction
Authors Sébastien RAVEL (sebastien.ravel@cirad.fr)
Christine TRANCHANT (christine.tranchant@ird.fr)
Creation Date 15/03/2018
Last Modified Date 16/04/2019

Summary


Creating your own workflow :

Practice 1 consists of using base one pre-defined configuration file to build own workflow to use with TOGGLe.

TOGGLe Manual Page

The SNPdiscoveryPaired.config.txt file is an example of how to customize your pipeline.

Providing an order

The order of a pipeline is provided with key $order, base on the file, build new config file to run only from mapping to SNP calling.

TP on IRD cluster

All input data:

  • Input data : /data2/formation/TPsnpSV/fastqDir/
  • Reference : /data2/formation/TPsnpSV/reference.fasta
  • Config file: /data2/formation/TPsnpSV/configFiles/SNPdiscoveryPaired.config.txt

To do:

  • Create a "formationX" directory in your account
  • Make à copy for reference and input data into "formationX" directory (scp).
  • Add the configuration file used by TOGGLe and change SGE key as below
$sge
-q formation.q
-b Y
-cwd
Connect to account and prepare datas:
  • Connect to the cluster:
    ssh -Y formationX@bioinfo-master.ird.fr
  • Launch a QRSH command:
    qrsh -q formation.q
  • Transfer the data from nas using SCP:
    scp -r nas:/data2/formation/TPsnpSV .
  • Load TOGGLe tools:
    module load bioinfo/TOGGLE/0.3.6

Launching an analysis

Use only one script to run all pipeline: toggleGenerator.pl script usage

  toggleGenerator.pl -d|--directory DIR -c|--config FILE -o|--outputdir DIR [-r|--reference FILE] [-k|--keyfile FILE] [-g|--gff FILE] [-nocheck|--nocheckFastq] [--help|-h]
Required named arguments: description
-d / --directory DIR: a folder with raw data to be treated (FASTQ, FASTQ.GZ, SAM, BAM, VCF)
-c / --config FILE: generally it is the software.config.txt file but it can be any text file structured as shown below.
-o / --outputdir DIR: the current version of TOGGLE will not modify the initial data folder but will create an output directory with all analyses in.
Optional named arguments:
-r / --reference FILE: the reference FASTA file to be used. (1)
-g / -gff FILE: the GFF file to be used for some tools.
-k / --keyfile FILE: the keyfile use for demultiplexing step.
-nocheck / --nocheckFastq: by default toggle checks if fastq format is correct in every file. This option allows to skip this step.
-report / --report: generate pdf report (more info)
-h / --help: show help message and exit

(1): If no database index exists, it will be automatically created if it is necessary. If the index already exists, they will not be re-created UNLESS the pipeline order (see below) expressively requests it (updating the index e.g.)

All the the paths (files and folders) can be provided as absolute (/home/mylogin/data/myRef.fasta) or relative (../data/myRef.fasta).

Example of a command to run TOGGLe :

toggleGenerator.pl -d ~/toggle/fastq -c ~/toggle/SNPdiscoveryPaired.config.txt -o ~/toggle/outputRES -r ~/toggle/reference.fasta -nocheck -report

SOLUTIONS:

vim SNPdiscoveryPaired.config.txt
 qsub -q formation.q -b Y -N TOGGLE "module load bioinfo/TOGGLE/0.3.6; toggleGenerator.pl -c /home/formationX/TPsnpSV/configFiles/SNPdiscoveryPaired.config.txt -d /home/formationX/TPsnpSV/fastqDir/ -r /home/formationX/TPsnpSV/reference.fasta -o /home/formationX/outputTOGGLe -nocheck -report"

Links


License

The resource material is licensed under the Creative Commons Attribution 4.0 International License (here).