{"id":1071,"date":"2020-11-10T16:15:00","date_gmt":"2020-11-10T15:15:00","guid":{"rendered":"https:\/\/itrop.ird.fr\/wordpress\/?page_id=1071"},"modified":"2022-04-06T14:50:08","modified_gmt":"2022-04-06T12:50:08","slug":"trainings-2019-trinity-practice","status":"publish","type":"page","link":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/","title":{"rendered":"Trainings 2019 &#8211; Trinity &#8211; Practice"},"content":{"rendered":"<h2>Trinity Practice<\/h2>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: left;\">Description<\/th>\n<th style=\"text-align: left;\">Hands On Lab Exercises for  RNASeq assembly<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: left;\">Related-course materials<\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/itrop.ird.fr\/wordpress\/index.php\/trainings-2019-linux-for-dummies\/\">Linux for Dummies<\/a><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">Related-course materials<\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/itrop.ird.fr\/wordpress\/index.php\/trainings-2019-rnaseq\/\">RNAseq<\/a><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">Authors<\/td>\n<td style=\"text-align: left;\">Julie Orjuela-Bouniol (julie.orjuela@ird.fr) - i-Trop platform (UMR BOREA \/ DIADE \/ IPME - IRD)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">Creation Date<\/td>\n<td style=\"text-align: left;\">02\/08\/2019<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">Last Modified Date<\/td>\n<td style=\"text-align: left;\">21\/09\/2019<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">Modified by<\/td>\n<td style=\"text-align: left;\">Christine Tranchant (christine.tranchant@ird.fr)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr \/>\n<h1>Summary<\/h1>\n<p><!-- TOC depthFrom:2 depthTo:2 withLinks:1 updateOnSave:1 orderedList:0 --><\/p>\n<ul>\n<li><a href=\"#practice-0\">Preambule :  0. Going to the i-Trop cluster - <code>ssh,srun,scp<\/code><\/a><\/li>\n<li><a href=\"#practice-1\">Practice 1:# 1. Check Reads Quality<\/a><\/li>\n<li><a href=\"#practice-2\">Practice 2:# 2. Performing a de novo RNA-Seq assembly with <code>trinity<\/code><\/a><\/li>\n<li><a href=\"#practice-3\">Practice 3:# 3. Assessing transcriptome assembly quality<\/a><\/li>\n<li><a href=\"#practice-4\">Practice 4:# 4. Differential Expression Analysis (DE)<\/a><\/li>\n<li><a href=\"#license\">License<\/a><\/li>\n<\/ul>\n<hr \/>\n<p><a name=\"practice-0\"><\/a><\/p>\n<h1>0. Going to the i-Trop cluster - <code>ssh,srun,scp<\/code><\/h1>\n<p>Dataset used in this practical comes from<\/p>\n<ul>\n<li>ref : <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3488244\/\">https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3488244\/<\/a><\/li>\n<li>data : NCBI SRA database under accession number SRS307298 S. cerevisiae<\/li>\n<li>Genome size of S. cerivisiae : 12M (12.157.105) (<a href=\"https:\/\/www.yeastgenome.org\/strain\/S288C#genome_sequence\">https:\/\/www.yeastgenome.org\/strain\/S288C#genome_sequence<\/a>)<\/li>\n<\/ul>\n<p>In this session, we will analyze RNA-seq data from one sample of S. cerevisiae (NCBI SRA<br \/>\nSRS307298). It is from two different origin (CENPK and Batch), with three biological replications for each<br \/>\norigin (rep1, rep2 and rep3).<\/p>\n<h3>Connection to the i-Trop Cluster through <code>ssh<\/code> mode<\/h3>\n<p>We will work on the i-Trop Cluster with a &quot;supermem&quot; node using SLURM scheduler.<\/p>\n<pre><code>ssh formationX@bioinfo-master.ird.fr<\/code><\/pre>\n<h3>Opening an interactive bash session on the node25 (supermem partition) - <code>srun -p partition --pty bash -i<\/code><\/h3>\n<p>Read this survival document containig basic commands to SLURM (<a href=\"https:\/\/southgreenplatform.github.io\/trainings\/slurm\/\">https:\/\/southgreenplatform.github.io\/trainings\/slurm\/<\/a>)<\/p>\n<pre><code>srun -p supermem --mem 50G --time 20:00:00 --cpus-per-task 2 --pty bash -i<\/code><\/pre>\n<h3>Prepare input files<\/h3>\n<ul>\n<li>Create your subdirectory in the scratch file system \/scratch. In the following, please replace X with your own user ID number in formationX.<\/li>\n<\/ul>\n<pre><code>cd \/scratch\nmkdir formationX\ncd formationX<\/code><\/pre>\n<ul>\n<li>Copy the exercise files from the shared location to your scratch directory (it is essential that all<br \/>\ncalculations take place here)<\/li>\n<\/ul>\n<pre><code>scp -r  nas:\/data2\/formation\/TP-trinity\/SRA_SRS307298\/RAWDATA\/ \/scratch\/formationX\/<\/code><\/pre>\n<ul>\n<li>When the files transfer is finished, verify by listing the content of the current directory and the subdirectory RAWDATA with the<br \/>\ncommand <code>ls -al<\/code>. You should see 12 gzipped read files in a listing, the <code>samples.txt<\/code> file and the <code>run_trinity.sh<\/code> bash script. <\/li>\n<\/ul>\n<pre><code>[orjuela@node25 RAWDATA]$ more samples.txt \nCENPK   CENPK_rep1  PATH\/SRR453569_1.fastq.gz   PATH\/SRR453569_2.fastq.gz\nCENPK   CENPK_rep2  PATH\/SRR453570_1.fastq.gz   PATH\/SRR453570_2.fastq.gz\nCENPK   CENPK_rep3  PATH\/SRR453571_1.fastq.gz   PATH\/SRR453571_2.fastq.gz\nBatch   Batch_rep1  PATH\/SRR453566_1.fastq.gz   PATH\/SRR453566_2.fastq.gz\nBatch   Batch_rep2  PATH\/SRR453567_1.fastq.gz   PATH\/SRR453567_2.fastq.gz\nBatch   Batch_rep3  PATH\/SRR453568_1.fastq.gz   PATH\/SRR453568_2.fastq.gz<\/code><\/pre>\n<hr \/>\n<p><a name=\"practice-1\"><\/a><\/p>\n<h1>1. Check Reads Quality<\/h1>\n<p>FastQC perform some simple quality control checks to ensure that the raw data looks good and there are no problems or biases in data which may affect how user can usefully use it. <a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/\">http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/<\/a><\/p>\n<pre><code># make a fastqc repertoty\ncd ..\/\nmkdir FASTQC; cd FASTQC\n\n#charge modules \nmodule load bioinfo\/FastQC\/0.10.1\n\n# run fastqc in the whole of samples\nfastqc -t 2 \/scratch\/formationX\/RAWDATA\/*.gz -o \/scratch\/formationX\/FASTQC\/<\/code><\/pre>\n<p>Multiqc is a modular tool to aggregate results from bioinformatics analyses across many samples into a single report. Use this tool to visualise results of quality. <a href=\"https:\/\/multiqc.info\/\">https:\/\/multiqc.info\/<\/a><\/p>\n<pre><code>#charge modules \nmodule load bioinfo\/multiqc\/1.7\n\n#launch Multiqc to create a report in html containing the whole of informations generated by FastQC\nmultiqc .\n\n#transfer results to your cluster home \nscp -r multiqc* nas:\/home\/formationX\/\n\n# transfert results to your local machine by scp or filezilla\nscp -r formationX@bioinfo-nas.ird.fr:\/home\/formationX\/multiqc* .\/\n\n# open in your favorite web navigator\nfirefox multiqc_report.html .<\/code><\/pre>\n<p>In this practice, reads quality is ok. You need to observe sequences and check biases. To remove adaptors and primers you can use Trimmomatic. Use PRINSEQ2 to detect Poly A\/T tails and low complexity reads. Remove contaminations with SortMeRNA, riboPicker or DeconSeq.<\/p>\n<hr \/>\n<p><a name=\"practice-2\"><\/a><\/p>\n<h1>2. Performing a de novo RNA-Seq assembly with <code>trinity<\/code><\/h1>\n<h2>2.1  Running Trinity with trimmomatic and reads normalisation<\/h2>\n<h4>Preparing assembly sample file and check parameters of trinity assembler<\/h4>\n<p>Observe <code>run_trinity.sh<\/code> script and adapt to your formation number. This script is not sending in sbatch mode because it could be take time in this training. If you need launch it in a slurm cluster you can use this version containing SBATCH commands <a href=\"https:\/\/github.com\/SouthGreenPlatform\/trainings\/blob\/gh-pages\/files\/AA-SG-ABiMS2019\/run_trinotate.slurm\">run_trinity.slurm<\/a> or adapt it to SGE.<\/p>\n<pre><code>[orjuela@node25 RAWDATA]$ more run_trinity.sh \n\n# loading modules\nmodule load bioinfo\/samtools\/1.9\nmodule load bioinfo\/trinityrnaseq\/2.8.5\nmodule load bioinfo\/Trimmomatic\/0.33\n\n# changing PATH to current directory in samples file\nsed -i &#039;s|PATH|&#039;$PWD&#039;|ig&#039; samples.txt \n\n# Running trinity assembly\n# Trinity --seqType fq --max_memory 50G --CPU 2 --samples_file samples.txt --output ..\/TRINITY_OUT \n\n# Running Trinity with trimmomatic and normalisation\nTrinity --seqType fq --max_memory 50G --CPU 2 --trimmomatic --quality_trimming_params &#039;ILLUMINACLIP:\/usr\/local\/Trimmomatic-0.33\/adapters\/TruSeq2-PE.fa:2:30:10 ILLUMINACLIP:\/scratch\/formationX\/RAWDATA\/adapt-125pbLib.txt:2:30:10 SLIDINGWINDOW:5:20 LEADING:5 TRAILING:5 MINLEN:25 HEADCROP:10&#039; --normalize_by_read_set --samples_file samples.txt --output ..\/TRINITY_OUT<\/code><\/pre>\n<p>Running assembly :<\/p>\n<pre><code>bash run_trinity.sh &gt; ..\/trinity.log &amp; <\/code><\/pre>\n<p>All screen output (info messages and error messages, if any) will be saved in the file<br \/>\n<code>..\/trinity.log<\/code>. The script will start executing in the background (the &amp; at the end), so<br \/>\nthat the terminal will return to the prompt right after you hit \u201cEnter\u201d. <\/p>\n<p>You can use <code>jobs<\/code> to monitor jobs but if you logout the program does not keep running.<\/p>\n<p>While running you can examine steps <code>tail -f ..\/trinity.log<\/code>.<\/p>\n<p>After trimmomatic and reads normalisation, Three stages are done by Trinity<\/p>\n<ol start=\"0\">\n<li>Jellyfish: Extracts and counts K-mers (K=25) from reads<\/li>\n<li>Inchworm: Assembles initial contigs by \u201cgreedily\u201d extending sequences with most abundant K-mers<\/li>\n<li>Chrysalis: Clusters overlapping Inchworm contigs, builds de Bruijn graphs for each cluster, partitions reads between clusters<\/li>\n<li>Butterfly: resolves alternatively spliced and paralogous transcripts independently for each cluster (in parallel)<\/li>\n<\/ol>\n<h4>Ending up assembly and downloading assembly results<\/h4>\n<ul>\n<li>\n<p class=\"warning\"> WARNING !: This job is expected to run 12 hours. Kill your job using <code>fg<\/code> and <code>ctl+c<\/code> <\/p>\n<\/li>\n<\/ul>\n<pre><code> jobs = shows jobs running\n fg = foreground\n bf = background\n ctrl+z = send job to background\n ctrl+c = kill job in ground<\/code><\/pre>\n<ul>\n<li>Recover assembly results (only Trinity.fasta) generated by trinity from the shared location to your scratch directory : <\/li>\n<\/ul>\n<pre><code>scp -r  nas:\/data2\/formation\/TP-trinity\/TRINITY_OUT\/TRINITY_OUT\/Trinity.fasta \/scratch\/formationX\/\n\ncd \/scratch\/formationX\/TRINITY_OUT<\/code><\/pre>\n<ul>\n<li>Upon successful completion of Trinity, the assembled transcriptome is written to the FASTA file called<br \/>\n<code>Trinity.fasta<\/code> located in the output directory <code>\/scratch\/formationX\/TRINITY_OUT<\/code>.<\/li>\n<\/ul>\n<hr \/>\n<p><a name=\"practice-3\"><\/a><\/p>\n<h2>3. Assessing transcriptome assembly quality<\/h2>\n<h2>3.1 Getting basic Assembly metrics with the trinity script <code>TrinityStats.pl<\/code><\/h2>\n<h5>Running trinity script<\/h5>\n<p>Trinity.fasta contains transcripts to be evaluated, annotated, and used in downstream analysis of<br \/>\nexpression. In this exercise, we only concentrate on basis statistics of the assembled transcriptome,<br \/>\nwhich can be obtained using a Trinity utility script TrinityStats.pl. <\/p>\n<pre><code>#declare the path_to_trinity variable\npath_to_trinity=\/usr\/local\/trinityrnaseq-2.8.5\/\n#check path\necho $path_to_trinity\n#launch assembly metrics script\n$path_to_trinity\/util\/TrinityStats.pl \/scratch\/formationX\/TRINITY_OUT\/Trinity.fasta &gt; trinityStats.log<\/code><\/pre>\n<h5>Parsing the results generated by the script<\/h5>\n<p>The output file generated (trinityStats.log) will contain basic information about contig length distributions, based on all transcripts and only on the longest isoform per gene.   <\/p>\n<p>Besides average and median contig lengths, also given are quantities N10 through N50. Nx is the smallest contig length such that (x\/100)% of all assembled bases are in contigs longer than Nx. Specifically, N50 is the contig length such that half of all assembly sequence is contained in contigs longer than that. In whole genome assembly, N50 is often used as a measure (one of many) of assembly quality, since the longer the contigs, the better the assembly. In the case of transcriptome, contig lengths should be correct, which does not imply \u201clarge\u201d. If it falls in the right ballpark (about 1000-1,500), N50  can still be used as a check on overall \u201csanity\u201d of the transcriptome assembly.<\/p>\n<pre><code>[orjuela@node25 orjuela]$ more trinityStats.log \n\n################################\n## Counts of transcripts, etc.\n################################\nTotal trinity &#039;genes&#039;:    7600\nTotal trinity transcripts:  8616\nPercent GC: 39.37\n\n########################################\nStats based on ALL transcript contigs:\n########################################\n\n    Contig N10: 11180\n    Contig N20: 8394\n    Contig N30: 6793\n    Contig N40: 5553\n    Contig N50: 4736\n\n    Median contig length: 506\n    Average contig: 1909.88\n    Total assembled bases: 16455500\n\n#####################################################\n## Stats based on ONLY LONGEST ISOFORM per &#039;GENE&#039;:\n#####################################################\n\n    Contig N10: 9989\n    Contig N20: 7558\n    Contig N30: 6192\n    Contig N40: 5131\n    Contig N50: 4194\n\n    Median contig length: 423\n    Average contig: 1625.41\n    Total assembled bases: 12353135\n<\/code><\/pre>\n<h2>3.2 Reads mapping back rate and abundance estimation using the trinity script <code>align_and_estimate_abundance.pl<\/code><\/h2>\n<p>Read congruency is an important measure in determining assembly accuracy. Clusters of read pairs that align incorrectly are strong indicators of mis-assembly. A typical &quot;good&quot; assembly has ~80% reads mapping to the assembly and ~80% are properly paired.<\/p>\n<p>Several tools can be used to calculate reads mapping back rate over <code>Trinity.fasta<\/code> assembly : bwa, bowtie2 (mapping), kallisto, salmon (pseudo mapping). Quantify read counts for each gene\/isoform can be calculate. Mapping and quantification can be obtained by using the --est_method argument into the <code>align_and_estimate_abundance.pl<\/code> script.<\/p>\n<p>We will performing this analyses step successively with the <code>align_and_estimate_abundance.pl<\/code> script :<\/p>\n<ul>\n<li>Pseudomapping methods (kallisto or salmon) are faster than mapping based. So firstly we will use salmon to pseudoalign reads from sample to the reference and quantify abondance. <\/li>\n<li>Then,  we will use bowtie2 and RSEM to align and quantify read counts for each gene\/isoform.<\/li>\n<\/ul>\n<h5>Preparing data and environment for our following analysis<\/h5>\n<pre><code>#create a directory dans \/scratch\/formationX\/\nmkdir ALIGN_AND_ABUNDANCE\ncd ALIGN_AND_ABUNDANCE\n\n# Loading modules\nmodule load system\/perl\/5.24.0\nmodule load bioinfo\/trinityrnaseq\/2.8.5\nmodule load bioinfo\/bowtie2\/2.2.9\nmodule load bioinfo\/express\/1.5.1\nmodule load bioinfo\/kallisto\/0.43.1\nmodule load bioinfo\/RSEM\/1.0\nmodule load bioinfo\/salmon\/0.10.2\nmodule load bioinfo\/samtools\/1.7\n\n# modifying samples file path because quality-trimmed your reads using the --trimmomatic parameter in Trinity\ncp \/scratch\/formationX\/RAWDATA\/samples.txt .\nsed -i &#039;s|RAWDATA|TRINITY_OUT|ig&#039; samples.txt\nsed -i &#039;s|.fastq.gz|.fastq.gz.P.qtrim.gz|ig&#039; samples.txt\n\n# define variables\npath_to_trinity=\/usr\/local\/trinityrnaseq-2.8.5\/\nfasta=\/scratch\/formationX\/TRINITY_OUT\/Trinity.fasta\nsamplesfile=\/scratch\/formationX\/ALIGN_AND_ABUNDANCE\/samples.txt #modified samples file<\/code><\/pre>\n<h5>Launch Salmon<\/h5>\n<pre><code># create a salmon_outdir and go on\nmkdir salmon_outdir; cd salmon_outdir\n\n# salmon\nperl $path_to_trinity\/util\/align_and_estimate_abundance.pl \\\n--transcripts $fasta \\\n--seqType fq \\\n--samples_file $samplesfile \\\n--est_method salmon \\\n--trinity_mode \\\n--prep_reference &gt; salmon_align_and_estimate_abundance.log 2&gt;&amp;1 &amp;<\/code><\/pre>\n<h4>Check reads percentage mapped to Trinity.fasta reference by sample for salmon and bowtie-rsem est method.<\/h4>\n<pre><code>#salmon results\n[orjuela@node25 salmon_outdir]$ grep &#039;Mapping rate =&#039; *\/logs\/salmon_quant.log \nBatch_rep1\/logs\/salmon_quant.log:[2019-09-17 15:54:32.923] [jointLog] [info] Mapping rate = 96.3027%\nBatch_rep2\/logs\/salmon_quant.log:[2019-09-17 15:55:09.922] [jointLog] [info] Mapping rate = 96.3992%\nBatch_rep3\/logs\/salmon_quant.log:[2019-09-17 15:55:39.246] [jointLog] [info] Mapping rate = 96.7795%\nCENPK_rep1\/logs\/salmon_quant.log:[2019-09-17 15:52:56.609] [jointLog] [info] Mapping rate = 94.4516%\nCENPK_rep2\/logs\/salmon_quant.log:[2019-09-17 15:53:30.146] [jointLog] [info] Mapping rate = 95.0558%\nCENPK_rep3\/logs\/salmon_quant.log:[2019-09-17 15:54:00.196] [jointLog] [info] Mapping rate = 95.0639%<\/code><\/pre>\n<h5>OPTIONAL: Launch bowtie2-rsem<\/h5>\n<ul>\n<li>\n<p class=\"warning\"> WARNING !: this job can take a lot ~1h30 by sample. This step will not run in this practice.<\/p>\n<\/li>\n<\/ul>\n<pre><code># create a bowtie2-rsem_outdir and go on\ncd \/scratch\/formationX\/ALIGN_AND_ABUNDANCE\/\nmkdir bowtie2-rsem_outdir; cd bowtie2-rsem_outdir\n\n# runnign align_and_estimate_abundance in bowtie2-rsem mode\nperl $path_to_trinity\/util\/align_and_estimate_abundance.pl \\\n--transcripts $fasta \\\n--seqType fq \\\n--samples_file $samplesfile \\\n--est_method RSEM --aln_method bowtie2 \\\n--trinity_mode \\\n--prep_reference \\\n--coordsort_bam &gt; bowtie-rsem_align_and_estimate_abundance.log 2&gt;&amp;1 &amp;<\/code><\/pre>\n<p>Here, we show you results with  bowtie2-rsem OPTIONAL part<\/p>\n<pre><code>#bowtie-rsem results\n[orjuela@node25 bowtie2-rsem_outdir]$ grep -B 6 &quot;overall alignment rate&quot; bowtie-rsem_align_and_estimate_abundance.log\nCMD: set -o pipefail &amp;&amp; bowtie2 --no-mixed --no-discordant --gbar 1000 --end-to-end -k 200  -q -X 800 -x \/scratch\/orjuela\/TRINITY_OUT\/Trinity.fasta.bowtie2 -1 \/scratch\/orjuela\/TRINITY_OUT\/SRR453569_1.fastq.gz.P.qtrim.gz -2 \/scratch\/orjuela\/TRINITY_OUT\/SRR453569_2.fastq.gz.P.qtrim.gz -p 4 | samtools view -@ 4 -F 4 -S -b | samtools sort -@ 4 -n -o bowtie2.bam \n3666948 reads; of these:\n  3666948 (100.00%) were paired; of these:\n    497221 (13.56%) aligned concordantly 0 times\n    2058938 (56.15%) aligned concordantly exactly 1 time\n    1110789 (30.29%) aligned concordantly &gt;1 times\n86.44% overall alignment rate\n--\n ...<\/code><\/pre>\n<h2>3.3. Expression matrix construction<\/h2>\n<p>Combine read count from all samples into a matrix, and normalize the read count using the TMM method. This command will take in RSEM output files from each sample, and combine them into a single matrix file.<\/p>\n<pre><code># go to salmon results repertory\ncd \/scratch\/formationX\/ALIGN_AND_ABUNDANCE\/salmon_outdir\n\n#load modules\nmodule load bioinfo\/R\/3.3.3\nmodule load system\/perl\/5.24.0\n\n#declare bash variables\npath_to_trinity=\/usr\/local\/trinityrnaseq-2.8.5\/\n\n# calculate expression matrix (if salmon use quant.sf files, if kallisto use abundance.tsv files)\n$path_to_trinity\/util\/abundance_estimates_to_matrix.pl \\\n--est_method salmon \\\n--out_prefix Trinity_trans \\\n--name_sample_by_basedir \\\n--gene_trans_map none \\\nCENPK_rep1\/quant.sf \\\nCENPK_rep2\/quant.sf \\\nCENPK_rep3\/quant.sf \\\nBatch_rep1\/quant.sf \\\nBatch_rep2\/quant.sf \\\nBatch_rep3\/quant.sf <\/code><\/pre>\n<p>You have to obtain two matrices:<br \/>\nThe firts one containing the estimated counts <code>\u0300Trinity_trans.isoform.counts.matrix<\/code> and the second one containing the TPM expression values that are cross-sample normalized using the TMM method <code>Trinity_trans.TMM.EXPR.matrix<\/code>.<br \/>\nTMM normalization assumes that most transcripts are not differentially expressed, and linearly scales the expression values of samples to better enforce this property.<\/p>\n<p>In both files, each column represents a sample, and each row represents a gene, the values are either the raw read counts or normalized FPKM values. The \u201ccounts\u201d file will be used for differentially expressed gene identification, and the \u201cfpkm\u201d file will be used for clustering analysis. By default, the fpkm file is normalized with TMM method. <\/p>\n<pre><code>[orjuela@node25 salmon_outdir]$ ll Trinity_trans.*\ntotal 1,4M\n-rw-r--r-- 1 orjuela borea-equipe7 539K 29 ao\u00fbt  16:41 Trinity_trans.isoform.TMM.EXPR.matrix\n-rw-r--r-- 1 orjuela borea-equipe7 386K 29 ao\u00fbt  16:41 Trinity_trans.isoform.counts.matrix\n-rw-r--r-- 1 orjuela borea-equipe7 474K 29 ao\u00fbt  16:41 Trinity_trans.isoform.TPM.not_cross_norm\n-rw-r--r-- 1 orjuela borea-equipe7  355 29 ao\u00fbt  16:41 Trinity_trans.isoform.TPM.not_cross_norm.TMM_info.txt\n-rw-r--r-- 1 orjuela borea-equipe7  542 29 ao\u00fbt  16:41 Trinity_trans.isoform.TPM.not_cross_norm.runTMM.R\n\n[orjuela@node25 salmon_outdir]$ head *.matrix\n==&gt; Trinity_trans.isoform.counts.matrix &lt;==\n    CENPK_rep1  CENPK_rep2  CENPK_rep3  Batch_rep1  Batch_rep2  Batch_rep3\nTRINITY_DN3130_c0_g1_i1 0.000000    4.000000    0.000000    0.000000    0.000000    0.000000\nTRINITY_DN986_c0_g1_i4  190.000000  210.677452  319.000000  353.571187  420.080043  292.000000\nTRINITY_DN4062_c0_g2_i1 0.000000    13.000000   0.000000    0.000000    0.000000    0.000000\nTRINITY_DN4840_c0_g1_i1 0.000000    10.000000   0.000000    0.000000    0.000000    0.000000\nTRINITY_DN1624_c0_g1_i1 0.000000    5.000000    0.000000    0.000000    0.000000    0.000000\nTRINITY_DN3492_c0_g1_i1 0.000000    5.000000    0.000000    0.000000    0.000000    0.000000\nTRINITY_DN1386_c0_g1_i1 671.000000  724.000000  997.000000  1325.000000 1726.000000 1229.000000\nTRINITY_DN6688_c0_g1_i1 0.000000    5.000000    0.000000    0.000000    0.000000    0.000000\nTRINITY_DN1126_c0_g2_i1 29.000000   42.000000   51.000000   14.000000   7.000000    5.000000\n\n==&gt; Trinity_trans.isoform.TMM.EXPR.matrix &lt;==\n    CENPK_rep1  CENPK_rep2  CENPK_rep3  Batch_rep1  Batch_rep2  Batch_rep3\nTRINITY_DN3130_c0_g1_i1 0.000   32.382  0.000   0.000   0.000   0.000\nTRINITY_DN986_c0_g1_i4  141.255 131.008 158.103 195.483 180.432 173.943\nTRINITY_DN4062_c0_g2_i1 0.000   20.291  0.000   0.000   0.000   0.000\nTRINITY_DN4840_c0_g1_i1 0.000   23.121  0.000   0.000   0.000   0.000\nTRINITY_DN1624_c0_g1_i1 0.000   22.187  0.000   0.000   0.000   0.000\nTRINITY_DN3492_c0_g1_i1 0.000   30.511  0.000   0.000   0.000   0.000\nTRINITY_DN1386_c0_g1_i1 96.472  86.724  94.337  140.804 141.542 140.796\nTRINITY_DN6688_c0_g1_i1 0.000   23.740  0.000   0.000   0.000   0.000\nTRINITY_DN1126_c0_g2_i1 9.819   11.863  11.412  3.511   1.357   1.351<\/code><\/pre>\n<pre><code>[orjuela@node25 salmon_outdir]$ more Trinity_trans.isoform.TPM.not_cross_norm.TMM_info.txt\ngroup   lib.size    norm.factors    eff.lib.size\nCENPK_rep1  999995  1.14115425493455    1141148.54916327\nCENPK_rep2  999993  0.905154412329735   905148.076248848\nCENPK_rep3  1000003 1.0939694255262 1093972.70743448\nBatch_rep1  999992  0.989872299588587   989864.38061019\nBatch_rep2  999990  0.950965458077021   950955.94842244\nBatch_rep3  1000015 0.940121262909323   940135.364728266<\/code><\/pre>\n<h2>3.4 Compute N50 based on the top-most highly expressed transcripts (Ex50)<\/h2>\n<pre><code>$path_to_trinity\/util\/misc\/contig_ExN50_statistic.pl Trinity_trans.isoform.TMM.EXPR.matrix \/scratch\/formationX\/TRINITY_OUT\/Trinity.fasta &gt; ExN50.stats\n\n#Plotting ExN50\n$path_to_trinity\/util\/misc\/plot_ExN50_statistic.Rscript ExN50.stats\n\n#If you want to know, how many transcripts correspond to the Ex 90 peak, you could:\ncat Trinity_trans.isoform.TMM.EXPR.matrix.E-inputs |  egrep -v ^\\# | awk &#039;$1 &lt;= 90&#039; | wc -l\n1761\n\n# or 50\ncat Trinity_trans.isoform.TMM.EXPR.matrix.E-inputs |  egrep -v ^\\# | awk &#039;$1 &lt;= 50&#039; | wc -l\n163<\/code><\/pre>\n<p>TRANSFERT: Observe plots. Remember, you have to transfert *.pdf files to your home before of transfering into your local machine.<\/p>\n<pre><code>#transfering plots\ncp *.pdf \/home\/formationX\/\n# from your local machine\nscp formationX@bioinfo-nas.ird.fr:*.pdf .<\/code><\/pre>\n<p><img decoding=\"async\" width=\"100%\" class=\"img-responsive\" src=\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png\" alt=\"\" \/><\/p>\n<h2>3.5 Quantifying completness using <code>BUSCO<\/code><\/h2>\n<p>Assessing gene space is a core aspect of knowing whether or not you have a good assembly.<\/p>\n<p>Benchmarking Universal Single-Copy Orthologs (BUSCO) sets are collections of orthologous groups with near-universally-distributed single-copy genes in each species, selected from OrthoDB root-level orthology delineations across arthropods, vertebrates, metazoans, fungi, and eukaryotes. BUSCO groups were selected from each major radiation of the species phylogeny requiring genes to be present as single-copy orthologs in at least 90% of the species; in others they may be lost or duplicated, and to ensure broad phyletic distribution they cannot all be missing from one sub-clade. The species that define each major radiation were selected to include the majority of OrthoDB species, excluding only those with unusually high numbers of missing or duplicated orthologs, while retaining representation from all major sub-clades. Their widespread presence means that any BUSCO can therefore be expected to be found as a single-copy ortholog in any newly-sequenced genome from the appropriate phylogenetic clade.<\/p>\n<h5>Running Busco on Trinity.fasta assembly<\/h5>\n<pre><code>#make a BUSCO directory\ncd \/scratch\/formationX\/\nmkdir BUSCO; cd BUSCO\n# download the eukaryota database from https:\/\/busco.ezlab.org\/\n# There is a lot of other databases usable for this dataset (ex: Fungi gene set or Saccaromyceta gene set, with more genes to retrieve, so longer to run)\n\nwget https:\/\/busco.ezlab.org\/datasets\/eukaryota_odb9.tar.gz\n# uncompress repertory\ntar zxvf eukaryota_odb9.tar.gz\n\n# define variables\nbusco=\/usr\/local\/BUSCO-3.0.2\/scripts\/run_BUSCO.py\npath_to_trinity=\/usr\/local\/trinityrnaseq-2.8.5\/\nfasta=\/scratch\/formationX\/TRINITY_OUT\/Trinity.fasta\nLINEAGE=\/scratch\/formationX\/BUSCO\/eukaryota_odb9\n\n# run busco\n\/usr\/local\/python-3.6.5\/bin\/python $busco -i $fasta -l $LINEAGE -m transcriptome  -c 2 -o trinity_busco &gt; busco.log &amp;\n\n# check busco running using \njobs\n# or \ntail -f busco.log<\/code><\/pre>\n<h5>Displaying the short summary table generated by BUSCO<\/h5>\n<pre><code>[orjuela@node25 run_trinity_busco]$ more run_trinity_busco\/short_summary_trinity_busco.txt \n# BUSCO version is: 3.0.2 \n# The lineage dataset is: eukaryota_odb9 (Creation date: 2016-11-02, number of species: 100, number of BUSCOs: 303)\n# To reproduce this run: python \/usr\/local\/BUSCO-3.0.2\/scripts\/run_BUSCO.py -i \/scratch\/formation1\/TRINITY_OUT\/Trinity.fasta -o trinity_busco_euk -l \/scratch\/formation1\/BUSCO\/eukaryota_odb9\/ -m transcriptome -c 2\n\n#\n# Summarized benchmarking in BUSCO notation for file \/scratch\/orjuela\/TRINITY_OUT\/Trinity.fasta\n# BUSCO was run in mode: transcriptome\n\n    C:93.4%[S:74.6%,D:18.8%],F:0.0%,M:6.6%,n:303\n\n    283 Complete BUSCOs (C)\n    226 Complete and single-copy BUSCOs (S)\n    57  Complete and duplicated BUSCOs (D)\n    0   Fragmented BUSCOs (F)\n    20  Missing BUSCOs (M)\n    303 Total BUSCO groups searched\n<\/code><\/pre>\n<h2>3.6 BLASTX comparison to known protein sequences database<\/h2>\n<h5>Performing a blastx againt the swissprot database (the manually annotated and reviewed section of the UniProt Knowledgebase)<\/h5>\n<ul>\n<li>\n<p class=\"warning\"> WARNING !: This step will not run in this practice because it will be inclued in annotation script !!  But we give you commands lines to lauch with your data.<\/p>\n<\/li>\n<\/ul>\n<pre><code>#load modules\nmodule load bioinfo\/blast\/2.4.0+\n\n#define variables\npath_to_blast=\/usr\/local\/ncbi-blast-2.4.0+\/bin\/\npath_to_trinity=\/usr\/local\/trinityrnaseq-2.8.5\/\n\n# create BLASTX repertory\nmkdir \/scratch\/formationX\/BLASTX\ncd \/scratch\/formationX\/BLASTX\n\n# First, we downloaded and indexed the database from ftp:\/\/ftp.ebi.ac.uk\/pub\/databases\/uniprot\/current_release\/knowledgebase\/complete\/uniprot_sprot.fasta.gz, \n# uniprotdatabase is available on cluster at  (with the correct index files)\nuniprot=\/data\/projects\/banks\/Uniprot\/uniprot_sprot.fasta\n\n#Then, we ran BLASTX to get the top match hit:\n$path_to_blast\/blastx \\\n-db $uniprot \\\n-query $fasta \\\n-num_threads 2 \\\n-max_target_seqs 1 \\\n-outfmt 6 \\\n-evalue 1e-20 &gt; SwissProt_1E20_Trinity_blastx.outfmt6 &amp;\n\n#if you want to test blastX practice, please recover results from `\/data2\/formation\/TP-trinity\/TRINITY_OUT\/BLASTX\/`\n\n#Finally, we examined the percent of alignment coverage:\n$path_to_trinity\/util\/misc\/blast_outfmt6_group_segments.pl SwissProt_1E20_Trinity_blastx.outfmt6 $fasta $uniprot &gt; SwissProt_1E20_Trinity_blastx.outfmt6.grouped\n\n$path_to_trinity\/util\/misc\/blast_outfmt6_group_segments.tophit_coverage.pl SwissProt_1E20_Trinity_blastx.outfmt6.grouped \\\n> SwissProt_1E20_Trinity_blastx.outfmt6.grouped.output<\/code><\/pre>\n<h5>Analyzing blast results<\/h5>\n<pre><code>[orjuela@node25 BLASTX]$ more SwissProt_1E20_Trinity_blastx.outfmt6.grouped.output\n#hit_pct_cov_bin    count_in_bin    &gt;bin_below\n100 242 242\n90  8   250\n80  4   254\n70  5   259\n60  4   263\n50  5   268\n40  4   272\n30  2   274\n20  2   276\n10  1   277<\/code><\/pre>\n<p>8616 transcripts from <code>Trinity.fasta<\/code> file were blasted against the swissprot database.<\/p>\n<p>440 hits were reported in <code>SwissProt_1E20_Trinity_blastx.outfmt6<\/code> file.<\/p>\n<p>In <code>SwissProt_1E20_Trinity_blastx.outfmt6.grouped.output<\/code> file we can observed for example that 242 sequences were found with a 100% identity to an uniprot protein (count_in_bin). bin_below column represent a accumulative number of sequences.<\/p>\n<hr \/>\n<p><a name=\"practice-4\"><\/a><\/p>\n<h1>4. Differential Expression Analysis (DE)<\/h1>\n<h2>4.1 Examine your data and your experimental replicates before DE<\/h2>\n<p>Before differential expression analysis, examine your data to determine if there are any confounding issues. Trinity comes with a 'PtR' script that we use to simplify making various charts and plots based on a matrix input file. Run these three commands lines : <\/p>\n<pre><code># go to salmon results repertory\ncd \/scratch\/formationX\/ALIGN_AND_ABUNDANCE\/salmon_outdir\n\n#create design file from samples.txt\ncut -f1,2 ..\/samples.txt &gt; design.txt\n\n[orjuela@node25 salmon_outdir]$ more design.txt \nCENPK   CENPK_rep1\nCENPK   CENPK_rep2\nCENPK   CENPK_rep3\nBatch   Batch_rep1\nBatch   Batch_rep2\nBatch   Batch_rep3<\/code><\/pre>\n<p>Run PtR scripts<\/p>\n<pre><code>\n$path_to_trinity\/Analysis\/DifferentialExpression\/PtR  --matrix Trinity_trans.isoform.counts.matrix --samples design.txt --log2 --min_rowSums 10 --compare_replicates\n\n$path_to_trinity\/Analysis\/DifferentialExpression\/PtR  --matrix Trinity_trans.isoform.counts.matrix --samples design.txt --log2 --min_rowSums 10 --CPM --sample_cor_matrix\n\n$path_to_trinity\/Analysis\/DifferentialExpression\/PtR  --matrix Trinity_trans.isoform.counts.matrix --samples design.txt --log2 --min_rowSums 10  --CPM --center_rows --prin_comp <\/code><\/pre>\n<p>TRANSFERT : Observe plots. Remember, you have to transfert *.pdf files to your home before of transfering into your local machine. <\/p>\n<pre><code>#transfering plots\ncp *.pdf \/home\/formationX\/\n# from your local machine\nscp formationX@bioinfo-nas.ird.fr:*.pdf .<\/code><\/pre>\n<h2>4.2 Identify differentially expressed genes between the two tissues.<\/h2>\n<p>The tool <code>run_DE_analysis.pl<\/code> is a PERL script that use <code>Bioconductor package edgeR<\/code>. <\/p>\n<pre><code>\n#run DE analysis\n$path_to_trinity\/Analysis\/DifferentialExpression\/run_DE_analysis.pl \\\n--matrix Trinity_trans.isoform.counts.matrix \\\n--method edgeR \\\n--samples_file design.txt \\\n--output edgeR_results<\/code><\/pre>\n<p>The output files are in the directory edgeR_results. Observe file <code>Trinity_trans.isoform.counts.matrix.Batch_vs_CENPK.edgeR.DE_results<\/code>. It provides several values for each gene: <\/p>\n<p>1) FDR to indicate whether a gene is differentially expressed or not <\/p>\n<p>2) logFC is the log2 transformed fold change between the two tissues<\/p>\n<p>3) logCPM is the log2 transformed normalized read count of average of the samples.<\/p>\n<p>Usually you have to filter this list of genes\/isoforms to FDR &lt;0.05 or below. To be more conservative, you could also use more stringent FDR cutoff (e.g. &lt;0.001), and only keep genes with high logFC (e.g. &lt;-2 and &gt;2) and\/or high logCPM (e.g. &gt;1). In the edgeR_results directory there is also a \u201cvolcano plot\u201d to visualize the distribution of the DE genes.<\/p>\n<pre><code>[orjuela@node25 salmon_outdir]$ head edgeR_results\/Trinity_trans.isoform.counts.matrix.Batch_vs_CENPK.edgeR.DE_results\nsampleA sampleB logFC   logCPM  PValue  FDR\nTRINITY_DN332_c0_g1_i10 Batch   CENPK   -10.359903780639    8.33512668205486    0   0\nTRINITY_DN730_c0_g2_i1  Batch   CENPK   -6.4740873961699    8.70152404919916    0   0\nTRINITY_DN741_c0_g1_i1  Batch   CENPK   -6.31231314146213   10.2830859720565    0   0\nTRINITY_DN287_c1_g1_i1  Batch   CENPK   -6.26650019458591   8.47203996977476    0   0\nTRINITY_DN65_c0_g1_i3   Batch   CENPK   -4.13189957112457   10.730475730689 0   0\nTRINITY_DN1298_c0_g1_i1 Batch   CENPK   -3.19109633696326   8.91247526735535    0   0\nTRINITY_DN1253_c0_g1_i1 Batch   CENPK   -3.03997649547823   8.99502681145441    6.47317750883953e-301   3.75444295512693e-298\nTRINITY_DN51_c0_g1_i1   Batch   CENPK   -4.59451625218422   10.4521118447878    5.94332889429699e-289   3.01623941385572e-286\nTRINITY_DN708_c0_g1_i1  Batch   CENPK   -4.1838974791094    7.70916912374135    3.84699757684993e-275   1.73542335133453e-272\n<\/code><\/pre>\n<pre><code>#transfering plots\ncd edgeR_result\ncp *.pdf \/home\/formationX\/\n# from your local machine\nscp formationX@bioinfo-nas.ird.fr:*.pdf .<\/code><\/pre>\n<p><img decoding=\"async\" width=\"100%\" class=\"img-responsive\" src=\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/maplot.png\" alt=\"\" \/><br \/>\n<img decoding=\"async\" width=\"100%\" class=\"img-responsive\" src=\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/volcano.png\" alt=\"\" \/><\/p>\n<h2>4.3 Clustering analysis<\/h2>\n<p>Hierarchical clustering and k-means clustering for samples and genes can be done using trinity scripts. <\/p>\n<p>The clustering will be performed only on differentially expressed genes, with FDR and logFC cutoff defined by -P and -C parameters.<\/p>\n<p>In this example, we set K=4 for k-means analysis. The genes will be separated into 4 groups based on expression pattern. <\/p>\n<p>There are two prefiltered files produced: <code>*DE_results.P1e-3_C2.Batch-UP.subset<\/code> and <code>*DE_results.P1e3_C2.CENPK-UP.subset<\/code>, with differentially expressed genes (FDR cutoff 0.001, logFC cutoff 2 and -2). <\/p>\n<pre><code>#go to edgeR_results\ncd edgeR_results\n# running analyze_diff_expr\n$path_to_trinity\/Analysis\/DifferentialExpression\/analyze_diff_expr.pl \\\n--matrix ..\/Trinity_trans.isoform.TMM.EXPR.matrix --samples ..\/design.txt -P 1e-3 -C 2 \\\n--output cluster_results\n# running define_clusters_by_cutting_tree\n$path_to_trinity\/Analysis\/DifferentialExpression\/define_clusters_by_cutting_tree.pl \\\n-K 4 -R cluster_results.matrix.RData <\/code><\/pre>\n<pre><code>#transfering plots\ncp *.pdf \/home\/formationX\/\n# from your local machine\nscp formationX@bioinfo-nas.ird.fr:*.pdf .<\/code><\/pre>\n<p><img decoding=\"async\" width=\"100%\" class=\"img-responsive\" src=\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/k4.png\" alt=\"\" \/><\/p>\n<h3>Before finish ...<\/h3>\n<p>Go to <a href=\"https:\/\/itrop.ird.fr\/wordpress\/index.php\/trainings-2019-trinity\/\">here<\/a> to recover the scripts used in this training.<\/p>\n<hr \/>\n<h3>License<\/h3>\n<p><a name=\"license\"><\/a><\/p>\n<div>\nThe resource material is licensed under the Creative Commons Attribution 4.0 International License (<a href=\"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\">here<\/a>).<br \/>\n<center><img decoding=\"async\" width=\"25%\" class=\"img-responsive\" src=\"http:\/\/creativecommons.org.nz\/wp-content\/uploads\/2012\/05\/by-nc-sa1.png\"\/><br \/>\n<\/center>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Trinity Practice Description Hands On Lab Exercises for RNASeq assembly Related-course materials Linux for Dummies Related-course materials RNAseq Authors Julie&hellip; <br \/> <a class=\"read-more\" href=\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/\">Lire la suite<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":1067,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"inline_featured_image":false,"footnotes":""},"class_list":["post-1071","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Trainings 2019 - Trinity - Practice - itrop<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/\" \/>\n<meta property=\"og:locale\" content=\"fr_FR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Trainings 2019 - Trinity - Practice - itrop\" \/>\n<meta property=\"og:description\" content=\"Trinity Practice Description Hands On Lab Exercises for RNASeq assembly Related-course materials Linux for Dummies Related-course materials RNAseq Authors Julie&hellip; Lire la suite\" \/>\n<meta property=\"og:url\" content=\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/\" \/>\n<meta property=\"og:site_name\" content=\"itrop\" \/>\n<meta property=\"article:modified_time\" content=\"2022-04-06T12:50:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@ItropBioinfo\" \/>\n<meta name=\"twitter:label1\" content=\"Dur\u00e9e de lecture estim\u00e9e\" \/>\n\t<meta name=\"twitter:data1\" content=\"20 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/\",\"url\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/\",\"name\":\"Trainings 2019 - Trinity - Practice - itrop\",\"isPartOf\":{\"@id\":\"https:\/\/bioinfo.ird.fr\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png\",\"datePublished\":\"2020-11-10T15:15:00+00:00\",\"dateModified\":\"2022-04-06T12:50:08+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#breadcrumb\"},\"inLanguage\":\"fr-FR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#primaryimage\",\"url\":\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png\",\"contentUrl\":\"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\/\/bioinfo.ird.fr\/index.php\/en\/front-page-2\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Trainings &#8211; FR\",\"item\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Trainings 2019 &#8211; Trinity\",\"item\":\"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Trainings 2019 &#8211; Trinity &#8211; Practice\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/bioinfo.ird.fr\/#website\",\"url\":\"https:\/\/bioinfo.ird.fr\/\",\"name\":\"itrop\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/bioinfo.ird.fr\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/bioinfo.ird.fr\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"fr-FR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/bioinfo.ird.fr\/#organization\",\"name\":\"i-Trop\",\"url\":\"https:\/\/bioinfo.ird.fr\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/bioinfo.ird.fr\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/bioinfo.ird.fr\/wp-content\/uploads\/2021\/10\/i-tropTwt5.png\",\"contentUrl\":\"https:\/\/bioinfo.ird.fr\/wp-content\/uploads\/2021\/10\/i-tropTwt5.png\",\"width\":1356,\"height\":1356,\"caption\":\"i-Trop\"},\"image\":{\"@id\":\"https:\/\/bioinfo.ird.fr\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/ItropBioinfo\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Trainings 2019 - Trinity - Practice - itrop","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/","og_locale":"fr_FR","og_type":"article","og_title":"Trainings 2019 - Trinity - Practice - itrop","og_description":"Trinity Practice Description Hands On Lab Exercises for RNASeq assembly Related-course materials Linux for Dummies Related-course materials RNAseq Authors Julie&hellip; Lire la suite","og_url":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/","og_site_name":"itrop","article_modified_time":"2022-04-06T12:50:08+00:00","og_image":[{"url":"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_site":"@ItropBioinfo","twitter_misc":{"Dur\u00e9e de lecture estim\u00e9e":"20 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/","url":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/","name":"Trainings 2019 - Trinity - Practice - itrop","isPartOf":{"@id":"https:\/\/bioinfo.ird.fr\/#website"},"primaryImageOfPage":{"@id":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#primaryimage"},"image":{"@id":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#primaryimage"},"thumbnailUrl":"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png","datePublished":"2020-11-10T15:15:00+00:00","dateModified":"2022-04-06T12:50:08+00:00","breadcrumb":{"@id":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#breadcrumb"},"inLanguage":"fr-FR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/"]}]},{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#primaryimage","url":"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png","contentUrl":"https:\/\/southgreenplatform.github.io\/trainings\/\/images\/Ex50.png"},{"@type":"BreadcrumbList","@id":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/trainings-2019-trinity-practice\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/bioinfo.ird.fr\/index.php\/en\/front-page-2\/"},{"@type":"ListItem","position":2,"name":"Trainings &#8211; FR","item":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/"},{"@type":"ListItem","position":3,"name":"Trainings 2019 &#8211; Trinity","item":"https:\/\/bioinfo.ird.fr\/index.php\/trainings-fr\/trainings-2019-trinity\/"},{"@type":"ListItem","position":4,"name":"Trainings 2019 &#8211; Trinity &#8211; Practice"}]},{"@type":"WebSite","@id":"https:\/\/bioinfo.ird.fr\/#website","url":"https:\/\/bioinfo.ird.fr\/","name":"itrop","description":"","publisher":{"@id":"https:\/\/bioinfo.ird.fr\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/bioinfo.ird.fr\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"fr-FR"},{"@type":"Organization","@id":"https:\/\/bioinfo.ird.fr\/#organization","name":"i-Trop","url":"https:\/\/bioinfo.ird.fr\/","logo":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/bioinfo.ird.fr\/#\/schema\/logo\/image\/","url":"https:\/\/bioinfo.ird.fr\/wp-content\/uploads\/2021\/10\/i-tropTwt5.png","contentUrl":"https:\/\/bioinfo.ird.fr\/wp-content\/uploads\/2021\/10\/i-tropTwt5.png","width":1356,"height":1356,"caption":"i-Trop"},"image":{"@id":"https:\/\/bioinfo.ird.fr\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/ItropBioinfo"]}]}},"_links":{"self":[{"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/pages\/1071","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/comments?post=1071"}],"version-history":[{"count":2,"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/pages\/1071\/revisions"}],"predecessor-version":[{"id":1074,"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/pages\/1071\/revisions\/1074"}],"up":[{"embeddable":true,"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/pages\/1067"}],"wp:attachment":[{"href":"https:\/\/bioinfo.ird.fr\/index.php\/wp-json\/wp\/v2\/media?parent=1071"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}