Scripture walkthrough example

 0) Choose whether to align reads or use pre-aligned reads

 1) Download reads from the Gene Expression Omnibus (GEO)

>> gunzip SRR039999_1.fastq.gz
>> gunzip SRR039999_2.fastq.gz
>> gunzip SRR040000_1.fastq.gz
>> gunzip SRR040000_2.fastq.gz
>> gunzip SRR040001_1.fastq.gz
>> gunzip SRR040001_2.fastq.gz

 

 2) Download Bowtie reference index for mouse (mm9)

>> unzip mm9.ebwt.zip

 

 3) Perform spliced alignment using TopHat

>> tophat -o tophat_out_SRR039999_1 [PATH_TO_BOWTIE_REFERENCE_INDEX]/mm9 SRR039999_1.fastq
>> tophat -o tophat_out_SRR039999_2 [PATH_TO_BOWTIE_REFERENCE_INDEX]/mm9 SRR039999_2.fastq
>> tophat -o tophat_out_SRR040000_1 [PATH_TO_BOWTIE_REFERENCE_INDEX]/mm9 SRR040000_1.fastq
>> tophat -o tophat_out_SRR040000_2 [PATH_TO_BOWTIE_REFERENCE_INDEX]/mm9 SRR040000_2.fastq
>> tophat -o tophat_out_SRR040001_1 [PATH_TO_BOWTIE_REFERENCE_INDEX]/mm9 SRR040001_1.fastq
>> tophat -o tophat_out_SRR040001_2 [PATH_TO_BOWTIE_REFERENCE_INDEX]/mm9 SRR040001_2.fastq

 4) Make paired end alignment files using Scripture

>> sed '1,2d' tophat_out_SRR039999_1/accepted_hits.sam | sort > tophat_out_SRR039999_1/accepted_hits.sorted.sam
>> sed '1,2d' tophat_out_SRR039999_2/accepted_hits.sam | sort > tophat_out_SRR039999_2/accepted_hits.sorted.sam
>> sed '1,2d' tophat_out_SRR040000_1/accepted_hits.sam | sort > tophat_out_SRR040000_1/accepted_hits.sorted.sam
>> sed '1,2d' tophat_out_SRR040000_2/accepted_hits.sam | sort > tophat_out_SRR040000_2/accepted_hits.sorted.sam
>> sed '1,2d' tophat_out_SRR040001_1/accepted_hits.sam | sort > tophat_out_SRR040001_1/accepted_hits.sorted.sam
>> sed '1,2d' tophat_out_SRR040001_2/accepted_hits.sam | sort > tophat_out_SRR040001_2/accepted_hits.sorted.sam 

>> java -Xmx4000m -jar scripture.jar -task makePairedFile -pair1 tophat_out_SRR039999_1/accepted_hits.sorted.sam -pair2 tophat_out_SRR039999_2/accepted_hits.sorted.sam -out SRR039999.paired.sam -sorted
>> java -Xmx4000m -jar scripture.jar -task makePairedFile -pair1 tophat_out_SRR040000_1/accepted_hits.sorted.sam -pair2 tophat_out_SRR040000_2/accepted_hits.sorted.sam -out SRR040000.paired.sam -sorted
>> java -Xmx4000m -jar scripture.jar -task makePairedFile -pair1 tophat_out_SRR040001_1/accepted_hits.sorted.sam -pair2 tophat_out_SRR040001_2/accepted_hits.sorted.sam -out SRR040001.paired.sam -sorted

 5) Combine TopHat alignments, then sort and index

>> cat tophat_out_SRR039999_1/accepted_hits.sorted.sam tophat_out_SRR039999_2/accepted_hits.sorted.sam tophat_out_SRR040000_1/accepted_hits.sorted.sam tophat_out_SRR040000_2/accepted_hits.sorted.sam tophat_out_SRR040001_1/accepted_hits.sorted.sam tophat_out_SRR040001_2/accepted_hits.sorted.sam > all_alignments.sam

>> igvtools sort all_alignments.sam all_alignments.sorted.sam

 

>> igvtools index all_alignments.sorted.sam

 

 6) Combine paired end alignments, then sort and index

>> cat SRR039999.paired.sam SRR040000.paired.sam SRR040001.paired.sam > all_alignments.paired.sam

>> igvtools sort all_alignments.paired.sam all_alignments.paired.sorted.sam

>> igvtools index all_alignments.paired.sorted.sam

 

 7) Run Scripture

>>java –jar scripture.jar –alignment all_alignments.sorted.sam –out chr19.scriptureESTest.segments –sizeFile mm9.sizes –chr chr19 –chrSequence chr19.fa -pairedEnd all_alignments.paired.sorted.sam

Using Version VPaperR2

Computing weights..... upweighting? false weight: 1.0

Has pairs: true

Has upweighting turned on: false

Computing weights..... upweighting? false weight: 1.0

AlignmentDataModel loaded, initializing model stats

model stats loaded, initializing model

Built the model: 0.001 free memory: 92240704

Loaded chromosome Sequence

Segmenting accross graph

Going to get read iterator to make graph with counts

Got read iterator

Made it through all reads

Made first graph

Got extended piecesMade second graph

Done making spliced graph

Got Simple paths...

Estimated distribution

Made path trees

Done adding paired ends (if available)

Done getting paths. Total: 99832

Done with local segmentation

Done setting local rate in graph

Done scoring paths.

Note: This will take ~15 minutes on a standard machine to complete

 

 8) Visualizing the transcript graphs

-in:  This is the path to the Dot file from a previous Scripture run

-chr: The chromosome of the region to extract

-start: The start coordinate of the region to extract

-end: The end coordinate of the region to extract

-out: The output file for the new extracted dot file

>>java –jar –task extractDot –in chr19.scriptureESTest.segments.dot –chr chr19 -start 32165265 -end 32455031 -out sgms1.dot