This tool provides simple, powerful read clipping capabilities to remove low quality strings of bases, sections of reads, and reads containing user-provided sequences.
It allows the user to clip bases in reads with poor quality scores, that match particular sequences, or that were generated by particular machine cycles.
Any number of BAM files.
A new BAM file containing all of the reads from the input BAMs with the user-specified clipping operation applied to each read.
Number of examined reads 13
Number of clipped reads 13
Percent of clipped reads 100.00
Number of examined bases 988
Number of clipped bases 126
Percent of clipped bases 12.75
Number of quality-score clipped bases 126
Number of range clipped bases 0
Number of sequence clipped bases 0
314KGAAXX090507:1:19:1420:1123#0 16 chrM 3116 29 76M * * *
TAGGACCCGGGCCCCCCTCCCCAATCCTCCAACGCATATAGCGGCCGCGCCTTCCCCCGTAAATGATATCATCTCA
#################4?6/?2135;;;'1/=/<'B9;12;68?A79@,@==@9?=AAA3;A@B;A?B54;?ABA
If we are clipping reads with -QT 10 and -CR WRITE_NS, we get:
314KGAAXX090507:1:19:1420:1123#0 16 chrM 3116 29 76M * * *
NNNNNNNNNNNNNNNNNTCCCCAATCCTCCAACGCATATAGCGGCCGCGCCTTCCCCCGTAAATGATATCATCTCA
#################4?6/?2135;;;'1/=/<'B9;12;68?A79@,@==@9?=AAA3;A@B;A?B54;?ABA
Whereas with -CR WRITE_Q0S:
314KGAAXX090507:1:19:1420:1123#0 16 chrM 3116 29 76M * * *
TAGGACCCGGGCCCCCCTCCCCAATCCTCCAACGCATATAGCGGCCGCGCCTTCCCCCGTAAATGATATCATCTCA
!!!!!!!!!!!!!!!!!4?6/?2135;;;'1/=/<'B9;12;68?A79@,@==@9?=AAA3;A@B;A?B54;?ABA
Or -CR SOFTCLIP_BASES:
314KGAAXX090507:1:19:1420:1123#0 16 chrM 3133 29 17S59M * * *
TAGGACCCGGGCCCCCCTCCCCAATCCTCCAACGCATATAGCGGCCGCGCCTTCCCCCGTAAATGATATCATCTCA
#################4?6/?2135;;;'1/=/<'B9;12;68?A79@,@==@9?=AAA3;A@B;A?B54;?ABA
-T ClipReads -I my.bam -I your.bam -o my_and_your.clipped.bam -R Homo_sapiens_assembly18.fasta \
-XF seqsToClip.fasta -X CCCCC -CT "1-5,11-15" -QT 10
| Name | Type | Default value | Summary |
|---|---|---|---|
| Required | |||
| --out | StingSAMFileWriter | stdout | Write BAM output here |
| Optional | |||
| --clipRepresentation | ClippingRepresentation | WRITE_NS | How should we actually clip the bases? |
| --clipSequence | String[] | NA | Remove sequences within reads matching this sequence |
| --clipSequencesFile | String | NA | Remove sequences within reads matching the sequences in this FASTA file |
| --cyclesToTrim | String | NA | String indicating machine cycles to clip from the reads |
| --outputStatistics | PrintStream | NA | Write output statistics to this file |
| --qTrimmingThreshold | int | -1 | If provided, the Q-score clipper will be applied |
How should we actually clip the bases?. The different values for this argument determines how ClipReads applies clips to the reads. This can range
from writing Ns over the clipped bases to hard clipping away the bases from the BAM.
The --clipRepresentation argument is an enumerated type (ClippingRepresentation), which can have one of the following values:
Remove sequences within reads matching this sequence. Clips bases from the reads matching the provided SEQ. Can be provided any number of times on the command line
Remove sequences within reads matching the sequences in this FASTA file. Reads the sequences in the provided FASTA file, and clip any bases that exactly match any of the sequences in the file.
String indicating machine cycles to clip from the reads. Clips machine cycles from the read. Accepts a string of ranges of the form start1-end1,start2-end2, etc. For each start/end pair, removes bases in machine cycles from start to end, inclusive. These are 1-based values (positions). For example, 1-5,10-12 clips the first 5 bases, and then three bases at cycles 10, 11, and 12.
Write BAM output here. The output SAM/BAM file will be written here
Write output statistics to this file. If provided, ClipReads will write summary statistics about the clipping operations applied to the reads to this file.
If provided, the Q-score clipper will be applied. If a value > 0 is provided, then the quality score based read clipper will be applied to the reads using this quality score threshold.
See also Documentation index | GATK Site | GATK support forum
GATK version 2.3-9-ge5ebf34 built at 2013/01/11 22:47:55.