I have whole genome sequencing data from one individual from 8 lanes of Illumina HiSeq. I was wondering if the SplitSamFile tool can be used to divide the reads by lane number instead by by subject (since I only have one subject). Also, where should I use this tool? Presumably before the RealignerTargetCreator step?
Thank you! Stephanie
If I have a bam file with three different read groups, and use SplitSamFile to split it like so:
java -Xmx2g -jar $GATKJAR -T SplitSamFile -I $INBAM -R $GENOME --outputRoot $PROJD/$IND/
Each of the output bam files have all three read groups. Is that the intended behavior? I would like each file to have only it's own read group info in the heads. Sorry for the bash arguments in the code above, is makes in readable at least.