Tagged with #lanes
0 documentation articles | 0 announcements | 2 forum discussions

No posts found with the requested search criteria.
No posts found with the requested search criteria.

Created 2014-03-18 14:12:40 | Updated 2014-03-18 14:13:47 | Tags: pipeline markduplicates lanes gatk-best-practices
Comments (2)

Referring to broadinstitute.org/gatk/guide/article?id=3060, is removing duplicates necessary to be done twice, once per-lane and then per-sample?

Is it not enough to just mark the duplicates in the final BAM file with all the lanes merged, which should remove both optical and PCR duplicates (I am using Picard MarkDuplicates.jar)? So specifically, in the link above what is wrong with generating -

  • sample1_lane1.realn.recal.bam
  • sample1_lane2.realn.recal.bam
  • sample2_lane1.realn.recal.bam
  • sample2_lane2.realn.recal.bam

Then, merging them to get

  • sample1.merged.bam
  • sample2.merged.bam

and finally, include "de-dupping" only for the merged BAM file.

  • sample1.merged.dedup.realn.bam
  • sample2.merged.dedup.realn.bam

Created 2013-01-09 11:16:39 | Updated 2013-01-09 11:17:10 | Tags: exome pooling lanes
Comments (2)


I have exome data run on two lanes per library is it better to combine the lanes into one or to run each lane independently through GATK? What are the pros and cons? Many thanks