Dear all , I am using Queue and DataProcessingPipeline.scala ( from https://github.com/broadgsa/gatk/blob/master/public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala ) to process my bam file . The input is a sample level bam file which has been processed by BWA aligned , Samtools sampe and Picard merge and duplicate removed . The output bam file (~200GB/sample ) is much larger than the input bam file (~80GB/sample ) . I want to know what information was added into the bam file ? Thanks a lot .My Queue version is Queue-2.1-10 . My script :
java \
-Xmx4g \
-Djava.io.tmpdir=../tmp/ \
-jar ./Queue-2.1-10/Queue.jar \
-S ./DataProcessingPipeline.scala \
-i input.bam \
-R /db/human_g1k_v37.fasta \
-D /db//dbSnp_b137.vcf \
-run