I am calling SNPs on an organism without a reference genome or database of known polymorphisms, so I'm trying to follow the advice posted here (and in the BaseRecalibrator documentation).
I've successfully called SNPs on the un-recalibrated .bam file, then used those SNPs to recalibrate, then called SNPs on the recalibrated .bam file. As expected, I got significantly fewer (and presumably more accurate) results.
I then used the new, reduced set of SNPs to recalibrate again. When I attempted to call SNPs on this "Round Two" recalibrated .bam file, I got the following error:
SAM/BAM file recalibrated.2.bam is malformed: Program record with group id GATK PrintReads already exists in SAMFileHeader!
I attempted to use PicardTools ValidateSamFile and CleanSam but received the same message (as an IllegalArgumentException). I would definitely consider myself a novice in the field. Any advice you can give will be greatly appreciated.