Greetings GATK users, I'm trying to run QualifyMissingIntervals in GATK, and want to verify the output of my command. I am using:
java -jar GenomeAnalysisTK.jar -T QualifyMissingIntervals -o outputtest.grp -R ref.fasta -I input.bam -L list.interval_list --targetsfile targets.intervals.
My interval list looks like this:
@HD VN:1.4 SO:coordinate @SQ SN:1 LN:4000000 chromosome 1 4000000 + target1
This is a subset of my targets file which was output from the RealignerTargetCreator function :
chromosome:889608-889611 chromosome:926218-926667 ... 24 lines
My output gives me data on only a single interval:
INTERVAL GC BQ MQ DP POS_IN_TARGET TARGET_SIZE BAITED MISSING_SIZE INTERPRETATION chromosome:1-4411709 0.65615955 31.01751693 42.77457476 421.83747409 -3522098 4 true 4000000 UNKNOWN
I get the feeling that one of my files is formatted improperly, but I can't figure out which it is. I have tried several iterations of the -L and --targetsfiles based on both the documentation and what has been previously posted on the forum, but to no avail, usually resulting in the command not running at all.
I would very much appreciate any help that might be provided!
Hey guys -
We've started to play with QualifyMissingIntervals, and pretty quickly ran into a ReviewedStingException ("BED files must be parsed through Tribble; parsing them as intervals through the GATK engine is no longer supported") that confused us for a bit. We tracked it down to our use of bed files with the baits and targets arguments, and the use of IntervalUtils.intervalFileToList in the QMI initializer. We'll modify our reference files in the meantime, but could we request that those arguments use the interval parsing code used by -L? I think it's a relatively minor change, but I just don't have time to play with code right now (much less test it…)