Hello,
When I run countcovariates I get the following error message;
java.lang.ArrayIndexOutOfBoundsException: 0
I think this has to do with the bam output of an upstream stage of my pipeline because when I run CalculateHsMetrics with lenient validation stringency I get hundreds of errors like the following;
Ignoring SAM validation error: ERROR: Record 542418, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542419, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542420, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542421, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542422, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542423, Read name (null), Zero-length read without CS or CQ tag
When I examine some of these lines in the bam file I get the following...
samtools view 19542Js.bam | head -542420 | tail -5 (null) 73 11 67353661 25 0M = 67353661 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 73 11 67353661 25 0M = 67353661 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 97 11 67353661 25 0M 9 98215962 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:25 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 99 11 67353661 17 0M = 67391488 37850 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:17AM:i:17 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 97 11 67353661 25 0M 2 47378438 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:25 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0
Is there a problem with these reads? It looks like the reads aren't present. Is that causing the out of bounds error? How can I fix the bam file?
Any help would be greatly appreciated.
-Rob
Here is the rest of my output...
INFO 2012-12-04 14:29:58 SinglePassSamProgram Processed 1,000,000 records.
INFO 2012-12-04 14:30:09 ProcessExecutor null device
INFO 2012-12-04 14:30:09 ProcessExecutor 1
INFO 2012-12-04 14:31:20 ProcessExecutor null device
INFO 2012-12-04 14:31:20 ProcessExecutor 1
INFO 2012-12-04 14:31:20 ProcessExecutor null device
INFO 2012-12-04 14:31:20 ProcessExecutor 1
[Tue Dec 04 14:31:20 EST 2012] net.sf.picard.analysis.CollectMultipleMetrics done. Elapsed time: 2.26 minutes.
Runtime.totalMemory()=2176253952
Run the recalibration
INFO 14:31:28,081 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:31:28,084 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.5-32-g2761da9, Compiled 2012/04/26 15:31:17
INFO 14:31:28,084 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 14:31:28,084 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki
INFO 14:31:28,084 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa
INFO 14:31:28,085 HelpFormatter - Program Args: -T CountCovariates -l INFO -U ALLOW_UNSET_BAM_SORT_ORDER --default_platform illumina -R /proj/re2sqs/re2sq00/Resources/Bundle/human_g1k_v37.fasta --knownSites /proj/re2sqs/re2sq00/Resources/Bundle/dbsnp_132.b37.vcf -I 19542Js.bam --standard_covs -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile 19542JCovars.csv
INFO 14:31:28,085 HelpFormatter - Date/Time: 2012/12/04 14:31:28
INFO 14:31:28,085 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:31:28,086 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:31:28,142 RodBindingArgumentTypeDescriptor - Dynamically determined type of /proj/re2sqs/re2sq00/Resources/Bundle/dbsnp_132.b37.vcf to be VCF
INFO 14:31:28,155 GenomeAnalysisEngine - Strictness is SILENT
INFO 14:31:28,398 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 14:31:28,433 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03
INFO 14:31:28,459 RMDTrackBuilder - Loading Tribble index from disk for file /proj/re2sqs/re2sq00/Resources/Bundle/dbsnp_132.b37.vcf
INFO 14:31:29,552 CountCovariatesWalker - The covariates being used here:
INFO 14:31:29,553 CountCovariatesWalker - ReadGroupCovariate
INFO 14:31:29,553 CountCovariatesWalker - QualityScoreCovariate
INFO 14:31:29,553 CountCovariatesWalker - CycleCovariate
INFO 14:31:29,553 CountCovariatesWalker - DinucCovariate
INFO 14:31:30,029 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING]
INFO 14:31:30,029 TraversalEngine - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 14:32:00,000 TraversalEngine - 3:20577904 2.34e+05 30.4 s 2.2 m 16.5% 3.1 m 2.6 m
INFO 14:32:30,211 TraversalEngine - 5:113081909 4.36e+05 60.7 s 2.3 m 32.1% 3.2 m 2.1 m
INFO 14:33:00,249 TraversalEngine - 7:5852794 7.07e+05 90.7 s 2.1 m 40.0% 3.8 m 2.3 m
INFO 14:33:30,337 TraversalEngine - 9:78429646 9.62e+05 2.0 m 2.1 m 52.1% 3.9 m 110.8 s
INFO 14:34:07,411 GATKRunReport - Uploaded run statistics report to AWS S3
java.lang.ArrayIndexOutOfBoundsException: 0 at org.broadinstitute.sting.gatk.walkers.recalibration.DinucCovariate.getValues(DinucCovariate.java:82) at org.broadinstitute.sting.gatk.walkers.recalibration.RecalDataManager.computeCovariates(RecalDataManager.java:615) at org.broadinstitute.sting.gatk.walkers.recalibration.CountCovariatesWalker.map(CountCovariatesWalker.java:381) at org.broadinstitute.sting.gatk.walkers.recalibration.CountCovariatesWalker.map(CountCovariatesWalker.java:134) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:78) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:63) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:246) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:128) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:92)
I'm trying to run CountCovariates and getting the error as :
" java.lang.IllegalStateException: This method hasn't been implemented yet for GAIIx "
I have used following program arguments as given below :
Program Args: -T CountCovariates -R /home/adlab/ngs_data/databases/human_g1k_v37.fasta -I D5_b37_aligned_sorted_realn_DupRm.bam -knownSites:dbsnp,VCF /home/adlab/ngs_data/databases/dbsnp_132.b37.vcf -recalFile D5_b37_aligned_sorted_realn_DupRm_countcovariates.csv -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -standard
To trouble shoot it I found that I have added the read groups in bwa sampe by setting platform info as "GAIIx" as given below and it's root for the error.
"@RG\tID:Clone-D5\tPL:GAIIx\tLB:LIB-RDT\tSM:UNKNOWN\I:200"
What's the exact error in my argument list / read group syntax and What could be the cause of the error ? kindly help.