How can I fix an ArrayIndexOutOfBoundsException? How can I fix the bam file?
Posted in Ask the team | Last updated on


Comments (6)

Hello,

When I run countcovariates I get the following error message;

java.lang.ArrayIndexOutOfBoundsException: 0

I think this has to do with the bam output of an upstream stage of my pipeline because when I run CalculateHsMetrics with lenient validation stringency I get hundreds of errors like the following;


Ignoring SAM validation error: ERROR: Record 542418, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542419, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542420, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542421, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542422, Read name (null), Zero-length read without CS or CQ tag Ignoring SAM validation error: ERROR: Record 542423, Read name (null), Zero-length read without CS or CQ tag


When I examine some of these lines in the bam file I get the following...

samtools view 19542Js.bam | head -542420 | tail -5 (null) 73 11 67353661 25 0M = 67353661 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 73 11 67353661 25 0M = 67353661 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 97 11 67353661 25 0M 9 98215962 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:25 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 99 11 67353661 17 0M = 67391488 37850 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:17AM:i:17 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 97 11 67353661 25 0M 2 47378438 0 * * RG:Z:HaloPilot-19542J XT:A:U NM:i:0 SM:i:25AM:i:25 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0

Is there a problem with these reads? It looks like the reads aren't present. Is that causing the out of bounds error? How can I fix the bam file?

Any help would be greatly appreciated.

-Rob

Here is the rest of my output...


INFO 2012-12-04 14:29:58 SinglePassSamProgram Processed 1,000,000 records. INFO 2012-12-04 14:30:09 ProcessExecutor null device INFO 2012-12-04 14:30:09 ProcessExecutor 1 INFO 2012-12-04 14:31:20 ProcessExecutor null device INFO 2012-12-04 14:31:20 ProcessExecutor 1 INFO 2012-12-04 14:31:20 ProcessExecutor null device INFO 2012-12-04 14:31:20 ProcessExecutor 1 [Tue Dec 04 14:31:20 EST 2012] net.sf.picard.analysis.CollectMultipleMetrics done. Elapsed time: 2.26 minutes. Runtime.totalMemory()=2176253952 Run the recalibration INFO 14:31:28,081 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:31:28,084 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.5-32-g2761da9, Compiled 2012/04/26 15:31:17 INFO 14:31:28,084 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 14:31:28,084 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki INFO 14:31:28,084 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa INFO 14:31:28,085 HelpFormatter - Program Args: -T CountCovariates -l INFO -U ALLOW_UNSET_BAM_SORT_ORDER --default_platform illumina -R /proj/re2sqs/re2sq00/Resources/Bundle/human_g1k_v37.fasta --knownSites /proj/re2sqs/re2sq00/Resources/Bundle/dbsnp_132.b37.vcf -I 19542Js.bam --standard_covs -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile 19542JCovars.csv INFO 14:31:28,085 HelpFormatter - Date/Time: 2012/12/04 14:31:28 INFO 14:31:28,085 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:31:28,086 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:31:28,142 RodBindingArgumentTypeDescriptor - Dynamically determined type of /proj/re2sqs/re2sq00/Resources/Bundle/dbsnp_132.b37.vcf to be VCF INFO 14:31:28,155 GenomeAnalysisEngine - Strictness is SILENT INFO 14:31:28,398 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 14:31:28,433 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03 INFO 14:31:28,459 RMDTrackBuilder - Loading Tribble index from disk for file /proj/re2sqs/re2sq00/Resources/Bundle/dbsnp_132.b37.vcf INFO 14:31:29,552 CountCovariatesWalker - The covariates being used here:
INFO 14:31:29,553 CountCovariatesWalker - ReadGroupCovariate INFO 14:31:29,553 CountCovariatesWalker - QualityScoreCovariate INFO 14:31:29,553 CountCovariatesWalker - CycleCovariate INFO 14:31:29,553 CountCovariatesWalker - DinucCovariate INFO 14:31:30,029 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING] INFO 14:31:30,029 TraversalEngine - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 14:32:00,000 TraversalEngine - 3:20577904 2.34e+05 30.4 s 2.2 m 16.5% 3.1 m 2.6 m INFO 14:32:30,211 TraversalEngine - 5:113081909 4.36e+05 60.7 s 2.3 m 32.1% 3.2 m 2.1 m INFO 14:33:00,249 TraversalEngine - 7:5852794 7.07e+05 90.7 s 2.1 m 40.0% 3.8 m 2.3 m INFO 14:33:30,337 TraversalEngine - 9:78429646 9.62e+05 2.0 m 2.1 m 52.1% 3.9 m 110.8 s INFO 14:34:07,411 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ArrayIndexOutOfBoundsException: 0 at org.broadinstitute.sting.gatk.walkers.recalibration.DinucCovariate.getValues(DinucCovariate.java:82) at org.broadinstitute.sting.gatk.walkers.recalibration.RecalDataManager.computeCovariates(RecalDataManager.java:615) at org.broadinstitute.sting.gatk.walkers.recalibration.CountCovariatesWalker.map(CountCovariatesWalker.java:381) at org.broadinstitute.sting.gatk.walkers.recalibration.CountCovariatesWalker.map(CountCovariatesWalker.java:134) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:78) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:63) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:246) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:128) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:92)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 1.5-32-g2761da9):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
ERROR
ERROR MESSAGE: 0
ERROR ------------------------------------------------------------------------------------------

Return to top Comment on this article in the forum