Tagged with #empty
0 documentation articles | 0 announcements | 3 forum discussions


No posts found with the requested search criteria.
No posts found with the requested search criteria.

Created 2015-07-15 10:04:10 | Updated 2015-07-15 10:05:21 | Tags: realignertargetcreator empty out-interval genomeanalysistk-jar
Comments (2)

Hi,

I am running GATK for mouse variation analysis. I am trying to do indel realignment using RealignerTargetCreator. Program run successfully but there is no data in put.interval file

**********command **********

java -Xmx5g -Xms5g -Djava.io.tmpdir=`pwd`/tmp -jar /share/apps/gatk/src/GenomeAnalysisTK.jar \
 -nt 36\
 -T RealignerTargetCreator \
  -R mm_ref_GRCm38.p2_Genome_renamed_reordered.fa \
  -o out.intervals \
   -known:myvcf,VCF /home/cparsania/Database/mm_ref_GRCm38.p2/mgp.v5.merged.snps_all.dbSNP142_renamed_sorted.vcf 

**********Output Log**********

INFO  16:07:22,948 HelpFormatter - Executing as cparsania@compute-2-3.local on Linux 2.6.18-308.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_25-b17. 
INFO  16:07:22,948 HelpFormatter - Date/Time: 2015/07/15 16:07:22 
INFO  16:07:22,948 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  16:07:22,948 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  16:07:23,280 GenomeAnalysisEngine - Strictness is SILENT 
INFO  16:07:23,403 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
INFO  16:07:23,574 MicroScheduler - Running the GATK in parallel mode with 36 total threads, 1 CPU thread(s) for each of 36 data thread(s), of 48 processors available on this machine 
INFO  16:07:23,701 GenomeAnalysisEngine - Preparing for traversal 
INFO  16:07:23,705 GenomeAnalysisEngine - Done preparing for traversal 
INFO  16:07:23,706 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  16:07:23,706 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining 
INFO  16:07:23,706 ProgressMeter -        Location |     sites | elapsed |     sites | completed | runtime |   runtime 
INFO  16:07:53,723 ProgressMeter - NC_000067.6:144540201      1.28E8    30.0 s       0.0 s        5.3%     9.4 m       8.9 m 
INFO  16:08:23,806 ProgressMeter - NC_000068.7:11335701      1.94E8    60.0 s       0.0 s        7.6%    13.2 m      12.2 m 
INFO  16:08:53,841 ProgressMeter - NC_000068.7:78999901   2.62471971E8    90.0 s       0.0 s       10.1%    14.9 m      13.4 m 
INFO  16:09:23,964 ProgressMeter - NC_000068.7:136999901   3.25471971E8   120.0 s       0.0 s       12.2%    16.4 m      14.4 m 
INFO  16:09:54,102 ProgressMeter - NC_000069.6:16999901   3.88585195E8     2.5 m       0.0 s       14.5%    17.3 m      14.8 m 
INFO  16:10:24,427 ProgressMeter - NC_000069.6:81057901   4.46585195E8     3.0 m       0.0 s       16.8%    17.8 m      14.8 m 
INFO  16:10:54,599 ProgressMeter - NC_000069.6:148214401   5.04585195E8     3.5 m       0.0 s       19.3%    18.1 m      14.6 m 
INFO  16:11:24,921 ProgressMeter - NC_000070.6:38804101   5.69624875E8     4.0 m       0.0 s       21.1%    19.0 m      15.0 m 
INFO  16:11:55,015 ProgressMeter - NC_000070.6:105999901   6.29624875E8     4.5 m       0.0 s       23.6%    19.1 m      14.6 m 
INFO  16:12:25,096 ProgressMeter - NC_000071.6:7734201   6.89624875E8     5.0 m       0.0 s       25.8%    19.5 m      14.5 m 
INFO  16:12:55,351 ProgressMeter - NC_000071.6:68405801   7.52132991E8     5.5 m       0.0 s       28.0%    19.7 m      14.2 m 
INFO  16:13:25,454 ProgressMeter - NC_000071.6:140296601   8.12132991E8     6.0 m       0.0 s       30.6%    19.7 m      13.6 m 
INFO  16:13:55,785 ProgressMeter - NC_000072.6:41999901   8.81967675E8     6.5 m       0.0 s       32.6%    20.1 m      13.5 m 
INFO  16:14:25,985 ProgressMeter - NC_000072.6:94999901   9.29967675E8     7.0 m       0.0 s       34.5%    20.4 m      13.3 m 
INFO  16:14:56,287 ProgressMeter - NC_000073.6:7225501   9.88967675E8     7.5 m       0.0 s       36.8%    20.5 m      12.9 m 
INFO  16:15:26,297 ProgressMeter - NC_000073.6:51999901   1.044704221E9     8.0 m       0.0 s       38.4%    20.9 m      12.9 m 
INFO  16:15:56,453 ProgressMeter - NC_000073.6:128014401   1.097704221E9     8.5 m       0.0 s       41.2%    20.7 m      12.2 m 
INFO  16:16:26,581 ProgressMeter - NC_000074.6:32999901   1.17014568E9     9.0 m       0.0 s       43.1%    21.0 m      11.9 m 
INFO  16:16:56,874 ProgressMeter - NC_000074.6:97999901   1.22914568E9     9.6 m       0.0 s       45.5%    21.0 m      11.5 m 
INFO  16:17:27,138 ProgressMeter - NC_000075.6:33693401   1.289546893E9    10.1 m       0.0 s       47.9%    21.0 m      11.0 m 
INFO  16:17:57,593 ProgressMeter - NC_000075.6:83999901   1.348546893E9    10.6 m       0.0 s       49.7%    21.2 m      10.7 m 
INFO  16:18:27,651 ProgressMeter - NC_000076.6:32999901   1.416142003E9    11.1 m       0.0 s       52.4%    21.1 m      10.0 m 
INFO  16:18:57,828 ProgressMeter - NC_000076.6:92999901   1.474142003E9    11.6 m       0.0 s       54.6%    21.2 m       9.6 m 
INFO  16:19:27,910 ProgressMeter - NC_000077.6:20049601   1.532836996E9    12.1 m       0.0 s       56.7%    21.3 m       9.2 m 
INFO  16:19:58,237 ProgressMeter - NC_000077.6:78999901   1.602836996E9    12.6 m       0.0 s       58.9%    21.3 m       8.8 m 
INFO  16:20:28,247 ProgressMeter - NC_000078.6:25999901   1.665919539E9    13.1 m       0.0 s       61.4%    21.3 m       8.2 m 
INFO  16:20:58,267 ProgressMeter - NC_000078.6:90999901   1.732919539E9    13.6 m       0.0 s       63.8%    21.3 m       7.7 m 
INFO  16:21:28,376 ProgressMeter - NC_000079.6:38925901   1.794048561E9    14.1 m       0.0 s       66.3%    21.2 m       7.2 m 
INFO  16:21:58,544 ProgressMeter - NC_000079.6:97482301   1.851048561E9    14.6 m       0.0 s       68.4%    21.3 m       6.7 m 
INFO  16:22:28,722 ProgressMeter - NC_000080.6:45083501   1.9224702E9    15.1 m       0.0 s       70.9%    21.3 m       6.2 m 
INFO  16:22:58,892 ProgressMeter - NC_000080.6:117999901   1.9884702E9    15.6 m       0.0 s       73.6%    21.2 m       5.6 m 
INFO  16:23:28,971 ProgressMeter - NC_000081.6:58225001   2.052372444E9    16.1 m       0.0 s       76.0%    21.2 m       5.1 m 
INFO  16:23:59,036 ProgressMeter - NC_000082.6:15999901   2.110372444E9    16.6 m       0.0 s       78.3%    21.2 m       4.6 m 
INFO  16:24:29,111 ProgressMeter - NC_000082.6:85802101   2.178416129E9    17.1 m       0.0 s       80.8%    21.1 m       4.0 m 
INFO  16:24:59,175 ProgressMeter - NC_000083.6:33999901   2.238623897E9    17.6 m       0.0 s       82.5%    21.3 m       3.7 m 
INFO  16:25:29,331 ProgressMeter - NC_000083.6:84758401   2.290623897E9    18.1 m       0.0 s       84.4%    21.4 m       3.3 m 
INFO  16:25:59,601 ProgressMeter - NC_000084.6:42999901   2.351611168E9    18.6 m       0.0 s       86.4%    21.5 m       2.9 m 
INFO  16:26:29,851 ProgressMeter - NC_000085.6:26999901   2.420313807E9    19.1 m       0.0 s       89.1%    21.4 m       2.3 m 
INFO  16:27:00,149 ProgressMeter - NC_000086.7:55531001   2.506745373E9    19.6 m       0.0 s       92.4%    21.2 m      96.0 s 
INFO  16:27:30,234 ProgressMeter - NC_000086.7:140398101   2.590745373E9    20.1 m       0.0 s       95.5%    21.0 m      56.0 s 
INFO  16:27:49,193 ProgressMeter -            done   2.725537669E9    20.4 m       0.0 s      100.0%    20.4 m       0.0 s 
INFO  16:27:49,193 ProgressMeter - Total runtime 1225.49 secs, 20.42 min, 0.34 hours 
INFO  16:28:12,666 GATKRunReport - Uploaded run statistics report to AWS S3 

I don't understand though command run successfully why out.interval is empty.

Please help

Thanks you Chirag


Created 2013-12-17 17:07:40 | Updated 2013-12-17 17:08:47 | Tags: unifiedgenotyper vcf empty
Comments (8)

Hey there,

I was trying to build an analysis pipeline for paired reads with BWA, Duplicate Removal Local Realignment and Base Quality Score Recalibration to finally use GATK's UnifiedGenotyper for SNP and Indel calling. However, for both SNPs and Indels, I receive no called variants no matter how low my used thresholds are. Quality values of the reads look ok, leaving out dbSNP does not change results. I have used the same reference throughout the whole pipeline. I use GATK 2.7, nevertheless a switch to GATK 1.6 did not change anything.

This is my shell command for SNP calling on chromosome X (GATK delivers no results for all chromosomes): java -Xmx4g -jar GenomeAnalysisTK.jar -T UnifiedGenotyper -R Homo_sapiens_assembly19.fasta -stand_call_conf 30.0 -stand_emit_conf 30.0 -glm SNP -mbq 17 -I test.bam -L X -o test.snps.vcf -D dbsnp_135.hg19.excluding_sites_after_129.vcf

Entries in my BAM file look like this: SRR389458.1885965 113 X 10092397 37 76M = 10092397 1 CCTGTTTCCCCTGGGGCTGGGCTNGANACTGGGCCCAACCNGTGGCTCCCACCTGCACACACAGGGCTGGAGGGAC 98998999989:99:9:999888#88#79999:;:89998#99:;:88:989:;:91889888:;:9;:::::999 X0:i:1 X1:i:0 BD:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN MD:Z:23G2G13C35 PG:Z:MarkDuplicates RG:Z:DEFAULT XG:i:0 BI:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AM:i:37 NM:i:3 SM:i:37 XM:i:3 XO:i:0 MQ:i:37 XT:A:U SRR389458.1885965 177 X 10092397 37 76M = 10092397 -1 CCTGTTTCCCCTGGGGCTGGGCTNGANACTGGGCCCAACCNGTGGCTCCCACCTGCACACACAGGGCTGGAGGGAC 98998999989:99:9:999888#88#79999:;:89998#99:;:88:989:;:91889888:;:9;:::::999 X0:i:1 X1:i:0 BD:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN MD:Z:23G2G13C35 PG:Z:MarkDuplicates RG:Z:DEFAULT XG:i:0 BI:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AM:i:37 NM:i:3 SM:i:37 XM:i:3 XO:i:0 MQ:i:37 XT:A:U SRR389458.1888837 113 X 14748343 37 76M = 14748343 1 TCGTGAAAGTCGTTTTAATTTTAGCGGTTATGGGATGGGTCACTGCCTCCAAGTGAAAGATGGAACAGCTGTCAAG 889999:9988;98:9::9;9986::::99:8:::::999988989:8;;9::989:999:9;9:;:99:98:999 X0:i:1 X1:i:0 BD:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN MD:Z:76 PG:Z:MarkDuplicates RG:Z:DEFAULT XG:i:0 BI:Z:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 MQ:i:37 XT:A:U

And this is the output of the UnifiedGenotyper: INFO 17:57:00,575 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:00,578 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51 INFO 17:57:00,578 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 17:57:00,578 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 17:57:00,582 HelpFormatter - Program Args: -T UnifiedGenotyper -R /hana/exchange/reference_genomes/hg19/Homo_sapiens_assembly19.fasta -stand_call_conf 30.0 -stand_emit_conf 30.0 -glm SNP -mbq 17 -I test.bam -L X -o testX.snps.vcf -D dbsnp_135.hg19.excluding_sites_after_129.vcf INFO 17:57:00,583 HelpFormatter - Date/Time: 2013/12/17 17:57:00 INFO 17:57:00,583 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:00,583 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:00,943 ArgumentTypeDescriptor - Dynamically determined type of /hana/exchange/reference_genomes/hg19/dbsnp_135.hg19.excluding_sites_after_129.vcf to be VCF INFO 17:57:01,579 GenomeAnalysisEngine - Strictness is SILENT INFO 17:57:02,228 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250 INFO 17:57:02,237 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 17:57:02,364 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.13 INFO 17:57:02,594 RMDTrackBuilder - Loading Tribble index from disk for file /hana/exchange/reference_genomes/hg19/dbsnp_135.hg19.excluding_sites_after_129.vcf INFO 17:57:02,867 IntervalUtils - Processing 155270560 bp from intervals INFO 17:57:02,943 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 17:57:03,166 GenomeAnalysisEngine - Done preparing for traversal INFO 17:57:03,167 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 17:57:03,167 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 17:57:33,171 ProgressMeter - X:11779845 1.01e+07 30.0 s 2.0 s 7.6% 6.6 m 6.1 m INFO 17:58:03,173 ProgressMeter - X:24739805 1.89e+07 60.0 s 3.0 s 15.9% 6.3 m 5.3 m INFO 17:58:33,175 ProgressMeter - X:37330641 3.25e+07 90.0 s 2.0 s 24.0% 6.2 m 4.7 m INFO 17:59:03,177 ProgressMeter - X:49404077 4.94e+07 120.0 s 2.0 s 31.8% 6.3 m 4.3 m INFO 17:59:33,178 ProgressMeter - X:64377965 5.36e+07 2.5 m 2.0 s 41.5% 6.0 m 3.5 m INFO 18:00:03,180 ProgressMeter - X:75606869 7.54e+07 3.0 m 2.0 s 48.7% 6.2 m 3.2 m INFO 18:00:33,189 ProgressMeter - X:88250233 7.74e+07 3.5 m 2.0 s 56.8% 6.2 m 2.7 m INFO 18:01:03,190 ProgressMeter - X:100393213 9.94e+07 4.0 m 2.0 s 64.7% 6.2 m 2.2 m INFO 18:01:33,192 ProgressMeter - X:110535705 1.09e+08 4.5 m 2.0 s 71.2% 6.3 m 109.0 s INFO 18:02:03,193 ProgressMeter - X:121257489 1.20e+08 5.0 m 2.0 s 78.1% 6.4 m 84.0 s INFO 18:02:33,195 ProgressMeter - X:132533757 1.32e+08 5.5 m 2.0 s 85.4% 6.4 m 56.0 s INFO 18:03:03,197 ProgressMeter - X:144498909 1.41e+08 6.0 m 2.0 s 93.1% 6.4 m 26.0 s INFO 18:03:30,079 ProgressMeter - done 1.55e+08 6.4 m 2.0 s 100.0% 6.4 m 0.0 s INFO 18:03:30,079 ProgressMeter - Total runtime 386.91 secs, 6.45 min, 0.11 hours INFO 18:03:30,080 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 150 total reads (0.00%) INFO 18:03:30,080 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter INFO 18:03:30,080 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter INFO 18:03:30,080 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter INFO 18:03:30,081 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter INFO 18:03:30,081 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter INFO 18:03:30,081 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter INFO 18:03:30,081 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter INFO 18:03:32,167 GATKRunReport - Uploaded run statistics report to AWS S3

Do I miss anything here?

Best,

Cindy


Created 2013-06-11 14:13:49 | Updated 2013-06-11 14:16:58 | Tags: unifiedgenotyper empty
Comments (3)

Command java -jar GenomeAnalysisTK.jar -T UnifiedGenotyper -dcov 1 -R smallRefGenome.fa -I testWh.bam -o test.vcf

Gives me this output:

INFO 15:48:37,070 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:48:37,074 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02 INFO 15:48:37,074 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 15:48:37,074 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 15:48:37,081 HelpFormatter - Program Args: -T UnifiedGenotyper -dcov 1 -R smallRefGenome.fa -I testWh.bam -o test.vcf INFO 15:48:37,081 HelpFormatter - Date/Time: 2013/06/11 15:48:37 INFO 15:48:37,081 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:48:37,081 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:48:37,220 GenomeAnalysisEngine - Strictness is SILENT INFO 15:48:37,331 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1 INFO 15:48:37,342 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 15:48:37,366 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 INFO 15:48:37,504 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 15:48:37,527 GenomeAnalysisEngine - Done creating shard strategy INFO 15:48:37,527 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 15:48:37,528 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 15:48:38,635 ProgressMeter - done 1.20e+03 1.0 s 15.3 m 99.9% 1.0 s 0.0 s INFO 15:48:38,636 ProgressMeter - Total runtime 1.11 secs, 0.02 min, 0.00 hours INFO 15:48:38,636 MicroScheduler - 0 reads were filtered out during traversal out of 418 total (0.00%) INFO 15:48:45,809 GATKRunReport - Uploaded run statistics report to AWS S3

The reference genome is 1200 nt long, all 418 reads map between position 100 and 1100 of this reference genome and are 100nt long. The reads are generated by Illumina and mapped with BWA. The bam file contains paired end data, but none are properly paired. The output looks like everything worked, with -dcov 1 I have to find many SNPs...

All steps I did before GATK, after mapping:

Convert sam to bam samtools view -u -S -b test.sam > test.bam

sort the bam file samtools sort test.bam testSorted

Add the missing header line in the bam file java -jarAddOrReplaceReadGroups.jar I=testSorted.bam O=testWh.bam LB=test PL=illumina PU=lane SM=samplename

Index the bam file samtools index ../testFiles/output//test/testWh.bam

Does anyone see where I made a mistake?

Is there another setting which I have to set for finding a SNP with a mall dataset on a small reference genome?