Tagged with #trio
0 documentation articles | 0 announcements | 5 forum discussions

No posts found with the requested search criteria.
No posts found with the requested search criteria.

Created 2015-08-19 00:13:46 | Updated | Tags: trio denovo
Comments (9)


I have followed the GATK guidelines for trio calling, recalculating posteriors, and annotating possible de novo. However, I am confused about some of the 'hiConfDeNovo' calls.

Here is an example:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  child   father  mother
chr1    144674391   rs61810930  C   T   1177.92 PASS    AC=3;AF=0.500;AN=6;BaseQRankSum=2.82;ClippingRankSum=0.306;DB;DP=71;FS=2.137;MLEAC=3;MLEAF=0.500;MQ=64.80;MQRankSum=-1.675e+00;PG=0,0,0;QD=16.59;ReadPosRankSum=-1.813e+00;SOR=1.034;hiConfDeNovo=child GT:AD:DP:GQ:JL:JP:PGT:PID:PL:PP 0/1:16,17:33:99:127:127:0|1:144674371_T_C:645,0,612:645,0,612   0/1:11,5:16:99:127:127:0|1:144674371_T_C:151,0,447:151,0,447    0/1:11,11:22:99:127:127:0|1:144674371_T_C:412,0,692:412,0,692
chr1    145037984   rs2489136   T   G   1401.92 PASS    AC=3;AF=0.500;AN=6;BaseQRankSum=-4.595e+00;ClippingRankSum=0.717;DB;DP=107;FS=43.850;MLEAC=3;MLEAF=0.500;MQ=69.55;MQRankSum=-9.500e-02;PG=0,0,0;QD=13.35;ReadPosRankSum=0.032;SOR=0.296;hiConfDeNovo=child  GT:AD:DP:GQ:JL:JP:PGT:PID:PL:PP 0/1:20,21:41:99:127:127:.:.:510,0,640:510,0,640 0/1:10,18:28:99:127:127:0|1:145037818_T_C:454,0,293:454,0,293   0/1:17,19:36:99:127:127:.:.:468,0,504:468,0,504
chr1    145053717   rs6670785   T   C   74.13   PASS    AC=1;AF=0.167;AN=6;BaseQRankSum=1.91;ClippingRankSum=-1.220e-01;DB;DP=114;FS=5.902;MLEAC=1;MLEAF=0.167;MQ=68.67;MQRankSum=0.203;PG=0,0,0;QD=3.22;ReadPosRankSum=1.01;SOR=2.712;hiConfDeNovo=child   GT:AD:DP:GQ:JL:JP:PGT:PID:PL:PP 0/0:48,0:48:99:96:96:.:.:0,102,1530:0,102,1590  0/1:19,4:23:99:96:96:0|1:145053711_C_T:105,0,826:105,0,886  0/0:42,0:42:99:96:96:.:.:0,99,1485:0,99,1545
chr1    145121901   rs10752808  A   G   3364.92 PASS    AC=3;AF=0.500;AN=6;BaseQRankSum=4.39;ClippingRankSum=0.353;DB;DP=173;FS=3.597;MLEAC=3;MLEAF=0.500;MQ=63.47;MQRankSum=-3.750e-01;PG=0,0,0;QD=19.45;ReadPosRankSum=0.164;SOR=0.550;hiConfDeNovo=child GT:AD:DP:GQ:JL:JP:PL:PP 0/1:20,37:57:99:127:127:1187,0,627:1187,0,627   0/1:21,24:45:99:127:127:717,0,736:717,0,736 0/1:24,47:71:99:127:127:1491,0,706:1491,0,706
chr2    91766984    rs55874359  A   T   6057.92 PASS    AC=3;AF=0.500;AN=6;BaseQRankSum=0.467;ClippingRankSum=-5.510e-01;DB;DP=404;FS=23.592;MLEAC=3;MLEAF=0.500;MQ=67.76;MQRankSum=-5.800e-02;PG=0,0,0;QD=15.78;ReadPosRankSum=0.318;SOR=0.878;hiConfDeNovo=child  GT:AD:DP:GQ:JL:JP:PGT:PID:PL:PP 0/1:69,62:131:99:127:127:.:.:1543,0,2330:1543,0,2330    0/1:44,40:84:99:127:127:0|1:91766961_T_C:1493,0,1698:1493,0,1698    0/1:87,82:169:99:127:127:0|1:91766961_T_C:3052,0,3450:3052,0,3450

If the all samples are called as 0/1 why is it called a deNovo event? Perhaps I am misunderstanding the definition used? This was done using gatk v3.4.

Created 2013-09-30 00:26:16 | Updated 2013-09-30 00:57:39 | Tags: phasebytransmission trio
Comments (3)

(EDIT: solution found and explained below, mostly an error on my end, sorry)

I have what I know is a de novo variant (validated) and GATK PhaseByTransmission refuses to see it. Here is what I am starting with in my VCF file: 7 151092903 . G A 338.83 PASS . GT:AD:DP:GQ:PL 0/0:12,0:12:36:0,36,414 0/0:20,0:20:60:0,60,669 0/1:6,15:20:99:389,0,108

So: - the father is 12 ref, 0 alt - the mother is 20 ref, 0 alt - the offspring is 6 ref, 15 alt

When I run java -Xmx2g -jar GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar -R fasta/human_g1k_v37.fasta -T PhaseByTransmission --DeNovoPrior 0.00001 -V trio1_1553_1554_1555_small.recode.vcf -ped trio1_1553_1554_1555.ped -o trio1_1553_1554_1555.vcf --MendelianViolationsFile trio1_1553_1554_1555_noMendel.tab

I get the following output VCF line: 7 151092903 . G A 338.83 PASS . GT:AD:DP:GQ:PL:TP 1|0:12,0:12:0:0,36,414:13 0|0:20,0:20:60:0,60,669:13 1|0:6,15:20:99:389,0,108:13

So the father is eventually called a het.This happens even when I set the prior to a low value of 10^-5. That does not seem like the right behavior to me, a more appropriate call would be to call both parents ref homs. The genotype likelihood certainly suggest that for a 10^-5 prior of de novo event, this would make sense.

EDIT: OK, I wish I could remove this post. I don't think I can but I can edit the answer at least. I was just misreading the genotype likelihood. The evidence in favour of a homozygous call in the father is in fact weaker than I thought. A prior of de novo calls of 5x10^-4 fixes things, and with that threshold I am getting a proper de novo call at this location. I apologize for the pointless post!

Created 2013-07-23 16:20:21 | Updated | Tags: phasebytransmission readbackedphasing trio genotype
Comments (4)

I am doing a WGS project on a family with seven siblings. We have data on the mother but the father passed many years ago. I tried splitting variant recalibrated vcf file and ped file into "trios" with just the mother and a sibling (seven times) then running PhaseByTransmission on the combined vcf. The job was successfully completed but nothing appears phased (all "/," and no "|") in the output vcf. I also tried the variant recalibrated vcf file separately with ReadBackedPhasing. The job was successfully completed as well but again nothing appears phased (all "/" and no "|" or assigned "PQ" scores). The ProduceBeagleInput walker (to use Beagle for genotype refinement) appears to only support unrelated individuals and my set involves related individuals. Do you have any other suggestions for phasing incomplete "trios?" Thanks in advance!

Created 2013-04-22 14:51:02 | Updated | Tags: unifiedgenotyper trio exome
Comments (2)


I have exome data for a few sets of parent-offspring trios, in which offspring have phenotypically related but probably genetically different diseases. Their parents are unaffected so I am particularly interested in identifying de novo mutations. My plan was to analyse each individual separately up to the variant calling phase and then to input three (mother, father, child) analysis-ready BAMs into the UnifiedGenotyper along with a ped file. My questions are:

1) Can you tell me whether the UnifiedGenotyper uses pedigree information in the ped file to call genotypes more accurately? In other words, is this better than calling variants jointly without supplying the ped file? If not, does GATK recommend any external tools for doing this step?

2) It is better to call variants jointly using all the trios (even though they are not related and probably don't share the same disease-causing mutations)?

Best wishes,


Created 2012-07-23 17:25:42 | Updated 2012-12-17 14:51:49 | Tags: phasebytransmission trio pedigree
Comments (4)

Is it possible to use PhaseByTransmission with families that are larger than a single trio? I have a family with four siblings. If I include all of the siblings in the PED I get:

PhaseByTransmission - Caution: Family BMD has 6 members; At the moment Phase By Transmission only supports trios and parent/child pairs. Family skipped.
ERROR MESSAGE: Bad input: No PED file passed or no trios found in PED file. Aborted.

And if I just include the one key trio with the proband, I get the following:

ERROR MESSAGE: Sample BMD006_R found in data sources but not in pedigree files with STRICT pedigree validation

There does not seem to be an accessible argument for relaxing the pedigree validation. Is there a way to use PhaseByTransmission with my larger family?