Tagged with #methods
1 documentation article | 0 announcements | 1 forum discussion

Created 2012-07-23 16:45:48 | Updated 2014-12-08 17:56:15 | Tags: inbreedingcoeff phasebytransmission pedigree intermediate phasing methods plink allelefrequency

Comments (36)

There are two types of GATK tools that are able to use pedigree (family structure) information:

Tools that require a pedigree to operate

PhaseByTransmission and CalculateGenotypePosterior will not run without a properly formatted pedigree file. These tools are part of the Genotype Refinement workflow, which is documented here.

Tools that are able to generate standard variant annotations

The two variant callers (HaplotypeCaller and the deprecated UnifiedGenotyper) as well as VariantAnnotator and GenotypeGVCFs are all able to use pedigree information if you request an annotation that involves population structure (e.g. Inbreeding Coefficient). To be clear though, the pedigree information is not used during the variant calling process; it is only used during the annotation step at the end.

If you already have VCF files that were called without pedigree information, and you want to add pedigree-related annotations (e.g to use Variant Quality Score Recalibration (VQSR) with the InbreedingCoefficient as a feature annotation), don't panic. Just run the latest version of the VariantAnnotator to re-annotate your variants, requesting any missing annotations, and make sure you pass your PED file to the VariantAnnotator as well. If you forget to provide the pedigree file, the tool will run successfully but pedigree-related annotations may not be generated (this behavior is different in some older versions).

About the PED format

The PED files used as input for these tools are based on PLINK pedigree files. The general description can be found here.

For these tools, the PED files must contain only the first 6 columns from the PLINK format PED file, and no alleles, like a FAM file in PLINK.

No articles to display.

Created 2015-03-24 09:24:28 | Updated | Tags: methods

Comments (1)

Hello, I can't seem to find the description of Indel calling (especially the calculation of the indel allele likelihood) in the original GATK paper. Is there any compehensive definition of the used model (I guess it is based on an HMM)?