DepthPerAlleleBySample

The depth of coverage of each allele per sample

Category Variant Annotations

VCF Field FORMAT (sample genotype-level)

Type StandardAnnotation


Overview

The AD and DP are complementary fields that are two important ways of thinking about the depth of the data for this sample at this site. While the sample-level (FORMAT) DP field describes the total depth of reads that passed the caller's internal quality control metrics (like MAPQ > 17, for example), the AD values (one for each of REF and ALT fields) is the unfiltered count of all reads that carried with them the REF and ALT alleles. The reason for this distinction is that the DP is in some sense reflective of the power I have to determine the genotype of the sample at this site, while the AD tells me how many times I saw each of the REF and ALT alleles in the reads, free of any bias potentially introduced by filtering the reads. If, for example, I believe there really is a an A/T polymorphism at a site, then I would like to know the counts of A and T bases in this sample, even for reads with poor mapping quality that would normally be excluded from the statistical calculations going into GQ and QUAL. Please note, however, that the AD isn't necessarily calculated exactly for indels. Only reads which are statistically favoring one allele over the other are counted. Because of this fact, the sum of AD may be different than the individual sample depth, especially when there are many non-informative reads.

Because the AD includes reads and bases that were filtered by the caller and in case of indels is based on a statistical computation, one should not base assumptions about the underlying genotype based on it; instead, the genotype likelihoods (PLs) are what determine the genotype calls.


See also Guide Index | Tool Documentation Index | Support Forum

GATK version 3.2-2-gec30cee built at 2014/07/17 17:54:48. GTD: NA