Tool Documentation Index 3.4-46-gbc02625

Name Summary
CommandLineGATK All command line parameters accepted by all tools in the GATK.

Name Summary
ASEReadCounter Calculate read counts per allele for allele-specific expression analysis
AnalyzeCovariates Create plots to visualize base recalibration results

This tool generates plots for visualizing the quality of a recalibration run.

BaseCoverageDistribution Evaluate coverage distribution per base
CallableLoci Collect statistics on callable, uncallable, poorly mapped, and other parts of the genome
CheckPileup Compare GATK's internal pileup to a reference Samtools pileup
CompareCallableLoci Compare callability statistics
CountBases Count the number of bases in a set of reads
CountIntervals Count contiguous regions in an interval list
CountLoci Count the total number of covered loci
CountMales Count the number of reads seen from male samples
CountRODs Count the number of ROD objects encountered
CountRODsByRef Count the number of ROD objects encountered along the reference
CountReadEvents Count the number of read events
CountReads Count the number of reads
CountTerminusEvent Count the number of reads ending in insertions, deletions or soft-clips
CoveredByNSamplesSites Report well-covered intervals
DepthOfCoverage Assess sequence coverage by a wide array of metrics, partitioned by sample, read group, or library
DiagnoseTargets Analyze coverage distribution and validate read mates per interval and per sample
ErrorRatePerCycle Compute the read error rate per position
FastaStats Calculate basic statistics about the reference sequence itself
FindCoveredIntervals Outputs a list of intervals that are covered above a given threshold
FlagStat Collect statistics about sequence reads based on their SAM flags
GCContentByInterval Calculates the GC content of the reference sequence for each interval
Pileup Print read alignments in Pileup-style format
PrintRODs Print out all of the RODs in the input data set
QCRef Quality control for the reference fasta
QualifyMissingIntervals Collect quality metrics for a set of intervals
ReadClippingStats Collect read clipping statistics
ReadGroupProperties Collect statistics about read groups and their properties
ReadLengthDistribution Collect read length statistics
SimulateReadsForVariants Generate simulated reads for variants

Name Summary
BaseRecalibrator Generate base recalibration table to compensate for systematic errors
ClipReads Read clipping based on quality, position or sequence matching
IndelRealigner Perform local realignment of reads around indels
LeftAlignIndels Left-align indels within reads in a bam file
PrintReads Write out sequence read data (for filtering, merging, subsetting etc)
ReadAdaptorTrimmer Utility tool to blindly strip base adaptors
RealignerTargetCreator Define intervals to target for local realignment
SplitNCigarReads Splits reads that contain Ns in their CIGAR string
SplitSamFile Split a BAM file by sample

Name Summary
ApplyRecalibration Apply a score cutoff to filter variants based on a recalibration table
BeagleOutputToVCF Takes files produced by Beagle imputation engine and creates a vcf with modified annotations.
GenotypeGVCFs Perform joint genotyping on gVCF files produced by HaplotypeCaller
HaplotypeCaller Call SNPs and indels simultaneously via local re-assembly of haplotypes in an active region
PhaseByTransmission Compute the most likely genotype combination and phasing for trios and parent/child pairs
ProduceBeagleInput Converts the input VCF into a format accepted by the Beagle imputation/analysis program.
ReadBackedPhasing Annotate physical phasing information
UnifiedGenotyper Call SNPs and indels on a per-locus basis
VariantRecalibrator Build a recalibration model to score variant quality for filtering purposes
VariantsToBeagleUnphased Produces an input file to Beagle imputation engine, listing unphased, hard-called genotypes for a single sample in input variant file.

Name Summary
CalculateGenotypePosteriors Calculate genotype posterior likelihoods given panel data
CatVariants Concatenate VCF files of non-overlapping genome intervals, all with the same set of samples
CombineGVCFs Combine per-sample gVCF files produced by HaplotypeCaller into a multi-sample gVCF file
CombineVariants Combine variant records from different sources
FilterLiftedVariants Filters a lifted-over VCF file for reference bases that have been changed
GenotypeConcordance Genotype concordance between two callsets
HaplotypeResolver Haplotype-based resolution of variants in separate callsets.
LeftAlignAndTrimVariants Left-align indels in a variant callset
LiftoverVariants Lifts a VCF file over from one build to another
RandomlySplitVariants Randomly split variants into different sets
RegenotypeVariants Regenotypes the variants from a VCF containing PLs or GLs.
SelectHeaders Selects headers from a VCF source
SelectVariants Select a subset of variants from a larger callset
VariantAnnotator Annotate variant calls with context information
VariantEval General-purpose tool for variant evaluation (% in dbSNP, genotype concordance, Ti/Tv ratios, and a lot more)
VariantFiltration Filter variant calls based on INFO and FORMAT annotations
VariantsToAllelicPrimitives Simplify multi-nucleotide variants (MNPs) into more basic/primitive alleles.
VariantsToBinaryPed Convert VCF to binary pedigree file
VariantsToTable Extract specific fields from a VCF file to a tab-delimited table
VariantsToVCF Convert variants from other file formats to VCF format

Name Summary
ListAnnotations Utility program to print a list of available annotations

Name Summary
FastaAlternateReferenceMaker Generate an alternative reference sequence over the specified interval
FastaReferenceMaker Create a subset of a FASTA reference sequence

Name Summary
GenotypeAndValidate Genotype and validate a dataset and the calls of another dataset using the Unified Genotyper
ValidateVariants Validate a VCF file with an extra strict set of criteria
ValidationSiteSelector Randomly select variant records according to specified options
VariantValidationAssessor Annotate a validation VCF with QC metrics

GATK Engine arguments that filter or transfer incoming SAM/BAM data files

Name Summary
BadCigarFilter Filter out reads with wonky CIGAR strings
BadMateFilter Filter out reads whose mate maps to a different contig
DuplicateReadFilter Filter out duplicate reads
FailsVendorQualityCheckFilter Filter out reads that fail the vendor quality check
HCMappingQualityFilter Filter out reads with low mapping qualities for HaplotypeCaller
LibraryReadFilter Only use reads from the specified library
MalformedReadFilter Filter out malformed reads
MappingQualityFilter Filter out reads with low mapping qualities
MappingQualityUnavailableFilter Filter out reads with no mapping quality information
MappingQualityZeroFilter Filter out reads with mapping quality zero
MateSameStrandFilter Filter out reads with bad pairing (and related) properties
MaxInsertSizeFilter Filter out reads that exceed a given insert size
MissingReadGroupFilter Filter out reads without read group information
NoOriginalQualityScoresFilter Filter out reads that do not have an original quality quality score (OQ) tag
NotPrimaryAlignmentFilter Filter out reads that are secondary alignments
OverclippedReadFilter Filter out reads that are over-soft-clipped
Platform454Filter Filter out reads produced by 454 technology
PlatformFilter Filter out reads that were generated by a specific sequencing platform
PlatformUnitFilter Filter out reads with blacklisted platform unit tags
ReadGroupBlackListFilter Filter out reads matching a read group tag value
ReadLengthFilter Filter out reads based on length
ReadNameFilter Only use reads with this read name
ReadStrandFilter Filter out reads based on strand orientation
ReassignMappingQualityFilter Set the mapping quality of all reads to a given value.
ReassignOneMappingQualityFilter Set the mapping quality of reads with a given value to another given value.
SampleFilter Only use reads belonging to a specific sample
SingleReadGroupFilter Only use reads from the specified read group
UnmappedReadFilter Filter out unmapped reads

Tribble codecs for reading reference ordered data (ROD) files such as VCF or BED

Name Summary
BeagleCodec Codec for Beagle imputation engine
BedTableCodec The standard table codec that expects loci as contig start stop, not contig:start-stop
RawHapMapCodec A codec for the file types produced by the HapMap consortium
RefSeqCodec Allows for reading in RefSeq information
SAMPileupCodec Decoder for SAM pileup data.
SAMReadCodec Decodes a simple SAM text string.
TableCodec Reads tab deliminated tabular text files

Annotations available to VariantAnnotator and the variant callers (some restrictions apply)

Name Summary
AlleleBalance Allele balance across all samples
AlleleBalanceBySample Allele balance per sample
AlleleCountBySample Allele count and frequency expectation per sample
BaseCounts Count of A, C, G, T bases across all samples
BaseQualityRankSumTest Rank Sum Test of REF versus ALT base quality scores
ChromosomeCounts Counts and frequency of alleles in called genotypes
ClippingRankSumTest Rank Sum Test for hard-clipped bases on REF versus ALT reads
Coverage Total depth of coverage per sample and over all samples.
DepthPerAlleleBySample Depth of coverage of each allele per sample
DepthPerSampleHC Depth of informative coverage for each sample.
FisherStrand Strand bias estimated using Fisher's Exact Test
GCContent GC content of the reference around the given site
GenotypeSummaries Summarize genotype statistics from all samples at the site level
HaplotypeScore Consistency of the site with strictly two segregating haplotypes
HardyWeinberg Hardy-Weinberg test for transmission disequilibrium
HomopolymerRun Largest contiguous homopolymer run of the variant allele
InbreedingCoeff Likelihood-based test for the inbreeding among samples
LikelihoodRankSumTest Rank Sum Test of per-read likelihoods of REF versus ALT reads
LowMQ Proportion of low quality reads
MVLikelihoodRatio Likelihood of being a Mendelian Violation
MappingQualityRankSumTest Rank Sum Test for mapping qualities of REF versus ALT reads
MappingQualityZero Count of all reads with MAPQ = 0 across all samples
MappingQualityZeroBySample Count of reads with mapping quality zero for each sample
NBaseCount Percentage of N bases
PossibleDeNovo Existence of a de novo mutation in at least one of the given families
QualByDepth Variant confidence normalized by unfiltered depth of variant samples
RMSMappingQuality Root Mean Square of the mapping quality of reads across all samples.
ReadPosRankSumTest Rank Sum Test for relative positioning of REF versus ALT alleles within reads
SampleList List samples that are non-reference at a given site
SnpEff Top effect from SnpEff functional predictions
SpanningDeletions Fraction of reads containing spanning deletions
StrandAlleleCountsBySample Number of forward and reverse reads that support each allele
StrandBiasBySample Number of forward and reverse reads that support REF and ALT alleles
StrandOddsRatio Strand bias estimated by the Symmetric Odds Ratio test
TandemRepeatAnnotator Tandem repeat unit composition and counts per allele
TransmissionDisequilibriumTest Wittkowski transmission disequilibrium test
VariantType General category of variant

Name Summary
ErrorThrowing A walker that simply throws errors.
GATKPaperGenotyper A simple Bayesian genotyper, that outputs a text based call format.

Errors caused by incorrect user behavior, such as bad files, bad arguments, etc.

Name Summary
ArgumentException Generic class for handling misc parsing exceptions.
ArgumentsAreMutuallyExclusiveException An exception indicating that mutually exclusive options have been passed in the same command line.
DynamicClassResolutionException Class for handling common failures of dynamic class resolution
InvalidArgumentException An exception for undefined arguments.
InvalidArgumentValueException An exception for values whose format is invalid.
MissingArgumentException An exception indicating that some required arguments are missing.
MissingArgumentValueException Specifies that a value was missing when attempting to populate an argument.
TooManyValuesForArgumentException An exception indicating that too many values have been provided for the given argument.
UnknownEnumeratedValueException An exception for when an argument doesn't match an of the enumerated options for that var type
UnmatchedArgumentException An exception for values that can't be mated with any argument.
UserException Represents the common user errors detected by GATK Root class for all GATK user errors, as well as the container for errors themselves
UserException.FileSystemInabilityToLockException A special exception that happens only in the case where the filesystem, by design or configuration, is completely unable to handle locking.
UserException.HardwareFeatureException A trivial specialization of UserException to mark that a hardware feature is not supported

Return to top

See also Guide Index | Tool Documentation Index | Support Forum

GATK version 3.4-46-gbc02625 built at 2015/07/09 18:36:44.