Guide Index
Your guide to the Guide

If this is your first rodeo, you're probably asking yourself:

  • What can GATK do for me? Identify variants in a bunch of sample sequences, with great sensitivity and specificity.

  • How do I get GATK to do that? You run the recommended Best Practices steps, one by one, from start to finish, as described in the Best Practices documentation.

  • No but really, how do I know what to do? For each step in the Best Practices, there is a tutorial that details how to run the tools involved, with example commands. The idea is to daisy-chain all thosee tutorials in the order that they're referenced in the Best Practices doc into a pipeline.

  • Oh, you mean I can just copy/paste all the tutorial commands as they are? Not quite, because there are a few things that need to be tweaked. For example, the tutorials use the -L/--intervals argument to restrict analysis for demo purposes, but depending on your data and experimental design, you may need to remove it (e.g. for WGS) or adapt it (for WEx). Hopefully it's explained clearly enough in the tutorials.

  • Why don't you just provide one script that runs all the tools? It's really hard to build and maintain a one-size-fits-all pipeline solution. Really really hard. And not nearly as much fun as developing new analysis methods. We do provide a pipelining program called Queue that has the advantage of understanding GATK argument syntax natively, but you still have to actually write scripts yourself in Scala to use it. Sorry. Maybe one day we will be able to offer GATK analysis on the Cloud. But not today.

  • What if I want to know what a command line argument does or change a parameter? First, check out the basic GATK command syntax FAQ if it's your first time using GATK, then consult the relevant Tool Documentation page. Keep in mind that some arguments are "engine parameters" that are shared by many tools, and are listed in a separate document. Also, you can always use the search box to find an argument description really quickly.

  • The documentation seems chaotic. Is there any logic to how it's organized? Sort of. (And, ouch. Tough crowd.) The main category names should be obvious enough (if not, see the "Documentation Categories" tab). Within categories, everything is just in alphabetical order. In future, we're going to try to provide more use-case based structure, but for now this is what we have. The best way to find practical information is to either go from the Best Practices doc (which provide links to all FAQs, method articles and tutorials directly related to a given step), or use the search box and search-by-tag functions (see the "Search tab"). Be sure to also check out the Presentations section, which provides workshop materials and videos that explain a lot of the motivation and methods behind the Best Practices.

  • Does GATK include other tools beside the ones in the Best Practices? Oh sure, there's a whole bunch of them, all listed in the Tool Documentation section, categorized by type of analysis. But be aware that anything that's not part of the Best Practices is most likely either a tool that was written for a one-off analysis years ago, an experimental feature that we're still not sure is actually useful, or an accessory utility that can be used in many different ways and takes expert inside knowledge to use properly. All these may be buggy, insufficiently documented, or both. We provide support for them as well as humanly possible but ultimately, you use them at your own risk.

  • Why do the answers to these questions keep getting longer and longer? I don't know what you're talking about.

  • What else should I know before I start? You should probably browse the titles of the Frequently Asked Questions -- there will be at least a handful you'll want to read, but it's hard for us to predict which ones.

The categories menu on the left-hand side shows the top categories under which the documentation articles are classified.

Guide IndexYour guide to the Guide
Search TagsLook up the information you need by tag
Best PracticesOfficial guidelines on how to best use our tools for data processing and analysis
Tool DocumentationDetailed technical documentation (arguments, options etc.) for each tool
Methods and WorkflowsRecommended methods and typical workflows for multi-step analyses
FAQsFrequently Asked Questions
TutorialsTutorials and how-to articles for our tools and pipelines
PresentationsMaterials from conferences, workshops and online events
Pipelining with QueueBuilding analysis pipelines to run GATK and other tools efficiently
Developer ZoneAll you need to know to write your own tools on top of the GATK
Third-Party ToolsTools built on top of the GATK developed by other groups
Version HistoryHistorical record of changes by version

In the unlikely event that the documentation doesn't solve all of your problems, do your holiday shopping and make your coffee too, please use our support forum to get help from a live human. The live human in question will most likely be a member of the development team. However in some cases you may get responses from other people in the user community who are not part of our group. This is usually a good thing, as there are quite a few expert users who we happily trust with answering most questions, and we welcome the help (hurray for free labor). But as with any public forum, we do recommend you exercise judgment in following advice from strangers on the Internet. One thing you can do is check people's role and post count on their public profile.

Now, in practice:

  • Questions, comments? If you have any questions about a specific documentation article, you can post a comment directly on the article in the forum by clicking on the "comment" link that shows up at the beginning and end of each article. Or if you can't find an article that covers the topic you're interested in, you can start your own discussion thread in the forum.
  • Getting an error/ found a bug? If you're encountering an error while running our tools, or you think you have found a bug, please search the forum to see if anyone else has encountered and reported it already. In many cases, the bug has already been found and fixed, and all you need to do is upgrade to the latest version of the tools. Sometimes it's not even a bug, just a usage error on your part (it happens, especially with complex software like this), and someone has already posted a workaround or advice to fix the mistake. But if it looks like your issue is a new bug, then by all means, please report it in the forum. We just ask that you make sure you've tried running the latest version and validated your input files before posting. When you do post, we'll need a succinct description of the problem, of the data you're working with, and the command lines that led to the issue.