We know this field can be confusing or even overwhelming to newcomers, and getting to grips with a large and varied toolkit like the GATK can be a big challenge. We have produce a presentation that we hope will help you review all the background information that you need to know in order to use the GATK:
In addition, the following links feature a lot of useful educational material about concepts and terminology related to next-generation sequencing:
A basic review of the sequencing process.
An excellent, detailed overview of the myriad next-gen sequencing methdologies.
A nice piece explaining the problems inherent in trying to analyze terabytes of data. The GATK addresses this issue by requiring all datasets be in reference order, so only small chunks of the genome need to be in memory at once, as explained here.