Presentations
Materials from conferences, workshops and online events


Below is a list of past events for which we provide materials such as presentation slides and videos. For upcoming events, please check the blog.

To view the materials, click the icon on the right.


Created 2015-08-17 23:35:00 | Updated 2015-08-17 23:35:19 | Tags: best-practices workshop presentations slides
Comments (0)

Laura Gauthier, Yossi Farjoun and Geraldine Van der Auwera presented this workshop in Pretoria, South Africa, upon invitation from the University of Pretoria.

This workshop covered the core steps involved in calling germline and somatic variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. The presentation materials describe why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

Additional considerations were covered, such as calling cohorts efficiently, as well as dealing with non-human data, RNAseq data, whole-genome vs. exome, basic quality control, and performance.

This was complemented by sets of hands-on tutorials aiming to teach basic GATK usage to new users, as well as introduce pipelining concepts using Queue.

The workshop was structured into five modules:

  • Introductory materials
  • Best Practices Phase 1: Pre-processing
  • Best Practices Phase 2A: Calling germline variants
  • Best Practices Phase 2B: Calling somatic variants
  • Best Practices Phase 3: Preliminary analyses

The workshop materials are available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.


Created 2015-08-14 01:57:37 | Updated 2015-08-14 01:58:03 | Tags: best-practices workshop workflow presentations
Comments (0)

Joel Thibault, Valentin Ruano-Rubio and Geraldine Van der Auwera presented this workshop in Edinburgh, Scotland, and Cambridge, England, upon invitation from the Universities of Edinburgh and Cambridge.

This workshop included two modules:

  • Best Practices for Variant Calling with the GATK

    The core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. The presentation materials describe why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

  • Beyond the Best Practices

    Additional considerations such as calling variants in RNAseq data and calling cohorts efficiently, as well as dealing with non-human data, RNAseq data, whole-genome vs. exome, basic quality control, and performance.

This was complemented by a set of hands-on exercises aiming to teach basic GATK usage to new users.

The workshop materials are available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.


Created 2015-08-14 01:50:16 | Updated 2015-08-14 01:56:56 | Tags: best-practices workshop workflow presentations
Comments (0)

The full GATK team presented this workshop at the Broad Institute with support form the BroadE education program.

This workshop covered the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. The presentation materials describe why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

The workshop materials are available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.


Created 2014-10-31 23:11:22 | Updated 2014-11-01 00:01:23 | Tags: best-practices presentations ashg
Comments (2)

Ami Levy-Moonshine presented this condensed 90-minute workshop given at ASHG 2014 in San Diego, CA on October 21.

This workshop covered all the core steps involved in calling variants with the GATK, using the “Best Practices” developed by the GATK team. The presentation materials outline why each step is essential to the calling process and what are the key operations performed on the data at each step. This includes specific information about variant calling in RNAseq data and efficient analysis of cohorts.

His slide deck is available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.

GATK was also featured in another mini-workshop at ASHG which covered the iSeqTools network, focused on cloud-based analysis. The presentation slides will be posted to the iSeqTools website in the near future.

Best Practices for variant discovery in DNA:

Best Practices for variant discovery in RNAseq:

Excerpt from Ami's ASHG poster:


Created 2014-10-31 22:45:38 | Updated 2014-10-31 23:12:02 | Tags: best-practices workshop presentations
Comments (4)

Eric Banks, Sheila Chandran and Geraldine Van der Auwera presented this workshop in Philadelphia, PA, upon invitation from the School of Medicine at UPenn.

This workshop covered all the core steps involved in calling variants with the GATK, using the “Best Practices” developed by the GATK team. The presentation materials describe why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. This includes specific information about variant calling in RNAseq data and efficient analysis of cohorts.

The material was presented over two days, organized in the following modules:

  • Data pre-processing: From FASTQ to analysis-ready BAM
  • Variant Discovery: From BAM to analysis-ready VCF

This was complemented by a set of hands-on exercises aiming to teach basic GATK usage to new users.

The workshop materials are available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.


Created 2014-10-31 21:17:51 | Updated 2014-10-31 22:45:14 | Tags: presentations c conference gamgee
Comments (3)

Mauricio Carneiro presented this talk at CPPCON (C++ conference) in Bellevue, WA on September 8, 2014. His slide deck and a link to the video are available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.

Abstract

Our group has defined the standards for DNA and RNA sequencing data processing and analysis for disease research and clinical applications. In the last 5 years we have published our tools in the GATK (genome analysis toolkit) which is completely written in java. With the scaling of next generation sequencing and the immense amount of that needs to be processed we hit a performance wall and found ourselves limited by the language to make optimizations and rewrite the algorithms in a way that would conform better to modern hardware.

Enter Gamgee. A free and open source C++14 library that offers much of the functionality of the GATK framework with the performance necessary to scale to the hundreds of petabytes of todays complex diseases projects. We will show how the tools developed using the Gamgee library replaced legacy java GATK tools in the production pipeline of the Broad Institute. We will also talk about how the algorithms have changed to take advantage of the native libraries and modern hardware features such as SSE/AVX and GPUs.


Created 2014-10-31 17:44:07 | Updated 2014-10-31 23:14:44 | Tags: workshop presentations
Comments (0)

Laura Gauthier, David Roazen and Geraldine Van der Auwera presented this workshop in Brussels, Belgium, upon invitation from the Royal Belgian Institute for Natural Sciences.

This workshop included two modules:

  • Best Practices for Variant Calling with the GATK

    The core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. The presentation materials describe why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

  • Beyond the Best Practices

    Additional considerations such as calling variants in RNAseq data and calling cohorts efficiently, as well as dealing with non-human data, RNAseq data, whole-genome vs. exome, basic quality control, and performance.

This was complemented by a set of hands-on exercises aiming to teach basic GATK usage to new users.

The workshop materials are available at this link if you're viewing this post in the forum, or below if you are viewing the presentation page already.


Created 2014-09-03 09:12:32 | Updated 2014-10-31 20:39:54 | Tags: talks conference ukgs2014
Comments (5)

This talk was presented by Geraldine Van der Auwera at Genome Science UK at Oxford on September 2, 2014. Get the slide deck here; abstract below.

Abstract

Variant discovery is greatly empowered by the ability to analyse large cohorts of samples rather than single samples taken in isolation, but doing so presents considerable challenges. Variant callers that operate per-locus (such as Samtools and GATK’s UnifiedGenotyper) can handle fairly large cohorts (thousands of samples) and produce good results for SNPs, but they perform poorly on indels. More recently developed callers that operate using assembly graphs (such as Platypus and GATK’s HaplotypeCaller) perform much better on indels, but their runtime and computational requirements tend to increase exponentially with cohort size, limiting their application to cohorts of hundreds at most. In addition, traditional multisample calling workflows suffer from the so-called “N+1 problem”, where full cohort analysis must be repeated each time new samples are added.

To overcome these challenges, we developed an innovative workflow that decouples the two steps in the multisample variant discovery process: identifying evidence of variation in each sample, and interpreting that evidence in light of the evidence gathered for the entire cohort. Only the second step needs to be done jointly on all samples, while the first step can be done just as well (and much faster) on one sample at a time. This decoupling hinges on the use of a novel method for reference confidence estimation that produces a genomic VCF (gVCF) intermediate for each sample.

The new workflow enables fast, highly accurate and computationally cheap variant discovery in cohort sizes that were previously intractable: it has already been applied successfully to a cohort of nearly one hundred thousand samples. This replaces previous brute-force approaches and lowers the threshold of accessibility of sophisticated cohort analysis methods for all, including researchers who do not have access to large amounts of computing power.


Created 2013-12-20 22:01:53 | Updated 2014-10-31 20:36:41 | Tags: performance presentations conferences
Comments (2)

Mauricio Carneiro presented this slide deck at the workshop organized by Mnt Sinai School of Medicine on December 10, 2013. The other presentations made at the workshop were posted here.

Please note that we cannot guarantee content hosted on other websites; if outgoing links becomes outdated please let us know.


Created 2013-10-25 16:35:32 | Updated 2014-10-31 23:13:13 | Tags: best-practices workshop presentations videos
Comments (2)

The full GATK team presented this workshop at the Broad Institute with support form the BroadE education program.

This workshop included two modules:

  • Best Practices for Variant Calling with the GATK

    The core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. View the workshop materials to learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

  • Building Analysis Pipelines with Queue

    An introduction to the Queue pipelining system. View the workshop materials to learn about how to use Queue to create analysis pipelines, scatter-gather them and run them locally or in parallel on a computing farm.


Created 2013-08-13 07:37:07 | Updated 2014-10-31 23:13:24 | Tags: official best-practices workshop
Comments (0)

The full GATK team presented this workshop at the Broad Institute with support form the BroadE education program.

This workshop covered the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. View the workshop materials to learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.


Created 2013-02-24 00:11:00 | Updated 2014-10-31 20:39:34 | Tags: conferences
Comments (0)

Mark and Mauricio presented these slide decks about technical issues in human medical genomics at the 14th annual AGBT meeting (20-23 Feb 2013).


Created 2013-01-05 01:17:21 | Updated 2014-10-31 23:13:36 | Tags: best-practices workshop presentations videos
Comments (0)

The full GATK team presented this workshop at the Broad Institute with support form the BroadE education program.

This workshop covered the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. View the workshop materials to learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.