Tagged with #workshop
4 documentation articles | 13 announcements | 2 forum discussions


Comments (2)

This workshop included two modules:

  • Best Practices for Variant Calling with the GATK

    The core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. View the workshop materials to learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

  • Building Analysis Pipelines with Queue

    An introduction to the Queue pipelining system. View the workshop materials to learn about how to use Queue to create analysis pipelines, scatter-gather them and run them locally or in parallel on a computing farm.

Comments (0)

Note: the exact data files we used in this tutorial are no longer available. However, you can use the files in the resource bundle to work through this tutorial.


Map and mark duplicates

http://gatkforums.broadinstitute.org/discussion/2799/howto-map-and-mark-duplicates

Starting with aligned (mapped) and deduplicated (dedupped) reads in .sam file to save time.

- Generate index

Create an index file to enable fast seeking through the file.

java -jar BuildBamIndex.jar I= dedupped_20.bam

- Prepare reference to work with GATK

http://gatkforums.broadinstitute.org/discussion/2798/howto-prepare-a-reference-for-use-with-bwa-and-gatk

Create a dictionary file and index for the reference.

java -jar CreateSequenceDictionary.jar R=human_b37_20.fasta O=human_b37_20.dict

samtools faidx human_b37_20.fasta 

Getting to know GATK

- Run a simple walker: CountReads

Identify basic syntax, console output: version, command recap line, progress estimates, result if applicable.

java -jar GenomeAnalysisTK.jar -T CountReads -R human_b37_20.fasta -I dedupped_20.bam -L 20

- Add a filter to count how many duplicates were marked

Look at filtering summary.

java -jar GenomeAnalysisTK.jar -T CountReads -R human_b37_20.fasta -I dedupped_20.bam -L 20 -rf DuplicateRead

- Demonstrate how to select a subset of read data

This can come in handy for bug reports.

java -jar GenomeAnalysisTK.jar -T PrintReads -R human_b37_20.fasta -I dedupped_20.bam -L 20:10000000-11000000 -o snippet.bam

- Demonstrate the equivalent for variant calls

Refer to docs for many other capabilities including selecting by sample name, up to complex queries.

java -jar GenomeAnalysisTK.jar -T SelectVariants -R human_b37_20.fasta -V dbsnp_b37_20.vcf -o snippet.vcf -L 20:10000000-11000000

Back to data processing

- Realign around Indels

http://gatkforums.broadinstitute.org/discussion/2800/howto-perform-local-realignment-around-indels

java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R human_b37_20.fasta -I dedupped_20.bam -known indels_b37_20.vcf -o target_intervals.list -L 20 

java -jar GenomeAnalysisTK.jar -T IndelRealigner -R human_b37_20.fasta -I dedupped_20.bam -known indels_b37_20.vcf -targetIntervals target_intervals.list -o realigned.bam -L 20 

- Base recalibration

http://gatkforums.broadinstitute.org/discussion/2801/howto-recalibrate-base-quality-scores-run-bqsr

java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R human_b37_20.fasta -I realigned_20.bam -knownSites dbsnp_b37_20.vcf -knownSites indels_b37_20.vcf -o recal_20.table -L 20

java -jar GenomeAnalysisTK.jar -T PrintReads -R human_b37_20.fasta -I realigned_20.bam -BQSR recal_20.table -o recal_20.bam -L 20

java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R human_b37_20.fasta -I recalibrated_20.bam -knownSites dbsnp_b37_20.vcf -knownSites indels_b37_20.vcf -o post_recal_20.table -L 20

java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R human_b37_20.fasta -before recal_20.table -after post_recal_20.table -plots recalibration_plots.pdf -L 20 

- ReduceReads

http://gatkforums.broadinstitute.org/discussion/2802/howto-compress-read-data-with-reducereads

java -jar GenomeAnalysisTK.jar -T ReduceReads -R human_b37_20.fasta -I recalibrated_20.bam -o reduced_20.bam -L 20 

- HaplotypeCaller

http://gatkforums.broadinstitute.org/discussion/2803/howto-call-variants-on-a-diploid-genome-with-the-haplotypecaller

java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R human_b37_20.fasta -I reduced_20.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o variants_20.vcf -L 20 
Comments (0)

This workshop covered the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. View the workshop materials to learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

Comments (0)

This workshop covered the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. View the workshop materials to learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

Comments (0)

This week, three lucky GATK team members are teaching an invited workshop at Mahidol University in Bangkok, Thailand! The slide decks for each day will be available at the start of the day here in the GSA dropbox. After the workshop, all materials will be available in the Presentations section of the GATK website as usual.

Comments (0)

The presentation videos for:

  • Day 1 (Best Practices for Variant Calling with GATK)
  • Day 2 (Building Analysis Pipelines with Queue)

are available here: http://www.broadinstitute.org/gatk/guide/events?id=3391

Comments (0)

Slides for :

  • Day 1 (Best Practices for Variant Calling with GATK)
  • Day 2 (Building Analysis Pipelines with Queue)

are available at this link in our DropBox slide archive :

https://www.dropbox.com/sh/ryz7bx4aeqc7mkl/44Q_2jvIyH

Comments (2)

Register now for a spot at the upcoming GATK workshop, which will be held in Cambridge, MA on October 21-22.

http://www.cvent.com/events/broade-workshop-gatk-best-practices-building-analysis-pipelines-with-queue/event-summary-de1eaa027413404ba6dc04c128d52c63.aspx

This workshop will cover the following topics:

  • GATK Best Practices for Variant Detection
  • Building Analysis Pipelines with Queue

The workshop is scheduled right before ASHG Boston, so if you're going to be in town for the conference, make sure you come a couple of days early and attend the GATK workshop!

Comments (3)

We're planning the next GATK workshop to fall right before the ASHG conference in Boston, so that ASHG attendees traveling in from out of town can kill two birds with one stone.

Program details will follow, but roughly, we'll have one full day of talks devoted to the Best Practices workflow (Monday 21st) and one morning of talks devoted to pipelining analyses with Queue (Tuesday 22nd), plus an optional hands-on session on Tuesday afternoon. Registration for the talk sessions will be open to all, but the hands-on session will be reserved for Broadies.

All workshop materials will be posted online as usual of course.

Comments (1)

The slides from the July 2013 Best Practices workshop are available here:

https://www.dropbox.com/sh/jhlk451jntywfdy/vJqbKTZZd_

The videos will be put online once they are processed.

Comments (1)

Before the workshop, you should run through this tutorial to install all the software on your laptop:

We use a small test dataset that you can download from this link (1.1 Gb):

  • https://s3.amazonaws.com/gatk-workshop/mini-bundle.zip

During the hands-on session of the workshop, we walk through the following tutorials, with some minor modifications:

Comments (0)

Registration is now LIVE for BroadE Workshops: Best Practices for Variant Calling with the GATK.

BroadE Workshop: Best Practices for Variant Calling with the GATK
Tuesday, July 9 & Wednesday, July 10, 2013
9:00 AM - 12:00 PM

REGISTER HERE

This workshop will focus on the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. You will learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

The workshop will last two days, divided into lecture-style sessions in the morning and optional hands-on sessions in the afternoon (note that for practical reasons, attendance at the latter will be limited). In the morning, you’ll hear from the GATK development team and invited guests, who will explain the rationale, theory and real-life applications of the Best Practices. In the afternoon, the GATK team will help you work through interactive exercises and tutorials in which you will apply the Best Practices to real datasets.

Optional: Hands-on tutorial sessions & Breakout session on XHMM and GenomeSTRiP

Broad Institute
Auditorium
7 Cambridge Center
Cambridge, Massachusetts 02142

Registration closes June 21 at 5:00 PM.

Notification of acceptance or wait list status sent June 27.

Comments (4)

We are organizing a user workshop on GATK Best Practices, which will take place July 9-10 at the Broad Institute in Cambridge, MA, USA.

The workshop will take the same format as last time, with lectures in the morning and hands-on tutorials in the afternoon on both days.

It will be free and open to all comers, but space at the hands-on tutorials will be limited, so be on the lookout for our announcement when registration opens. We will announce it here on the forum and by Twitter (follow us at @gatk_dev).

We regret that we are not able to offer any financial assistance for travel or lodging. However all the workshop materials will be posted online on our Events page on the GATK website, so be sure to check them out if you can't be here in person.

Comments (0)

These are videos of the presentations given at the GATK workshop on 4-5 Dec 2012 on "Best Practices for variant calling with the GATK".

http://www.broadinstitute.org/gatk/guide/events?id=2038

Comments (4)

Here is a Dropbox link to the presentations given at the Dec 4-5 GATK workshop:

https://www.dropbox.com/sh/0puyz82ecswm4ig/V2fW--1ZFS

Comments (1)

Registration for the GATK workshop (Dec 4-5) is now open to the public. Attendance for the hands-on portion of the workshop is limited so be sure to sign up soon!

http://www.regonline.com/builder/site/Default.aspx?EventID=1162708

Please note that this workshop is primarily intended for Cambridge and Boston-area users; users from further afield are welcome to come, but at this time we regretfully cannot offer any financial assistance for travel or lodging. Rest assured that we will make videos of the talks and all teaching materials available online following the workshop.

Comments (12)

Hi all,

We are organizing a two-day GATK workshop to be held at the Broad Institute in Cambridge, MA on Dec 4-5 2012.

This workshop will focus on the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. You will learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

The workshop will last two days, divided into lecture-style sessions in the morning and optional hands-on sessions in the afternoon (note that for practical reasons, attendance at the latter will be limited, so be sure to sign up early). In the morning, you’ll hear from the GATK development team, who will explain the rationale, theory and real-life applications of the Best Practices. In the afternoon, the GATK team will help you work through interactive exercises and tutorials in which you will apply the Best Practices to real datasets.

Registration is not yet open; we will post another announcement when it is, but be sure to save the date if you are interested in attending.

Please note that this workshop is primarily intended for Cambridge and Boston-area users; users from further afield are welcome to come, but at this time we regretfully cannot offer any financial assistance for travel or lodging. Rest assured that we will make videos of the talks and all teaching materials available online following the workshop.

Comments (1)

Will there be a GATK Best Practices and Building Analysis Pipelines with Queue workshop in 2014? I would like to attend, and to put my attendance as part of a career development plan I am building. Thanks, Jimmy

Comments (15)

Hi Team, I wonder whether you're planning to hold GATK workshops or user meetings before or after ASHG this year. I assume many users will make their way to Boston, and for those from far away (like myself), this would be very useful and efficient. Hope I didn't miss an announcement somewhere, searched but only found an IGV tutorial... Hopefully yours, K