Genome Sequencing and Analysis Group

From GSA

(Redirected from Main Page)
Jump to: navigation, search

The Genome Sequencing and Analysis Group (GSA) in Medical and Population Genetics at the Broad Institute is a team of computational biologists, software engineers, and hosted students and researchers developing algorithms for next generation DNA sequencers for medical and population genetics and cancer applications, as well as applying these algorithms to answer fundamental scientific questions

Contents

What does GSA do?

GSA has extensive experience with processing of next-generation DNA sequencer data as well as genotyping and validation data along with downstream analysis of this data for medical and population genetics studies. The group was one of the most active participants in the 1000 Genomes Project pilot, generating SNP and indel calls for deep whole-genome single sample individuals (Pilot 2), low-pass whole genomes (Pilot 1) and deep targeted capture sequencing (Pilot 3), contributing to the official call sets for all three wings of the project.

The method development arm of GSA -- lead by Eric Banks -- has created a powerful framework in the The Genome Analysis Toolkit for analysis of next-generation sequencer data and analysis of variation discovered by NGS. The GATK was designed to simplify the process of developing efficient, robust tools for working with NGS data, and currently supports in a single integrated framework Solexa, SOLiD, 454, Complete Genomics, and Sanger sequencer data. Using this framework we have developed and released now widely used tools such as Base quality score recalibration, Local realignment around indels, multi-sample SNP and indel callers, as well as read and variation call QC tools. These tools are now integrated into the 1000 Genomes Project, The Cancer Genome Atlas, the Broad's production sequencing pipeline, as well as at many other sequencing centers and individual labs with sequencing machines.

While continuing the development of novel methods for working with NGS data, we increasingly are applying these tools to technology development, production data analysis and medical genetic projects as part of the Analysis Team lead by Kiran Garimella such as:

  • Comparative analysis of new sequencing technologies such as Roche-454, ABI SOLiD, Illumina GAII, Illumina HiSeq, and Complete Genomics sequencers
  • Design and analysis of whole exome hybrid capture technology
  • ARRA-funded GO Exome Sequencing Project
  • ARRA-funded GO Type-2 Diabetes Sequencing Project
  • ARRA-funded GO Autism Sequencing Project
  • Framingham Heart Study targeted sequencing project
  • HLA typing and HIV elite controllers
  • NHGRI "extremes" phenotype-driven whole-exome sequencing projects
    • Ciliopathies
    • Hutterites
    • Familial Combined Hypolipidemia
    • Many other projects arriving weekly

Group members

Current members

  • Group Leader

Former members

Recruitment

We currently have one open position:

Previously open and then filled positions are:

Please apply directly at Career Center. Email depristo@broadinstitute.org if you'd like more information.

The Genome Analysis Toolkit (GATK)

See the following page for detailed information about our programming framework and the tools built upon it: The Genome Analysis Toolkit

Queue

See the following page for detailed information about our process management framework: Queue

Tribble

See the following page for detailed information about the common reference meta-data framework Tribble: Tribble

GSA Firehose Pipeline Documentation

Documentation of GSA QC methodology is availabile here. For information on GATK analyses and parameters run in the standard Firehose pipeline, go here

GSA next-generation sequencing workshop

On Feb. 4th, 2010 GSA organized a next-generation sequencing workshop at the Broad institute. The topcs and speakers were:

A complete video of the workshop will be available shortly.

GSA provided data

Internal GSA website

See [1]

Personal tools