2013 Workshops

Proteomics
This half-day course will present essential aspects of proteomics of general interest to biologists and clinicians. Attendees will come away with a strong understanding of the important technologies and experimental approaches used in modern mass spectrometry-based proteomics.
Topics and applications presented will include:
  • Key types of proteomics experiments and sample requirements 
  • Quantitative MS approaches for global proteome and posttranslational analyses 
  • Basics of targeted MS for precise, reproducible measurements of peptides, proteins, and their modifications in biology and medicine 
  • Data analysis approaches and statistical methods for confident identification of true differential proteins/peptides
  • Useful tools for annotating the MS results and to extract knowledge and organize results into pathways presented.
  December 10   Workshop Materials
De novo & Genome-guided Transcript Reconstruction from RNA-Seq
Participants in this workshop will learn the basics of how to reconstruct and analyze transcriptomes starting from RNA-Seq data using genome-guided and genome-free methods. Genome-guided reconstruction will be performed using the Tuxedo software suite, and genome-free reconstruction will use Trinity. The course consists of a lecture followed by direct exploration of sample RNA-Seq data using the tools in the context of each analysis framework.
  December 2    
Sharing Genomics Data and Analytical Results with GenomeView
This workshop will teach you the steps involved to set up a GenomeView instance that is preloaded with your genomics data and analytical results to share with collaborators and colleagues. The focus of this session is on using GenomeView as a genomics visualization platform to share data, rather than using GenomeView per se. We will start by reviewing the steps on how to prepare your data for visualization, and some best practices for different data types that have served us well. After preprocessing the data, we will set up an online repository with this data and make it secure using standard web authentication. Next, we will tweak the configuration settings to ensure that people visiting the repository will get to see the data exactly as we like. Finally, we will expand the repository with interactive components that enable us to guide the users to a more meaningful interaction with our data.

The workshop will start with a short introductory presentation and continues with a step-by-step walkthrough to set up the repository. After the walkthrough, participants can complete guided exercises with the example data or can set up their own data repository with the newly learned skills. The examples will include data from various types of sequencing assays: ChIP-seq, RNA-seq, and WGS. We will also highlight visualizing different comparative genomics analyses with two examples. The first one will look at read-based genome diversity through variant calls. The second one will look at whole-genome based comparative studies.

Note: This is not an introductory workshop to using GenomeView as a visualization tool.

  November 19  

 

Integrative Genomics Analysis with GenomeSpace
This workshop will focus on integrative genomics analysis with GenomeSpace, an environment that brings together a diverse set of computational tools, enabling nonprogramming scientists to easily combine the tools’ capabilities through a user-friendly point-and-click interface. It offers a common space to create, manipulate, and share an ever-growing range of genomic visualizations and analyses.

GenomeSpace features support for cloud-based data storage and analysis, multi-tool analyses, automatic conversion of data formats, and ease of connecting new tools to the environment. A set of six “GenomeSpace-enabled” seed tools developed by collaborating organizations provides a comprehensive platform for the analysis of genome data: Cytoscape, Galaxy, GenePattern, Genomica, Integrative Genomics Viewer, and the UCSC Genome Browser. The extensible format of the system has empowered a wider range of analyses through the continual addition of new tools and resources.

Participants in this workshop will learn how to use GenomeSpace to utilize the visualization and analysis capabilities of multiple tools in several research scenarios. Through the demonstration of a number of short analysis “recipes,” we will give participants the essential elements to construct powerful integrative genetic and genomic analyses.

  November 8  

 

Genome View (Part I)
This workshop will teach you how to get started with GenomeView, an interactive genome browser and annotation editor. It allows you to visually explore various data types that come out of (comparative) genomic studies. This includes reference genomes with annotation, sequence reads, read coverage, whole genome alignments with translated annotations, and read based variation. You will learn how to prepare your data for visualization and how to get started with the various visualization tracks, and you will get some tips and tricks on how to leverage read data, annotations, and whole genome alignments to verify your data generation and downstream analyses. Finally we'll explore how you can use this to generate hypotheses by leveraging the various visualizations.

The workshop will start with an introductory presentation and a step-by-step walkthrough of some of the most salient features with interactive examples. After the walkthrough, participants can either complete tutor guided exercises with the example data or can explore their own data with the newly learned skills. The examples during the workshop will focus on non-vertebrate genomes, i.e., plants, fungi, and bacteria, but are equally applicable to any other genomes. The examples will include data from various types of sequencing assays: ChIP-seq, RNA-seq and WGS. We will also highlight visualizing different comparative genomics analyses with two examples. The first one will look at read based genome diversity through variant calls. The second one will look at whole-genome based comparative studies. Example data sets will be provided.

Participants should bring their own computer and participants are very much encouraged to bring their own data in any of the supported file formats.

  October 22    
GATK Best Practices and Building Analysis Pipelines with Queue
The first part of this workshop, GATK Best Practices, will focus on the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. The GATK development team and invited guests will give talks explaining the rationale, theory, and real-life applications of the Best Practices. You will learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.

An optional half-day hands-on session will be available to select participants. In this session, the GATK team will help beginners work through interactive exercises and tutorials to learn how to use GATK and apply the Best Practices to real data.

The second part of this workshop, Building Analysis Pipelines with Queue, will focus on using the Queue software package to build analysis pipelines. Queue is a program developed by the GATK team as a companion to GATK. It features powerful pipelining capabilities that can be used to streamline your analysis work and take advantage of high-performance computing infrastructure to deliver results faster. The GATK development team and invited guests will give talks explaining how to use Queue to build analysis pipelines that involve popular analysis tools such as BWA, GATK, and others. You will learn how to write a pipeline script, customize inputs and outputs, and run the pipeline on a computing cluster.

  October 21 and 22   Workshop Materials and Videos
Introduction to Experimental Design of Sequencing-Based Studies
In this workshop, attendees will get a broad overview of experimental design criteria for successful sequencing experiments of a number of types. High-throughput genome sequencing is a very powerful and versatile tool that is being adopted by a wide range of new researchers as costs drop. However, no technology or analysis can save a poorly designed or underpowered experiment. We will discuss important questions that should be addressed prior to generating sequencing data, cover the approximate amounts and type of sequencing needed for reasonably powered experiments, how to think about what kind of sequencing to do or whether to sequence, and basic analysis needs for different types of experiments.

This course is primarily for researchers new to sequencing or project managers and analysts who are not familiar with the upstream considerations of the experiments they manage and analyze.

  October 4   Workshop Materials
Statistical Genetics
This workshop provides an introduction to the basic principles of statistical genetic analysis. This course is targeted for individuals who are interested in learning the basics of genetic analysis. Specific areas of focus for the course include: study design considerations for genetic association tests, quality control (QC) procedures for genetic data, basic analysis of genome-wide association SNP data, and introduction to rare variant testing approaches.

The workshop will be a blend of lectures introducing each of these topics and then hands-on practical application, in particular for QC and common variant analysis. Tutorial datasets will be provided for the introduction to the analysis of genetic data. Participants will not need to bring a laptop, as computers will be provided. At the end of the workshop, participants should have a basic understanding of the elements of genetic analysis for common and rare variation. In particular, this workshop will provide a foundation for starting genetic analysis of real data and an introduction for how best to learn more about current techniques. The practical components of the workshop will include command line work, so basic familiarity with Unix environment is strongly recommended.

  September 27   Workshop Materials and Videos
xBrowse
This workshop gives an introduction to xBrowse, a software platform for analyzing exome sequencing data from families affected by Mendelian disease. xBrowse allows researchers to filter and prioritize potential causal variants from next generation sequencing data, and to prioritize candidate variants and genes using a variety of annotation tools and external data-sets. While xBrowse has both a web-based front end and a command line interface, in this workshop we will only review the web-based front end. Programming experience is not necessary.

The workshop will be broken into two halves. In the first half we will start with an introduction to exome sequencing, our approach to upstream data processing and variant-calling, and various considerations around sample/family selection and experimental approaches. We will then introduce the xBrowse pipeline, including data upload options and various analytical approaches. We will show demonstrations using real research data from the MacArthur lab.

Participants will not need data or a computer. After this tutorial, we will have an open “coffee-hour” style discussion where we will explore participants’ own datasets as a group. Any participants that want to review their own data should contact us as soon as possible so that we can ensure your data are processed and uploaded to the system in time. We will ensure that all patient data are presented in a deidentified fashion, but participants should only provide data that are not sensitive in nature.

  September 13    
Using CellProfiler for Biological Image Analysis
This workshop will instruct participants in the use of CellProfiler, an open-source, freely downloadable software package designed for large-scale, automated phenotypic image analysis. Attendees are encouraged to contribute sample images from their assays as part of the demonstration. We will also briefly discuss the basic principles of supervised machine learning in order to score phenotypes where phenotypic differences between samples are not visible by eye.
  August 9    
Best Practices For Variant Calling With The GATK
This workshop will focus on the core steps involved in calling variants with the Broad’s Genome Analysis Toolkit, using the “Best Practices” developed by the GATK team. You will learn why each step is essential to the calling process, what are the key operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset.
 
  July 9-10   Workshop Materials and Videos
The workshop will last two days, divided into lecture-style sessions in the morning and optional hands-on sessions in the afternoon (note that for practical reasons, attendance at the latter will be limited, so be sure to sign up early). In the morning, you’ll hear from the GATK development team and invited guests, who will explain the rationale, theory and real-life applications of the Best Practices. In the afternoon, the GATK team will help you work through interactive exercises and tutorials in which you will apply the Best Practices to real datasets.

All participants should be familiar with general next-gen sequencing terms and data formats. The lecture-based morning sessions are open to all existing, new, or prospective users of the GATK. The hands-on afternoon sessions are restricted to existing users of the GATK who are familiar with the command-line work environment and at least capable of running simple analyses as described in an online tutorial.

 
 
   
Planning & Execution of Successful Small Molecule Probe Development
The goal of this workshop is to educate our community about successful probe development strategies, engage workshop participants through real-world case studies and examples, and provide a framework for follow-up interaction with the Therapeutics Platform.

The course will begin with presentations on how to approach a probe development project covering topics such as Project planning, assay development, chemical libraries, data analysis and computational mining, and working with the platform. The second half of the session will be working through complex fictional situations to design and execute a successful probe development project.

  June 10
 
  Workshop Materials
Modern Statistical Ideas for the Life Sciences
This workshop will cover modeling and estimation basics for life scientists. The workshop is aimed at biologists who want to work closely with statisticians to develop either mechanistic or phenomenological models of cellular behavior. Statistical models can be used to gain insight into quantitative aspects of any cellular behavior of interest that are not directly measurable, and need to be inferred from indirect measurements. Model-based statistical analysis and inference are especially useful in high-throughput studies, and to disentangle small-to-medium effects from noise. If you are expecting 50-fold effects, counting will work just fine.
  June 7
 
 
 
Charting the Epigenome with ChIP
ChIP (chromatin immunoprecipitation) is a very powerful technique that enables the localization of proteins on DNA throughout the genome. The technique relies on the selective enrichment of a chromatin fraction containing a specific antigen, by immunoprecipitation. Antibodies that recognize a protein or protein modification are used to capture the chromatin (protein-DNA complex), and in the contemporary method of ChIP-Seq, next-gen sequencing libraries are derived from the recovered DNA. The libraries are sequenced and the recovered sequences are aligned to a genomic scaffold to map the locations of the antigen recognized by the antibody. The ChIP technique can be used in any area of research to further elucidate gene function and regulation in their native state.

Application of ChIP to the genome wide localization of DNA binding proteins, transcription factors, chromatin modifying enzymes, and histone modifications has helped develop an understanding of the mechanisms for the regulation of chromatin organization. These methods have contributed to our understanding of embryogenesis and tissue specific cellular differentiation, while aberrant chromatin structure is associated with developmental disorders and other diseases, such as cancer.

  June 3

 

  Workshop Materials and Videos
Visual Representation for Exploring and Explaining Scientific Data
Visual representation can serve two distinct purposes: to guide the data-exploration process as the scientific story is still unfolding and to communicate research findings. Each goal entails a different approach to data representation, but sound graphic design principles are important in both. This workshop will focus on data-visualization techniques to parallel the research trajectory from lab to publication. Prospective attendees will be guided through essential design principles, graphical methods for depicting information, and software tools for implementing visualization ideas. Hands-on activities will give participants opportunities to put concepts into practice.
  May 10   Visual Strategies

 

Intro to RNA-Seq
RNA-Seq is revolutionizing our ability to analyze the transcriptome. This seminar will present participants with an overview of RNA-Seq principles, experimental considerations, steps of the RNA-Seq analysis process, and current state of the art. It will provide the basis for future in-depth RNA-Seq courses and workshops.
  May 7
 
  Workshop Materials
GenePattern
In a half-day, hands-on format, participants learn to use GenePattern features, including an intuitive graphical user interface for users at all levels of computational sophistication, a comprehensive repository of over 180 tools for the analysis of gene expression, sequence variation, proteomics, and more, and a pipeline environment that allows users to chain tasks together to create and share reproducible analysis workflows.
  May 3

 

  Workshop Materials
Using CellProfiler for Biological Image Analysis
This workshop will instruct participants in the use of CellProfiler, an open-source, freely downloadable software package designed for large-scale, automated phenotypic image analysis. Attendees are encouraged to contribute sample images from their assays as part of the demonstration. We will also briefly discuss the basic principles of supervised machine learning in order to score phenotypes where phenotypic differences between samples are not visible by eye.
  April 12   Workshop Materials
Functional Genomic Screens in the RNAi Platform
The RNAi Platform workshop will explore functional genomics resources at the Broad, both for those interested in performing genetic screens and for those interested in using these tools to answer specific questions in their area of interest. The workshop is aimed at bench scientists who might use these resources as well as computationalists who want to understand more about the the biological mechanisms and experimental approaches underlying the data sets that emerge. We will cover the range of perturbations available in the platform, including shRNAs, ORFs, and TALENs, including background on how they work and how they are delivered into cells. We will then discuss the planning and execution of small scale and genome-wide screens using these reagents. Additionally, we will provide hands-on examples of how to analyze and prioritize hits that emerge from screens, and discuss how to move from primary screening data to figures 3 through 7 of your publication.
  April 5
 
  Workshop Materials and Videos
Genome Assembly
This half-day workshop is intended as an introduction to the concepts of assembly and assembly analysis. The content is well-suited to non-experts who wish to learn about the fundamentals of assembly algorithms and basic best practices for understanding levels of quality in an assembly. Basic understanding of the concepts of genome sequencing and genome assembly is expected.

The first third of this course will focus on what an assembly is, and how changes in sequencing technology have impacted assembly algorithms. We will provide insights into how assemblers work, and why assembly is still an open problem. Attendees are expected to gain a basic understanding of current assembly algorithms. This section contains an interactive problem-solving session designed to develop understanding of assembly algorithms.

The second two-thirds of the course will focus on utilizing and analyzing genomic assemblies. We will outline why assemblies are not perfect, and explain how not all assemblies are created equal. Participants should leave with a good understanding of what metrics should be assessed when judging the quality of an assembly. This session contains an interactive analysis session where participants will be able to look at real assemblies to diagnose issues.

  March 20

 

  Workshop Materials
RNA-Seq Basics
This workshop will cover the basic conceptual ideas behind library construction, sequencing, and initial analysis of RNA-Seq data. The workshop is aimed at biologists who want to learn the basics of RNA-Seq analysis, computationalists who want to understand more about the basics of RNA-Seq data generation, or anyone who is interested in RNA-Seq but is not familiar with basic high throughput sequencing technologies. We will start with very basic concepts from three perspectives: generating high throughput (e.g., Illumina) sequencing data; making RNA-Seq libraries; and understanding mRNA structure, annotation, and quantitation. We will proceed through the steps of basic sequencing into the common first pass methods for analysis. Prospective attendees who are already familiar with the majority of these concepts and are looking for a more in depth or hands-on workshop on specific RNA-Seq techniques are advised to look for our more advanced workshop in May.
  March 6

 

  Workshop Materials
Differential Expression analysis of RNA-Seq with the Tuxedo Tools
RNA-Seq is now a routine assay for measuring gene expression, but the analysis of the data can be daunting. This course will cover basic principles of RNA-Seq analysis and include hands-training for the "Tuxedo Tools". Users will learn to run TopHat, Cuffdiff, and CummeRbund to detect differentially expressed genes and transcripts.

Objective
Participants in this workshop will learn fundamental concepts of RNA-Seq analysis and be able to perform a basic differential analysis at gene- and isoform-resolution using the Tuxedo Tools.

Skill Level
Those new to RNA-Seq. Familiarity with UNIX and/or R is a strong plus.

Prerequisites
Users are strongly encouraged to familiarize themselves with R prior to coming to the workshop. You will get much more out of this workshop if you are comfortable with starting and stopping an R session and performing very simple tasks such as opening and viewing files in R.

Users are also encouraged to have a look at ggplot2, which is a beautiful plotting package written in R. CummeRbund, which we will use to explore our RNA-Seq data, is built on top of ggplot2.

There is an excellent, easy to read book on ggplot2, and you can find sample chapters here: http://ggplot2.org/book/. Chapter 2 is available for free, and you are strongly encouraged to try running the bits of R code in chapter 2 on your own to get a feel for what exploring data with R and ggplot2 is like. The entire book is available through Amazon.com and is an extremely worthwhile read.

  January 11