2016 Workshops

Broad Cancer Research Data Resources & Tools 101: An Introductory Symposium Sponsored by the Cancer Program and BroadE This introductory symposium aims to offer any interested scientist a chance to get an understanding of the landscape of some of the tools, datasets, and resources available for cancer research at the Broad including Firecloud, TCGA data, Copy Number Portal, GTEx, CMap, CCLE, Achilles, and more. Currently, these topics are only systemically discussed together in the 'deep-dive' format of the two-week long annual Cancer Program Postdoc BootCamp. Therefore, this BroadE provides an exciting opportunity for people to learn more about how and why these resources and datasets may be applicable to their own research. The session may also provide introductory information about how to access or get assistance using these resources. Attendees do not need computational experience. Additionally, subsequent BroadE workshops are being planned that will explore and demonstrate these individual tools and datasets in greater detail through hands-on training. We encourage you to register to attend this new symposium sponsored and organized by BroadE and the Broad Cancer Program!	December 14
FireCloud Workshop for Analysts FireCloud is a cloud-based, cancer genomics analysis platform with co-located TCGA data. Modeled after the Broad Institute’s Firehose analysis infrastructure, FireCloud democratizes data access and facilitates collaboration by providing a robust, scalable platform accessible to the community at large. Using the elastic compute capacity of Google Cloud, FireCloud empowers analysts, tool developers, and production managers to perform large-scale analysis, engage in data curation, and store or publish results. On FireCloud, users can upload their own analysis methods to workspaces or run the Broad Institute’s best practice tools and pipelines. FireCloud also includes tutorial workspaces as well as carefully curated open and controlled-access TCGA workspaces that users can clone. The aim of this Workshop is to introduce analysts to the FireCloud environment via hands-on tutorials.	December 5
Dynamic Work Design and Visual Management Over the last few years the Genomics Platform has adopted new approaches to operational design and management, referred to here as Dynamic Work Design and Visual Management. As a result, we have seen dramatic reductions in production cycle time and cost. Additionally and just as importantly, we have reached a new level of transparency, communication, and accountability around all aspects of our operations, ranging from the allocation of development resources, to orchestrating production activities, to the integration of GP and the Data Sciences Platform. Dynamic Work Design and Visual Management is not only applicable to operations and/or production, but also has practical applications in project management, academic research, and administration. We will present this information in four parts: Background and Theory Visual Tools and Management Understanding “Value Added” The A3 Structured Problem Solving Approach This workshop is designed to introduce the operational management methodology of dynamic work design, and to then provide several tools that can be applied across a diverse range of work environments for lab managers, project managers, and platform employees.	November 29
CellProfiler: Learn to analyze 1 image or a 1,000,000 Microscopy and image processing methods improve every year. As the capacity to acquire and analyze images continues to grow, so too does CellProfiler, an open-source, freely-downloadable software designed for large-scale, automated phenotypic image analysis. Workshop attendees will learn the fundamentals of building CellProfiler pipelines to analyze image data, and will gain knowledge of the following: Basics of image analysis. Obtaining measurements from objects. Basics of machine-learning for phenotype identification. This hands-on introduction to CellProfiler will be followed by case-studies on HCS and cell-type classification. At the end of the workshop there will be a breakout session where attendees will receive guidance on analyzing their own image data. If you are curious about automating the analysis of your microscopy data or want to become familiar with "what's possible," come to the workshop and see what's new in CellProfiler for 2016.	November 14
Quantitative Proteomics in Biology, Chemistry and Medicine The course will provide biologists, chemical biologists and clinicians with a working knowledge of the most relevant proteomic technologies and data analysis methods and will describe how these methods are being applied in a wide range of collaborative research at Broad.	November 9	Workshop Materials and Video
Best Practices for Variant Calling with the GATK Workshop attendees will gain broad insight into the rationale of the GATK Best Practices for variant discovery, as well as a solid understanding of how individual GATK tools work and how to apply them in practice. Novices to the GATK will come out of the workshop knowing enough to identify which questions they can address using GATK tools, how to get started on designing their experiment and analytical workflow, and how to run the tools on their own computer. Existing GATK users will come out with a deeper understanding of how the GATK works under the hood and how to improve their results further, especially with respect to the latest innovations. The workshop consists of an all-day lecture series and two optional hands-on tutorials on the second day. *Day 1: November 7 — Lecture* This workshop will focus on the core steps involved in calling variants with Broad's Genome Analysis Toolkit (GATK), using best practices developed by the GATK team. The GATK development team and invited guests will give talks explaining the rationale, theory, and real-world applications of the GATK Best Practices. You will learn why each step is essential to the variant-calling process, what key operations are performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. Special lecture topics include somatic variant calling with MuTect2 and somatic copy number variant calling. *Day 2: November 8 — Hands-on Tutorials* One of two optional hands-on tutorials will be available to select participants. In each tutorial, the GATK team will guide participants in applying GATK tools on provided example data. The two sessions reflect different stages of research — data generation and evaluation of results. The morning session is titled Joint Variant Discovery, and participants will come out knowing how to use the GVCF workflow and how to interpret results. The afternoon session is titled Variant Callset Filtering and Evaluation, and participants will come out knowing how to use tools to evaluate and filter a variant callset.	November 7-8
Meta'omic Analyses for Microbial Communities Microbial ecology is one of many fields that has benefited greatly from technical advances in DNA sequencing. In particular, low-cost, culture-independent sequencing has made metagenomic and metatranscriptomic (“meta’omic”) surveys of microbial communities practical, including bacteria, archaea, viruses, and fungi associated with the human body, other hosts, and the environment. The resulting data have stimulated the development of new computational approaches to meta’omic sequence analysis, including metagenomic assembly, microbial identification, strain tracking, and gene, transcript, and pathway functional profiling. We will present a high-level introduction to computational meta’omics, highlighting the state-of-the-art in the field as well as outstanding challenges. This will include an introduction to the biological goals of typical meta’omic studies and the bioinformatic processes currently available to achieve them. Topics will include: Reference genome-based community composition and functional profiling Methods for constructing new genomic references by de novo assembly The challenges associated with precisely quantifying members of a microbial community Functional analysis of the gene families in a community The association of gene families with their source organisms The combination of gene families into pathways for metabolic profiling We will conclude with an overview of the statistical challenges inherent to analyzing the compositional data arising in meta’omic studies. Workshop attendees will gain hands-on experience with these analyses using bioBakery: a comprehensive platform for the analysis of shotgun meta’omic sequencing data. bioBakery includes tools for fast, accurate microbial taxonomic profiling (MetaPhlAn2), organism-specific functional profiling (HUMAnN2), identification/tracking of microbial strains (StrainPhlAn/PanPhlAn), and pattern discovery in microbial communities (MaAsLin, BAnOCC, and HAllA). Interspersed with lecture content, attendees will work through meta’omic analysis tutorials in Google Cloud instances of the bioBakery virtual machine. This workshop will be run by members of the Huttenhower Lab of the Harvard T.H. Chan School of Public Health and Broad Institute	November 1
CRISPR in a Nutshell This workshop will consist of three modules: Introduction to CRISPR-Cas9 systems and basic genome editing methodologies, including guide design, in vitro and in vivo delivery, and assays to monitor editing Guide to CRISPR-Cas9-based screening applications (both loss- and gain-of-function) Overview of frontiers in genome editing, including therapeutic delivery modes, approaches to enhancing specificity, and novel RNA-guided RNA-targeting enzymes Each module will be followed by ample time for questions to ensure that participants get detailed advice tailored to their specific needs.	October 12	Videos
CellProfiler: Learn to analyze 1 image or a 1,000,000 Microscopy and image processing methods improve every year. As the capacity to acquire and analyze images continues to grow, so too does CellProfiler, an open-source, freely-downloadable software designed for large-scale, automated phenotypic image analysis. Workshop attendees will learn the fundamentals of building CellProfiler pipelines to analyze image data, and will gain knowledge of the following: Basics of image analysis. Obtaining measurements from objects. Basics of machine-learning for phenotype identification. This hands-on introduction to CellProfiler will be followed by case-studies on HCS and cell-type classification. At the end of the workshop there will be a breakout session where attendees will receive guidance on analyzing their own image data. If you are curious about automating the analysis of your microscopy data or want to become familiar with "what's possible," come to the workshop and see what's new in CellProfiler for 2016.	September 29
FireCloud Workshop for Tool Developers FireCloud is a cancer genome analysis platform with co-located TCGA data. Modeled after the Broad Institute’s Firehose analysis infrastructure, FireCloud democratizes data access and facilitates collaboration by providing a robust, scalable platform accessible to the community at large. Using the elastic compute capacity of Google Cloud, FireCloud empowers analysts, tool developers, and production managers to perform large-scale analysis, engage in data curation, and store or publish results. Users can upload their own analysis methods to the FireCloud Method Repository or run the Broad Institute’s best practice tools and pipelines. FireCloud also includes pre-loaded workspaces with curated TCGA data and results that users can clone. The aim of this workshop is to introduce tool developers to the FireCloud environment through hands-on exercises. We will introduce core FireCloud concepts such as the workspace data model, method configs, and billing projects. In addition, we will cover the steps needed to upload tools to FireCloud, and provide an overview of Docker and Workflow Description Language (WDL). Prerequisites: Basic familiarity with bioinformatics tools and command-line interfaces. We will send out software installation instructions ahead of the workshop.	August 12	Videos
FireCloud Workshop for Firehose Users The aim of this workshop is to introduce cancer genomics and bioinformatics professionals to the FireCloud environment through interactive demos and hands-on tutorials. FireCloud is a cancer genome analysis platform built on a cloud computing environment (Google) with co-located TCGA data that was created and designed for the community of cancer researchers. FireCloud is modeled on Firehose, the cancer genome analysis platform built by the Getz lab at the Broad Institute, which supports both small groups and major projects (e.g., TCGA, GTEx). Firehose is used by both production managers for large-scale analysis and analysts for interactive analysis, curation and manual review of data for publication. Like Firehose, FireCloud is built around "workspaces," which have robust security and access control and which hold data, tools, workflows, and results. FireCloud will hold best-practice tools and workflows that are currently in use at the Broad. In addition, FireCloud will host pre-loaded workspaces (e.g., TCGA-THCA workspace) holding data, pipelines, and results. These workspaces can be cloned and used to co-analyze user uploaded data. TCGA data — both protected and open access — will be co-located with FireCloud. Protected access data will require ERA Commons authorization and appropriate dbGaP permissions.	June 24
Integrative Genomics Analysis with GenePattern In a hands-on format, participants will learn to use the GenePattern platform for integrative genomics analysis. GenePattern includes an intuitive graphical user interface for users at all levels of computational sophistication; a repository of hundreds of tools for the analysis of gene expression, sequence variation, proteomics, and more; and a "pipeline" environment that allows users to chain tasks together to encapsulate and share their research as reproducible workflows. A new GenePattern Notebook environment, based on the popular Jupyter Notebook system, allows users to interleave text, graphics, and analyses into unified "research narratives" that can be shared and published. In this workshop, participants will learn how to: navigate the GenePattern environment analyze and visualize gene expression (including RNA-seq) and other genomic data create reproducible pipelines of their research create research narratives with the GenePattern Notebook environment	June 16	Workshop Materials
ChIP-seq: From Lab Process to Integrative Analysis This will be a whirlwind tour of the ChIP-seq method from some Broad experts. It will cover the fundamentals of why we do ChIP, how we do ChIP, and how we interpret the data from ChIP-seq at both a primary level (getting to tracks we can display in a genome browser) and an integrative level. Finally, it will touch upon applications of ChIP-seq in research, illustrating the power of the method to reveal fundamental processes and mechanisms, in a way that is complementary to insights from genetic and transcriptomic analyses. Along the way we will highlight best practices and share pointers to relevant resources, including web sites that describe laboratory methods, antibody validation resources, and public ChIP-seq data repositories. The intended audience includes everyone from curious people to post-docs who are interested in incorporating ChIP-seq methods in their research.	May 25
FireCloud Workshop for Firehose Users The aim of this workshop is to introduce Firehose users to the FireCloud environment via an introduction and a hands-on tutorial. FireCloud is a cancer genome analysis platform built on a cloud computing environment (Google) with co-located TCGA data that was created and designed for the community of cancer researchers. FireCloud is modeled on Firehose, the cancer genome analysis platform built by the Getz lab at the Broad Institute, which supports both small groups and major projects (e.g., TCGA, GTEx). Firehose is used by both production managers for large-scale analysis and analysts for interactive analysis, curation, and manual review of data for publication. Like Firehose, FireCloud is built around "workspaces," which have robust security and access control, and which hold data, tools, workflows, and results. FireCloud will hold best-practice tools and workflows that are currently in use at the Broad. In addition, FireCloud will host pre-loaded workspaces (e.g., TCGA-THCA workspace) holding data, pipelines, and results. These workspaces can be cloned and used to co-analyze user uploaded data. TCGA data — both protected and open access — will be co-located with FireCloud. Protected access data will require ERA Commons authorization and appropriate dbGaP permissions.	May 2
Statistical Genetics 201 This workshop teaches extensions in statistical genetics to interpret and leverage the results of genome-wide association studies. This course is aimed at individuals who are already familiar with the basics of what a genome-wide association study (GWAS) is and how it is performed, and are interested in learning additional techniques to understand the genetic drivers of a trait or disease. Specific topics that will be covered are: Fine mapping to identify candidate causal variants in GWAS loci LD Score regression to distinguish confounding from polygenicity, and to estimate heritability and genetic correlation between traits Identity-by-descent mapping as an alternative to single-variant GWAS Pathway analysis to detect enrichment of association signals in biologically relevant gene sets The workshop will combine lectures introducing each of these topics with demonstrations and hands-on practical — in particular, for performing LD Score and pathway analysis. Tutorial datasets will be provided for attendees to try out demonstrated tools. At the end of the workshop, participants should have an understanding of the different approaches that are available to interpret genetic association study results.	April 6
Drop-seq: A Practical Tutorial Drop-seq is a new technology that enables preparation of thousands of single-cell gene expression profiles in affordable, facile experiments. In this workshop, we will provide a practical guide to designing, performing, and analyzing Drop-seq experiments.	April 1
Fragment-based Drug Discovery 101 Over the last two decades, significant technical advances have been made that enable the efficient discovery and optimization of drugs for newly identified and challenging targets. In line with this, biophysical methods are critical for the identification of compounds via fragment-based drug discovery (FBDD) and are also valuable for the "on-target" validation and optimization of compounds discovered by other methods early in the discovery process. This seminar will highlight key concepts that are relevant for the implementation of Fragment Based Drug Discovery and Structure-Based Drug Design (SBD). Major topics include: Key factors that motivate FBDD and SBD Physicochemical properties and chemical space NMR methods for screening and structure-based design Other biophysical methods for screening and structure-based design Examples of success in the clinic	March 4	Workshop Materials and Video
FireCloud Workshop for Firehose Users The aim of this workshop is to introduce Firehose users to the FireCloud environment via an introduction and a hands-on tutorial. FireCloud is a cancer genome analysis platform built on a cloud computing environment (Google) with co-located TCGA data that was created and designed for the community of cancer researchers. FireCloud is modeled on Firehose, the cancer genome analysis platform built by the Getz lab at the Broad Institute, which supports both small groups and major projects (e.g., TCGA, GTEx). Firehose is used by both production managers for large-scale analysis and analysts for interactive analysis, curation, and manual review of data for publication. Like Firehose, FireCloud is built around "workspaces," which have robust security and access control, and which hold data, tools, workflows, and results. FireCloud will hold best-practice tools and workflows that are currently in use at the Broad. In addition, FireCloud will host pre-loaded workspaces (e.g., TCGA-THCA workspace) holding data, pipelines, and results. These workspaces can be cloned and used to co-analyze user uploaded data. TCGA data — both protected and open access — will be co-located with FireCloud. Protected access data will require ERA Commons authorization and appropriate dbGaP permissions.	January 12