2018 Workshops

Broad Cancer Research Data Resources & Tools 101
This introductory symposium offers any interested scientist a chance to get an understanding of the landscape of some of the tools, datasets, and resources available for cancer research at the Broad including CCLE, Achilles, Firecloud, Copy Number Portal, GTEx, CLUE/CMap, Gene Pattern, CellProfiler, and more. Currently, these topics are only systemically discussed together in the ‘deep-dive’ format of the two-week long annual Cancer Program Postdoc BootCamp.

Therefore, this BroadE provides an exciting opportunity for people to learn more about how and why these resources and datasets may be applicable to their own research. The session may also provide introductory information about how to access or get assistance using these resources. Attendees do not need computational experience. 

 

 
December 10
Scale with Hail 0.2: A hands-on tutorial for genomic analysis
This hands-on Hail 0.2 tutorial will be led by Jon Bloom and Tim Poterba of the Hail team, based in the Neale lab in the Stanley Center.
 
Hail is an open-source, general-purpose, Python-based data analysis tool with additional data types and methods for working with genomic data. Similar to the R or Python scientific computing stacks, Hail supports data frame queries, statistics, linear algebra, and plotting, both interactively and with scripts. Unlike these stacks, Hail:
  • scales from laptop to large compute cluster or cloud, with the same code
  • is designed to work with datasets that do not fit in memory
  • has first-class support for multi-dimensional structured data, like genomic data
At Broad, Hail is the analytical engine behinds dozens of studies, the Genome Aggregation Database (gnomad.broadinstitute.org), and the Neale lab mega-GWAS (nealelab.is/uk-biobank). Beyond Broad, Hail is used by academia and industry on data ranging from mouse models to GTEx.
 
Target audience: Scientists analyzing genomic variant datasets and their relationship to phenotypes, gene expression, or other data. Participants should bring a fully-charged laptop but need not pre-install any software; they will use Hail on the cloud through a browser.
 
November 30
Integrative Genomic Analysis with Gene Pattern
GenePattern, www.genepattern.org, enables researchers at all levels of computational expertise to use hundreds of tools for the analysis of gene expression, sequence variation, proteomics, and more, through an intuitive interface that requires no coding.  
 
In this hands-on workshop, participants will learn how to:
  • Analyze and visualize gene expression (including RNA-seq) and other genomic data
  • Identify GenePattern analyses relevant to their scientific objectives
  • Ensure that their analyses are reproducible
  • Create and publish research narratives that serve as a live, executable, sharable representation of a study
 
June 12
The Genotype Tissue Expression Project -- Data, Resources, and Interpretation
The goal of the Genotype-Tissue Expression (GTEx) Project is to understand how genetic variation contributes to the regulation of gene expression among individuals, tissues, and cell types. This project provides an unprecedented breadth of transcriptomic (RNA-seq) and genetic (WGS) data from >50 non-diseased human tissues across ~ 960 donors.
 
This interactive workshop will showcase the GTEx data resource, and demonstrate how to access, interpret, and visualize the various types of data produced, using downloadable files and the GTEx Portal. Given the complexity of interpreting genetic associations with gene expression, we will review mapping of expression quantitative trait loci (eQTLs) and identification of a high-confidence set of causal variants at a given locus.
 
The tutorial will cover several, common data interpretation examples – including using eQTLs to interpret genetic associations with complex traits – and will showcase how to query specific subsets of the available data types. We will also discuss best practices and caveats for using GTEx data.
 
June 4
Analysis of Biological Images with CellProfiler
CellProfiler is an open-source, freely-downloadable software designed for large-scale, automated analysis of biological images. Attendees will have a hands-on experience with CellProfiler, followed by case-studies on High Content Screening and cell-type classification. At the end of the workshop there will be a breakout session where attendees receive guidance on analyzing their own image data.
 
May 16
An Introduction to Image Analysis with CellProfiler
CellProfiler is an open-source, freely-downloadable software designed for large-scale, automated analysis of biological images. Attendees will have a hands-on introduction to CellProfiler, followed by case-studies on High Content Screening and cell-type classification. At the end of the workshop there will be a breakout session where attendees will receive guidance on analyzing their own image data. If you are curious about automating the analysis of your microscopy data or want to become familiar with "what's possible", come to the workshop and see what's new in CellProfiler for 2018.
 
March 28
 
Using Morpheus for Matrix Visualization and Analysis
In this hands-on workshop, participants will learn to use Morpheus, a web-based application for matrix visualization and analysis. Participants will learn how to interact with multiple data types (e.g., gene expression and mutation in a heat-map-based view) and how to cluster, sort, and filter their data.
 
March 5
 
Proteomics Toolset for Integrative Data Analysis
The Proteomics Platform develops and applies advanced quantitative proteomics methods to a variety of biological questions. To enable proteomics researchers to interactively explore the acquired data matrices of quantified proteins or post-translational modifications and to facilitate an integrative set of analysis tools, we have developed the Proteomics Toolset for Integrative Data Analysis (Protigy). Primarily developed for the Proteomics Platform, we are proud to open access to our tools to a broader audience. Protigy streamlines proteomics data analysis, provides an intuitive interface for lab researchers to analyze and explore proteomics datasets, and ensuring reproducible data analysis by keeping track of workflows and parameters. Some of the features of Protigy that we may cover in the workshop include:
  • Data normalization/filtering
  • Data QC
  • Marker selection
  • Interactive visualization of results
  • Integration of protein-protein interaction databases
  • Saving analysis sessions and sharing with collaborators
  • Export of results in Excel and PDF formats

The last part of the workshop will be a hands-on session in which participants will analyze proteomics datasets provided by the instructors. Protigy has been implemented as R/Shiny app and runs on a commercially licensed Shiny Server Professional maintained by the Proteomics Platform which will be partly used throughout the workshop. In order to guarantee all participants can take part in the hands-on sessions, participants are encouraged to bring their own laptops. We will demonstrate how to install and use Protigy on Windows/Linux/Mac computers. Protigy can be freely accessed on GitHub (https://github.com/karstenkrug/modT).

 
 
February 28
 
 
Integrative Genomic Analysis with GenePattern
GenePattern enables researchers at all levels of computational expertise to use hundreds of tools for the analysis of gene expression, sequence variation, proteomics, and more, through an intuitive interface that requires no coding.

GenePattern makes reproducible research easy: analyses can be rerun at any time with the same inputs; every version of each tool is tracked, so that a result can be reproduced even if the code that produced it changes in the future; and researchers can chain analyses together to encapsulate and share their research as reproducible workflows.

A new GenePattern Notebook (http://www.genepattern-notebook.org/) environment based on the popular Jupyter Notebook system, further allows users to interleave text, graphics, and analyses in unified "research narratives" that can be shared and published.

 
In this hands-on workshop, participants will learn how to:
  • Analyze and visualize gene expression (including RNA-seq) and other genomic data
  • Identify GenePattern analyses relevant to their scientific objectives   
  • Ensure that their analyses are reproducible
  • Create and publish research narratives that serve as a live, executable, sharable representation of a study
 
 
February 5