Introduction to Deep Mutational Scanning collaborations with the GPP In this virtual workshop, we will provide an overview of Deep Mutational Scanning pooled screens, and discuss issues for which these assays seem uniquely beneficial. We will also cover common experiment types for which Deep Mutational Scanning would be a poor choice. We will walk through an example of CDK6 deep mutational scanning experiments, discuss the process of collaborating with the Genetic Perturbation Platform on these experiments, and touch on sub-saturating alternative mutagenesis techniques available. |
October 1
|
||
ENCODE Data Utilization Workshop The ENCODE project is building a comprehensive parts list of functional elements in the human genome, and mapping the regulatory mechanisms that control gene expression. ENCODE consists of ~15,000 curated datasets, and continues to grow. This workshop is being divided into two half-days to operate virtually. Please take note of the descriptions and instructions for registration below. May 20th, Lectures: In this virtual half-day workshop, you’ll learn how to search, analyze, and visualize ENCODE data. First, leaders from the NHGRI will give an overview of the types of datasets available on the Encode portal. Next, our partners at the ENCODE data coordination center at Stanford (DCC), and data analysis center at UMASS (DAC) will teach you how to access and work with data via both the ENCODE portal and SCREEN. These panelists will be available to answer your questions and help troubleshoot as needed. This workshop is open to all Broad members, affiliates, and Encode consortium members, and will be run in a webinar format. May 21st, Hands on workshop: We are also offering a second half-day of virtual hands on instruction. Included will be hands-on demos from Noam from Broad Epigenomics and the DSP Terra team on how to access and analyze ENCODE data in the cloud, via both the ENCODE Data Coordination Center (DCC)/AWS and Terra/Google Cloud. Finally, you’ll learn from Neva in the Aiden lab how to identify and visualize 3D genome interactions using ENCODE data in Juicebox. Participation in this hands-on portion is limited to 30 attendees to facilitate interaction in virtual breakout rooms and maximize one-on-one attention between instructors and attendees. No prior experience is necessary as there will be demos and experts on hand to teach at different levels. |
May 20-21
|
||
Scalable Genomic Analysis using Hail Hail is an open-source library that provides accessible interfaces for exploring genomic data, with a backend that automatically scales to take advantage of large compute clusters. Hail enables those without expertise in parallel computing to flexibly, efficiently, and interactively analyze large sequencing datasets. Hail is the analytical engine behind projects such as the Genome Aggregation Database, the UK Biobank mega-GWAS, eQTLs in GTEx, TOPMed, the Psychiatric Genomics Consortium, and the Centers for Mendelian Genomics. This workshop provides an introduction to Hail through hands-on exploration and analysis of public 1000 Genomes data. Following a brief conceptual overview, participants will be guided through a hands-on tutorial with interactive exercises. The workshop covers some the most common use cases: general-purpose data exploration functionality; variant and sample quality control; common variant association; and rare variant burden tests. By the end, participants will be ready to begin using Hail to answer their own scientific questions. |
March 5
|
||
Introduction to Machine Learning on Biomedical Data The course will begin with a very brief overview of the mathematical foundations of ML, specifically linear regression, logistic regression, and multilayer perceptrons. Applying these models to the well-studied dataset of hand written digits, MNIST, we’ll gain first-hand experience with model selection, training and validation. We will then introduce several abstractions from the ML4CVD codebase (TensorMaps, Model Factories, and Tensorization) which accelerate and simplify the process of preparing data for ML, building models, and evaluating them. Then we will examine several real world biomedical datasets of various sizes, structures and quality. Specifically, from the Allen Brain Atlas we will load high-dimensional brain MRI data linked with gene expression microarrays, from Qure.ai we will use the CQ500 dataset of 500 CT scans containing 193K slices linked with medical reports from 3 senior radiologists, and lastly from the Erowid website raw natural language text testimonials of drug experiences linked with basic demographics. These data will present many new challenges, such as noisy labels, small sample sizes, confounding by indication, and batch effects. These issues are less prevalent in typical ML data, like MNIST or imagenet, but very common with biomedical data. After visually and statistically exploring the datasets, we will setup several ML problems using them as raw data. Lastly, we will explore model interpretation by visualizing saliency maps, class activation maps, and training adversarial examples. The course will conclude with a general discussion on framing biological questions as machine learning problems and the opportunity for participants to brainstorm how ML might apply to their own datasets. |
February 28
|
||
PyMOL is a molecular visualization software used to visualize protein, DNA, and RNA in 3D. This software allows the user to view and analyze their system of interest and create high quality images and movies of their work. This half-day workshop, designed for new and existing users, will be run by a team from Schrödinger, designers of the PyMOL software.
Who should attend
How to get PyMOL at the Broad? |
February 11
|
||
From Missense Variants to Protein Sequence and Structure |
January 24
|