Broad Institute of MIT and Harvard

High-dimensional features like RNA expression levels or ATAC-seq peaks are ubiquitous in modern biological datasets. While these kinds of datasets are incredibly rich, extracting reliable insights from high-dimensional data presents a number of significant computational and statistical challenges. In this talk I give an overview of how one approach to high-dimensional statistics---Bayesian variable selection (BVS)---can be applied to problems in bioinformatics. In the first application I illustrate how BVS can be used to identify the genetic determinants of differential viral fitness from SARS-CoV-2 genomic surveillance data. In the second application I describe how BVS can be used to pinpoint regulatory elements in CRISPR tiling screen data. I also provide a short demo of millipede, an open source package for BVS.

MIA Talks Search