Reading the rules of gene regulation from the human noncoding genome
Abstract: Functional genomics approaches to better model genotype-phenotype relationships have important applications toward understanding genomic function and improving human health. In particular, thousands of noncoding loci associated with diseases and physical traits lack mechanistic explanation. I'll present a machine-learning system to predict cell type-specific epigenetic and transcriptional profiles in large mammalian genomes from DNA sequence alone. Using convolutional neural networks, this system identifies promoters and distal regulatory elements and synthesizes their content to make effective gene expression predictions. I'll show that model predictions for the influence of genomic variants on gene expression align well to causal variants underlying eQTLs in human populations and can be useful for generating mechanistic hypotheses to enable GWAS loci fine mapping.
Broad Data Sciences Platform
Primer: Classifying genomic sequences with convolutional neural networks
Abstract: Initially developed for image processing, Convolutional Neural Networks (CNNs) have been applied to genomic data with promising results. This primer will trace some of the history of neural networks with an eye towards the practical lessons learnt along the way. Then building on the idea of the Position Weight Matrix as a motif detector we will explore exactly what convolution means when applied to a DNA sequence. While drawing examples from computer vision and natural language processing, our focus will be on the application of CNNs to genomic data. Lastly, we will cover recent advances in CNNs including residual connections and dilated convolutions.