Identifying functional effects of noncoding variants is a major challenge in human genetics. I will discuss our deep learning–based algorithmic framework, DeepSEA (http://deepsea.princeton.edu/) that predicts noncoding-variant effects de novo from genomic sequence. DeepSEA directly learns a regulatory sequence code from large-scale chromatin-profiling data, enabling prediction of chromatin effects of sequence alterations with single-nucleotide sensitivity. We further used this capability to improve prioritization of functional variants and to predict tissue-specific expression based only on genomic sequence.
I will then discuss our work on building tissue-specific networks (http://hb.flatironinstitute.org/) to understand cell- and tissue-specific gene function and regulation and application of these networks to the study of autism spectrum disorder (ASD). ASD is a complex neurodevelopmental disorder with a strong genetic basis. Yet, only a small fraction of potentially causal genes—about 65 genes out of an estimated several hundred—are known with strong genetic evidence from sequencing studies. We developed a complementary machine-learning approach based on a human brain-specific gene network to present a genome-wide prediction of autism risk genes, including hundreds of candidates for which there is minimal or no prior genetic evidence.