Using deep learning regulatory models and random DNA for evolutionary inference

Assistant Professor, School of Biomedical Engineering UBC

Genetic variation in cis-regulatory sequences can alter gene regulation and is a major driver of phenotypic variation. Here, I will describe two recent advances in our understanding of the molecular evolution of cis-regulatory DNA, gleaned through gene regulatory “big data” and machine learning. Using yeast as a model system, we recently demonstrated that random DNA (where bases are randomly selected from the four possibilities) placed in the a promoter-like context had diverse gene regulatory activity. Measuring random DNA at scale enabled us to train highly accurate machine learning models that capture gene regulation. These models demonstrated that gene expression evolution is highly dynamic and enabled us to chart the course of cis-regulatory evolutionary past and future. The diverse expression observed in random promoter sequences implied that regulatory activities were easy to evolve. Using a combination of experimentation and inference, we estimated how frequently gene regulatory features (e.g. transcripts and chromatin marks) occur by chance in entire chromosomes of evolutionarily naive DNA, finding that regulatory features are both predicted and observed to be frequent in evolutionarily naïve DNA. Since gene regulatory features are expected to occur by chance in the absence of selection, many of the biochemically active sequences in genomes are unlikely to be adaptive.

MIA Talks Search