Computational tools for deciphering the RNA structural code

Goodarzi Lab, University of California San Francisco; Vector Institute

Goodarzi Lab, University of California San Francisco

We present two different approaches, pyTEISER and Pythia, both inspired by context-free grammars (CFGs), to better capture structural RNA elements and their underlying regulons from transcriptomic measurements. pyTEISER scans a large sampled population of structural elements (modeled as CFGs) against the transcriptome and uses high information criterion to identify likely functional CFGs. A more exhaustive local search is then used to identify the best representation of the structural cis-regulatory elements that underlie the observed transcriptomic moduilations (e.g., changes in RNA stability, processing, etc). Pythia reimagines this concept by modeling context-free grammar rules as fixed-dilated convolutional layers in neural network architectures. This enables a neural network model to build informative context-free grammars from scratch, as opposed to scoring pre-existing ones. Here, we have shown that Pythia passes this structural representation of RNA to a neural network capable of learning RNA binding protein preferences with high accuracy and precision. Together, these frameworks allow us to reveal and interrogate the fundamental contribution of RNA structural elements to post-transcriptional regulatory programs in health and disease.

MIA Talks Search