MIA Talks

The science of information: Case studies from DNA and RNA assembly

April 27, 2016
Dept. of Electrical Engineering, Stanford University; Dept. of Electrical Engineering and Computer Sciences, UC Berkeley

Claude Shannon invented information theory in 1948 to study the fundamental limits of communication. The theory not only establishes the baseline to judge all communication schemes but inspires the design of ones that are simultaneously information optimal and computationally efficient. In this talk, we discuss how this point of view can be applied on the problems of de novo DNA and RNA assembly from shotgun sequencing data. We establish information limits for these problems, and show how efficient assembly algorithms can be designed to attain these information limits, despite the fact that combinatorial optimization formulations of these problems are NP-hard. We discuss Shannon, a de novo RNA-seq assembly software designed based on such principles, and compare its performance against state-of-the-art assemblers on several datasets.