SPLASH unifies genomic analysis and discovery through a paradigm shift to statistics-first

Stanford University

Myriad mechanisms diversify the sequence content of DNA and of RNA transcripts. Currently, these events are detected using tools that first require alignment to a necessarily incomplete reference genome alignment in the first step; this incompleteness is especially prominent in human genetic diseases such as cancer, in the microbial world, and in non—model organisms where it severely limits the speed and scope of discovery. Second, today the next step in analysis requires as a custom choice of bioinformatic procedure to follow it: for example, to detect splicing, RNA editing, or V(D)J recombination among many others.  I will present the theory for why SPLASH, a new statistics-first analytic approach captures myriad forms of genome regulation, without a reference or sample metadata, by performing statistical inference directly on raw sequencing reads. By design, SPLASH as an algorithm is highly efficient. Thanks to joint work with Professor Sebastian Deorowicz’s group, SPLASH is now implemented so that it is efficient and simple to run. A snapshot of its findings include new insights into RNA splicing, cancer transcriptomes, single cell RNA-editing, mobile genetic elements and discovers new genes non-model organisms.    

MIA Talks Search