You are here

MIA Talks

Viral diagnostic design with model-based optimization

February 24, 2021
Sabeti Lab, Broad Institute

In recent years technological developments have enhanced viral detection, spawning new tools for diagnostics and surveillance. Yet there has been limited progress in using computational design to enhance viral diagnostics and surveillance. Designing assays from viral genomic data is still done largely by hand, without well-defined objectives and with a great deal of trial and error. In this talk, we discuss an approach that combines a deep learning model with combinatorial optimization to design viral diagnostics. Concentrating on CRISPR-based diagnostics, we screen 19,000 guide-target pairs and train a deep neural network to predict the diagnostic signal of a prospective design better than other techniques. We use this model within a genome-wide optimization algorithm to construct assays with maximal sensitivity, in expectation over a virus's full genomic variation, and stringent specificity. We also discuss ADAPT, a design system that implements our approach while automatically leveraging the latest public viral genomic data. We show that ADAPT rapidly designs optimal CRISPR-based diagnostic assays for the 1,933 vertebrate-infecting viral species, providing a proactive resource of broadly-effective viral diagnostics. ADAPT's designs are sensitive and specific down to the lineage-level when tested against extensive viral variation, including viral taxa that pose challenges of diversity and specificity, and exhibit significantly higher fluorescence and lower limits of detection than designs from standard techniques. In this talk, we also describe related work on an algorithm, CATCH, for designing efficient assays that simultaneously enrich all known whole-genome variation across hundreds of viral species, which permits us to perform more sensitive viral metagenomics on patient and environmental samples. The work in this talk integrates locality-sensitive hashing, machine learning, submodular maximization, and a novel trie-based data structure for enforcing high taxon-specificity.