|Publication Type||Journal Article|
|Year of Publication||2020|
|Authors||Arbab, M, Shen, MW, Mok, B, Wilson, C, Matuszek, Ż, Cassa, CA, Liu, DR|
|Date Published||2020 Jun 09|
Although base editors are widely used to install targeted point mutations, the factors that determine base editing outcomes are not well understood. We characterized sequence-activity relationships of 11 cytosine and adenine base editors (CBEs and ABEs) on 38,538 genomically integrated targets in mammalian cells and used the resulting outcomes to train BE-Hive, a machine learning model that accurately predicts base editing genotypic outcomes (R ≈ 0.9) and efficiency (R ≈ 0.7). We corrected 3,388 disease-associated SNVs with ≥90% precision, including 675 alleles with bystander nucleotides that BE-Hive correctly predicted would not be edited. We discovered determinants of previously unpredictable C-to-G, or C-to-A editing and used these discoveries to correct coding sequences of 174 pathogenic transversion SNVs with ≥90% precision. Finally, we used insights from BE-Hive to engineer novel CBE variants that modulate editing outcomes. These discoveries illuminate base editing, enable editing at previously intractable targets, and provide new base editors with improved editing capabilities.