Kate Sanders, a junior computer science major at Hendrix College, created two programs to help researchers find information about potentially pathogenic gene variants.
Kidney disease is the 9th leading cause of death in the United States, primarily due to late diagnosis. Working at the Broad has been a dream of mine for years, but I never imagined I would be able to come here as an undergraduate. This experience has reshaped my view of science and renewed my sense of wonder with the world. It was refreshing to interact with many brilliant, passionate, and humble scientists. Though I am leaving the Broad physically, Broadie values will stay close to my heart.Determining causal variants for kidney disease could lead to reduced patient mortality through earlier diagnosis and personalized treatments. However, clicking through web browsers to find information about variants is time-consuming and error-prone. My goal was to create a user-friendly program to gather, filter, annotate, and sort gene variant information related to kidney disease. I took two approaches to account for both Mendelian and complex disease variants.
We first took a Mendelian approach to look for likely pathogenic variants in MUC1, a gene associated with Medullary Cystic Kidney Disease. After inputting a phenotype, the program returns files containing information about diseases and genes related to that phenotype. If a researcher is interested in looking at a specific genes’ variants, they can download a CSV of variants from gnomAD. Using functions in the Mendelian Pipeline, the researcher can remove unwanted variants, add variant annotations, and sort the variants by the number of pathogenicity scores passed.
To look at complex disease variants, I incorporated three sources of information: GWAS hits to associate variants with complex traits, cis-eQTL to show a variant’s effect on gene expression in kidney tissues, and linkage disequilibrium for fine-mapping. This framework is useful because there are currently no web browsers for visualizing all of these data sources together for kidney diseases.
These tools will enable researchers to discover interesting variants more effectively. Little user input is needed for the programs to run successfully. Additionally, only slight modifications would be needed to apply both programs to diseases in other organs. Currently, I am working on developing R packages for distributing the Mendelian Pipeline and Complex Disease Framework.
Project: Characterizing kidney disease gene variants: a programmatic approach
Mentors: Jamie Marshall and Qingbo Wang, Medical and Population Genetics