Tofunmi Sodeinde
Tofunmi Sodeinde, a junior majoring in Mathematics and Scientific Computation & Data Sciences at the University of Texas at Austin, created a pipeline to serve drug side-effect information to the Biomedical Data Translator Project.
As a result of scientific advancements in biomedical research and clinical interactions, there is a large amount of public experimental and observational data. These data are useful for understanding diseases and developing treatments for them. However, these data are spread out and there are few community standard formats for most data types, so there is a high barrier to making them interoperable. I came to the Broad not knowing what I was going to do for the next nine weeks. This was my first research experience as an undergraduate student, so I had no expectations. Once the program started, I truly appreciated every opportunity that came my way. Not only did I enjoy my research project and got to learn new computational skills and software, but I also learned what I am capable of and how hard I can push myself to achieve my goals. Along with my newfound research skills, I learned how to communicate scientific information effectively as well as what options I have after my undergraduate career and whether I want to pursue a Ph.D. and/or MD. I also had the pleasure of being around a diverse group of students who are passionate about their respective disciplines and always pushed me to do my best. During the program, I told a friend of mine about my summer in Boston, and he called the Broad Summer Research Program (BSRP) ‘the ultimate summer program’ jokingly. Now that it has ended, I can honestly say that it was the ultimate summer program.Consequently, the National Center for Advancing Translational Sciences (NCATS) developed the Biomedical Data Translator Project, which aims to integrate multiple types of existing data sources and create a one-stop hub where anyone can access this information, including medical professionals and biomedical researchers. The Translator will receive input from eight knowledge providers containing biomedical data including clinical data, genetics, and molecular data. My research focused on the Molecular Data Provider or MolePro, which serves chemical biology data for the Translator. At first, there was no way to integrate drug-side effect information into the Translator so I created a new pipeline for MolePro that will store and serve this information. First, I reorganized the drug side-effect information from SIDER, a side-effect resource that contains a database of drugs, their side effects, and the frequency of these side effects. This information was then stored in a newly created database using SQLite Studio and Python programming. Next, I restructured these data into a new mechanism called a transformer, that will convert the data into one standard format, so it is interpretable by MolePro. With this new transformer, Translator users will be able to query drug side-effect information.
Project: Developing a New MolePro API to serve SIDER for the Biomedical Data Translator Project
Mentors: Paul Clemons, PhD, and Vlado Dancik, PhD, Chemical Biology and Therapeutics Science Program