Newton North High School
Newton, MA

Andrew Dunford and Timothy Wood
Cancer Program

Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgekin lymphoma. Recently-published results from the Broad indicate that DLBCL diagnoses can be broken down into five subtypes, determinable solely from applying clustering algorithms to patients’ genomic data. Eugene verified this result using a different dataset and a different clustering algorithm. By applying k-means clustering to a set of data from both the Broad Institute and Duke University, Eugene was able to verify the presence of five DLBCL subtypes present in the genomic data. He was additionally able to identify a handful of key genes present in each cluster. These results will inform future research to generate more targeted DLBCL treatments than currently exist. 

Eugene has always been interested in science and math, but felt a bit frustrated that his school didn’t really connect the two subjects very much. Eugene was excited to see this summer how both mathematics and machine learning played a role in biology research. That said, Eugene’s favorite part of working at the Broad was the people: “Everyone at the Broad is also very eager to help and showcase their work, even people who have PhDs and numerous patents, papers, and citations under their belts!”