Eric Bueno

Eric Bueno, a senior computer science & engineering major at the University of Connecticut, designed a machine learning pipeline which segments cell nuclei.

Object segmentation is a problem in computer science which attempts to distinguish between the objects of interest in an image and the background. This problem has applications in the biological sciences, namely in analyzing microscopy images. We present a novel method of segmenting nuclei in microscopy images using a machine learning algorithm trained on the 2018 Data Science Bowl dataset. My BSRP 2020 experience has taught me that we can adapt to anything and still thrive so long as we have each other. Despite the numerous hurdles in our way, my cohort has come out on the other side as better scientists. I am proud to call each and every member my friend, despite never seeing any of them face to face. I’ve also learned many new skills this summer, from proper scientific communication to how to design a machine learning pipeline.  None of this would have been possible if it weren’t for the culture of open communication and collaboration at the Broad. I am eager to continue learning how I can use computational techniques to bolster biological research, as well as seeing what else this cohort will accomplish. Remember: once a Broadie, always a Broadie.While the results of the competition were promising, even the top algorithm was prohibitively computationally expensive and struggled with certain kinds of images. Our approach utilizes deep learning to train on a normalized variation of the dataset in the hope that identifying tissue color and nuclei size before segmentation can help improve accuracy and efficiency. We randomly selected images from each type of microscopy image in the data set and cropped random sections of the images to artificially expand our training set, as the original data set had high variability in the distribution of each kind of image. Before attempting to segment the nuclei, we used another neural network to identify both the color of the tissue and get an estimate of the size of the nuclei in the image. We hypothesize that controlling for these features will allow for more efficient segmentation because they will allow us to apply the optimal preprocessing technique to the image. Ideally, the completed pipeline will allow for increased throughput in the laboratory because it will mitigate one of the most time-consuming steps in many experiments.

 

Project: Streamlining nuclei segmentation using machine learning techniques

Mentors: Juan C. Caicedo, Cell Circuits and Epigenomics Programs and Imaging Platform