CellProfiler’s identifying features

It’s one of the great quandaries of the Information Age: as advances in digital technologies allow us to generate data at an ever-increasing pace, there is a concomitant need to find new ways to process and analyze the resulting deluge of information. Such has been the case in cell imaging. As...

Tools of the Trade: CellProfiler
Tools of the Trade: CellProfiler

It’s one of the great quandaries of the Information Age: as advances in digital technologies allow us to generate data at an ever-increasing pace, there is a concomitant need to find new ways to process and analyze the resulting deluge of information.

Such has been the case in cell imaging. As modern microscopes have become increasingly powerful and automated – able not only to bring the smallest cellular sub-compartments into focus, but also to capture tens of thousands of images of cell activity per day – the output has become impossible for researchers to process with the naked eye.

Enter CellProfiler and CellProfiler Analyst – complementary tools created to help scientists measure and analyze microscopic changes happening at the cellular level. The tools were first conceptualized by Anne Carpenter, director of the Broad’s Imaging Platform. Today, her team continues to refine and expand the software, making it increasingly robust and enabling researchers from across biomedical fields to tackle the deluge of visual data.

Carpenter first recognized the need for the software a decade ago when, as a postdoctoral student in the lab of Broad senior associate member David Sabatini, she couldn’t find adequate tools to measure cells in her Drosophila fruit fly experiments. Though Carpenter wasn’t a computer scientist (she is a cell biologist by training), she decided to tackle the problem herself, using MATLAB, a programming language for numerical computation, to help write the software she needed.

As her colleagues throughout the institute began asking for help in quantifying microscopy experiments, Carpenter quickly realized that she wasn’t the only one who needed the software. She teamed up with computational biologist Thouis (Ray) Jones, now affiliated with the Harvard School of Engineering and Applied Sciences, to turn her prototype into an open source program that could assist other biologists who were conducting image-based experiments. The result was CellProfiler, a versatile and user-friendly program that helps researchers measure features such as the size, shape, or brightness of stained compartments within cells. It is used often in large-scale experiments to measure changes in cells that have been treated by chemical compounds or genetic perturbations that alter the function of specific genes.

It is the digital nature of the images that makes such measurements possible.

“Since the images are digital, each pixel is actually a number. You can do math on numbers, and one of the reasons you might want to do math on images is to identify biologically relevant parts within those images,” she explains.

A selection of cells from past research projects, viewed under CellProfiler’s analytical gaze. Video by Nick Dua, Broad Communications.

Once measurements are taken, researchers can analyze patterns in the data using CellProfiler Analyst, which uses machine-learning algorithms (akin to the facial recognition tools commonly used by photo editing software and social networking sites) to home in on the cellular features that researchers want to study. Researchers customize their analysis by “training” the software: they show the software examples of what they’re looking for, then refine CellProfiler Analyst’s search abilities by providing feedback each time it makes a mistake or correctly identifies the desired feature. Once the program “knows” what to look for, it can quickly and accurately scan and analyze thousands of images.

The CellProfiler programs have been used in a wide range of experiments, analyzing images of specimens such as cells, yeast colonies, and worms in support of research on diseases ranging from breast cancer and leukemia, to liver disease and HIV. At last count, the software has been cited in over 700 papers since being published in October 2006.

The program is open source, and is launched over 250 times per day by users across the globe.

“It’s designed for anyone to use, but it’s definitely geared toward biologists so that they can do this type of image-based research on their own. Which is great,” Carpenter adds, “because they know their experiments the best.”

CellProfiler, which was awarded Bio-IT World’s Best Practices Award for IT and Informatics in 2009, has evolved dramatically since Carpenter first released the version that she created with MATLAB. In 2008, a grant from the National Institutes of Health enabled the platform to hire software engineer, Lee Kamentsky, long-term to refine the software.

“He rewrote the whole thing from scratch,” Carpenter said. “It has the same basic, user-friendly design, but it’s way, way better under the hood.”

Kamentsky still leads the CellProfiler software development effort. He, Carpenter, and others from the Imaging Platform continue to evolve the software and, in the next couple of months, will release a major overhaul that will make CellProfiler even easier to use. It is also more powerful, boosting capacity by taking advantage of computers that have multiple processing cores. The reboot is part of the platform’s ongoing effort to respond to the research community’s needs, which shift with advances in technology and new discoveries.

These needs are continually assessed in the field; members of the CellProfiler team collaborate closely with researchers from the Broad community on a variety of scientific studies. As new challenges arise, the team adapts the software to solve the problem. As they’re developed, new algorithms and iterations are added to the Imaging Platform’s open source offerings.

“We collaborate with different research groups on particular projects, but our end goal is always the same,” she says. “We want to share whatever image-processing or data-mining algorithms we develop through the CellProfiler project with the rest of the research world.”

Want to learn more about CellProfiler? Consider taking a BroadE workshop at the Broad Institute. These workshops are open to all Broad staff and to researchers at MIT, Harvard, and Harvard-affiliated hospitals. You can also connect with the Imaging Platform’s CellProfiler team on Facebook or LinkedIn.