Researchers announce GenomeSpace environment to connect genomic tools
By Haley Bridger, Broad Communications
April 25, 2012
Researchers from the Broad Institute of MIT and Harvard have announced that GenomeSpace, a software environment that seamlessly connects genomic analysis tools, is now available to the scientific community. During her keynote address at Bio-IT World Conference and Expo on Tuesday, Jill Mesirov, director of computational biology and bioinformatics at the Broad Institute, invited biomedical researchers and tool developers to explore this beta release of the new resource and to use it in their work.
Currently, in order to make use of multiple analysis tools and data sources, biologists need to convert between the different data formats they use. This often involves error-prone spreadsheet manipulations or requires programming skills to write scripts. Mesirov’s team and her collaborators set out to change that.
“Our goal is to bring the ever-changing wealth of genomic analysis methods and whatever data are required to the fingertips of any biologist,” said Mesirov.
The GenomeSpace environment currently connects six tools: GenePattern, Galaxy, Integrative Genomics Viewer (IGV), Cytoscape, Genomica, and the UCSC Genome Browser. Many projects in genomic research rely on one or more of these tools. For instance, if researchers want to test a hypothesis about genetic differences between two stages of breast cancer, they might first use an analytical tool such as GenePattern to detect genes of interest; then IGV to view their genetic sequence; and then Cytoscape to see protein-protein interactions. GenomeSpace allows them to seamlessly transition between all of these tools to carry a project through to completion.
But GenomeSpace can also be used for smaller inquiries or simple conversions from one tool to another. GenomeSpace’s designers worked closely with scientists from the Broad Institute and beyond to determine many kinds of scientific problems for which GenomeSpace could be used. “We strove to identify a range of critical biological problems — from 'microproblems' involving a couple of steps in two tools, to complex scenarios on the scale of an extensive research paper,” said Aviv Regev, a core faculty member of the Broad Institute. Regev, who is also an associate professor at MIT and an Early Career Scientist of the Howard Hughes Medical Institute, and members of her lab have made examples of ways to solve these kinds of problems available on GenomeSpace as tutorials for others.
Michael Reich, director of informatics development for the Broad Institute’s Cancer Program, is one of the architects of GenomeSpace. He describes GenomeSpace as a connection layer that allows different tools to communicate – it can detect the different data formats each tool requires and make the necessary conversions. “GenomeSpace acts as a broker, automatically detecting and converting files from one format to another for the user,” said Reich.
Anton Nekrutenko, an associate professor at Pennsylvania State University and one of the developers of the aggregation tool Galaxy, notes that tools like Galaxy and GenePattern already integrate hundreds of tools. GenomeSpace pulls these aggregation tools together.
“GenomeSpace is an integration of integrators,” Nekrutenko said. “The benefit to the user is that this brings together distinctive collections of functionalities offered by individual tools.”
"We couldn't be more pleased that Cytoscape is plugged into GenomeSpace,” said Trey Ideker, Division Chief of Medical Genetics at University of California, San Diego School of Medicine. “GenomeSpace will connect our network analysis tools with hundreds of other state-of-the-art programs and enable our users quick access to expression clustering and classification and browsing, access to the genome sequence, and so on."
Mesirov, Reich, and their colleagues are eager for other biologists to test drive GenomeSpace and offer feedback on its utility. “We’re committed to rapidly responding to the needs of the scientific community and supporting the widest range of genomic research,” Reich said.
“GenomeSpace will empower biologists with no computational or programming background to maximize their ability to weave together biological insight with best-in-class computational tools,” said Regev. “We hope it will make analyses accessible that were beyond the reach of many biologists.”
The GenomeSpace project is a collaboration of the Mesirov and Regev laboratories at the Broad Institute; the Chang laboratory at Stanford University; the Ideker laboratory at the University of California, San Diego; the Nekrutenko laboratory at Pennsylvania State University; the Segal laboratory at the Weizmann Institute of Science; and the Haussler and Kent laboratories at the University of California, Santa Cruz. The project is funded by the National Human Genome Research Institute with additional support from Amazon Web Services.
About the Broad Institute of Harvard and MIT
The Eli and Edythe L. Broad Institute of Harvard and MIT was launched in 2004 to empower this generation of creative scientists to transform medicine. The Broad Institute seeks to describe all the molecular components of life and their connections; discover the molecular basis of major human diseases; develop effective new approaches to diagnostics and therapeutics; and disseminate discoveries, tools, methods and data openly to the entire scientific community.
Founded by MIT, Harvard and its affiliated hospitals, and the visionary Los Angeles philanthropists Eli and Edythe L. Broad, the Broad Institute includes faculty, professional staff and students from throughout the MIT and Harvard biomedical research communities and beyond, with collaborations spanning over a hundred private and public institutions in more than 40 countries worldwide. For further information about the Broad Institute, go to http://www.broadinstitute.org.