A protein interaction map for finding interesting biology in large-scale genetic and genomic data

InWeb_InBioMap brings together data on more than a half-million protein interactions to put genetic data into functional context and reveal promising pathways for further study.

Taibo Li, Lage lab/Lauren Solomon, Broad Communications
Credit: Taibo Li, Lage lab/Lauren Solomon, Broad Communications

A team of researchers from the Broad Institute of MIT and Harvard and institutions in Denmark and the United Kingdom has launched a new resource for mapping protein interaction networks from genetic data. Reported in a Nature Methods paper, InWeb_InBioMap (a.k.a, InWeb_IM) includes data on more than half a million protein-protein interactions and can help scientists interpret the biological meaning of data generated from genome-wide association and other large-scale exome sequencing studies of traits or illnesses.

Protein-protein interaction resources (curated by specialists by laboriously extracting proteomic data from the published literature) can help reveal functional networks and target future research. Such resources have been particularly useful for implicating unsuspected pathways in cancer and other conditions fueled by somatic mutations and other genetic alterations. However, the resources available today differ in the number of protein interactions they capture, the organisms they include, and the experimental approaches used to generate the data.

With InWeb_IM, the team — led by Taibo Li and Kasper Lage in the Stanley Center for Psychiatric Research at Broad Institute and Massachusetts General Hospital — have created a large, standardized, and transparent resource particularly well suited for functionally interpreting large genomic data sets. InWeb_IM contains data from eight established protein interaction resources, covering 585,843 interactions and spanning 87% of reviewed human proteins cataloged in UniProt (a major protein sequence and annotation resource). In the paper, the team reported that the networks generated with InWeb_IM’s ranked well for accuracy and quality when benchmarked against several of its source databases using genomic data from ~4,700 cancer genomes and tissue-specific data from the Genotype-Tissue Expression (GTEx) project.

InWeb_InBioMap is being updated regularly and is available through a dedicated online portal, as well as through the GeNets pathway analysis and visualization platform.

Paper(s) cited

Li T et alA scored human protein-protein interaction network to catalyze genomic interpretation. Nature Methods. Online November 28, 2016. DOI: 10.1038/nmeth.4083.