Integrative analysis of multiple genomic variables using a hierarchical Bayesian model.

Bioinformatics
Authors
Keywords
Abstract

Motivation: Genes showing congruent differences in several genomic variables between two biological conditions are crucial to unravel causalities behind phenotypes of interest. Detecting such genes is important in biomedical research, e.g. when identifying genes responsible for cancer development. Small sample sizes common in next-generation sequencing studies are a key challenge, and there are still only very few statistical methods to analyze more than two genomic variables in an integrative, model-based way. Here, we present a novel bioinformatics approach to detect congruent differences between two biological conditions in a larger number of different measurements such as various epigenetic marks or mRNA transcript levels.

Results: We propose a coefficient quantifying the degree to which genes present consistent alterations in multiple (more than two) genomic variables when comparing samples presenting a condition of interest (e.g. cancer) to a reference group. A hierarchical Bayesian model is employed to assess uncertainty on a gene level, incorporating information on functional relationships between genes. We demonstrate the approach on different data sets containing RNA-seq gene transcripton and up to four ChIP-seq histone modification measurements. Both the coefficient-based ranking and the inference based on the model lead to a plausible prioritizing of candidate genes when analyzing multiple genomic variables.

Availability and implementation: BUGS code in the Supplement.

Contact: m.schaefer@uni-duesseldorf.de.

Supplementary information: Supplementary data are available at Bioinformatics online.

Year of Publication
2017
Journal
Bioinformatics
Volume
33
Issue
20
Pages
3220-3227
Date Published
2017 Oct 15
ISSN
1367-4811
DOI
10.1093/bioinformatics/btx356
PubMed ID
28582573
Links