Integrative analysis of multiple genomic variables using a hierarchical Bayesian model.
Motivation: Genes showing congruent differences in several genomic variables between two biological conditions are crucial to unravel causalities behind phenotypes of interest. Detecting such genes is important in biomedical research, e.g. when identifying genes responsible for cancer development. Small sample sizes common in next-generation sequencing studies are a key challenge, and there are still only very few statistical methods to analyze more than two genomic variables in an integrative, model-based way. Here, we present a novel bioinformatics approach to detect congruent differences between two biological conditions in a larger number of different measurements such as various epigenetic marks or mRNA transcript levels.
Results: We propose a coefficient quantifying the degree to which genes present consistent alterations in multiple (more than two) genomic variables when comparing samples presenting a condition of interest (e.g. cancer) to a reference group. A hierarchical Bayesian model is employed to assess uncertainty on a gene level, incorporating information on functional relationships between genes. We demonstrate the approach on different data sets containing RNA-seq gene transcripton and up to four ChIP-seq histone modification measurements. Both the coefficient-based ranking and the inference based on the model lead to a plausible prioritizing of candidate genes when analyzing multiple genomic variables.
Availability and implementation: BUGS code in the Supplement.
Supplementary information: Supplementary data are available at Bioinformatics online.
|Year of Publication||
2017 Oct 15