Integrative annotation of chromatin elements from ENCODE data.

Nucleic Acids Res

Authors	Michael Hoffman Jason Ernst Steven Wilder Anshul Kundaje Robert Harris Max Libbrecht Belinda Giardine Paul Ellenbogen Jeffrey Bilmes Ewan Birney Ross Hardison Ian Dunham Manolis Kellis William Noble
Keywords	Humans Genome-Wide Association Study Chromatin Enhancer Elements, Genetic Genome, Human Molecular Sequence Annotation Promoter Regions, Genetic Proteins Transcription, Genetic Regulatory Elements, Transcriptional Terminator Regions, Genetic Insulator Elements
Abstract	The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate an interpretable summary of the massive datasets of the ENCODE Project, we apply unsupervised learning methodologies, converting dozens of chromatin datasets into discrete annotation maps of regulatory regions and other chromatin elements across the human genome. These methods rediscover and summarize diverse aspects of chromatin architecture, elucidate the interplay between chromatin activity and RNA transcription, and reveal that a large proportion of the genome lies in a quiescent state, even across multiple cell types. The resulting annotation of non-coding regulatory elements correlate strongly with mammalian evolutionary constraint, and provide an unbiased approach for evaluating metrics of evolutionary constraint in human. Lastly, we use the regulatory annotations to revisit previously uncharacterized disease-associated loci, resulting in focused, testable hypotheses through the lens of the chromatin landscape.
Year of Publication	2013
Journal	Nucleic Acids Res
Volume	41
Issue	2
Pages	827-41
Date Published	2013 Jan
ISSN	1362-4962
URL	http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=23221638
DOI	10.1093/nar/gks1284
PubMed ID	23221638
PubMed Central ID	PMC3553955
Links	PubMed DOI Google Scholar
Grant list	DK065806 / DK / NIDDK NIH HHS / United States HG004695 / HG / NHGRI NIH HHS / United States R01 HG004037 / HG / NHGRI NIH HHS / United States RC2 HG005573 / HG / NHGRI NIH HHS / United States HG006259 / HG / NHGRI NIH HHS / United States HG005573 / HG / NHGRI NIH HHS / United States HG005334 / HG / NHGRI NIH HHS / United States HG004570 / HG / NHGRI NIH HHS / United States 095908 / Wellcome Trust / United Kingdom R01 DK065806 / DK / NIDDK NIH HHS / United States K99 HG006259 / HG / NHGRI NIH HHS / United States

Recent Broad Publications

Multi-ancestry meta-analysis of tobacco use disorder identifies 461 potential risk genes and reveals associations with multiple health outcomes.

Refining the impact of genetic evidence on clinical success.

Refining the impact of genetic evidence on clinical success.

Analysis of REST binding sites with canonical and non-canonical motifs in human cell lines.

Placental senescence pathophysiology is shared between peripartum cardiomyopathy and preeclampsia in mouse and human.