Epigenome Mapping

Producing the original sequence of the human genome was a landmark achievement. Yet, it posed a new challenge: the genome sequence was an endless stream of As, Cs, Gs, and Ts, devoid of functional interpretation. So beyond protein-coding genes, which constitute only 1% of the genome, we were unable to decode the functions of the other sequences.

Epigenomic maps provide an opportunity to identify and understand the sequences, interacting proteins, and chromosomal structures that act throughout the other 99% of the genome to control gene activity. We use DNA sequencing-based mapping technologies such as ChIP-Seq, Bisulfite-Seq, MINT-Chip, RNA-Seq, and chromatin conformation capture assays to annotate the genome and advance our understanding of how it functions (see Technologies for more details).

Genome annotations also allows us to examine the consistent structural features of the genome and lend a framework for interpreting mutations that contribute to disease, including cancer.

A major goal of the program is to generate reference epigenomic data as part of collaborative, international projects. For an idea of scale, the Epigenomics Program creates over 1,000 epigenomic maps per year. These data are utilized by thousands of researchers worldwide to better annotate and understand the human genome.

Epigenome mapping projects

  • ENCODE 4: The NIH Encyclopedia of DNA Elements (ENCODE) project is now in its fourth iteration and 10th year. We collaborate with scientists around the world to create reference maps of the epigenome and understand the regulatory potential of the DNA between protein-coding genes. For ENCODE 4, the Epigenomics Program, together with BTL, forms one of five epigenome mapping centers located in the USA. We also serve as a major data coordination center. ENCODE 4 will now also incorporate rare cell populations from human organoids, as well as disease samples. All data are provided to the community open access.
  • Epigenome mapping in cancer
  • Epigenome mapping in disease: translating GWAS variants to function

Technology instantiation and quality control

Our core labs - led by Chuck Epstein, Noam Shoresh, and Andi Gnirke - work together to optimize technologies for reproducible, large scale epigenomic research. We adapt methods to enable rigorous quality control, and where feasible, high-throughput robotic operation. We devise computational pipelines to support automated processing and sharing of data.

Our current production-scale capabilities include:

See our Technologies page for information about other technologies utilized by the program.

Past mapping projects

  • NIH Roadmap: This five-year program mapped the epigenomic landscape of many cell types, resulting in a public reference map against which researchers can compare the aberrant epigenomic characteristics associated with particular diseases. Data are also available via this resource.
  • ENCODE 2 and ENCODE 3

Key Papers

Roadmap Epigenomics Consortium, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015.

Farh KK, Marson A, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature. 2015.

Tsankov AM, et al. Transcription factor binding dynamics during human ES cell differentiation. Nature. 2015.

Cacchiarelli D, Trapnell C, Ziller MJ, et al. Integrative analyses of human reprogramming reveal dynamic nature of induced pluripotency. Cell. 2015.

Zhu J, Adli M, et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell. 2013.

The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012.

Ernst J, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011.

Mikkelsen TS, Xu Z, et al. Comparative epigenomic analysis of murine and human adipogenesis. Cell. 2010.