Deciphering chromatin: Many marks, millions of histones at a time
A new high-resolution technique for reading combinations of chemical flags in the epigenome could help uncover new rules underlying cell fate and provide important clues for understanding diseases like cancer.
Chromatin, that protein-DNA hybrid that helps neatly bundle six or so feet of DNA into each of our trillions of cells, is more than an elaborate packaging system. It also guides gene regulation by adding a layer of "epigenetic" instructions on top of our genetic code. That layer comprises a diverse set of chemical modifications or marks — molecular flags fluttering in the nucleoplasm to alert the cell as to which stretches of DNA are available to read and which are not.
Marks on chromatin's histones (the protein spools around which DNA wraps) hold great sway over how DNA is “read.” At the moment, we only have a rudimentary understanding of what these marks say. For instance, three methyl groups hooked to lysine 4 on the tail of histone protein 3 (an H3K4me3 modification) means “Read me!” The same number on lysine 27 (H3K27me3) means “Ignore me!”
From here it gets complicated. Not only are histone marks dynamic, but any one histone can carry more than one mark at a time, even in contradictory or "bivalent" combinations (such as simultaneous “read me” and “ignore me” marks). Those combinations might encode hitherto unknown rules for specifying and maintaining a cell's fate — what makes a T-cell a T-cell, a neuron a neuron, and so on.
“There are more than 100 different histone modifications, but probably not many actual combinations,” said Brad Bernstein, co-director of the Epigenomics Program at Broad Institute and a professor of pathology at Massachusetts General Hospital and Harvard Medical School, who a decade ago identified bivalent histones in embryonic stem cells. "The question is, how do you study those combinations?"
The problem is today’s technologies aren't set up for this herculean challenge. Take ChIP-seq — a powerful and popular approach combining chromatin immunoprecipitation (ChIP, which uses antibodies to extract modified histones or other DNA-bound proteins from cells) and genome sequencing. Researchers can use it to locate all of the histones in a sample with one kind of mark, but it is no good for looking at combinations.
"You also end up with an average signal for your population of cells," said Efrat Shema, a research fellow in Bernstein's laboratory, of ChIP-seq. "You have no idea which signals come from which cells."
Other technologies can pick up multi-marked histones, but can’t pinpoint where those histones reside in the genome.
Bernstein, Shema, and Dan Jones (president of the biotech company SeqLL) wanted to create a kind of ChIP-seq on steroids — a single-molecule resolution, high-throughput assay that reveals the combinations of marks on individual histones and maps those histones’ genomic locations. And a paper released this week in Science shows that they’ve taken their first steps.
A histone landscape on a slide
Their assay starts with millions of individual nucleosomes — the functional unit of chromatin, made up of histones and the DNA embracing them — isolated from cells, fixed on a slide, and indexed for position. Fluorescently labeled antibodies (visualized using an imaging approach called total internal reflection fluorescence [TIRF] microscopy) let the team see, and individually count, which nucleosomes harbor histones with which marks and in what combinations.
The team then chemically removes the histones, leaving the nucleosomes' DNA attached to the slide, ready for genomic mapping using SeqLL's single molecule sequencing technology.
Nucleosomes (green) waving H3K9ac flags (red).
Image courtesy Efrat Shema
The result? High-resolution maps revealing the locations of specific multiply marked histones within the genome, data that can reveal marking patterns unique to specific cell types or developmental stages.
"In principle, you can examine tens or dozens of marks simultaneously across millions of nucleosomes, effectively decoding the nucleosomes for the major flags," Bernstein said.
Case in point: In their Science paper, the team reports that embryonic stem cells' histones generally carry either bivalent "read me"+"ignore me" marks (H3K4me3 plus H3K27me3) or a pair of "read me" marks (H3K4me3 plus H3K27ac, a modification involving an acetyl group). Lung fibroblasts (which are fully committed adult cells), on the other hand, have many histones with the double "read me" combination, but very few with the bivalent combination.
"For the first time, we've been able to count how many bivalent nucleosomes are present in ESCs and map them to specific locations in the genome," said Shema.
The team also validated the assay against T-cell leukemia, glioblastoma, lymphoma, and kidney cancer cell lines. "You also see increased bivalency in cancer cells," Shema explained, as the enzymes that add methyl groups to histones are often mutated in cancer. And indeed, the lines they examined had more bivalent histones than normal differentiated cells, but fewer than ESCs.
A product of collaboration
For Bernstein and Shema, the success has been a long time coming. Shema noted that she'd been working on the assay for more than three years, and that while she had been able to study the histone mark combinations, she was struggling with finding a way to sequence the nucleosomes' underlying DNA.
"The challenge was that everything has to be done on the same surface. You need to always know the exact coordinates on the slide for each nucleosome to map them to the genome," Shema continued. "Brad and I brainstormed for many weeks about how to get that done," she added, “but none of the techniques we tried really worked."
By chance, she happened to hear about SeqLL, a young biotech that had bought Helicos Biosciences' single molecule sequencing technology after the company filed for bankruptcy in 2012. She contacted Dan Jones, a former Helicos employee who founded SeqLL in 2013.
"The key from our perspective was to provide enhanced stability during the transition from the antibody imaging step to the sequencing step, to survive both while keeping the DNA's integrity intact," Jones said. "With other technologies you might get a bulk idea that there's some modification within a given region, but you wouldn't be able to pinpoint these modifications to a specific molecule. That's the power of this approach, allowing us to examine the same molecules dozens and potentially hundreds of times."
A good starting point
Bernstein, Shema, and Jones are all up front that theirs' is a proof of concept study, and that a lot of work remains to be done to refine the technique and scale up its throughput.
"Right now we can study 3 to 5 million nucleosomes on one slide in a good experiment," Shema said. She thinks that as they enhance the method, they can increase that number by more than 10 fold.
"This technique of single-molecule nucleosome decoding and sequencing could, in theory, replace ChIP-seq," Bernstein said. "But there's a bit of work to be done to get to that point."
There are any number of directions one could go with the technology. While the Science paper centered on whole genome nucleosome analysis, nothing says the approach can't be used to study only nucleosomes from specified genomic regions (e.g., nucleosomes associated with gene regulatory regions). Or to profile histone combinations within single cells. Or, with the right antibodies, to look at combinations of transcription factors bound to regulatory regions of DNA.
Shema believes the technique could give the community a new way to look at and gain deeper insights into gene regulation. "Because we can look at the different combinations at single-molecule resolution, we might be able to tease out new regulatory principles."
Bernstein thinks the assay could help researchers plumb some of the fundamental biochemical mechanisms of cell division and cell fate.
"Can we follow modification patterns through the cell cycle? Can we learn how cells of a given lineage 'remember' to keep certain genes on when they divide? Can we learn more about the consistency or variability in how individual loci are modified? Can we find features of drug response or drug resistance?" he asked. "These are basic biological questions that were previously inaccessible, but the answers could yield great insights into the biochemical nature of human diseases."
Shema E, Jones D, et al. Single-molecule decoding of combinatorially modified nucleosomes. Science. May 6, 2016. DOI: 10.1126/science.aad7701