The sites where transcription factors bind within regulatory DNA fall into six distinct patterns that overlap with the factors' functions, Broad scientists find, helping advance a goal of regulatory genomics.
Sketching out a transcription factor code: Binding patterns reflect factors' gene expression roles
The rules dictating how regulatory DNA elements called enhancers control genes' expression remain murky, but it is clear that they act through specialized proteins called transcription factors (TFs). A group of Broad scientists has found that TFs' binding sites within enhancers cluster in distinct patterns reflecting the factors' roles in gene expression control.
Motif clustering patterns within noncoding DNA. Click to see the full size image. Adapted from Grossman et al. PNAS 2018. (Credit: Susanna M. Hamilton, Broad Communications)
These patterns, which may constitute a position-based code, could be a boon to researchers trying to learn how to predict enhancers' activities from sequence data alone.
TFs bind to enhancers (which oversee whole programs of gene expression) and other regulatory elements called promoters (from which gene transcription begins) at specific sequences called motifs. Once bound, these proteins perform a variety of jobs, from unraveling DNA to reading genes and writing RNA.
The Broad team surveyed 103 factors' motifs in 47 cell types, focusing on nucleosome-depleted regions (NDRs): stretches of unwound regulatory DNA where TFs can bind. When the team compared the motifs' locations, six distinct groups emerged, overlapping with the factors' known roles.
The groups were not evenly distributed across all of the NDRs the team examined. Rather, the team noted, certain groups consistently occurred together in all 47 cell types. For instance, enhancers harboring group 4 motifs also contained more motifs from group 3 and fewer from groups 5 and 6 — suggesting that different combinations of TFs may work with different kinds of enhancers.
The team's findings represent a new step toward a long-term goal: defining a model for predicting enhancers' activities within a given cell type by looking at their DNA sequence. They open the door to additional insights into what constitutes an active, functional enhancer, as opposed to an inactive one.
This work was supported by the National Human Genome Research Institute and the National Institute of General Medical Sciences.