Unforeseen treasures wrapped in human genome's scraps

Scavenging can turn up some surprising finds, even in the human genome. Comparing the DNA of several different animals, including the opossum and the chicken, a team of scientists has sifted through our own genetic clutter and stumbled upon some unexpected jewels. These may hold important clues for understanding the most fundamental components of the human genome.

Recent studies reveal that about 5% of human DNA has been immaculately preserved from our genetic predecessors, thanks to evolution’s diligent care of these sequences. In light of such careful swaddling, scientists speculate that these genetic heirlooms must fill prominent roles. But only a small fraction, which carry the instructions for making the body’s proteins, are understood on a functional level. The remainder, referred to as conserved noncoding elements (CNEs), have been particularly difficult to decipher, in part because of their inordinate diversity. With few similarities among known CNEs, scientists have been unable to glean useful details about their function.

To address this problem, Broad scientists Mike Kamal, Xiaohui Xie and Eric Lander turned to a collection of ancient, related DNA sequences in the human genome, known as ancestral repeats (ARs) and often referred to as genomic “junk.” The researchers compared human ARs with those of other mammals, including dog, mouse and rat, looking for short stretches of DNA perfectly matched among all four species: in other words, CNEs. Their efforts, reported February 13 in the online edition of the Proceedings of the National Academy of Sciences, uncovered more than 100 previously unidentified CNEs, many of which show significant similarities in DNA sequence. These newly identified CNEs contained a surprisingly large contingent from the MER121 family, a group of related ARs that consists of over 900 members in humans.

A closer look revealed this AR family to be a previously undiscovered genomic treasure. As a group, MER121 members tend to be an eclectic lot, with substantial variability in DNA sequence. However, when compared among other mammals, including, mouse, rat and dog, they show far fewer alterations in length and in DNA sequence than do other AR families. The researchers identified family members in other, more-distant genomes, including the chicken and the opossum Monodelphis domestica (a marsupial), which allowed them to reconstruct the family’s lineage. They uncovered hundreds of MER121 members in the opossum genome, many of which possessed a corresponding human ortholog, but found evidence of only two MER121-like sequences in the chicken genome. This suggests that this AR family emerged sometime after the appearance of birds but before marsupials, and that evolutionary pressures have helped to protect and maintain this membership in mammals.

In light of such significant evolutionary conservation, the scientists then looked for telling signs of the family’s purpose in humans. They found little evidence that MER121 DNA is transcribed into RNA, suggesting that it is indeed a non-coding feature of the genome. Some family members contain short, conserved sequence motifs that may form binding sites for regulatory proteins, such as transcription factors. Though family members appear primarily in remote regions of the human genome that are secluded from genes, there are a few large clusters that reside near genes.

While the function of this remarkably conserved family of ARs awaits further study, these findings suggest that what was previously thought of as junk, at least within the human genome, needs to be further sifted before being relegated to the scientific curb.

Paper(s) cited

Kamal M, Xie X, Lander ES. A large family of ancient repeat elements in the human genome is under strong selection. PNAS; doi:10.1073/pnas.0511238103