Lots of variation uncovered, but more remains hidden

Leah Eisenstadt, February 7th, 2011 | Filed under
  • Image courtesy of Jane Ades, NHGRI

As data from the Human Genome Project accumulated, scientists realized that there was a significant amount of variation in the human genome, especially in the form of single-letter changes known as single nucleotide polymorphisms (SNPs). The study of SNPs in the human genome and their influence on disease has been a major focus of genome research over the past decade and has revealed hundreds of SNPs associated with common diseases.

More recently, scientists have recognized the importance of another form of genetic variation, structural variation. Last week, we described results from a study by researchers at the Broad and elsewhere that used next-generation sequencing data from the 1000 Genomes Project to discover and map structural variants in the human genome. Scientists can now use that map and research tools created from it, such as microarrays, to conduct more studies to uncover associations between structural variants and disease, as was done with SNPs. Co-senior author of the new study and Broad associate member Charles Lee explains that all humans are estimated to carry between 1,100 and 1,400 structural variants that are more than 500 basepairs, or “letters,” of DNA in length.

Structural variation comes in a few forms:

- One is deletion or duplication, known as copy number variation, in which the genome contains missing or extra DNA. The new study identified 22,000 common deletions in the human genome; however, the team estimates that they are still missing 20-30% of the existing deletions, says Lee. The study methods were less successful at picking up duplications, identifying 501 of these variants — an estimated 5-10% of what’s out there.

- They found more than 5,000 “mobile elements,” bits of DNA that can move around the genome. “We think there are more [mobile elements] out there, but don’t have enough of an estimate to tell how much yet,” he says.

- The computer programs used in the study were not able to reliably identify inversions or translocations, in which segments of DNA are reversed on a chromosome or switched between chromosomes, respectively.

The researchers plan to refine their methods in an attempt to discover the remaining common structural variation. Until then, the influx of data on structural variants in the human genome offers valuable tools for disease research and a newfound appreciation for the complexity of the human genome.