Broad Institute unleashes dog genome
Tasha, the boxer dog
Image courtesy of NIH
"In my friend, I find a second self."
Friendship often provides a looking glass in which we may truly behold ourselves. In a poetic testimonial, scientists have glimpsed the biological essence of humans by translating the genetic code of man's best friend — the dog.
An international research team led by scientists at the Broad Institute of MIT and Harvard has decoded the DNA of the domestic dog and pinpointed millions of genetic differences that distinguish dog breeds. The study also includes the first comparative analysis to encompass three distinct mammalian genomes, revealing important DNA elements common among them. Such shared genetic signatures offer crucial insights into genome organization and function, particularly in humans. Their efforts, described in the December 8 issue of Nature, shed light on the genetic similarities between dogs and humans as well as the genetic differences between dog breeds, and may guide future discoveries that improve the health of both species.
"Of the more than 5,500 mammals living today, dogs are arguably the most remarkable," said senior author Eric Lander, director of the Broad Institute, professor of biology at MIT and of systems biology at Harvard Medical School, and a member of the Whitehead Institute for Biomedical Research. "The incredible physical and behavioral diversity of dogs — from Chihuahuas to Great Danes — is encoded in their genomes. It can uniquely help us understand embryonic development, neurobiology, human disease and the basis of evolution."
Dogs lead the hunt for common threads in mammalian genomes
More than two years ago, the Nature paper's authors embarked on a mission to assemble a complete map of the dog genome. In the first phase of the project they obtained high-quality DNA sequence from a female boxer named "Tasha," covering nearly 99% of the dog's genome. Because dogs sit at a key branch point in the evolutionary tree relative to humans, the dog genome sequence enabled researchers to make novel observations regarding the genetic similarities among mammals.
Broad researchers recognized that 5% of human DNA is shared with dogs and mice. These common elements are genetic evidence of evolution's handiwork and have been protected from changes that might otherwise have accumulated during the 100 million years that separate the species. Such careful preservation of these shared sequences suggests their significance in genome function.
In a geographical sense, the researchers found that many of the conserved sequences display a clear asymmetric distribution in humans, with half clustered around just a few hundred sites in the sprawling genomic landscape. Moreover, only a handful of genes dwell within these clusters of shared sequences. Searching within this small number of genes, Broad scientists noted that many carry the instructions for transcription factors and axon guidance molecules — developmental proteins that determine cellular identity and guide the wiring of the brain, respectively. Thus, a majority of the conserved sequences oversee a tiny slice of the human genome — just 1% of all genes — and one that speaks during the earliest, most impressionable chapter of life.
"The clustering of regulatory sequences is incredibly interesting," said Kerstin Lindblad-Toh, first author of the Nature paper and co-director of the Genome Sequencing and Analysis Program at Broad. "It means that a small subset of crucial human genes is under much more elaborate control than we had ever imagined."
Prior to this analysis, genome scientists had confined their investigations to a single group — known as a clade in evolutionary parlance — whose representatives include humans, chimpanzees, mice and rats. Since dogs are housed in a neighboring clade, the completion of the dog genome lends newfound strength to genome comparisons that run the evolutionary gamut.
"Incorporating the dog genome into a comparison spanning three mammalian species enabled us to identify some of the most highly conserved, and therefore biologically important, genetic elements in humans," said author Tarjei Mikkelsen, a graduate student in the Genome Sequencing and Analysis Program at Broad. "This is a significant step toward assembling a complete parts list for the human genome."
Although sequencing of the dog genome gives substantial insight into the architecture of the human genome, even greater insight will come from adding additional mammalian genomes to the analysis. This is already underway at the Broad, as part of a large NIH-funded collaboration that includes three other research centers and that will provide low-coverage genome sequence for at least 16 other mammals. Several have been chosen based on their evolutionary relationship to humans, and include representatives from the two mammalian clades that have not yet been sampled. The elephant, armadillo and rabbit genomes are already complete, and several more are soon to follow, including cat, bat, and squirrel. In this context, the information contributed by the dog genome will serve as an important link between humans and more diverged mammals. By putting this veritable zoo in their test tube, Broad scientists hope to gain a global view of mammalian genomes and the critical insights needed to unravel how they work.
Dogs' tricks revealed by SNPs
In the project's second phase, Broad scientists used the boxer genome sequence as a genetic "compass" to navigate the genomes of 10 different dog breeds and other related canine species, including gray wolves and coyotes. They identified tiny spots of genetic variation, called single nucleotide polymorphisms (SNPs, pronounced "snips"), which collectively form a catalog of genetic variability between individuals. In contrast to genome comparisons, which highlight trends across multiple species, sampling genetic variation within a species, like dogs, is essential for reconstructing its ancestry and, more importantly, for identifying the genetic mutations that underlie disease.
Using SNPs as a gauge, Broad researchers measured stretches of DNA, called haplotype blocks, in the genomes of several different dog breeds. Haplotype blocks are often inherited en masse as DNA is passed from parent to offspring. But, with each successive generation, a genetic shuffling, known as recombination, intervenes to shorten these blocks and eventually whittles them to scant fragments. Much like the number of nested rings in a tree's trunk reveals its age, the length of haplotype blocks in a population provides a genetic readout of its history.
"By surveying haplotype structure, we found clear genetic evidence of the two bottlenecks that modern dogs have undergone, signifying early domestication and subsequent breed development," said author Claire Wade, senior research scientist in the Center for Human Genetic Research at Massachusetts General Hospital and a Broad researcher. "These bottlenecks have given dogs a very special genomic structure that will greatly assist us in finding disease genes."
In probing the details of this configuration, Broad scientists discovered that haplotype blocks in dogs resemble a patchwork quilt of DNA. Short haplotype blocks are shared across dog breeds and represent the vestigial DNA inherited from ancestral dogs that has been trimmed by recombination over several generations. These are interspersed with long haplotype blocks, which are common among dogs of the same breed and are longer due to their more recent origins when modern breeds emerged.
Such a mosaic topology has profound practical implications. Using computers to simulate a small stretch of the dog genome, the scientists played a game of hide-and-seek with an imaginary disease-causing mutation. They discovered that, due to the unusual haplotype block structure in dogs, just 10,000 SNPs are required to locate the genes associated with particular traits, such as disease susceptibility.
Friends in need
The companionship shared between dogs and humans has a lengthy history, tracing back to dogs' initial domestication from gray wolves at least 30,000 years ago. Through selective breeding over the past few centuries, modern dog breeds have become a testament to biological diversity with an impressive medley of body sizes and structures, temperaments and behaviors. Although breeding practices aim to sustain the preferred traits of one generation in the next, they also have the unintended consequence of predisposing many dog breeds to genetic diseases, including heart disease, cancer, blindness, cataracts, epilepsy, hip dysplasia and deafness. As a consequence, veterinary medicine confronts the challenge of identifying the causal genes, so that more effective preventive, diagnostic and treatment modalities can be developed.
Through the branches of nature's evolutionary tree, we see that similar hurdles exist in humans. We suffer from many of the same illnesses as our four-legged friends and even show similar symptoms, but the genetic underpinnings have proved difficult to trace. Because of the distinctive genomic architecture in dog breeds — the haplotype blocks are 50 to 100 times longer than those found in humans — identifying the genes associated with disease should be more straightforward. "Finding the genes that cause disease in humans can be a daunting task, requiring at least 300,000 SNPs and thousands of patients," said author Elinor Karlsson, a graduate student in the Bioinformatics Program at Boston University working at Broad. "Because of their unique genomic organization, we need about 30 times fewer SNPs and only a few hundred volunteers to accomplish this task in dogs."
Broad scientists are currently applying the knowledge gained from SNP analysis to find disease genes. To this end, the dog-owner community has been an essential collaborator. "We deeply appreciate the generous cooperation of individual dog owners and breeders, breed clubs and veterinary schools in providing blood samples for genetic analysis and disease gene mapping," said Lindblad-Toh. "Without their interest and help we could not be doing this work."
Dogs mirror humans in their DNA and in their biology, which makes them critical for understanding diseases in both species. Second only to humans, dogs are the most intensively studied mammal in medicine, often with more detailed genetic and family histories than those available for humans. Moreover, due to their relatively long lifespan, dogs naturally develop many diseases, such as cancer, that afflict aging humans as well. Now, with the availability of comprehensive genomic tools in dogs, future endeavors to map disease-causing mutations will be strengthened and accelerated, heralding medical advances for canines and humans alike.