Mycobacterium tuberculosis PGG1
Our central premise is that a lack of an adequate quantity of comparative MTb full- genome sequence data significantly hinders our understanding of strain dissemination, strain genotype-patient phenotype relationships, molecular genetic basis of drug resistance, and fundamental differences in strain behavior. This lack greatly restricts the speed with which new basic science discoveries are made and translated into practical anti-tuberculosis maneuvers. Moreover, this lack of adequate data significantly impairs our ability to develop very rapid, relatively inexpensive, high-throughput molecular strategies for use in public health programs designed to limit MTB spread, including dissemination of MDR-TB strains.
Full-genome sequence datasets have rapidly become the end-game for beginning to understand certain types of biologic problems, regardless of the species or type of problem. The overall goal of the proposed project is to provide a unique, data-rich, enabling set of polymorphic genetic markers to facilitate the drive toward conducting precise, real-time epidemiologic studies of MTB spread and developing rapid genetically-based strategies to identify drug-resistant organisms that are a crucial threat to successful MTB control. Relative to many bacterial pathogens, allelic diversity of MTB is relatively restricted. Thus, our central premise is that only by aggressive, ultra-deep re-sequencing of the genome of multiple rationally chosen MTB strains will the genetic information be available that can be exploited subsequently for developing novel diagnostic, epidemiologic, public health, and therapeutic maneuvers. We have arrived at this premise after years of studying MTB population genetics, evolution, and molecular basis of drug resistance, and examining our recent data and insights obtained from sequencing the genomes of 100 strains of serotype M3 group A Streptococcus. Thus, we propose that the genome of 100 strains of MTB be sequenced to high fold-coverage. We propose to specifically concentrate on phylogenetically related strains of principal genetic group 1 (PGG1), which are known to commonly cause large case outbreaks in many regions of the United States and globally. Strains of the so-called W or W-Beijing clone are members of PGG1. PGG1 isolates also are commonly represented among multi-drug resistant (MDR-TB) and extensively drug-resistant (XDR-TB) organisms in the United States and globally.
The project will exploit the availability of strains already present in unique collections of many thousands of MTB strains from diverse geographic sites, patients of known type of clinical disease, patients of known ethnicity, and drug-resistant phenotype. All strains have already been extensively characterized for many genetic markers, including principal genetic group, IS6110 RFLP profile, spoligotype, and mycobacterial interspersed repetitive unit (MIRU) type. In addition, many strains have been characterized in our previous studies of single nucleotide polymorphism (SNP) genotype based on a genome-wide sample of 578 SNPs. The dataset that will derive from this proposed project will totally change how we can think about and address many areas of MTB research and public health, that is, it will be a 'disruptive,' research-enabling dataset that will be mined and used by MTB investigators and public health officials worldwide. Thus, the results of the study project will have widespread clinical, translational, and basic science international implications.
Samples are in the queue for sequencing, check back for data.
This project description was taken from a white paper authored by James M. Musser, M.D., Ph.D., Edward Graviss, Ph.D., and Stephen B. Beres, Ph.D., from The Methodist Hospital Research Institute.
This sequencing project was supported by the National Institute of Allergy and Infectious Disease, National Institutes of Health funded Genome Sequencing Center for Infectious Diseases at the Broad Institute.