The Cancer Genome Atlas (TCGA) Datasets
The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. TCGA began as a three-year pilot in 2006 with an investment of $50 million each from the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI). The TCGA pilot project confirmed that an atlas of changes could be created for specific cancer types. It also showed that a national network of research and technology teams working on distinct but related projects could pool the results of their efforts, create an economy of scale and develop an infrastructure for making the data publicly accessible. Importantly, it proved that making the data freely available would enable researchers anywhere around the world to make and validate important discoveries. The success of the pilot led the National Institutes of Health to commit major resources to TCGA to collect and characterize more than 20 additional tumor types.
A full description of the project as well as access to the data can be found at: http://cancergenome.nih.gov/
The Integrative Genomics Viewer (IGV) client server provides an interactive display of the open-access data from the TCGA project without the need for futher downloads. Some datasets require approval to access and cannot be provided in this public forum. Individual users who have permission to access these datasets can download them from the above website and load them into IGV locally.
Information about the data analysis steps used to create these datasets can be found here: