Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs.

Nat Biotechnol
Authors
Keywords
Abstract

Massively parallel cDNA sequencing (RNA-Seq) provides an unbiased way to study a transcriptome, including both coding and noncoding genes. Until now, most RNA-Seq studies have depended crucially on existing annotations and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We applied it to mouse embryonic stem cells, neuronal precursor cells and lung fibroblasts to accurately reconstruct the full-length gene structures for most known expressed genes. We identified substantial variation in protein coding genes, including thousands of novel 5' start sites, 3' ends and internal coding exons. We then determined the gene structures of more than a thousand large intergenic noncoding RNA (lincRNA) and antisense loci. Our results open the way to direct experimental manipulation of thousands of noncoding RNAs and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.

Year of Publication
2010
Journal
Nat Biotechnol
Volume
28
Issue
5
Pages
503-10
Date Published
2010 May
ISSN
1546-1696
URL
DOI
10.1038/nbt.1633
PubMed ID
20436462
PubMed Central ID
PMC2868100
Links
Grant list
R01 HG005111 / HG / NHGRI NIH HHS / United States
R01 HG005111-01 / HG / NHGRI NIH HHS / United States
DP1 OD003958-01 / OD / NIH HHS / United States
Howard Hughes Medical Institute / United States
DP1 OD003958 / OD / NIH HHS / United States