Publications

High-resolution view of the yeast meiotic program revealed by ribosome profiling.

Project: Aim 1, 3
Citation: Brar GA, Yassour M, Friedman N, Regev A, Ingolia NT, Weissman JS
Journal: Science
Publication Date: Feb. 3 2012
Paper Name: Brar.Science.2011.1215110
Link: http://dx.doi.org/10.1126/science.1215110

Abstract: Meiosis is a complex developmental process that generates haploid cells from diploid progenitors. We measured messenger RNA (mRNA) abundance and protein production through the yeast meiotic sporulation program and found strong, stage-specific expression for most genes, achieved through control of both mRNA levels and translational efficiency. Monitoring of protein production timing revealed uncharacterized recombination factors and extensive organellar remodeling. Meiotic translation is also shifted toward noncanonical sites, including short open reading frames (ORFs) on unannnotated transcripts and upstream regions of known transcripts (uORFs). Ribosome occupancy at near-cognate uORFs was associated with more efficient ORF translation; by contrast, some AUG uORFs, often exposed by regulated 5' leader extensions, acted competitively. This work reveals pervasive translational control in meiosis and helps to illuminate the molecular basis of the broad restructuring of meiotic cells.

Combinatorial patterning of chromatin regulators uncovered by genome-wide location analysis in human cells.

Project: Aim 1, 2, 4
Citation: Ram O, Goren A, Amit I, Shoresh N, Yosef N, Ernst J, Kellis M, Gymrek M, Issner R, Coyne M, Durham T, Zhang X, Donaghey J, Epstein CB, Regev A, Bernstein BE
Journal: Cell
Publication Date: Dec. 23 2011
Paper Name: Ram.Cell.2011.09.057
Link: http://dx.doi.org/10.1016/j.cell.2011.09.057

Abstract: Hundreds of chromatin regulators (CRs) control chromatin structure and function by catalyzing and binding histone modifications, yet the rules governing these key processes remain obscure. Here, we present a systematic approach to infer CR function. We developed ChIP-string, a meso-scale assay that combines chromatin immunoprecipitation with a signature readout of 487 representative loci. We applied ChIP-string to screen 145 antibodies, thereby identifying effective reagents, which we used to map the genome-wide binding of 29 CRs in two cell types. We found that specific combinations of CRs colocalize in characteristic patterns at distinct chromatin environments, at genes of coherent functions, and at distal regulatory elements. When comparing between cell types, CRs redistribute to different loci but maintain their modular and combinatorial associations. Our work provides a multiplex method that substantially enhances the ability to monitor CR binding, presents a large resource of CR maps, and reveals common principles for combinatorial CR function.

Strategies to discover regulatory circuits of the mammalian immune system.

Project: Aim 1, 3
Citation: Amit I, Regev A, Hacohen N
Journal: Nat Rev Immuno
Publication Date: Nov. 18 2011
Paper Name: Amit.2011.NRI.3109
Link: http://dx.doi.org/10.1038/nri3109

Abstract: Recent advances in technologies for genome- and proteome-scale measurements and perturbations promise to accelerate discovery in every aspect of biology and medicine. Although such rapid technological progress provides a tremendous opportunity, it also demands that we learn how to use these tools effectively. One application with great potential to enhance our understanding of biological systems is the unbiased reconstruction of genetic and molecular networks. Cells of the immune system provide a particularly useful model for developing and applying such approaches. Here, we review approaches for the reconstruction of signalling and transcriptional networks, with a focus on applications in the mammalian innate immune system.

Systematic Discovery of TLR Signaling Components Delineates Viral-Sensing Circuits.

Project: Aim 1, 3
Citation: Chevrier N, Mertins P, Artyomov MN, Shalek AK, Iannacone M, Ciaccio MF, Gat-Viks I, Tonti E, Degrace MM, Clauser KR, GarberM, Eisenhaure TM, Yosef N, Robinson J, Sutton A, Andersen MS, Root DE, von Andrian U, Jones RB, Park H, Carr SA, Regev A,Amit I, Hacohen N.(2011) Cell 147(4):853-67
Journal: Cell
Publication Date: Nov. 11 2011
Paper Name: Rabani.2011.NBT.1861
Link: http://dx.doi.org/10.1038/nbt.1861

Abstract: Deciphering the signaling networks that underlie normal and disease processes remains a major challenge. Here, we report the discovery of signaling components involved in the Toll-like receptor (TLR) response of immune dendritic cells (DCs), including a previously unkown pathway shared across mammalian antiviral responses. By combining transcriptional profiling, genetic and small-molecule perturbations, and phosphoproteomics, we uncover 35 signaling regulators, including 16 known regulators, involved in TLR signaling. In particular, we find that Polo-like kinases (Plk) 2 and 4 are essential components of antiviral pathways in vitro and in vivo and activate a signaling branch involving a dozen proteins, among which is Tnfaip2, a gene associated with autoimmune diseases but whose role was unknown. Our study illustrates the power of combining systematic measurements and perturbations to elucidate complex signaling circuits and discover potential therapeutic targets.

Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes.

Project: Aim 2
Citation: Ingolia NT, Lareau LF, Weissman JS 
Journal: Cell
Publication Date: Nov. 11 2011
Paper Name: Chevrier.Cell.2011.10.022
Link: http://dx.doi.org/10.1016/j.cell.2011.10.002

Abstract: The ability to sequence genomes has far outstripped approaches for deciphering the information they encode. Here we present a suite of techniques, based on ribosome profiling (the deep sequencing of ribosome-protected mRNA fragments), to provide genome-wide maps of protein synthesis as well as a pulse-chase strategy for determining rates of translation elongation. We exploit the propensity of harringtonine to cause ribosomes to accumulate at sites of translation initiation together with a machine learning algorithm to define protein products systematically. Analysis of translation in mouse embryonic stem cells reveals thousands of strong pause sites and unannotated translation products. These include amino-terminal extensions and truncations and upstream open reading frames with regulatory potential, initiated at both AUG and non-AUG codons, whose translation changes after differentiation. We also define a class of short, polycistronic ribosome-associated coding RNAs (sprcRNAs) that encode small proteins. Our studies reveal an unanticipated complexity to mammalian proteomes.

Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. 

Project: Aim 2
Citation: Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. 
Journal: Nature
Publication Date: Aug. 28 2011
Paper Name: Guttman.2011.Nature.10398
Link: http://dx.doi.org/10.1101/gad.17446611

Abstract: Large intergenic noncoding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lincRNAs remains a challenge. Recent advances in RNA sequencing (RNA-seq) and computational methods allow for an unprecedented analysis of such transcripts. Here, we present an integrative approach to define a reference catalog of >8000 human lincRNAs. Our catalog unifies previously existing annotation sources with transcripts we assembled from RNA-seq data collected from 4 billion RNA-seq reads across 24 tissues and cell types. We characterize each lincRNA by a panorama of >30 properties, including sequence, structural, transcriptional, and orthology features. We found that lincRNA expression is strikingly tissue-specific compared with coding genes, and that lincRNAs are typically coexpressed with their neighboring genes, albeit to an extent similar to that of pairs of neighboring protein-coding genes. We distinguish an additional subset of transcripts that have high evolutionary conservation but may include short ORFs and may serve as either lincRNAs or small peptides. Our integrated, comprehensive, yet conservative reference catalog of human lincRNAs reveals the global properties of lincRNAs and will facilitate experimental studies and further functional classification of these genes.

lincRNAs act in the circuitry controlling pluripotency and differentiation. 

Project: Aim 4
Citation: Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, Young G, Lucas AB, Ach R, Bruhn L, Yang X, Amit I, Meissner A, Regev A, Rinn JL, Root DE, Lander ES 
Journal: Nature
Publication Date: August 28 2011
Paper Name: Guttman.2011.Nature.10398
Link: http://dx.doi.org/10.1038/nature10398

Abstract: Although thousands of large intergenic non-coding RNAs (lincRNAs) have been identified in mammals, few have been functionally characterized, leading to debate about their biological role. To address this, we performed loss-of-function studies on most lincRNAs expressed in mouse embryonic stem (ES) cells and characterized the effects on gene expression. Here we show that knockdown of lincRNAs has major consequences on gene expression patterns, comparable to knockdown of well-known ES cell regulators. Notably, lincRNAs primarily affect gene expression in trans. Knockdown of dozens of lincRNAs causes either exit from the pluripotent state or upregulation of lineage commitment programs. We integrate lincRNAs into the molecular circuitry of ES cells and show that lincRNA genes are regulated by key transcription factors and that lincRNA transcripts bind to multiple chromatin regulatory proteins to affect shared gene expression programs. Together, the results demonstrate that lincRNAs have key roles in the circuitry controlling ES cell state.

Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells.

Project: Aim 1, 3
Citation: Rabani M, Levin JZ, Fan L, Adiconis X, Raychowdhury R, Garber M, Gnirke A, Nusbaum C, Hacohen N, Friedman N, Amit I, Regev A. 
Journal: Nat Biotechnol
Publication Date: May 29 2011
Paper Name: Ram.Cell.2011.09.057
Link: http://dx.doi.org/10.1038/nbt.1861

Abstract: Cellular RNA levels are determined by the interplay of RNA production, processing and degradation. However, because most studies of RNA regulation do not distinguish the separate contributions of these processes, little is known about how they are temporally integrated. Here we combine metabolic labeling of RNA at high temporal resolution with advanced RNA quantification and computational modeling to estimate RNA transcription and degradation rates during the response of mouse dendritic cells to lipopolysaccharide. We find that changes in transcription rates determine the majority of temporal changes in RNA levels, but that changes in degradation rates are important for shaping sharp 'peaked' responses. We used sequencing of the newly transcribed RNA population to estimate temporally constant RNA processing and degradation rates genome wide. Degradation rates vary significantly between genes and contribute to the observed differences in the dynamic response. Certain transcripts, including those encoding cytokines and transcription factors, mature faster. Our study provides a quantitative approach to study the integrative process of RNA regulation.

Full-length transcriptome assembly from RNA-Seq data without a reference genome.

Project: Aim 2
Citation: Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A.
Journal: Nat Biotechnol
Publication Date: May 15 2011
Paper Name: Grabherr.2011.NBT.1883
Link: http://dx.doi.org/10.1038/nbt.1883

Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

Comprehensive comparative analysis of strand-specific RNA sequencing methods.

Project: Aim 1
Citation: Levin JZ, Yassour M, Adiconis X, Nusbaum C, Thompson DA, Friedman N, Gnirke A, Regev A.
Journal: Nat Methods
Publication Date: Sep. 7 2010
Paper Name: Levin.2010.NMeth.1491
Link: http://dx.doi.org/10.1038/nmeth.1491

Abstract: Strand-specific, massively parallel cDNA sequencing (RNA-seq) is a powerful tool for transcript discovery, genome annotation and expression profiling. There are multiple published methods for strand-specific RNA-seq, but no consensus exists as to how to choose between them. Here we developed a comprehensive computational pipeline to compare library quality metrics from any RNA-seq method. Using the well-annotated Saccharomyces cerevisiae transcriptome as a benchmark, we compared seven library-construction protocols, including both published and our own methods. We found marked differences in strand specificity, library complexity, evenness and continuity of coverage, agreement with known annotations and accuracy for expression profiling. Weighing each method's performance and ease, we identified the dUTP second-strand marking and the Illumina RNA ligation methods as the leading protocols, with the former benefitting from the current availability of paired-end sequencing. Our analysis provides a comprehensive benchmark, and our computational pipeline is applicable for assessment of future protocols in other organisms.

Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. 

Project: Aim 2, 4
Citation: Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A.
Journal: Nat Biotechnol
Publication Date: May 28 2010
Paper Name: Guttman.2010.NBT.1633
Link: http://dx.doi.org/10.1038/nbt.1633

Abstract: Massively parallel cDNA sequencing (RNA-Seq) provides an unbiased way to study a transcriptome, including both coding and noncoding genes. Until now, most RNA-Seq studies have depended crucially on existing annotations and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We applied it to mouse embryonic stem cells, neuronal precursor cells and lung fibroblasts to accurately reconstruct the full-length gene structures for most known expressed genes. We identified substantial variation in protein coding genes, including thousands of novel 5' start sites, 3' ends and internal coding exons. We then determined the gene structures of more than a thousand large intergenic noncoding RNA (lincRNA) and antisense loci. Our results open the way to direct experimental manipulation of thousands of noncoding RNAs and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.

Comprehensive mapping of long-range interactions reveals folding principles of the human genome.

Project: Aim 1
Citation: Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J.
Journal: Science
Publication Date: Oct. 9 2009
Paper Name: Lieberman-Aiden.2009.Science.1181369
Link:  http://dx.doi.org/10.1126/science.1181369

Abstract: We describe Hi-C, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We constructed spatial proximity maps of the human genome with Hi-C at a resolution of 1 megabase. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free, polymer conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.

Unbiased reconstruction of a mammalian transcriptional network mediating pathogen responses.

Project: Aim 3
Citation: Amit I, Garber M, Chevrier N, Leite AP, Donner Y, Eisenhaure T, Guttman M, Grenier JK, Li W, Zuk O, Schubert LA, Birditt B, Shay T, Goren A, Zhang X, Smith Z,Deering R, McDonald RC, Cabili M, Bernstein BE, Rinn JL, Meissner A, Root DE, Hacohen N, Regev A. 
Journal: Science
Publication Date: Oct. 9 2009
Paper Name: Amit.2009.Science.1179050
Link: http://dx.doi.org/10.1126/science.1179050

Abstract: Models of mammalian regulatory networks controlling gene expression have been inferred from genomic data but have largely not been validated. We present an unbiased strategy to systematically perturb candidate regulators and monitor cellular transcriptional responses. We applied this approach to derive regulatory networks that control the transcriptional response of mouse primary dendritic cells to pathogens. Our approach revealed the regulatory functions of 125 transcription factors, chromatin modifiers, and RNA binding proteins, which enabled the construction of a network model consisting of 24 core regulators and 76 fine-tuners that help to explain how pathogen-sensing pathways achieve specificity. This study establishes a broadly applicable, comprehensive, and unbiased approach to reveal the wiring and functions of a regulatory network controlling a major transcriptional response in primary mammalian cells.