RNA-Seq data

Data from Pauli et al, 2012:

- Embryonic Transcriptome


This track contains all 56,535 confidently identified transcripts (single and multi-exonic) in 28,912 loci. The ‘Embryonic Transcriptome’ data set is based on more than two billion 76-bp paired-end strand-specific RNA-Seq reads derived from PolyA+ selected RNA from eight embryonic stages (2-4 cell, 1000 cell, dome, shield, bud, 28 hours post fertilization (hpf), 48 hpf, 120 hpf). All included transcripts were identified at least twice: either assembled with both Scripture and Cufflinks, or identified in at least two developmental stages.
Comment: transcripts on the + strand appear in purple, transcripts on the - strand in green.
DOWNLOAD:     - bed file (56,535 isoforms in .bed format)
                           - fasta file (transcript sequences of 56,535 isoforms in fasta format)
                           - gtf file (conversion of isoforms (Zv9_... IDs) to loci (Xloc_... IDs))

- lncRNAs


This track contains all 1,133 multi-exonic lncRNAs. These transcripts were classified as non-coding based on our stringent filtering pipeline that aimed at removing protein-coding transcript. Filtering criteria included PhyloCSF (phylogenetic Codon Substitution Frequency), blastx/blastp and hmmer, maximal ORF length and sense-overlap with protein-coding transcripts. According to their genomic location, these 1,133 lncRNAs can be partitioned into 397 intergenic lncRNAs, 184 intronic overlapping lncRNAs and 566 antisense exonic overlapping lncRNAs.

DOWNLOAD:     - bed file (1,133 lncRNAs in .bed format)
                           - fasta file (transcript sequences of 1,133 lncRNAs in fasta format)



