Supplementary Info
All supplementary information is linked at: http://www.broad.mit.edu/seq/Saccharomyces/The download page contains:
S1. Sequences and Alignments
- a. The raw contigs and scaffolds for each species.
- b. The fasta sequences of unambiguous ORFs and flanking intergenic regions for each species.
- c. Multiple alignment of unambiguous ORFs and intergenic regions.
- d. Protein alignments of unambiguous ORFs are linked at the SGD website.
S2. Annotation
- a. All predicted ORFs for each species and their correspondence with S.cerevisiae.
- b. Blocks of conserved gene order (synteny) between each species and S. cerevisiae.
- c. Small and large homology groups for each species.
- d. Other intergenic features. tRNA table and counts per species. Transposons tables.
- e. Multiple alignment of tRNA genes, other RNA genes, Centromeres, ARS, LTR.
S3. Visualization of gene correspondence
- a. Dotplots between ORFs in each assembly and ORFs in S. cerevisiae chromosomes.
- b. Tiling of local gene order and ORF correspondence in 50kb windows.
- c. Interactive synteny viewer at SGD website.
S4. Mutation counts
- a. Overall counts of transitions, transversions, insertions, deletions along phylogenetic tree.
- b. Ka/Ks rate for uninterrupted S. cerevisiae ORFs.
S5. Rearrangements
- a. Distances between consecutive ORFs in each species and S. cerevisiae.
- b. Table of genomic rearrangements for each species (translocations, inversions, segment duplications).
- c. Table of all insertions/deletions of at least 2kb between consecutive syntenic ORFs.
- d. Clusters of ORFs with ambiguous correspondence define regions of rapid change.
S6. RFC test
- a. Test outcome for every ORFs as kept, rejected or no_call.
- b. RFC score for every ORF across each species and S. cerevisiae.
- c. Window-based RFC analysis for each species and S. cerevisiae.
- d. Correlation of RFC test with length of open frame for short ORFs of varying lengths.
S7. Revisiting S.cerevisiae annotation
- a. Table of proposed changes in ORF boundaries (changed start, changed end, merged ORFs).
- b. Alignment of proposed start/end boundary changes.
- c. Alignment of apparent single-species frame-shift mutations
- d. Results of resequencing proposed frame-shifts in S. cerevisiae.
- e. Alignment of resequenced regions and PCR primers used.
- f. Table of proposed novel introns and proposed changed introns.
- g. Alignment of known, changed, and proposed novel introns
S8. Genome-wide motifs
- a. Table of mini-motifs with CC1, CC2, CC3 score, extension and conservation counts
- b. Sequence-based motif collapsing for each test.
- c. Co-occurrence-based collapsing of grouped consensi.
S9. Category-based motif discovery
- a. Increased enrichment of known motifs by using multiple genomes.
- b. Comparison of our category-based motif discovery and MEME for known motifs.
- c. All category-based motifs discovered, clustered by sequence similarity.
