Neurospora Assembly

Methodology Overview

The Neurospora genome was sequenced using the Whole Genome Shotgun methodology, whereby:
  1. Neurospora DNA is shattered into small fragments (~4 kb or ~40 kb)
  2. Each fragment is inserted into a vector and cloned
  3. The two ends of the fragment are sequenced, creating paired reads
  4. The assembly process uses the paired reads to identify contiguous stretches of sequence (contigs)
  5. Contigs are ordered and linked together into larger supercontigs by using paired reads lying in different contigs

Assembly Data

Assembly 3, February 1, 2001

Sequencing Facts

  • Greater than 10X sequencing coverage of the genome
  • 821 contigs longer than 2 kb
  • 173 supercontigs
  • 46 kb average contig length (range 2-522 kb)
  • 219 kb average supercontig length (range 2-1829 kb)
  • 38 Mb total length of combined contigs (38,044,343 bp)
  • Average base lies in a contig of length 91 kb
  • Average base lies within a supercontig of length 608 kb

Supercontig/Contig Numbering

  • Supercontig and contig numbers are preceded by the version of the assembly. For example:
    • Contig 3.25 - refers to contig number 25 within assembly 3.
    • Supercontig 3.2 - refers to supercontig number 2 within assembly 3. Supercontig 3.2 contains contigs 3.18,3.19,..., 3.40.

  • Supercontigs are numbered in order of decreasing length. For example, supercontig 3.1 is the largest with 1.7 Mb, and supercontig 3.173 is the smallest with 2 kb.

    See Supercontig Table for a list of all supercontigs with their lengths and contained contigs, or download a comma-separated file supercontigs.csv.

  • Contigs within supercontigs are ordered positionally. For example, supercontig 2.1 contains contigs 1,2,3...26,27 (in that order).

    See Contig Table for a list of all contigs with their lengths and supercontigs, or download a comma-separated file contigs.csv.

    There is no correspondence between contig or supercontigs numbers in different assemblies.

Library Clones

    We end sequenced cosmid and BAC libraries from the Fungal Genetics Stock Center and added these reads to the assembly data.

    Library Name # Clones mapped to Assembly 3
    BAC3965
    Cosmid pLORIST6209
    Cosmid pMOcosX2850
    total13,024

    You can download a comma-separated file listing all clones: clones.csv (see Download Clones for data details).

    You may also search for clones that overlap a contig region.

    Ordering Clones from FGSC:
    As a note of caution, it is not uncommon for cross-well contamination to occur during the preparation and replication of arrayed libraries. As a result, some library copies may have multiple clones within a single well, and the predominant clone in any well may vary from one copy of the library to another.

    You should verify the identity of the clones prior to investing your time and materials. Ideally this verification would begin with isolation of single colonies and be sequence-based, e.g., an end-sequencing reaction, a restriction fingerprint, or a PCR assay based on the expected sequence.

    Order clones from the Fungal Genetics Stock Center

Genetic Markers