Description of Blast Databases

Neurospora

This file contains all the contigs for the current assembly (genomic DNA only, not mitochondria sequence).

Neurospora mitochondria

This file contains all the contigs for the current assembly of the mitochondria sequence.

Neurospora small contigs

In sequencing a whole genome, the entire genomic DNA is broken into small fragments and 500 nucleotide reads are sequenced from these small fragments. The millions of reads are assembled into larger contiguous sequence fragments, termed contigs. See Neurospora Assembly for details.

The Neurospora Assembly 3 was created on 12/13/02 and consists of the 821 contigs larger than 2000 nucleotides. These 821 contigs, totalling 38 Mb, was analyzed and annotated. The sequence of these 821 contigs is available for download and database searching.

The assembly also contained an additional 813 contigs smaller than 2000 nucleotides. These small contigs are not presented in our assembly as they tend to be redundant and of low quality.

However, for completeness, we provide these small contigs in a separate file (nc3_small_contigs) that you may download or search using our BLAST interface.

Neurospora excluded reads

In sequencing a whole genome, the entire genomic DNA is broken into small fragments and 500 nucleotide reads are sequenced from these small fragments. The millions of reads are assembled into larger contiguous sequence fragments, termed contigs. See Neurospora Assembly for details.

Some of the reads are excluded in the assembly process and are not used to in the creation of contigs. These are high quality reads which the assembly software could not confidently place. A common reason for a read to be excluded is that it falls within a highly repetitive region.

We provide these 83924 sequence reads in a separate file (nc3_excluded_reads) that you may download or search using our BLAST interface.

Neurospora upstream / downstream

We provide seven files to help you search for regulatory regions and other elements of 5' and 3' UTRs:
    nc3_gene_upstream300
    nc3_gene_upstream500
    nc3_gene_upstream1000
    nc3_gene_upstream2000
    nc3_gene_downstream300
    nc3_gene_downstream500
    nc3_gene_downstream1000

These files contain a sequence fragment for each of the 10082 genes predicted in Neurospora Assembly 3. The nc3_gene_upstream files contain the regions 300-2000bp upstream of each gene, and the nc3_gene_downstream files contain the regions 300-1000bp downstream of each gene. The sequences in each multifasta file are labeled with the format:

>UPSTREAM300 (NCU10033.1) predicted protein Neurospora crassa contig 3.736 [1438-1737] reverse complement:

indicating:

  1. Direction of region: UPSTREAM or DOWNSTREAM
  2. Length of region: 300,500,1000, or 2000
  3. (NCU#####.#): locus of gene, with version number
  4. Gene name
  5. Contig name
  6. Region of contig corresponding to the displayed region
  7. "reverse complement" if the gene is on the - strand