Sequencing & Assembly
The Ustilago maydis genome was sequenced using the Whole Genome Shotgun methodology, whereby:
- Ustilago maydis DNA is shattered into small fragments (~4 kb or ~40 kb)
- Each fragment is inserted into a vector and cloned
- The two ends of the fragment are sequenced, creating paired reads
- The assembly process uses the paired reads to identify contiguous stretches of sequence (contigs)
- Contigs are ordered and linked together into larger supercontigs by using paired reads lying in different contigs
Assembly 1, May 28, 2003
- Total length of assembly: 19.683 Mb
- >10X sequencing coverage of the genome
- 274 contigs in 48 scaffolds (supercontigs)
- 19.683 Mb total length of combined contigs (19,683,350 bp)
- Average base lies in a contig with length at least 127 Kb (contig N50)
- Average base lies within a supercontig with length at least 818 Kb (supercontig N50)
- Supercontig and contig numbers are preceded by the version of the assembly. For example:
- Contig 1.25 - refers to contig number 25 within assembly 1.
- Supercontig 1.2 - refers to supercontig number 2 within assembly 1.
- Supercontigs are numbered in order of decreasing length. For example, supercontig 1.1 is the largest with 2.465 Mb, and supercontig 1.48 is the smallest with 3.049 Kb.
- Contigs within supercontigs are ordered positionally. For example, supercontig 1.1 contains contigs 1,2,3...26 (in that order).
There is no correspondence between contig or supercontigs numbers in different assemblies.
We end-sequenced the following types of libraries: Plasmid and Fosmid
# Clone ends mapped to Assembly 1