Reads.ids

From ArachneWiki

Jump to: navigation, search

reads.ids is a file that gives the names of reads used in an assembly, sorted by id. The file appears in SUBDIRs as output, although it is often used as input for other assembly modules.

For an assembly containing n reads, the file reads.ids will contain n+1 lines. The first line contains the number n (useful information for memory-allocation routines such as READ in System). The following n lines contain the read names. Note that the ordering of these names is extremely important because it implies the read ids: that is, the first name (on line 2) is for read 0; the second name is for read 1; and so forth. A sample file, with 5 reads, might look like this:

5
read_id_0
read_id_1
read_id_2
read_id_3
read_id_4

Making reads.ids

If necessary, it is easy to create reads.ids from reads.fasta using UNIX commands. First, run the following command:

grep ">" reads.fasta | cut -c2- > reads.ids

This gets all the read names in the correct order. Next, you must add the first line to reads.ids, which is easily done manually. Run wc -l reads.ids to find out the number of reads (i.e., lines in the file) and prepend it to reads.ids.

Personal tools