The DATA directory is one of the input directories necessary for running Arachne. It is given as a subdirectory of PRE, and it has its subdirectories for Arachne output, most importantly RUN. Each DATA directory corresponds to a particular genome-sequencing project.

The DATA directory is where the raw input data is stored. It contains a fasta/ subdirectory with fasta reads, a qual/ subdirectory with qual files, and a traceinfo/ subdirectory with XML ancillary files. It also contains the files genome.size, insert.sites, and others (see Input for a complete list.) The files from DATA are used in pre-processing and the assembly process.

For example, the script Assemblez requires the following command-line form:

Assemblez PRE=/pre DATA=data RUN=run

This will fail unless the directories /pre and /pre/data already exist. If it succeeds, it will create /pre/data/run and put output there. Note that DATA must be a relative directory location, not be an absolute one (i.e., it cannot begin with a backslash.) However, DATA may contain internal backslashes, so that /PRE/DATA is multiple steps below /PRE in the directory tree. DATA may also contain symbolic links.

