Hello , I ve been trying to write a script for calculating coverage per gene,unsuccessfully(!) ,and I found now that is nicely done by GATK ! I would very much need to use this calculation of depthOfCoverage for each gene but I cannot find the geneList needed in the format explained here. I have a RefSeq gene list downloaded from UCSC table which contains RefSeq name ,cds_start & end and "chr" information. Is this acceptable? I want to do it for exons falling inside the genes, which I have downloaded from : ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/exome_pull_down_targets/ (phase3), so this would be my Intervals List. It contains only chr info and start-end. I have also calculated for my bam files the bedtools-genomecov with option of "bedgraph" ,so I wrote a script to calculate mean coverage for each exon whose reads fall onto. Is the calculation of DepthOfCoverage done in the same principle ? Moreover I cannot find in UCSC a table which combines RefSeq name with used gene Name. Is it combined in this genesList you provide from GATK? Can you guide me where I could find the exact url for /humgen/.../geneList.txt or if mine could work ,and if exons table is ok with only these 3 columns ? I m a registered member, as for writing in the forum. Is there any extra procedure needed to access your database ? Thanks in advance !

I am learning to use the DepthofCoverage function to obtain the gene coverage information for a collection of bacterial contigs that were mapped with metagenomic reads. The original post introducing this function is here: http://gatkforums.broadinstitute.org/discussion/40/depthofcoverage-v3-0-how-much-data-do-i-have#latest

In the post, you mentioned the gene list, as follow:

-geneList /path/to/gene/list.txt

The provided gene list must be of the following format:

585     NM_001005484    chr1    +       58953   59871   58953   59871   1       58953,  59871,  0       OR4F5   cmpl    cmpl    0,
587     NM_001005224    chr1    +       357521  358460  357521  358460  1       357521, 358460, 0       OR4F3   cmpl    cmpl    0,

I have three inquiries:

  1. Can you please provide headers to the values in each column?
  2. I am working with bacterial genomic contigs, can you please specify what basic information is needed for a gene list (e.g., name of contig, name of gene, location of gene in the contig, from... to ..., etc.)?

Thanks so much!