Hi there, I have a two basic MutSig questions that I was hoping you could help with.
I would like to generate my own MutSig compliant coverage file. So I am counting up the number of nucleotides in each effect/category zone for all genes and am wondering how you reduce the complexity of gene transcripts? Do you use the RefSeq gene model? When there is more than one RefSeq transcript how do you choose which one to use?
The other question is, are non-coding variants used in the calculation of significance for mutations? The logic for asking this question is that I have whole genome data but would only like to process the variants in coding sequences through MutSig so I wanted to double check there wouldn’t be any unforeseen consequences of omitting non-coding variants.
Many thanks for your help