What columns are necessary in MAF for MutSigCV?

From some of the previous posts, it sounds like the 'categ' and 'effect' columns are not necessary. If we are okay with using the new defaults (i.e. using the inputs derived from HELA cells and the cell line encyclopedia - that is MAF only), what columns are necessary / used by the algorithm? This would be helpful to me and, I think , make for more complete documentation.

Thank you,

What columns are necessary in MAF for MutSigCV?

The MAF file should have the following fields:

Hugo_Symbol or gene
Tumor_Sample_Barcode or patient
Variant_Classification or type
Chromosome or chr
Start_position or start
Reference_Allele or ref_allele
Tumor_Seq_Allele1, Tumor_Seq_Allele2 or newbase

Note that the value in the Variant_Classification field must match those in its dictionary.

Gordon

Variant_Classification field

The ones in the mutation_type_dictionary_file.txt, gotcha!