Cytosine methylation is a key component in epigenetic regulation of gene expression and frequently occurs at CpG sites throughout the genome. Bisulfite sequencing is a technique used to analyze the genome-wide methylation profiles on a single nucleotide level [doi:10.1093/nar/gki901]. Sodium bisulfite efficiently and selectively deaminates unmethylated cytosine residues to uracil without affecting 5-methyl cytosine (methylated). Using restriction enzymes and PCR to enrich for regions of the genome that have high CpG content, the resulting reduced genome comprises ~1% of the original genome but includes key regulatory sequences as well as repeated regions.
The protocol involves several steps. First, genomic DNA is digested with a restriction endonuclease such as MspI, which targets CG dinucleotides. This results in DNA fragments with CG at the ends. Next, the fragments are size selected (via gel electrophoresis), which facilitates the enrichment of CpG-containing sequences. The next step involves bisulfite treatment to convert unmethylated C nucleotides to uracil (U), while methylated cytosines will remain intact. The bisulfite-treated DNA is amplified with a proofreading-deficient DNA polymerase to facilitate amplification of both methylated cytosines as well as the C -> U converted bases. Subsequent to PCR amplification, each original unmethylated cytosine will be converted to either a T (+ strand) or an A (- strand), while methylated C will remain a C (+ strand) or a G (- strand). The PCR products are then sequenced using conventional methods and aligned to a reference.
I have Bisulfite- treated sequence mapped using Bismark and Bowtie2 and I'd like to call SNPs and INDELs from it. I have used Bis-SNP to call SNPs but it doesn't call indels , can I use GATK to call indels from the mapped data? Do u have any support to Bisulfite data? Another question please, the data is a mix from 6 different people do u have any support fro pooled data? Thanks for your help.