Tagged with #fasta
0 documentation articles | 0 announcements | 5 forum discussions

No articles to display.

No articles to display.

Created 2016-05-10 22:25:04 | Updated 2016-05-10 22:29:05 | Tags: fasta snps fastaalternatereferencemaker

Comments (0)

Hi everyone, I have a set of high-quality SNPs that were jointly called off a merged, realigned BAM. I then created a VCF for each sample using SelectVariants, re-ran UnifiedGenotyper with EMIT_ALL_SITES, and thinned the resulting VCF to no-call sites so I could mask them. Next, I produced a FASTA file for each sample with Ns at no-call sites and SNPs in their appropriate positions with FastaAlternateReferenceMaker. How can I use GATK to specify contigs where no reads were supported for a given sample and use this information to avoid outputting these regions via the -L flag with FastaAlternateReferenceMaker? Apologies if this is trivial, but I haven't found a clear solution.


Created 2015-10-29 17:03:50 | Updated | Tags: fasta index-file

Comments (1)


I've followed the exact commandline instructions for two separate reference files (hg.fa and Homo_sapiens_assembly19.fasta). I am definitely generating the files, but other tools (bwa mem) find no index when I use the reference as an argument. Full example below:

$ bwa index -a bwtsw -p Homo_sapiens_assembly19 Homo_sapiens_assembly19.fasta $ samtools faidx Homo_sapiens_assembly19.fasta $ picard CreateSequenceDictionary \ REFERENCE=Homo_sapiens_assembly19.fasta \ OUTPUT=Homo_sapiens_assembly19.dict (all of these finish without an error)

$ ls Homo_sapiens_assembly19* Homo_sapiens_assembly19.amb Homo_sapiens_assembly19.bwt Homo_sapiens_assembly19.fasta Homo_sapiens_assembly19.pac Homo_sapiens_assembly19.ann Homo_sapiens_assembly19.dict Homo_sapiens_assembly19.fasta.fai Homo_sapiens_assembly19.sa


$ bwa mem Homo_sapiens_assembly19.fasta filtered.fastq [E::bwa_idx_load_from_disk] fail to locate the index files

Any ideas? I could swear I've run this exact same pipeline before with the same files without any issues...

I'm running bwa-0.7.12 with picard-1.140 and samtools-1.2 on a Macbook Pro OSX El Capitan. Every command executed within the same directory.

Thanks for any help!

Created 2015-09-27 22:16:57 | Updated | Tags: haplotypecaller fasta

Comments (4)


I have tried to solve several issues which came up while trying to run the HaplotypeCaller. For this one, I didn't find anything on google and to be honest when pasting the error, google doesn't even find something similar.

ERROR MESSAGE: Badly formed genome loc: Contig NC_007605 given as location, but this contig isn't present in the Fasta sequence dictionary

Can anyone please tell me what's the problem here? The fasta file I got was the one downloaded from the bundle: human_g1k_v37.fasta.gz

Any help would be really appreciated. Thank you!!

Created 2015-04-28 15:14:42 | Updated 2015-04-28 15:15:42 | Tags: vcf fasta

Comments (3)

Hello GATK Team,

Is there a tool within GATK that takes a multiple sequence alignment in FASTA format and converts to VCF? If not, could anyone point me to a tool that could do this task?

Many thanks, Nick

Created 2014-07-30 13:26:04 | Updated | Tags: fasta bam-files

Comments (3)


I am a phd student working in Sweden, currently trying to apply NGS data to phylogenetics.

I would like to know if there is a way to convert a sorted BAM file into a fasta sequence using only the mapped reads, i.e., without incorporating any of the reference into the fasta sequence?

I have sorted bam files that result from mapping reads of one species to a reference of a different species. Right now I am extracting the reads from the bam files and re-assembling them, but the result is sub-optimal because I often get multiple contigs, probably due to low coverage portions in the bam file, and this causes many alignment problems. I would like to get a single contig, perhaps with gaps inserted where there are no reads to match the reference, which would be much easier to align to other samples.

Regards, Filipe de Sousa