I am putting together a class on NGS data analysis and am working with one of Illumina's "Platinum" data sets (http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?cmd=viewer&m=data&s=viewer&run=ERR262996). This is a result set for a long range mate-pair experiment with about 33x coverage. When I run ReduceReads on the mapped, aligned, and indel re-aligned BAM file using default parameters, I get only about a 20% reduction in file size. Closer inspection shows that none of the reads were actually removed. I tried with both the complete results and a chromosome 20 subset.
What am I missing?