You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While aligning a 19nt-long sequence to the human reference genome (hg38) using bwa aln, I get the unexpected results to get less reported alignments when using 6 mismatches instead of 5.
I also just run into that issue.
Happens to me with many different short sequences of lengths 25-30 against hg19 reference, where I get less hits when running with 5 mismatches than with 4 mismatches (with less mismatches it behaves as expected).
Here is an example: @read1
GTGGGCGGGCAGGTGCAGGTGGGT
+
########################
With 4 mismatches I get 501 hits in X1: (bwa aln -N -n4 hg19.fa read1.fastq):
19:7591367-7591594/16,24/+ 0 19 7591571 0 24M * 0 0 GTGGGCGGGCAGGTGCAGGTGGGT ######################## XT:A:U NM:i:0 X0:i:1 X1:i:501
With 5 mismatches I get only 39 hits in X1: (bwa aln -N -n4 hg19.fa read1.fastq):
19:7591367-7591594/16,24/+ 0 19 7591571 7 24M * 0 0 GTGGGCGGGCAGGTGCAGGTGGGT ######################## XT:A:U NM:i:0 X0:i:1 X1:i:39
While aligning a 19nt-long sequence to the human reference genome (hg38) using
bwa aln
, I get the unexpected results to get less reported alignments when using 6 mismatches instead of 5.My toy example fastq file (toyExample.fastq):
My code for 6 mismatches:
I set the seed length to a large number to ignore seed-specific constraints.
Version: 0.7.17-r1198-dirty
The text was updated successfully, but these errors were encountered: