bwa aln returning less alignments for higher number of mismatches #358

Jfortin1 · 2022-06-23T19:20:24Z

While aligning a 19nt-long sequence to the human reference genome (hg38) using bwa aln, I get the unexpected results to get less reported alignments when using 6 mismatches instead of 5.

My toy example fastq file (toyExample.fastq):

@AGCATGGGGAGCTCCCGGG
AGCATGGGGAGCTCCCGGG
+AGCATGGGGAGCTCCCGGG
~~~~~~~~~~~~~~~~~~~

My code for 6 mismatches:

bwa aln -n 6 -k 0 -l 100 -o 0 -N myBWAIndex toyExample.fastq > out_n6.sai

I set the seed length to a large number to ignore seed-specific constraints.

Version: 0.7.17-r1198-dirty

The text was updated successfully, but these errors were encountered:

mikiger · 2022-07-10T12:41:15Z

I also just run into that issue.
Happens to me with many different short sequences of lengths 25-30 against hg19 reference, where I get less hits when running with 5 mismatches than with 4 mismatches (with less mismatches it behaves as expected).

Here is an example:
@read1
GTGGGCGGGCAGGTGCAGGTGGGT
+
########################

With 4 mismatches I get 501 hits in X1: (bwa aln -N -n4 hg19.fa read1.fastq):
19:7591367-7591594/16,24/+ 0 19 7591571 0 24M * 0 0 GTGGGCGGGCAGGTGCAGGTGGGT ######################## XT:A:U NM:i:0 X0:i:1 X1:i:501

With 5 mismatches I get only 39 hits in X1: (bwa aln -N -n4 hg19.fa read1.fastq):
19:7591367-7591594/16,24/+ 0 19 7591571 7 24M * 0 0 GTGGGCGGGCAGGTGCAGGTGGGT ######################## XT:A:U NM:i:0 X0:i:1 X1:i:39

bwa version: 0.7.17-r1188

lh3 · 2025-03-22T16:27:50Z

@Jfortin1 and @mikiger are correct that the result is unexpected. This is related to gapped alignment, but I don't know why.

npsonis · 2025-03-22T19:57:36Z

There is a possible explanation here https://www.biostars.org/p/16221/#16256.

This comment has been minimized.

Sign in to view

lh3 added Investigation needed bwa-aln labels Mar 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bwa aln returning less alignments for higher number of mismatches #358

bwa aln returning less alignments for higher number of mismatches #358

Jfortin1 commented Jun 23, 2022

mikiger commented Jul 10, 2022

This comment has been minimized.

lh3 commented Mar 22, 2025

npsonis commented Mar 22, 2025

bwa aln returning less alignments for higher number of mismatches #358

bwa aln returning less alignments for higher number of mismatches #358

Comments

Jfortin1 commented Jun 23, 2022

mikiger commented Jul 10, 2022

This comment has been minimized.

lh3 commented Mar 22, 2025

npsonis commented Mar 22, 2025