Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bwa aln returning less alignments for higher number of mismatches #358

Open
Jfortin1 opened this issue Jun 23, 2022 · 4 comments
Open

bwa aln returning less alignments for higher number of mismatches #358

Jfortin1 opened this issue Jun 23, 2022 · 4 comments

Comments

@Jfortin1
Copy link

While aligning a 19nt-long sequence to the human reference genome (hg38) using bwa aln, I get the unexpected results to get less reported alignments when using 6 mismatches instead of 5.

My toy example fastq file (toyExample.fastq):

@AGCATGGGGAGCTCCCGGG
AGCATGGGGAGCTCCCGGG
+AGCATGGGGAGCTCCCGGG
~~~~~~~~~~~~~~~~~~~

My code for 6 mismatches:

bwa aln -n 6 -k 0 -l 100 -o 0 -N myBWAIndex toyExample.fastq > out_n6.sai

I set the seed length to a large number to ignore seed-specific constraints.

Version: 0.7.17-r1198-dirty

@mikiger
Copy link

mikiger commented Jul 10, 2022

I also just run into that issue.
Happens to me with many different short sequences of lengths 25-30 against hg19 reference, where I get less hits when running with 5 mismatches than with 4 mismatches (with less mismatches it behaves as expected).

Here is an example:
@read1
GTGGGCGGGCAGGTGCAGGTGGGT
+
########################

With 4 mismatches I get 501 hits in X1: (bwa aln -N -n4 hg19.fa read1.fastq):
19:7591367-7591594/16,24/+ 0 19 7591571 0 24M * 0 0 GTGGGCGGGCAGGTGCAGGTGGGT ######################## XT:A:U NM:i:0 X0:i:1 X1:i:501

With 5 mismatches I get only 39 hits in X1: (bwa aln -N -n4 hg19.fa read1.fastq):
19:7591367-7591594/16,24/+ 0 19 7591571 7 24M * 0 0 GTGGGCGGGCAGGTGCAGGTGGGT ######################## XT:A:U NM:i:0 X0:i:1 X1:i:39

bwa version: 0.7.17-r1188

@npsonis

This comment has been minimized.

@lh3
Copy link
Owner

lh3 commented Mar 22, 2025

@Jfortin1 and @mikiger are correct that the result is unexpected. This is related to gapped alignment, but I don't know why.

@npsonis
Copy link

npsonis commented Mar 22, 2025

There is a possible explanation here https://www.biostars.org/p/16221/#16256.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants