You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After mapping I observed some reads which did not span the entire amplicon region. I checked back the read in the UMI file and in the stiched reads file. The "original" stiched read file contained 12900 reads with the UMI and 99.9% were full length and only 10 were smaller. However, the smallest read was still longer than the read in the UMI read file.
UMI: GCATCCACAAAT
Stiched reads with this UMI: 12963 reads
length distribution (count / length):
1 96
1 129
1 130
10 131
161 132
12787 133
2 134
length of the consensus of the UMI family: 69 bp
The homing sequence is 3 nt and the barcode 10nt. Therefore, even if the 96nt should result in a consensus read.
Do you have any suggestions? Or was this observed before?
The text was updated successfully, but these errors were encountered:
Were your input reads all of uniform read length? I'm surprised by this behavior; would you be willing to provide some data with which I can reproduce the issue?
Hi,
The read length varies a bit (+/- 2 bp) as you also see in the post before. I also observed this phenomenom in other datasets at low level. Accidentely, I tried the UMI generation with the option "-n 5" and then the consensus reads were correct. However, I have no clue why this option does change the output. Do you?
Hi,
I collapsed amplicon data and got some truncated reads during the collapse step.
The data was paired-ed reads which were stiched to single reads. The stiched reads were collapsed with the UMI being inline.
bmftools collapse inline -S -l 10 -s <homing> -f <prefix> -z <stiched reads>
After mapping I observed some reads which did not span the entire amplicon region. I checked back the read in the UMI file and in the stiched reads file. The "original" stiched read file contained 12900 reads with the UMI and 99.9% were full length and only 10 were smaller. However, the smallest read was still longer than the read in the UMI read file.
UMI: GCATCCACAAAT
Stiched reads with this UMI: 12963 reads
length distribution (count / length):
1 96
1 129
1 130
10 131
161 132
12787 133
2 134
length of the consensus of the UMI family: 69 bp
The homing sequence is 3 nt and the barcode 10nt. Therefore, even if the 96nt should result in a consensus read.
Do you have any suggestions? Or was this observed before?
The text was updated successfully, but these errors were encountered: