Truncated consensus sequence #104

Benni96 · 2017-06-30T13:51:31Z

Hi,
I collapsed amplicon data and got some truncated reads during the collapse step.

The data was paired-ed reads which were stiched to single reads. The stiched reads were collapsed with the UMI being inline.

bmftools collapse inline -S -l 10 -s <homing> -f <prefix> -z <stiched reads>

After mapping I observed some reads which did not span the entire amplicon region. I checked back the read in the UMI file and in the stiched reads file. The "original" stiched read file contained 12900 reads with the UMI and 99.9% were full length and only 10 were smaller. However, the smallest read was still longer than the read in the UMI read file.

UMI: GCATCCACAAAT
Stiched reads with this UMI: 12963 reads
length distribution (count / length):
1 96
1 129
1 130
10 131
161 132
12787 133
2 134
length of the consensus of the UMI family: 69 bp
The homing sequence is 3 nt and the barcode 10nt. Therefore, even if the 96nt should result in a consensus read.

Do you have any suggestions? Or was this observed before?

The text was updated successfully, but these errors were encountered:

dnbaker · 2017-07-01T00:14:35Z

Were your input reads all of uniform read length? I'm surprised by this behavior; would you be willing to provide some data with which I can reproduce the issue?

Thank you!

Benni96 · 2017-07-03T11:52:46Z

Hi,
The read length varies a bit (+/- 2 bp) as you also see in the post before. I also observed this phenomenom in other datasets at low level. Accidentely, I tried the UMI generation with the option "-n 5" and then the consensus reads were correct. However, I have no clue why this option does change the output. Do you?

dnbaker · 2017-07-05T19:14:16Z

BMFtools assumes uniform read length, which is why adapter masking, not trimming, is suggested.

Are you using Illumina data?

-n only changes memory requirements.

How many reads passed homing sequence?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Truncated consensus sequence #104

Truncated consensus sequence #104

Benni96 commented Jun 30, 2017

dnbaker commented Jul 1, 2017

Benni96 commented Jul 3, 2017

dnbaker commented Jul 5, 2017

Truncated consensus sequence #104

Truncated consensus sequence #104

Comments

Benni96 commented Jun 30, 2017

dnbaker commented Jul 1, 2017

Benni96 commented Jul 3, 2017

dnbaker commented Jul 5, 2017