Skip to content

rnaseqmut can not deal with indel reads correctly without a reference fasta. #4

@y9c

Description

@y9c

I test a bam file with indel reads, and all the reads has a MD tag. If the reference fasta file is provided, the reference base in the ouput will be the correct one. But if I do not provide the reference sequence, and let rnaseqmut infer that from the MD tag., then some position can no be inferred correctly.

Demo run without reference seq:

(The 4th row in the output show ref base of pos:4 is G, while it should be C)

$ rnaseqmut -s 6 -m 3 -i 1 -k test.bam | head -5

rRNA-RNA18SN1   3       C       G       35      0       4       0
rRNA-RNA18SN1   3       C       A       35      0       2       0
rRNA-RNA18SN1   4       C       T       45      0       1       0
rRNA-RNA18SN1   4       G       T       45      0       1       0
rRNA-RNA18SN1   5       T       G       38      0       4       0

Demo run withreference seq:

$ rnaseqmut -s 6 -m 3 -i 1 -k -r test.fa test.bam | head -5

rRNA-RNA18SN1   3       C       G       36      0       4       0
rRNA-RNA18SN1   3       C       A       36      0       2       0
rRNA-RNA18SN1   4       C       T       45      0       1       0
rRNA-RNA18SN1   5       T       G       42      0       4       0
rRNA-RNA18SN1   5       T       A       42      0       1       0

demo files: test.tar.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions