I test a bam file with indel reads, and all the reads has a MD tag. If the reference fasta file is provided, the reference base in the ouput will be the correct one. But if I do not provide the reference sequence, and let rnaseqmut infer that from the MD tag., then some position can no be inferred correctly.
Demo run without reference seq:
(The 4th row in the output show ref base of pos:4 is G, while it should be C)
$ rnaseqmut -s 6 -m 3 -i 1 -k test.bam | head -5
rRNA-RNA18SN1 3 C G 35 0 4 0
rRNA-RNA18SN1 3 C A 35 0 2 0
rRNA-RNA18SN1 4 C T 45 0 1 0
rRNA-RNA18SN1 4 G T 45 0 1 0
rRNA-RNA18SN1 5 T G 38 0 4 0
Demo run withreference seq:
$ rnaseqmut -s 6 -m 3 -i 1 -k -r test.fa test.bam | head -5
rRNA-RNA18SN1 3 C G 36 0 4 0
rRNA-RNA18SN1 3 C A 36 0 2 0
rRNA-RNA18SN1 4 C T 45 0 1 0
rRNA-RNA18SN1 5 T G 42 0 4 0
rRNA-RNA18SN1 5 T A 42 0 1 0
demo files: test.tar.gz
I test a bam file with indel reads, and all the reads has a MD tag. If the reference fasta file is provided, the reference base in the ouput will be the correct one. But if I do not provide the reference sequence, and let rnaseqmut infer that from the MD tag., then some position can no be inferred correctly.
Demo run without reference seq:
(The 4th row in the output show ref base of pos:4 is G, while it should be C)
$ rnaseqmut -s 6 -m 3 -i 1 -k test.bam | head -5 rRNA-RNA18SN1 3 C G 35 0 4 0 rRNA-RNA18SN1 3 C A 35 0 2 0 rRNA-RNA18SN1 4 C T 45 0 1 0 rRNA-RNA18SN1 4 G T 45 0 1 0 rRNA-RNA18SN1 5 T G 38 0 4 0Demo run withreference seq:
$ rnaseqmut -s 6 -m 3 -i 1 -k -r test.fa test.bam | head -5 rRNA-RNA18SN1 3 C G 36 0 4 0 rRNA-RNA18SN1 3 C A 36 0 2 0 rRNA-RNA18SN1 4 C T 45 0 1 0 rRNA-RNA18SN1 5 T G 42 0 4 0 rRNA-RNA18SN1 5 T A 42 0 1 0demo files: test.tar.gz