Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embl2Fasta creates non-unique sequence names for embl files with multiple sequences #9

Open
0xaf1f opened this issue Jun 28, 2020 · 1 comment

Comments

@0xaf1f
Copy link
Contributor

0xaf1f commented Jun 28, 2020

when using transfer type Multiple, when one of the reference embls contains two sequences (like a chromosome and a plasmid), I get the error

ERROR: The reference file may contain sequences with non-unique
       header Ids, please check your input files and try again
ERROR: postnuc returned non-zero

Looking at the code for the file being passed to nucmer, as well as the Embl2Fasta function in main.ratt.pl, it's clear that every sequence in the embl file is given the same fasta header. I have not looked into whether patching the function to make the names unique here will cause problems for the way RATT handles the matches at later stages (like retrieving the annotations later on).

@haessar
Copy link
Collaborator

haessar commented Mar 17, 2023

Assuming a switch to Bioperl might also fix this? (refer to #12)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants