-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RATT transfers order( , )
coordinates and loses a parenthesis
#12
Comments
Hi,
The issues is the order tag. I think in the past I had a regular expression to replace it. Let me have a look at your fix.
Best,
Thomas
… On 16 Feb 2023, at 21:56, Afif Elghraoui ***@***.***> wrote:
The reference annotation <https://www.ncbi.nlm.nih.gov/nuccore/NC_000962.3> contains
FT gene 3593369..3593852
FT /locus_tag="Rv3216"
FT /pseudogene="unknown"
FT /db_xref="GeneID:888845"
FT misc_feature order(3593369..3593437,3593439..3593852)
FT /locus_tag="Rv3216"
FT /note="acetyltransferase (2.3.1.-), contains GNAT domain
FT (GCN5-like N-acetyltransferase. See Vetting et al. 2005),
FT probably pseudogene as appears frameshifted due to 1bp
FT insertion at position 3593438. Frameshift present in all
FT sequenced tubercle bacilli. Start changed since first
FT submission, extended by 50aa."
FT /pseudogene="unknown"
FT /db_xref="PSEUDO:CCP46032.1"
which gets transferred to the input assembly as
FT gene complement(116773..117256)
FT /locus_tag="Rv3216"
FT /note="*pseudogene: unknown"
FT /db_xref="GeneID:888845"
FT /gene="Rv3216"
FT misc_feature complement(order(116773..117256)
FT /locus_tag="Rv3216"
and then parsing the annotation file fails because the misc_feature coordinate has an unbalanced parenthesis.
—
Reply to this email directly, view it on GitHub <#12>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEOT7ET5UZAGYUMYJEJO47LWX2PCRANCNFSM6AAAAAAU6WUT3A>.
You are receiving this because you are subscribed to this thread.
|
@0xaf1f Thomas refers to a fix, are you aware of this? |
No, I haven't gotten to it yet since I've been working on my own code. I think RATT would benefit from using Bioperl to read/write embl files (it might even take care of #10), but I haven't looked into how disruptive that would be versus updating a regex. I wouldn't suggest waiting for me when your focus is already here. |
Using Bio::SeqIO (Bioperl) would allow me to essentially replace main.ratt.pl:300-500 or so with only a few lines of code, if I have it right. Will put it on the to-do list. |
But it requires to install bioPerl, which was annoying in the past…
Best,
Thomas
… On 17 Mar 2023, at 15:02, Will Haese-Hill ***@***.***> wrote:
Using Bio::SeqIO (Bioperl) would allow me to essentially replace main.ratt.pl:300-500 or so with only a few lines of code, if I have it right. Will put it on the to-do list.
—
Reply to this email directly, view it on GitHub <#12 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEOT7EUMMSZWFD6W3EIHUVLW4R4HRANCNFSM6AAAAAAU6WUT3A>.
You are receiving this because you commented.
|
The reference annotation contains
which gets transferred to the input assembly as
and then parsing the annotation file fails because the
misc_feature
coordinate has an unbalanced parenthesis.The text was updated successfully, but these errors were encountered: