genomic / cdna coord conversion of upstream/downstream regions on NR transcripts#43
Open
rolfschr wants to merge 3 commits intocounsyl:masterfrom
Open
genomic / cdna coord conversion of upstream/downstream regions on NR transcripts#43rolfschr wants to merge 3 commits intocounsyl:masterfrom
rolfschr wants to merge 3 commits intocounsyl:masterfrom
Conversation
added 3 commits
November 16, 2017 09:25
…es for non coding transcripts.
|
I think "NR_002717.2:n.-327" violates the HGVS spec: https://varnomen.hgvs.org/bg-material/numbering/ non-coding DNA reference sequencesIt is not allowed to describe variants in nucleotides which are not covered by the transcript using a non-coding DNA reference sequence |
g-b-f
added a commit
to g-b-f/pyhgvs2
that referenced
this pull request
May 26, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi,
I found some issues with the genomic / cdna coord conversion for NR transcripts. I base my findings on comparing the results from pyhgvs with Alamut Visual (which in turn is consistent with https://mutalyzer.nl/position-converter). Specifically, I saw that upstream (befor first exon) and downstream (after last exon) cdna coords were copmletely off for non-coding (NR) transcripts. Exonic and intronic regions are fine. I believe the issue stems from trying to use the (non-existing) start (or end) position of the cds regions as an offset. I have added some unittests which fail.
I have also supplied a patch which makes the unittests pass. However, I believe this is not the way one would implement
genomic_to_cdna_coord(). It's more like a hint. I also added a crappy hack for a very specific scenario incdna_to_genomic_coord()where I did not find how to do it "more" correctly.As an example:
chr13:g.70681018delon ATXN8OS NR_002717.2 (upstream) is coded asNR_002717.2:n.-327by Alamut/Mutalyzer butpygvs.genomic_to_cdna_coord('NR_002717.2', 70681018)returns-326.chr13:g.70714013del(same tx, downstream) should be*128but is1600.