Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Article ID parsing issue. #1

Closed
iacopy opened this issue Feb 25, 2022 · 0 comments · Fixed by #4
Closed

Article ID parsing issue. #1

iacopy opened this issue Feb 25, 2022 · 0 comments · Fixed by #4
Labels
bug Something isn't working

Comments

@iacopy
Copy link
Owner

iacopy commented Feb 25, 2022

See gijswobben#22

Describe the bug
A report the first report of this

While iterating on articles resulting from a PubMed query, I noticed that some article ids have parsing issues.

For instance :
Query : ((Haliaeetus leucocephalus[Title/Abstract])) AND ((prey[Title/Abstract]) OR (diet[Title/Abstract]))

Returns (when printing first 10 results) :
pubmed_id = '22822430\n18959310\n21310968\n21295371\n20439737'
abstract = ('Bald eagles (Haliaeetus leucocephalus) are recovering from severe population declines...

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: [e.g. Windows / Linux]
  • Version [e.g. Python 3.7.0]

Additional context
Add any other context about the problem here.

@iacopy iacopy added the bug Something isn't working label Feb 25, 2022
iacopy added a commit that referenced this issue Feb 25, 2022
Fix #1

This fix avoids returning also the IDs of cited
papers
(they are within the ReferenceList element of the xml).

Note: this issue was tracked as 22 on the original repository (now archived)

An alternative XPath to be used:
path = ".//PubmedData/ArticleIdList/ArticleId[@idtype='pubmed']"
@iacopy iacopy mentioned this issue Feb 25, 2022
7 tasks
@iacopy iacopy closed this as completed in #4 Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant