pubmed_parser
pubmed_parser copied to clipboard
:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset
**Describe the bug** I am using pubmed parser to extract table information from xml files. The tests were performed with two papers: PMCID 535340, and 535341. Both papers have tables;...
Using parse_pubmed_table with PMC535341.xml and PMC535340.xml (as test papers). In both cases, parse_pubmed_table is not retrieving table info. Other modules like parse_captions / paragraphs work fine. Any hints at what...
**Describe the bug** consider [PMC 1280406](https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=%271280406%27): Published online: 2005 May 31 Published in journal: 2005 Sep valid dates would be '2005-09' or '2005-05-31', but `pp.parse_pubmed_xml` yields '2005-09-31' The culprit is...
**Describe the bug** I was preparing for a dataset requiring paragraph-level parsing of PMC_OA articles. However, when I try to parse this article with PMC id [PMC8075838](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8075838/), there are actually...
**Describe the bug** Tags in the `` value cause: ```python-traceback File ".../pubmed_oa_parser.py", line 153, in parse_pubmed_xml journal = " ".join([j.text for j in journal_node]) TypeError: sequence item 0: expected str...
**Describe the bug** It seems like now every file from the PubMed XML follows the MEDLINE XML format. Therefore, running `pp.parse_pubmed_xml()` on any file will always result to the issue...
First, thank you for your tool, it is very very useful. I was wondering why there is a difference in the processing of list elements according to the parser used....
It looks like the pubmed parser doesn't support the pubmed baseline files? I get the error below. It also doesn't look like the test file is using a similar file...
Does anyone know if there is a parser available that was used to take the original PDF files and convert them to the PubMed Open format?
I am trying to use your parser but there are some problems. My OS is Windows 10 and I did next steps: I installed Git. From Git I run: $...