ChemDataExtractor
ChemDataExtractor copied to clipboard
Automatically extract chemical information from scientific documents
I know this package has not been updated for some time; however, is there any way how we can install it in more recent python versions?
http://chemdataextractor.org/results/abb704de-ca52-4bc4-973b-d34ee8f1407a
I have been working with this library to extract chem information from HTML pages. I followed http://chemdataextractor.org/demo and saved https://pubs.rsc.org/en/content/articlelanding/2015/TC/C5TC02626A as an html(input3.html) file. Below is my code. with open('input/input3.html',...
Can anyone let me know , how to write custom parser to fetch Chemical molecule name with constituents details in desired format [Chemical name + addition : Constituents],[Chemical name +...
[USpatenttest.xml.zip](https://github.com/mcs07/ChemDataExtractor/files/5243401/USpatenttest.xml.zip) Having trouble reading in this XML file with the generic XMLReader. It's downloaded from the WIPO patenscope site. I run: `from chemdataextractor import Document` `f = open('USpatenttest.xml', 'rb')` `doc=Document.from_file(f)`...
Does anyone knows how to write a custom parser to extract a named entity inside an entity. For example from the following sentence I want to extract 'boiling' which will...
Kindly share or provide a sample which showcased in http://chemdataextractor.org/demo Thanks in advance,
In the NlmXmlReader class ```python def detect(self, fstring, fname=None): """""" if fname and not (fname.endswith('.xml') or fname.endswith('.nxml')): return False if b'xmlns="http://jats.nlm.nih.gov/ns/archiving' in fstring: return True if b'JATS-archivearticle1.dtd' in fstring: return...
China's access to the Internet is too slow
I am trying to create a custom parser to extract the boiling points from the following texts, so that the text between "boiling point" and "of" is optional. ``` Paragraph(u'The...