biomedical icon indicating copy to clipboard operation
biomedical copied to clipboard

chemdner kb implementation needs normalizations

Open galtay opened this issue 3 years ago • 1 comments

https://github.com/bigscience-workshop/biomedical/blob/master/bigbio/biodatasets/chemdner/chemdner.py https://github.com/bigscience-workshop/biomedical/pull/326

the current implementation says it supports the text classification and named entity recognition tasks. the text classification tasks has MESH codes but the NER task does not. this issue is to investigate why the MESH codes are not available in the normlized field of the kb entity schema and to investigate if we can make this a named entity disambiguation task as well.

galtay avatar Jun 05 '22 02:06 galtay

the text classification tasks has MESH codes but the NER task does not.

This is because the MeSH codes are assigned as "document" (global) tags and are used for indexing purpose in PubMed. Unfortunately no annotation is provide at the mention-level. This is why it is a NER and TEXT_CLASSIFICATION dataset.

sg-wbi avatar Jun 07 '22 14:06 sg-wbi