croissant
croissant copied to clipboard
OpenML: Disagreement between OpenML keywords and Croissant keywords
For many datasets, the keywords field of the Croissant metadata json is wrong. For example, https://www.openml.org/search?type=data&status=active&id=925&sort=runs. This is evident by comparing them to OpenML keywords and the datasets content. For some reason, the keywords "Life Science" and "Chemistry" appear in a lot of Croissant metadata files even though the datasets are not related.
Just to confirm, this is an issue with the OpenML management of keywords, right?
They seem to be correct in the OpenML website, but nor correct when the json is extracted. I am not sure where the issue appears.