datascienceontology
datascienceontology copied to clipboard
High-level, informal concepts of data science
The ontology should, perhaps, include high-level concepts of data science, such as "data cleaning/preprocessing", "inference", and "evaluation". The usefulness of such concepts is obvious, but there are several difficulties. Unlike the concepts currently in the ontology, these high-level concepts are
- informal and imprecise, i.e., do not admit a clean mathematical description
- usually present only implicitly in code or natural text, i.e., must be either inferred using NLP methods or manually annotated by the data analysis author
How to proceed is an open question.