Support for fixing problems detected by report queries
I would benefit from some of the problems detected by report queries to be repaired automatically.
For example: annotation_whitespace constraint violations can be fixed by some SPARQL Update (not speaking about efficiency of this particular solution) - e.g. like here https://github.com/ontodev/robot/compare/master...psiotwo:robot:feature/fixing-report-issues
A few questions:
- Does such feature make sense to you? Would you benefit from such 'repairing' functionality for your use-cases? It is also possible that I miss some way, how this can be accomplished by the current ROBOT command set.
- If so, what sort of architecture + API would make sense to you? For example, a
--fix-profileparameter to therepaircommand with analogous syntax (to theprofileparameter of thereportcommand) might do the job IMO.
Thanks for feedback.
Interesting idea! How Many checks can actually be fixed automatically?
For the default constraint set it seems to me that the following constraints should be repairable automatically:
- annotation_whitespace
- illegal_use_of_built_in_vocabulary
- label_formatting
- label_whitespace
- lowercase_definition
- missing_obsolete_label
- missing_synonymtype_declaration
So added examples of annotation_whitespace, label_formatting, label_whitespace queries + tests to see how it can work - https://github.com/ontodev/robot/compare/master...psiotwo:robot:feature/fixing-report-issues
Alright you are obviously very serious about this. I don't want you to do crazy work, there is a real problem with using SPARQL update for applying fixes. While it is easy to query for problems, deleting triples is hard because of reification - most annotations can easily have an annotation assertion on top, and because that is the case, the sparql queries you propose will only fix the direct triple, but the redundant reified triple will have the broken values re-emerge during the next serialisation step. So if you want to really work on a repair language maybe its better to use the OWL API directly for defining repairs, and extend the repair command? I don't know. Just guess working. But update sparql queries is IMO not the right tool for most realistic ontologies..
:-D My point was not so much about defending the SPARQL Update solution (although for my cases the sample queries would be enough) but rather about the "fix feature" itself.
Yet, in general, the point you are raising makes an absolute sense! Indeed, SPARQL Update would not work great for annotated axioms. SPARQL Update-based solution could only be considered an incomplete way (some problems might get fixed, but not all) for simple assertions (the annotation patterns you mention could be matched in the WHERE clause to avoid problems you are describing).
To make the repair strategies complete a dedicated (OWL-based) language would be great, but it might be an overkill. But even individual fix strategies at least for some of the problems with literals (e.g. annotation_whitespace, label_whitespace, label_formatting, lowercase_definition) can be implemented directly as services over OWLAPI.
Yes, OWLAPI already has a framework for repairs. One clean way to do this would be to implement an ROBOTReportProfile() akin to the OWL API Profile() checking mechanism (as implemented in robot validate-profile). Then add --repair to the validate-profile command and uses OWLAPI own repair strategies as well as the ones we implement for ROBOTReportProfile(). This would be an excellent Summer project for a student!
Even a more naive thing where you simply create an interface ROBOTReportRepair and then keep a map with ROBOT report checks as keys and repair strategies as Java objects would work too. Again, needs a volunteer. But before we embark on this, we should wait for James to chime in which could be a while!
This may seem like overkill but if we added support for a subset of kgcl it could provide a general high level update mechanism for annotations