dataMaid icon indicating copy to clipboard operation
dataMaid copied to clipboard

Add "compare" step

Open annennenne opened this issue 6 years ago • 0 comments

This is a planned extension of dataMaid. Feel free to add suggestions or comments.

Together with the three existing dataMaid data screening steps (summarize, visualize and check), we will add a new step: compare. The current data screening steps only look at variables one by one and therefore, it is not possible to accommodate row-wise comparisons/checks. This will be possible to do in the "compare step". Here, the user can specify comparisons, e.g. linear restraints (variableA > variableB).

We are considering the following:

  • Support specifying comparisons with rules from the validate package.
  • Allow users to specify comparisons class-wise. E.g. for all factor variables, check that they are not identical to variable a.
  • Allow users to specify comparisons for specific variables only. This would be very flexible, but less straight forward to make user-friendly.
  • No default comparisons should be included, but a few different comparison functions should be implemented and readily available.

Issues that are related to this extension and will be solvable afters its implementation:

  • Online tool feature wish: check for unique identifiers [#41]
  • Connection from the identified outliers to the row or unique identifier [#15]

annennenne avatar Mar 05 '19 16:03 annennenne