Predictions refuses features from test set with incorrect variable names if renamed to the correct variable name
What's wrong? If Predictions receives inputs from a test set where features have variable names different from the variable names in the training data, it won't recognize the features, and predictors will fail This is as expected. However, using Edit Domain it should be possible to change the feature variable name to the 'correct' name (same as in training set), for Predictions to recognize the feature. However, it doesn't. Somehow, it still seems to 'see' the unchanged, 'wrong' variable name. On the other hand, if the feature has the 'correct' name in the test set, but it is changed to a 'wrong' name using Edit Domain, and, again, back to the 'correct' name, Predictions will recognize it as the intended feature.
How can we reproduce the problem? See attached workflow (using URLs for input files), testing several options
What's your environment?
- Operating system: Mac OS 12.4 (on Silicon)
- Orange version: 3.32.0
- How you installed Orange: from DMG Predictions bug.ows.zip
This is not a bug, it is by design. In the background, Orange uses compute_value, a function whose goal is to transform any new rows into an appropriate form with the same approach. This is particularly useful in Test and Score and Predictions, where one doesn't have to apply any transformation to the test data (everything is done automatically).
We have already foreseen the issue you are having. In Edit Domain, simply check "unlink variable from its source variable", which will remove its compute_value, thus enabling you to compare variables by name only (not by their inherent similarity).
In Edit Domain, simply check "unlink variable from its source variable"
@ajdapretnar Thanks for the explanation. I'd like to do that, but that option is greyed out ....
I am reopening this one. I know Orange is strict about variable reuse, but this case is different. So here the user had a raw variable and renamed it, which still means they should have a functionally raw (but renamed) variable. So I think the matches should indeed be possible in their case and not allowing them is a bug.
We tried it: this problem could be solved if Edit Domain allowed unlinking of renamed variables. This widgets needs to be changed in at least two places: it should no longer disable the checkbox, and requires_unlinking should not check whether there are any transformations that might add compute_value (because this test does not seem to work properly?).
The checkbox has a decent tooltip, but it may have to also include a tip that one can rename variables in that fashion. Perhaps also include why you may not want to unlink -- the "history" of changes (e.g. discretization) is lost so the model may not properly apply to new data.