cornac [FEATURE] Allow running evaluation on validation set during training or right after training is done

Description

This is a breaking change. In many training/validation/test pipeline, the evaluation on validation data happens during training (for monitoring/model selection). The current version of cornac evaluates validation set similar to test data.

Expected behavior with the suggested feature

Evaluate validation periodically after n epochs and report the validation results.
Some model (e.g. MMNR) uses different external data for validation and test, which requires a way to distinguish between validation evaluation and test evaluation.

Other Comments

Apr 03 '24 03:04 lthoang

Couple of things:

if it's only for performance reporting purpose on validation set, we can implement it as part of the model training loop stead of part of general pipeline
for MMNR model, what does it mean by using different external data for validation and test? can we have a specific example?

Apr 03 '24 22:04 qtuantruong

for MMNR model, what does it mean by using different external data for validation and test? can we have a specific example?

@tqtg, If we look closely at this function. MMNR uses different history matrices for train data, validation data, and test data.

My idea is to breakdown the evaluation pipeline into train, validation, and test to reduce the redundancy of the current implementation (currently, cornac allows manipulating val_set along with train_set inside fit function, it may cause re-evaluate val_set once or twice inside score or rank functions)

Apr 04 '24 13:04 lthoang

We provide models with both val_set and train_set so that it can perform early stopping, hyper-parameter optimization, or any kind of trade-off for model selection inside the training loop. I don't get your very last sentence about score and rank functions.

Apr 04 '24 16:04 qtuantruong

@tqtg let's say we evaluate val_set inside fit for early stopping/monitoring. After the training is done and we perform evaluation on VALIDATION, the model has to do inference on the val_set again.

Apr 05 '24 03:04 lthoang