autogluon
autogluon copied to clipboard
Feature: Programmatic model training interruptions
Add a feature in Tabular which allows for model training to be interrupted after a given time has passed.
Example: time_limits=100, but model has been training for 200 seconds and has not returned.
Currently, AutoGluon waits for the model to return and then exits.
With this feature, AutoGluon could raise an Exception to the model if it goes significantly over the time budget, without needing to make the call in the model training code itself.
Potential Solution: https://stackoverflow.com/questions/492519/timeout-on-a-function-call Drawbacks / Reasons this is not easy: https://eli.thegreenplace.net/2011/08/22/how-not-to-set-a-timeout-on-a-computation-in-python
Requirements:
- Works identically on Linux, Mac, and Windows.
- Lightweight or toggle-able to avoid slowing down ordinary training runs.
- Able to handle situations involving models using multi-threading / multi-processing.
- Safely exits model training.
Primary Benefits:
- CatBoost can sometimes take very long to return a model regardless of time_limits due to the lack of callback support, this should fix that situation.
- Similarly, KNN can also do this, although far less often or severely as CatBoost.
- LinearRegression / LogisticRegression
- Any model which somehow gets stuck could potentially be recovered from.