[ADD] fit ensemble
This PR enables fitting an ensemble after search has finished. It also fixes issue #299.
Types of changes
- [x] New feature (non-breaking change which adds functionality)
Note that a Pull Request should only contain one of refactoring, new features or documentation changes. Please separate these changes and send us individual PRs for each. For more information on how to create a good pull request, please refer to The anatomy of a perfect pull request.
Checklist:
- [x] My code follows the code style of this project.
- [x] My change requires a change to the documentation.
- [x] I have updated the documentation accordingly.
- [x] Have you checked to ensure there aren't other open Pull Requests for the same update/change?
- [x] Have you added an explanation of what your changes do and why you'd like us to include them?
- [x] Have you written new tests for your core changes, as applicable?
- [x] Have you successfully ran tests with your changes locally?
Description
This PR adds a function called fit_ensemble which can create an ensemble after search has finished. It uses the same backend working directory as the search and builds an ensemble based on the predictions of models saved during search. Also, as ensemble creation is now a separate process, it does not make sense to instantiate a task tied with an ensemble_size and ensemble_nbest. Therefore, these parameters are no longer class parameters, instead, they are passed to the search function or fit_ensemble function.
Additionally, as raised in #299 we raise a warning that ensemble could not be built regardless if the user did not want it in the first place.
Motivation and Context
This PR enables the ability to fit an ensemble post hoc. This enables the ability to create multiple ensembles with the same algorithms found in search, it can also save time in the search by removing the overhead with fitting the ensemble as it is passed as a callback to smac. Moreover, in the future, we can also enable creating an ensembles stored in disk with a new task object.
How has this been tested?
As ensemble fitting was already being tested in the test_api, I have added tests to a new function _init_ensemble_builder. Moreover, with the posthot_ensemble_fit example, it ensures a smooth function of search with ensemble_size=0.
Codecov Report
Merging #366 (200aa7f) into development (0e574af) will decrease coverage by
57.65%. The diff coverage is4.95%.
@@ Coverage Diff @@
## development #366 +/- ##
================================================
- Coverage 85.50% 27.84% -57.66%
================================================
Files 231 230 -1
Lines 16303 16331 +28
Branches 3009 3022 +13
================================================
- Hits 13940 4548 -9392
- Misses 1524 11781 +10257
+ Partials 839 2 -837
| Impacted Files | Coverage Δ | |
|---|---|---|
| autoPyTorch/api/tabular_classification.py | 46.66% <ø> (-44.45%) |
:arrow_down: |
| autoPyTorch/api/base_task.py | 15.61% <4.95%> (-68.20%) |
:arrow_down: |
| ...cessing/time_series_preprocessing/scaling/utils.py | 8.00% <0.00%> (-84.00%) |
:arrow_down: |
| ...tup/network_backbone/forecasting_backbone/cells.py | 8.56% <0.00%> (-83.80%) |
:arrow_down: |
| ...mponents/setup/forecasting_target_scaling/utils.py | 7.44% <0.00%> (-82.98%) |
:arrow_down: |
| ...mponents/setup/network/forecasting_architecture.py | 10.10% <0.00%> (-80.50%) |
:arrow_down: |
| ...omponents/setup/network_backbone/ResNetBackbone.py | 19.62% <0.00%> (-80.38%) |
:arrow_down: |
| ...twork_head/forecasting_network_head/NBEATS_head.py | 18.84% <0.00%> (-78.27%) |
:arrow_down: |
| autoPyTorch/ensemble/ensemble_selection.py | 18.75% <0.00%> (-78.13%) |
:arrow_down: |
| ...oPyTorch/data/time_series_forecasting_validator.py | 9.52% <0.00%> (-76.79%) |
:arrow_down: |
| ... and 206 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 0e574af...200aa7f. Read the comment docs.