project-template icon indicating copy to clipboard operation
project-template copied to clipboard

check_estimator outside of unittests

Open remiadon opened this issue 5 years ago • 2 comments

Hi there, Thanks for this awesome template :+1:

For the moment calls to check_estimator are made insides unittest (see test_common.py)

For my project I find it can be useful to automatically run calls to check_estimators for all classes in some pre-defined modules

proposed solution

a python script at the root of the project, implementing the following process:

  • load all classes from a module
  • filter those classes according to a given criterion (I check if they have a .fit method)
  • call check_estimator with generate_only=True, and sequentially run all tests for all candidate estimators
  • pretty print the result on the standard output

Here is my code

import inspect

import sklearn
from sklearn.utils.estimator_checks import check_estimator

import skmine.itemsets
# TODO : add other modules here

MODULES = [
    skmine.itemsets,
]

OK = '\x1b[42m[ OK ]\x1b[0m'
FAIL = "\x1b[41m[FAIL]\x1b[0m"

def is_estimator(e):
    _, est = e
    meth = getattr(est, "fit", None)
    return callable(meth)

if __name__ == '__main__':
    for module in MODULES:
        clsmembers = inspect.getmembers(skmine.itemsets, inspect.isclass)
        estimators = filter(is_estimator, clsmembers)
        for est_name, est in estimators:
            # from sklearn 0.23 check_estimator takes an instance as input
            obj = est() if sklearn.__version__[:4] >= '0.23' else est
            checks = check_estimator(obj, generate_only=True)
            for arg, check in checks:
                check_name = check.func.__name__  # unwrap partial function
                desc = '{} === {}'.format(est_name, check_name)
                try:
                    check(arg)
                    print(OK, desc)
                except Exception as e:
                    print(FAIL, desc, e)

and here is the kind of output I get when calling

python check_estimators.py

from the project root

image

LIMITS

Note that with sklearn.__version__ >= '0.23' we run check_estimator with an instance of an estimator. The above script always instantiate it with default parameters, so it's not perfect, but the point here is to provide a quick check for compatibility

remiadon avatar May 15 '20 14:05 remiadon

Thanks @remiadon ! +1 for a function to list all estimators, basically adapting https://github.com/scikit-learn/scikit-learn/blob/1986c89a12203a2df02f65e0764acea2bcd027cc/sklearn/utils/init.py#L1146

However, tests should use pytest and parametrize_with_checks decorator.

@remiadon Would you be interested in making a PR with it?

rth avatar May 18 '20 13:05 rth

@rth thanks for pointing out the get_all_estimators method. And yes I will make a PR for this :)

remiadon avatar May 18 '20 14:05 remiadon

I did the change for using the parametrize with checks. I'll check if I can quickly adapt the functions to list all public estimators.

glemaitre avatar May 01 '24 10:05 glemaitre