Partial dependence plots
Implementation of the partial dependence (PD) and individual conditional expectation (ICE) leveragingsklearn implementation.
Some functionalities that it includes
- PD and ICE for numerical features
- PD and ICE for categorical features
- PD and ICE for combinations of numerical and/or categorical features
- Plots for all the above cases
- Custom grids
- Usage of any black-box model (i.e. not only restricted to
sklearnestimators)
TODOs:
- [x] Method description notebook
- [x] Example usage notebook
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
@RobertSamoilescu I've attempted to fix the problem with the docs CI failing. It was due to the $f_S(x_{S})$ equation in PartialDependence.ipynb. There were a number of minor issues:
- The use of
$$surrounding the\begin{align}. Our combination ofnbsphinxandmyst-parsermeans amsmath environments such asalignare automatically interpreted as latex math. If you also add$$the html docs build will be OK, but it will screw with our latex docs build. - The use of
|...|for abs. Sphinx uses|...|for some sort of substitution functionality (see here), so with|S|it was treatingSas a substitution. Solution is to use\lvertand\rvertinstead. - Intends have a specific meaning to
myst-parser. We need to make sure we don't indent the content in math environments.
With the above changes I think the equation should render correctly for html and pdf builds, but probably good to check!
P.s. It seems there is still an issue with the hyperlink format I advised you to use. I'll look into this now.
P.s. It seems there is still an issue with the hyperlink format I advised you to use. I'll look into this now.
@RobertSamoilescu you are doing everything correct here. i.e. creating a header anchor in overview/high-level.md:
(partial-dependence)=
#### Partial Dependence
(note you don't have to do this for heading levels 1 to 3 since our config setting myst_heading_anchors = 3 means anchors are generated automatically for these). You should be able to reference this like you are doing in PartialDependence.ipynb:
[Partial Dependence](../overview/high_level.md#partial-dependence)
But... I'm afraid this will not work due to our combination of nbsphinx for parsing .ipynb files and then myst-parser for doing the final rendering. Long story short myst-parser looks for ../overview/high_level.md#partial-dependence, but nbsphinx has already converted it to ../overview/high_level.html#partial-dependence. This would be fixed if we transition from nbsphinx to myst-nb one day, or better yet stop writing methods docs as jupyter-notebooks (my strong preference!).
I agree with moving towards .md format with myst syntax entirely for docs. Would need a boring PR that replaces all the .ipynb docs files with the .md + myst equivalents but not sure if the effort at the moment is justified.
@RobertSamoilescu btw are the 3rd and 4th files here redundant?

@RobertSamoilescu should we try to add a tqdm progress bar around the for loop over features_list? Can't remember if there were some issues in certain Python environments with this...
@RobertSamoilescu I think it would be great if we could have another example, preferably classification to show off interpretation of multiple targets and black-box to show that it works the same way as sklearn models. This could be a follow-up PR.
Codecov Report
Merging #721 (79d2c43) into master (7c5e48c) will decrease coverage by
1.33%. The diff coverage is60.05%.
@@ Coverage Diff @@
## master #721 +/- ##
==========================================
- Coverage 80.99% 79.65% -1.34%
==========================================
Files 105 107 +2
Lines 11869 12657 +788
==========================================
+ Hits 9613 10082 +469
- Misses 2256 2575 +319
| Impacted Files | Coverage Δ | |
|---|---|---|
| alibi/utils/visualization.py | 18.98% <14.28%> (-5.55%) |
:arrow_down: |
| alibi/explainers/partial_dependence.py | 45.88% <45.88%> (ø) |
|
| alibi/explainers/tests/conftest.py | 92.30% <78.57%> (-1.19%) |
:arrow_down: |
| alibi/api/defaults.py | 100.00% <100.00%> (ø) |
|
| alibi/explainers/__init__.py | 100.00% <100.00%> (ø) |
|
| alibi/explainers/tests/test_partial_dependence.py | 100.00% <100.00%> (ø) |
|
| alibi/datasets/default.py | 69.56% <0.00%> (-1.03%) |
:arrow_down: |
| alibi/explainers/anchors/anchor_tabular.py | 89.57% <0.00%> (-0.71%) |
:arrow_down: |
After an offline discussion we decided to copy the private methods from sklearn to compute PD using brute approach and use that directly on black-box models without wrapping into a sklearn-like wrapper. This allows us to get rid of the confusing predictor_kw kwarg making it easier and more transparent to the end-user. It also has the added benefit of us being more in control due to not relying on private sklearn functions.
On the flip side, this means for now we only support numpy arrays for both black-box and sklearn models (contrast with sklearn implementation which allows estimators to be fitted on pandas dataframes, sparse matrices etc. This is in line with our current approach of mainly supporting numpy array fitted models.
Decided to remove the options response_method='auto'. This is because for a binary classifier the number of output targets can change based on the other parameters (e.g., method='recursive' and kind='average' vs method='brute' and kind='both'). If method='recursive' and kind='average' are set, then the decision function is used to compute the PD, which means that for a binary classifier the output will have just one column. On the other hand, if method='brute' and kind='both', then predict_proba will be used instead, which in our implementation will result in two output columns. The problem arises when plotting since the plots can use the target_names specified in the constuctor. For example, if target_names=['output'], then a plotting error will arise when calling the explain method with the second pair of parameters since in that case the output has 2 columns, but the user specified only one target name. Another failure can arrise if target_names=[class_0, class_1]. If the first pair of parameters is used, then the PD corresponding to the decison score will be labeled with class_0' which is not correct.
Also I decided to move the parameter response_method into the __init__ since will fully specify the function to be used for the PD computation. Thus, we will avoid the plotting errors above. If the user wants to use anther function, they will have to create a new explainer object.
Also decided to remove method='auto' option. The sklearn logic is the following:
if method == Method.AUTO:
if isinstance(self.predictor, BaseGradientBoosting) and self.predictor.init is None:
method = Method.RECURSION.value
elif isinstance(self.predictor, (BaseHistGradientBoosting, DecisionTreeRegressor, RandomForestRegressor)):
method = Method.RECURSION.value
else:
method = Method.BRUTE.value
if method == Method.RECURSION:
if not isinstance(self.predictor, (BaseGradientBoosting, BaseHistGradientBoosting, DecisionTreeRegressor,
RandomForestRegressor)):
supported_classes_recursion = (
"GradientBoostingClassifier",
"GradientBoostingRegressor",
"HistGradientBoostingClassifier",
"HistGradientBoostingRegressor",
"HistGradientBoostingRegressor",
"DecisionTreeRegressor",
"RandomForestRegressor",
)
raise ValueError(f"Only the following estimators support the 'recursion' "
f"method: {supported_classes_recursion}. Try using method='{Method.BRUTE.value}'.")
if response_method == ResponseMethod.AUTO:
response_method = ResponseMethod.DECISION_FUNCTION.value
if response_method != ResponseMethod.DECISION_FUNCTION:
raise ValueError(f"With the '{method.RECURSION.value}' method, the response_method must be "
f"'{response_method.DECISION_FUNCTION.value}'. Got {response_method}.")
Removing the method='auto' option is a consequence of removing response_method='auto'.
Consider we have a model that supports both decision_function and predict_proba . Furthermore, assume it is a model that supports recursion method. Then we can have at least the following cases:
-
kind='average',response_method='decision_function',method='auto'In this case,methodwill becomemethod='recursion'because the model supports recursion option (first set of if). Everything works well. -
kind='average',response_method='predict_proba',method='auto'In this case,methodwill also becomemethod=recursionbecause the model supports recursion (first set of if). But after that, a value error will be thrown because theresponse_methodcannot be set or was not set todecision_function. The error would not have been thrown ifresponse_method='auto'cause in that caseresponse_methodwould have become'decision_function'. But since that option is no longer available, IMO it doesn't make sense to keep the optionmethod='auto'.
Following an offline discussion it was decided to simplify the implementation and user interface by splitting the implementation into two distinct classes:
-
PartialDependencefor use with black-box models calculating PD using a brute-force approach -
TreePartialDependencefor use with white-box models (currently only a small selection ofsklearnestimators) that support a recursive algorithm for calculating PD which is faster than the brute-force approach
This allows us to remove all of the slightly confusing arguments discussed previously.
Also, @RobertSamoilescu checked that the recursive algorithm returns slightly different values than performing the brute-force PD on the same estimators which further justifies splitting the implementation into two public classes (similar to KernelShap and TreeShap.
@jklaise, check the note at the end of this section which confirms that the two methods differ in the values they return