DiCE icon indicating copy to clipboard operation
DiCE copied to clipboard

Multiclass Prediction with PyTorch / TF Model?

Open bruno-hoermann opened this issue 3 years ago • 8 comments

Hello everyone!

is there support for multiclass prediction with PyTorch models? My first attempt failed and after a quick glance into the code of the DicePyTorch class is appears to me that only binary classification is supported. I have the same impression for the corresponding Tensorflow class. Is my observation correct and multiclass prediction is not supported vor deep learning models or am I doing something wrong? Any suggestions on what to try are much appreciated!

bruno-hoermann avatar Mar 04 '22 15:03 bruno-hoermann

@prunprun, could you paste the error that you saw with DicePytorch?

Regards,

gaugup avatar Mar 06 '22 17:03 gaugup

Here is my code:

import torch.jit
import dice_ml
from utils.utils import transform_torch_model, create_random_dataset


model = transform_torch_model(torch.jit.load("models/model.pt").state_dict())
print(model)
df = create_random_dataset(model)
df.info()

continuous_features = df.drop("action", axis=1).columns.tolist()

d = dice_ml.Data(dataframe=df,
                 continuous_features=continuous_features,
                 outcome_name="action")


m = dice_ml.Model(model=model, backend='PYT', model_type='classifier')

exp = dice_ml.Dice(d, m)

query_instance = {'order1': 0.5,
                  'order2': -1.5,
                  'order3': 1.5,
                  'order4': -0.5,
                  'order5': 0.5,
                  'action': 2
                  }

cf = exp.generate_counterfactuals(query_instance, total_CFs=2, desired_class=3)
cf.visualize_as_dataframe(show_only_changes=True)

Running the script prints the following output:

TestModel(
  (layer0): Linear(in_features=5, out_features=64, bias=True)
  (layer1): Linear(in_features=64, out_features=64, bias=True)
  (output_layer): Linear(in_features=64, out_features=5, bias=True)
  (relu): ReLU()
)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   order1  100 non-null    float64
 1   order2  100 non-null    float64
 2   order3  100 non-null    float64
 3   order4  100 non-null    float64
 4   order5  100 non-null    float64
 5   action  100 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 4.8 KB

The error message is:

Exception has occurred: ValueError
cannot convert float NaN to integer
  File "/home/bruno/anaconda3/envs/reif-exp/lib/python3.7/site-packages/dice_ml/explainer_interfaces/explainer_base.py", line 420, in do_posthoc_sparsity_enhancement
    diff = query_instance[feature].iat[0] - int(final_cfs_sparse.at[cf_ix, feature])
  File "/home/bruno/anaconda3/envs/reif-exp/lib/python3.7/site-packages/dice_ml/explainer_interfaces/dice_pytorch.py", line 577, in find_counterfactuals
    final_cfs_df_sparse, test_instance_df, posthoc_sparsity_param, posthoc_sparsity_algorithm)
  File "/home/bruno/anaconda3/envs/reif-exp/lib/python3.7/site-packages/dice_ml/explainer_interfaces/dice_pytorch.py", line 130, in generate_counterfactuals
    tie_random, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm)
  File "/home/bruno/rl-pmsp/explain.py", line 31, in <module>
    cf = exp.generate_counterfactuals(query_instance, total_CFs=2, desired_class=3)

The final_cfs_sparse df only contains zeros and nan values so the error must happen before the posthoc sparsity computation.

bruno-hoermann avatar Mar 07 '22 09:03 bruno-hoermann

@prunprun, could you turn off the posthoc_sparsity_param by setting this to None in your generate_counterfactuals() function call? That way the function do_posthoc_sparsity_enhancement() wouldn't get called. Let's see if the counterfactuals are computed without the posthoc_sparsity_param. We can then debug the scenario with posthoc_sparsity_param.

Regards,

gaugup avatar Mar 07 '22 20:03 gaugup

With posthoch_sparsity_param = None I receive the following message:

No Counterfactuals found for the given configuation,  perhaps try with different values of proximity (or diversity) weights or learning rate... ; total time taken: 00 min 24 sec
Query instance (original outcome : -1)
   order1  order2  order3  order4  order5  action
0     0.5    -1.5     1.5    -0.5     0.5  -1.213

This is strange because the output of a regular forward pass of the model output is:

tensor([ -1.6103, -12.0566,   7.0264,  -8.2856,  -0.6247],
       grad_fn=<AddBackward0>)

so instead of action = -1.213 I would have expected 'action = 2'.

bruno-hoermann avatar Mar 08 '22 09:03 bruno-hoermann

@gaugup, if you could take a look at this it would be much appreciated!

bruno-hoermann avatar Mar 09 '22 11:03 bruno-hoermann

@prunprun hi, from generate_counterfactuals's interface description, param desired_class can take 0 or 1, It seems to confirm your guess.

sunsssk avatar Mar 10 '22 08:03 sunsssk

@sunsssk Thank you! Maybe it should be added to the documentation which backend works with multiclass prediction.

bruno-hoermann avatar Mar 11 '22 08:03 bruno-hoermann

@sunsssk However in the Multiclass example the DiceGenetic class is used where documentation of generate_counterfactuals() says as well that desired_class can only take 0 or 1. So I think you cant rely on the documentation too much.

bruno-hoermann avatar Mar 11 '22 08:03 bruno-hoermann