interpret Why index 1 is out of bounds for axis 0 with size 1?

The size of my dataset is (150, 32) and the y-label is (150, ) I don't get an error with the open source dataset on your package, but I get an error with my own dataset, why is that?

May 24 '22 09:05 Turningl

I have a similar problem, it seems to be related to this segment of the code in interpret/utils/all.py:

def unify_predict_fn(predict_fn, X):
    predictions = predict_fn(X[:1])
    if predictions.ndim == 2:
        new_predict_fn = lambda x: predict_fn(x)[:, 1]  # noqa: E731
        return new_predict_fn
    else:
        return predict_fn

the first line of code makes a prediction for the first entry in the dataset, which my Keras model (IMHO understandably) outputs as a (1, 1) array because predict outputs are (batch_size, output_vector_size) which in my case is a regression problem so only one value is written per example.

Accordingly, python crashes in line 4 since 1 is out of bounds.

I get this error both when trying to do a LIME explanation and PDP's of each feature

I used the Boston dataset (imported from Keras) as an input to train a Keras Sequential model, did I mess up the input or is this an error on the framework side?

May 26 '22 17:05 J0s3c4rl0s

I got the same problem with my Keras Model @J0s3c4rl0s. It states that my index 1 ist out of bound for line 1..... I got it for all local and global methods (LIME,SHAP,PDP,Salib) Currently i am evaluating not only my Keras Model via InterpretML, I tried also a XGBoost Model and the K nearest neighbor Regressor. They worked under the same circumstances without any Problems.

For further understanding: I did a Train-, Test Split, afterwards transformed the categorical features with OHE and used the StandardScaler. Also tried the SHAP and LIME Main Packeges on my Keras Model and they worked. Train-, Test Shape: (2051, 242) (2051,) (879, 242) (879,)

Here is my Code for My Keras Model:

keras_callbacks   = [
    EarlyStopping(monitor='val_loss', patience=30, mode='min', min_delta=0.01)]
mc = ModelCheckpoint('best_model_3.h5', monitor='val_loss', mode='max', verbose=1, save_best_only=True)
#Definition von Hyperparametern
EPOCHS = 500
BATCH_SIZE = 128
DROPOUT =0.15
OPT = Adam(learning_rate=0.0015)

model = tf.keras.Sequential()

model.add(Dense(X_train.shape[1],activation='relu'))
model.add(Dense(32,activation='relu'))
model.add(Dropout(DROPOUT))

model.add(Dense(64,activation='relu'))
model.add(Dropout(DROPOUT))

model.add(Dense(128,activation='relu'))
model.add(Dropout(DROPOUT))
model.add(Dense(1))

model.compile(optimizer=OPT, loss='mae',metrics=['mse','mae','mape',])

r = model.fit(X_train, y=y_train,
              validation_data=(X_test,y_test),
              batch_size=BATCH_SIZE,
              epochs=EPOCHS,
              callbacks= [keras_callbacks,mc])```


As follows an example of the InterpretML LIME usage:

```#Blackbox explainers need a predict function, and optionally a dataset
lime = LimeTabular(predict_fn=model.predict, data=X_train, random_state=1)

#Pick the instances to explain, optionally pass in labels if you have them
lime_local = lime.explain_local(X_train[0:2], y_train[0:2], name='LIME')

show(lime_local)

IndexError: index 1 is out of bounds for axis 1 with size```

Does anyone know why the Error is occuring ? am I doing something wrong ?

Dec 20 '22 20:12 Emirbeg4

Hi @Turningl, @J0s3c4rl0s, and @Emirbeg4 -- Thank you for reporting this and for the excellent debugging. I have fixed the underlying issue, which as @J0s3c4rl0s surmised was that we were treating any 2-dimensional arrays from the predict function as classification. This works for most scikit-learn estimators, but apparently not for Keras. We now check to see if the 2nd dimension has shape 1 as well.

The fix is in the develop branch, and will be more available in our next release which will be v0.3.1 or higher.

This should work now for:LIME,SHAP,PDP,Salib.

Mar 03 '23 19:03 paulbkoch