causalml causalml.metrics.visualize.get_qini gives incorrent results for multiple models

Describe the bug Just like #272 , when the qini_score function is used to score multiple models at the same time, result of the second model could be influenced by the first model.
This problem is caused by re-using the same df for evaluation for each model in the get_qini function.

To Reproduce

import pandas as pd
from causalml.metrics.visualize import get_qini, RANDOM_COL, qini_score

test_df = pd.DataFrame(
    {
        "y": [0, 0, 0, 0, 1, 0, 0, 1, 1, 1],
        "w": [0] * 5 + [1] * 5,
    }
)

test_df = pd.DataFrame(
    {
        "y": [0, 0, 0, 0, 1, 0, 0, 1, 1, 1],
        "w": [0] * 5 + [1] * 5,
    }
)

good_uplift = [_/ 10 for _ in range(0, 5)]
bad_uplift = [1] + [0] * 9 
test_df['learner_1'] = good_uplift * 2
test_df['learner_2'] = bad_uplift # learner_2 is a bad model because it gives zero for almost all rows of data

print("TEST 2 models:")
print(qini_score(test_df))
print()

print("TEST learner_1 separately:")
print(qini_score(test_df[['y', 'w', 'learner_1']]))
print()

print("TEST learner_2 separately:")
print(qini_score(test_df[['y', 'w', 'learner_2']]))
print()

Results:

TEST 2 models:
learner_1    0.088636
learner_2    0.068939
Random       0.000000
dtype: float64

TEST learner_1 only:
learner_1    0.088636
Random       0.000000
dtype: float64

TEST learner_2 only:
learner_2   -0.378788
Random       0.000000
dtype: float64

Expected behavior The model learner_2 is expected to have the same results, with or without testing with learner_1:

TEST 2 models:
learner_1    0.088636
learner_2    -0.378788
Random       0.000000
dtype: float64

TEST learner_1 only:
learner_1    0.088636
Random       0.000000
dtype: float64

TEST learner_2 only:
learner_2   -0.378788
Random       0.000000
dtype: float64

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: macOS 12.3.1 (21E258)
Python Version: 3.8.9
Versions of Major Dependencies (pandas, scikit-learn, cython): numpy==1.22.3, scikit-learn==0.22.2, pandas==1.4.2

Additional context I fixed it in my own forked repo here. I wonder if it's okay for me to submit a PR to fix this issue. Many thanks.

Jun 12 '22 08:06 enzoliao

Thanks @enzoliao for identifying the issue the fix looks good to me. Please submit a PR for CausalML as well. Appreciate your contribution!

Jun 13 '22 05:06 ppstacy

Thanks. I just created a PR #520 here.

Jun 13 '22 06:06 enzoliao