tree_influence icon indicating copy to clipboard operation
tree_influence copied to clipboard

Error when fitting the estimator

Open aclarkse opened this issue 1 year ago • 2 comments

Hello,

I was using your implementation of BoostIn to fit my own data, but I came across an error, so I thought it might be due to some inherent inconsistency with my features. However, when fitting it to the iris data provided by the sklearn package (as cited in your example document in the repository), I came across this very same error:

180 # compute leaf derivative w.r.t. each train example in leaf_docs 181 numerator = g[leaf_docs, class_idx] + leaf_vals[leaf_idx] * h[leaf_docs, class_idx] # (no. docs,) --> 182 denominator = np.sum(h[leaf_docs, class_idx]) + l2_leaf_reg 183 leaf_dvs[leaf_docs, boost_idx, class_idx] = numerator / denominator * lr # (no. docs,) 185 # update approximation

TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'

Could you please give me some guidance as to what can be going wrong? For context, I am using an XGBoost model here, and I must provide scale_pos_weight=1 in order to avoid having an assertion error. It would be nice if this could be modified as well. Thank you!

aclarkse avatar Mar 16 '24 02:03 aclarkse

Hi @aclarkse! Can you please provide:

  • Python version
  • ibug version
  • XGBoost version
  • Snippet of code that reproduces the issue

Thanks!

jjbrophy47 avatar Mar 30 '24 17:03 jjbrophy47

Hi there! Of course:

  • python 3.10
  • ibug 0.0.9
  • xgboost 2.0.3

The error occurs when fitting the estimator. The model is defined as follows: model = XGBClassifier(scale_pos_weight=1).fit(X_train, y_train)

And after running this line: explainer = BoostIn().fit(model, X_train, y_train)

I get the following error message:


TypeError Traceback (most recent call last) Cell In[8], line 1 ----> 1 explainer = BoostIn().fit(model, X_train, y_train)

File c:\Users\andre\anaconda3\envs\tree_influence\lib\site-packages\tree_influence\explainers\boostin.py:53, in BoostIn.fit(self, model, X, y) 50 self.n_train_ = X.shape[0] 51 self.loss_fn_ = util.get_loss_fn(self.model_.objective, self.model_.n_class_, self.model_.factor) ---> 53 self.train_leaf_dvs_ = self.compute_leaf_derivatives(X, y) # (X.shape[0], n_boost, n_class) 54 self.train_leaf_idxs = self.model_.apply(X) # shape=(X.shape[0], no. boost, no. class) 56 return self

File c:\Users\andre\anaconda3\envs\tree_influence\lib\site-packages\tree_influence\explainers\boostin.py:182, in BoostIn._compute_leaf_derivatives(self, X, y) 180 # compute leaf derivative w.r.t. each train example in leaf_docs 181 numerator = g[leaf_docs, class_idx] + leaf_vals[leaf_idx] * h[leaf_docs, class_idx] # (no. docs,) --> 182 denominator = np.sum(h[leaf_docs, class_idx]) + l2_leaf_reg 183 leaf_dvs[leaf_docs, boost_idx, class_idx] = numerator / denominator * lr # (no. docs,) 185 # update approximation

TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'

aclarkse avatar Apr 03 '24 21:04 aclarkse

I think this issue should be resolved in v0.1.7. Please give that a try, thank you!

jjbrophy47 avatar May 27 '24 21:05 jjbrophy47