Error when running catboost on valid training and target sets version 1.1.1
def train_model_pandas(self, train_df: pd.DataFrame, valid_df: pd.DataFrame = None):
labels = some function call that extracts labels successfully
# this part works
training = train_df.merge(labels, on="common key")
y = training[['target variable']]
Y = pd.get_dummies(y)
X = np.vstack(training['feature_scaled'].fillna(0))
I checked if any of these were null, they weren't and are of the same number of rows (422,8): target shape, (422,23) training shape. Target is pandas dataframe and X is numpy
#now call catboost here
#model training fails
self.__model = CatBoostClassifier(
iterations = 10000,
learning_rate=0.01,
loss_function='MultiCrossEntropy'
).fit(X,Y)
Problem:
I get the following error:
94, in train_model_pandas ).fit(X,Y) File "/usr/local/lib/python3.10/site-packages/catboost/core.py", line 5128, in fit self._fit(X, y, cat_features, text_features, embedding_features, None, sample_weight, None, None, None, None, baseline, use_best_model, File "/usr/local/lib/python3.10/site-packages/catboost/core.py", line 2355, in _fit self._train( File "/usr/local/lib/python3.10/site-packages/catboost/core.py", line 1759, in _train self._object._train(train_pool, test_pool, params, allow_clear_pool, init_model._object if init_model else None) SystemError: <method '_train' of '_catboost._CatBoost' objects> returned a result with an exception set
catboost version: 1.1.1 Operating System: windows 10, pyspark kernel but I am using pandas for training due to the nature of predictions. CPU: NA GPU: NA
For more context, this worked on an ec2 instance but failed to work on EMR. Thanks. After looking at my screen all day, I realized that turning verbose to False worked.
- Try to use the most modern stable release (1.2.5), it might be that this bug is already fixed in it.
- Can you create a fully end-to-end reproducible code example?
-
I realized that turning verbose to False worked.
This is strange, verbose settings should not cause any such errors.