MultiTrain icon indicating copy to clipboard operation
MultiTrain copied to clipboard

Error while using the train.fit() function in MultiTrain

Open jbdatascience opened this issue 3 years ago • 4 comments

I am trying to implement an example of using MultiTrain, from this article: https://www.analyticsvidhya.com/blog/2022/09/make-model-training-and-testing-easier-with-multitrain/

I receive an error in this part of the code:

After splitting the features and labels into train, test is appended to a variable named split. This variable then holds X_train, X_test, y_train, and y_test; we would need it in the next function below. train.fit(X: str = None, y: str = None, split_self: bool = False, X_train: str = None, X_test: str = None, y_train: str = None, y_test: str = None, split_data: str = None, splitting: bool = False, kf: bool = False, fold: int = 5, excel: bool = False, return_best_model: str = None, show_train_score: bool = False)

fit = train.fit(X=features, y=labels, splitting=True, split_data=split)

THE ERROR : LinearRegression(n_jobs=-1) fitting LinearRegression(n_jobs=-1)

TypeError Traceback (most recent call last) in 5 y=labels, 6 splitting=True, ----> 7 split_data=split)

/usr/local/lib/python3.7/dist-packages/MultiTrain/regression/regression_models.py in fit(self, X, y, split_self, X_train, X_test, y_train, y_test, split_data, splitting, kf, fold, excel, return_best_model, show_train_score) 396 mae = mean_absolute_error(true, pred) 397 rmse = np.sqrt(mean_squared_error(true, pred)) --> 398 r2 = r2_score(true, pred, force_finite=True) 399 try: 400 rmsle = np.sqrt(mean_squared_log_error(true, pred))

TypeError: r2_score() got an unexpected keyword argument 'force_finite'

Can you give me some guidance on this?

jbdatascience avatar Oct 18 '22 10:10 jbdatascience

  1. Can you please provide me with the version of MultiTrain you're working with? The force_finite argument was removed in the latest version of MultiTrain(v0.13.11)

  2. You would need to use the train.split() method to split your dataset, only then can you assign split_data = split and splitting = True.

import pandas as pd
from MultiTrain import MultiClassifier

train = MultiClassifier()
df = pd.read_csv('file.csv')

features = df.drop("nameOflabelcolumn", axis = 1)
labels = df["nameOflabelcolumn"]

split = train.split(X=features, 
                    y=labels, 
                    sizeOfTest=0.3, 
                    randomState=42,
                    strat=True,
                    shuffle_data=True)

fit = train.fit(splitting=True,
                split_data=split)
  1. If you used the train_test_split method directly from sklearn, you would have to set each variable to their corresponding arguments. E.g
import pandas as pd
from sklearn.model_selection import train_test_split
from MultiTrain import MultiClassifier
train = MultiClassifier()

df = pd.read_csv('filename.csv')

features = df.drop('labelName', axis=1)
labels = df['labelName']

X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)
fit = train.fit(X_train=X_train, 
                     X_test=X_test, 
                     y_train=y_train, 
                     y_test=y_test, 
                     split_self=True, #always set this to true if you used the traditional train_test_split
              ) 

LOVE-DOCTOR avatar Oct 18 '22 10:10 LOVE-DOCTOR

I installed it like this:

  • https://github.com/LOVE-DOCTOR/MultiTrain#installation !pip install MultiTrain
  • Successfully installed MultiTrain-0.1.30 catboost-1.1 pyaml-21.10.1 scikit-optimize-0.9.0

And after that I did:

  • If you experience issues or come across a bug while using MultiTrain,
  • make sure to update to the latest version with !pip install --upgrade MultiTrain

Now I am not sure which version I have at this point

jbdatascience avatar Oct 18 '22 13:10 jbdatascience

Check your version of MultiTrain by doing this

import MultiTrain
print(MultiTrain.__version__)

Ensure that your version is 0.13.11

If it's lower

pip install MultiTrain==0.13.11

If that doesn't work, please provide me with your os, Python version, MultiTrain version.

LOVE-DOCTOR avatar Oct 19 '22 11:10 LOVE-DOCTOR

@jbdatascience

Have you been able to fix this error?

LOVE-DOCTOR avatar Oct 22 '22 16:10 LOVE-DOCTOR