evalml icon indicating copy to clipboard operation
evalml copied to clipboard

Segmentation issue keeps crashing the kernel

Open enfeizhan opened this issue 1 year ago • 2 comments

issue.csv The AutoMLSearch keeps crashing with the simple dataset. Running the code in terminal gives segmentation error. If run in Jupyter Notebook, the notebook crashes and gets restarted.

import pandas as pd
import evalml

fm = pd.read_csv('issue.csv')
fm.ww.init()

fm.ww.describe()

y = fm.ww.pop('label')

automl = evalml.AutoMLSearch(
    X_train=fm,
    y_train=y,
    problem_type='binary',
    random_seed=3,
    max_batches=5
)
automl.search()

The data doesn't have infinity or null values. In principle, it shouldn't crash the kernel even though it won't get an amazing model.

enfeizhan avatar Jun 05 '24 01:06 enfeizhan

The search went through once the search scope is limited to random forest and linear_model: allowed_model_families=["random_forest", "linear_model"]. Further investigation finds the problem is with lightgbm. As long as lightgbm isn't here, the search would be fine.

enfeizhan avatar Jun 05 '24 03:06 enfeizhan

Thanks for reporting and investigating @enfeizhan. Could you share what evalml and lightgbm versions you're running with, as well as a bit more information about your data (types, size, etc)?

eccabay avatar Jun 13 '24 12:06 eccabay