mljar-supervised icon indicating copy to clipboard operation
mljar-supervised copied to clipboard

segmentation fault on autoML fit

Open yairVanti opened this issue 3 years ago • 13 comments

happens on recent version of mljar. (0.11.3)

relevant details :

Translated Report (Full Report Below)

Process: python3.8 [36912] Path: /Users/USER/*/python Identifier: python3.8 Version: ??? Code Type: X86-64 (Native) Parent Process: pycharm [32492] Responsible: pycharm [32492] User ID: 501

Date/Time: 2022-10-03 14:27:12.1979 +0300 OS Version: macOS 12.6 (21G115) Report Version: 12 Bridge OS Version: 6.6 (19P6067)

Crashed Thread: 10

Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000010 Exception Codes: 0x0000000000000001, 0x0000000000000010 Exception Note: EXC_CORPSE_NOTIFY

VM Region Info: 0x10 is not in any region. Bytes before following region: 140737486778352 REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL UNUSED SPACE AT START --->
VM_ALLOCATE 7fffffe7f000-7fffffe80000 [ 4K] r-x/r-x SM=ALI

Thread 10 crashed with X86 Thread State (64-bit): rax: 0x0000000140063e80 rbx: 0x00007fd330c6b240 rcx: 0x0000000000000004 rdx: 0x0000000000000000 rdi: 0x0000000000000002 rsi: 0x000070000b878b90 rbp: 0x000070000b878b90 rsp: 0x000070000b878aa0 r8: 0x0000000000000001 r9: 0x00000000000000c0 r10: 0x0000000000000001 r11: 0x0000000000000246 r12: 0x0000000000000000 r13: 0x0000000000000000 r14: 0x0000000000000002 r15: 0x000070000b878b90 rip: 0x0000000140018a6c rfl: 0x0000000000010206 cr2: 0x0000000000000010

Logical CPU: 8 Error Code: 0x00000004 (no mapping for user data read) Trap Number: 14

Thread 10 instruction stream: 70 48 8b 74 24 78 48 8b-bc 24 80 00 00 00 48 89 pH.t$xH..$....H. 03 33 c0 4d 8b 0b 4d 8b-53 08 4d 8b 63 10 48 89 .3.M..M.S.M.c.H. 53 08 48 89 4b 10 49 89-28 49 89 70 08 49 89 78 S.H.K.I.(I.p.I.x 10 4d 89 4d 00 4d 89 55-08 4d 89 65 10 e8 62 bb .M.M.M.U.M.e..b. f9 ff 66 90 41 55 41 56-41 57 53 55 48 83 ec 40 ..f.AUAVAWSUH..@ 48 89 f5 48 8d 05 1a b4-04 00 48 63 ff 48 8b 10 H..H......Hc.H.. [4c]8b 2c fa 4c 89 ef e8-f8 2a ff ff 4d 8d b5 c0 L.,.L....*..M... <== 05 00 00 4c 89 f7 e8 f3-1d 00 00 89 c3 85 db 0f ...L............ 85 12 05 00 00 48 8b 55-20 48 85 d2 0f 84 e4 04 .....H.U H...... 00 00 b8 01 00 00 00 86-02 48 8b 45 28 4c 8b 38 .........H.E(L.8 48 8d 1d 0d b5 04 00 49-89 ad 98 01 00 00 83 3b H......I.......; 00 0f 84 ae 04 00 00 83-7b 14 00 74 03 0f ae f0 ........{..t....

packages from requirements file :

aiohttp==3.7.4.post0 albumentations==1.2.0 alembic==1.6.5 antlr4-python3-runtime==4.9.3 aporia==1.0.79 appnope==0.1.2 astroid==2.4.2 async-timeout==3.0.1 atomicwrites==1.4.0 attrs==21.2.0 backcall==0.2.0 blis==0.7.8 boto3==1.17.88 botocore==1.20.112 bson==0.5.10 catalogue==2.0.7 catboost==1.0.6 category-encoders==2.3.0 certifi==2020.12.5 chardet==4.0.0 click==7.1.2 cliff==3.8.0 cloudpickle==2.2.0 cmaes==0.8.2 cmd2==2.1.2 colorama==0.4.4 colorlog==5.0.1 colour==0.1.5 commonmark==0.9.1 confluent-kafka==1.6.1 croniter==1.3.5 cycler==0.10.0 cymem==2.0.6 dacite==1.6.0 dask==2022.9.2 dataclasses==0.6 dataclasses-json==0.5.2 deap==1.3.1 debugpy==1.6.0 decorator==4.4.2 deepchecks==0.9.0 dill==0.3.5.1 distributed==2022.9.2 dnspython==2.2.1 docker==5.0.3 dtreeviz==1.3.3 entrypoints==0.4 et-xmlfile==1.0.1 evidently==0.1.47.dev1 fastai==2.5.6 fastcore==1.4.4 fastdownload==0.0.6 fastjsonschema==2.15.3 fastprogress==1.0.2 Flask==1.1.2 fonttools==4.28.3 fsspec==2022.5.0 future==0.18.2 graphviz==0.17 greenlet==1.1.1 gunicorn==20.1.0 HeapDict==1.0.1 hydra-core==1.2.0 idna==2.10 ImageHash==4.2.1 imbalanced-learn==0.8.0 imblearn==0.0 imgaug==0.4.0 importlib-resources==5.7.1 iniconfig==1.1.1 ipykernel==6.6.0 ipython==7.23.1 ipython-genutils==0.2.0 isort==5.7.0 itsdangerous==1.1.0 jdcal==1.4.1 jedi==0.18.0 Jinja2==3.0.0 jmespath==0.10.0 joblib==1.2.0 jsonschema==4.6.0 jupyter-client==7.2.0 jupyter-core==4.10.0 kafka-python==2.0.2 kiwisolver==1.3.1 langcodes==3.3.0 lazy-object-proxy==1.4.3 lightgbm==3.3.2 llvmlite==0.39.1 locket==1.0.0 Mako==1.1.4 Markdown==3.3.4 MarkupSafe==2.0.0rc2 marshmallow==3.10.0 marshmallow-enum==1.5.1 marshmallow-oneofschema==3.0.1 matplotlib==3.3.4 matplotlib-inline==0.1.3 mccabe==0.6.1 mljar-supervised==0.11.3 msgpack==1.0.4 mttkinter==0.6.1 multidict==5.2.0 murmurhash==1.0.7 mypy-extensions==0.4.3 nbformat==5.4.0 nest-asyncio==1.5.5 nonechucks==0.4.2 numba==0.56.2 numpy==1.23.3 omegaconf==2.2.2 opencv-python==4.5.4.60 openpyxl==3.0.6 optuna==2.9.1 orjson==3.6.4 packaging==21.0 pandas==1.5.0 parso==0.8.1 partd==1.2.0 pathy==0.6.1 patsy==0.5.1 pbr==5.6.0 pendulum==2.1.2 pexpect==4.8.0 pickleshare==0.7.5 Pillow==8.3.2 plotly==5.5.0 pluggy==0.13.1 prefect==1.2.4 preshed==3.0.6 prettytable==2.1.0 prompt-toolkit==3.0.16 psutil==5.9.1 ptyprocess==0.7.0 py==1.10.0 pydantic==1.8.2 pyDeprecate==0.3.2 Pygments==2.8.1 pylint==2.6.2 pymongo==4.1.0 pyparsing==2.4.7 pyperclip==1.8.2 pyreadline3==3.3 pyrsistent==0.18.1 pytest==6.2.4 pytest-mock==3.7.0 python-dateutil==2.8.2 python-editor==1.0.4 pytorch-ignite==0.4.9 pyts==0.12.0 python-box==6.0.2 python-dateutil==2.8.2 python-editor==1.0.4 python-slugify==6.1.2 pyts==0.12.0 pytz==2022.2.1 pytzdata==2020.1 PyYAML==5.4.1 pyzmq==23.1.0 requests==2.25.1 rich==12.4.4 rope==0.18.0 s3transfer==0.4.2 scikit-learn==1.1.2 scikit-plot==0.3.7 scipy==1.9.1 seaborn==0.11.1 setuptools-scm==6.3.2 simplejson==3.17.2 six==1.16.0 sktime==0.7.0 slicer==0.0.7 smart-open==5.2.1 sortedcontainers==2.4.0 spacy==3.4.1 spacy-legacy==3.0.9 spacy-loggers==1.0.2 split-folders==0.4.3 SQLAlchemy==1.4.22 srsly==2.4.3 statsmodels==0.12.2 stevedore==3.3.0 stopit==1.1.2 stringcase==1.2.0 tabulate==0.8.9 tblib==1.7.0 tenacity==6.3.1 text-unidecode==1.3 thinc==8.1.2 threadpoolctl==3.1.0 tk==0.1.0 toml==0.10.2 tomli==1.2.2 toolz==0.12.0 torch==1.10.1 torchmetrics==0.8.2 torchvision==0.11.2 tornado==6.1 tqdm==4.64.1 traitlets==5.1.0 tsai==0.3.1 typer==0.4.1 typing-inspect==0.8.0 typing_extensions==4.3.0 update-checker==0.18.0 urllib3==1.26.3 wasabi==0.9.1 wcwidth==0.2.5 webencodings==0.5.1 wordcloud==1.8.1 wrapt==1.12.1 yarl==1.7.0 zipp==3.8.1 websocket-client==1.3.3 Werkzeug==1.0.1 xgboost==1.6.2 zict==2.2.0 pyod==1.0.4 suod==0.0.8 importlib_metadata==4.12.0 shap==0.39.0

yairVanti avatar Oct 03 '22 11:10 yairVanti

@yairVanti thanks for reporting. Could you please provide the code+data to reproduce the issue? What processor type do you have?

pplonski avatar Oct 03 '22 11:10 pplonski

processeor is 2.6GHz 6-Core Intel Core i7 i think its not an issue of data, it happens on many kinds of data. another thing is that it happens only if n_jobs is bigger than 1. on one thread everything works.

yairVanti avatar Oct 03 '22 13:10 yairVanti

Thank you. Are you able to track the reason?

pplonski avatar Oct 03 '22 13:10 pplonski

no, the linear regression works and then on the simple /default algorithms step it crashes.

yairVanti avatar Oct 03 '22 13:10 yairVanti

@pplonski - i suspect it's an issue with monterey OS version see https://github.com/microsoft/LightGBM/issues/4229 and https://www.pythonfixing.com/2021/11/fixed-python-multithreading-didn-work.html maybe the import order matters ? if the import of lightgbm will be first it will pass ?

yairVanti avatar Oct 06 '22 12:10 yairVanti

Thanks @yairVanti for investigation. Can you confirm that import order matters? Can you try to import lightgbm in your script before automl import?

pplonski avatar Oct 06 '22 12:10 pplonski

tried , it didnt work....

yairVanti avatar Oct 08 '22 16:10 yairVanti

I got the same error because of the memory usage limited. The process is killed by the system. But mljar can not quit automaticly. But the trained model is preserved.

xinlnix avatar Oct 15 '22 13:10 xinlnix

when i ran all algorithms except of lightGBM - everything works. so indeed the problem is narrowed to lightgbm algorithm in AutoML running on mac (currently my version is 12.6.1)

yairVanti avatar Dec 06 '22 12:12 yairVanti

@yairVanti great catch! What processor do you have on Mac? M1/M2?

pplonski avatar Dec 06 '22 12:12 pplonski

2.6 GHz 6-Core Intel Core i7

yairVanti avatar Dec 06 '22 13:12 yairVanti

are you able to manually update only lightgbm package and check if the problem still occurs?

pplonski avatar Dec 06 '22 13:12 pplonski

updated lightgbm to 3.3.0 (latest version in pypi) , and the problem persists.

yairVanti avatar Dec 06 '22 13:12 yairVanti