InvokeAI icon indicating copy to clipboard operation
InvokeAI copied to clipboard

[bug]: segmentation fault on startup [Python 3.11]

Open keturn opened this issue 2 years ago • 16 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

OS

Linux

GPU

cuda

VRAM

12

What version did you experience this issue on?

bb9460d2781276d688d8da6287957826c2c05023

What happened?

I'm trying to get a development environment going with Python 3.11. Dependencies all installed successfully, but invokeai-web segfaults immediately.

faulthandler log
$ PYTHONFAULTHANDLER=True invokeai-web
Fatal Python error: Segmentation fault

Current thread 0x00007f13ff078000 (most recent call first):
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/torch/jit/_script.py", line 1345 in script
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/activations_me.py", line 60 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/create_act.py", line 8 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/evo_norm.py", line 32 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/create_norm_act.py", line 12 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/conv_bn_act.py", line 9 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/__init__.py", line 10 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/fx_features.py", line 18 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/helpers.py", line 21 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/beit.py", line 50 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/__init__.py", line 1 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/__init__.py", line 2 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/midas/vit.py", line 3 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/midas/blocks.py", line 4 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/midas/dpt_depth.py", line 6 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/api.py", line 9 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/__init__.py", line 11 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/__init__.py", line 7 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  ...

Extension modules: pydantic.typing, pydantic.errors, pydantic.version, pydantic.utils, pydantic.class_validators, pydantic.config, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.networks, pydantic.types, pydantic.json, pydantic.error_wrappers, pydantic.fields, pydantic.parse, pydantic.schema, pydantic.main, pydantic.dataclasses, pydantic.annotated_types, pydantic.decorator, pydantic.env_settings, pydantic.tools, pydantic, yaml._yaml, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, charset_normalizer.md, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, PIL._imaging, regex._regex, scipy._lib._ccallback_c, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.linalg._flinalg, scipy.special._ellip_harm_2, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy._lib.messagestream, scipy.optimize._trlib._trlib, numpy.linalg.lapack_lite, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.spatial._ckdtree, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.optimize._direct, pywt._extensions._dwt, pywt._extensions._cwt, pywt._extensions._pywt, pywt._extensions._swt, scipy._lib._uarray._uarray, PIL._imagingft, psutil._psutil_linux, psutil._psutil_posix, skimage._shared.geometry, skimage.measure._find_contours_cy, skimage.measure._marching_cubes_lewiner_cy, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, skimage.measure._moments_cy, scipy.signal._sigtools, scipy.signal._max_len_seq_inner, scipy.signal._upfirdn_apply, scipy.signal._spline, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.signal._sosfilt, scipy.signal._spectral, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy.stats._statlib, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.signal._peak_finding_utils, skimage.measure._pnpoly, skimage.measure._ccomp (total: 169)
Segmentation fault (core dumped)

segfault.txt

Additional context

Using python3.11 package on Ubuntu 22.04.2 LTS.

keturn avatar Jul 27 '23 21:07 keturn

I see timm in there, and our dependencies are pinned to an old version of that, but upgrading that to the latest didn't help.

The top line is in torch.jit, so I guess something is messed up with my torch installation?

keturn avatar Jul 27 '23 21:07 keturn

if that line number can be trusted, it's falling over when torch jitscript is trying to copy over the docstring reference?

https://github.com/pytorch/pytorch/blob/e9ebda29d87ce0916ab08c06ab26fd3766a870e5/torch/jit/_script.py#L1345

that is, uh, not something I expected.

keturn avatar Jul 27 '23 21:07 keturn

https://pytorch.org/docs/stable/jit.html#disable-jit-for-debugging — using PYTORCH_JIT=0 to disable it allows the process to start, but it's obviously not a fix.

keturn avatar Jul 27 '23 22:07 keturn

Thread 1 "python" received signal SIGSEGV, Segmentation fault.                                                                                                                                                                               0x00000000005266a0 in _PyDictKeys_StringLookup (dk=0x0, key='__doc__') at ../Objects/dictobject.c:1011                                                                                                                                       1011    ../Objects/dictobject.c: No such file or directory.                                                                                                                                                                                  (gdb) bt                                                                                                                                                                                                                                     #0  0x00000000005266a0 in _PyDictKeys_StringLookup (dk=0x0, key='__doc__') at ../Objects/dictobject.c:1011                                                                                                                                   #1  0x0000000000504e03 in specialize_dict_access (kind=<optimized out>, base_op=95, hint_op=159, values_op=154, name=<optimized out>, type=0x448b4e0, instr=0x4cd1ea6, owner=<torch._C.ScriptFunction at remote 0x7fff435acad0>)                 at ../Python/specialize.c:625                                                                                                                                                                                                            #2  _Py_Specialize_StoreAttr (name=<optimized out>, instr=0x4cd1ea6, owner=<torch._C.ScriptFunction at remote 0x7fff435acad0>) at ../Python/specialize.c:813                 

That dk=0x0 -- a null got passed in as the DictKeys object? how does this even happen

keturn avatar Jul 27 '23 22:07 keturn

building a new version of Python 3.11.4 (using pyenv) instead of using the python3.11 in Ubuntu LTS seems to have fixed things.

So I guess this is not-a-bug?

but maybe we have to explain to people that python 3.11 works unless you're using Ubuntu LTS? ugh.

keturn avatar Jul 27 '23 23:07 keturn

Yikes. @Millu , let's add a warning in the docs about potential python 3.11 issues on Ubuntu LTS (22.04).

Here's a recipe from @gogurtenjoyer to build python on linux: https://discord.com/channels/1020123559063990373/1049495067846524939/1134255238963011644

Is that the process you followed @keturn ?

psychedelicious avatar Jul 31 '23 08:07 psychedelicious

No, I used https://github.com/pyenv/pyenv

keturn avatar Jul 31 '23 19:07 keturn

Seems like this is happening on python 3.10 too 😬

See #3967

Millu avatar Aug 07 '23 05:08 Millu

Both segfaults, but very different places. This one was at the very start of the process launch, long before being able to attempt image generation.

keturn avatar Aug 07 '23 06:08 keturn

If I can add to this:

./invoke.sh: line 54: 39533 Segmentation fault      (core dumped) invokeai-web $PARAMS

JohnDevlopment avatar Oct 22 '23 20:10 JohnDevlopment

./invoke.sh: line 54: 39533 Segmentation fault      (core dumped) invokeai-web $PARAMS

This also happens on my system (Manjaro), but it might be a different issue because setting PYTORCH_JIT=0 does not fix this issue for me.

SpecificProtagonist avatar Oct 23 '23 19:10 SpecificProtagonist

Segmentation fault with fresh install of invoke 4 on Manjaro Linux: invoke.sh: line 37: 29423 Segmentation fault (core dumped) invokeai-web $PARAMS

No ideas how to debug this!

arigbs avatar Apr 05 '24 21:04 arigbs

This seems to be dependent on the python version installed. You can try installing the latest python using pyenv or building yourself a fresh python.

psychedelicious avatar Apr 05 '24 22:04 psychedelicious

Segmentation fault with fresh install of invoke 4 on Manjaro Linux: invoke.sh: line 37: 29423 Segmentation fault (core dumped) invokeai-web $PARAMS

No ideas how to debug this!

It turns out in my case it was patchmatch issue, I recalled trying to fix a recurrent patchmatch warning by following the steps on the repo about how to stop that warning, so I disabled patchmatch in the invokeai.yaml file and I'm not getting the segmentation fault issue anymore, and the webui loads.

arigbs avatar Apr 05 '24 23:04 arigbs

@arigbs that's a good catch. Some of the users who have this error had successfully compiled patchmatch. You'll see in the startup logs.

psychedelicious avatar Apr 06 '24 02:04 psychedelicious

Same problem here. Fedora 40, nvidia, Python 3.11, v4.2.2post1

Crashes on startup. Disabling patchmatch in the config fixes it.

daleglass avatar May 22 '24 22:05 daleglass

Delete python3.11 completely,

sudo apt-get remove python3.11-venv 
sudo apt list --installed | grep python3.11
sudo apt-get purge python3.11
sudo apt-get autoremove
sudo rm -rf /usr/local/lib/python3.11
sudo rm -rf /usr/local/bin/python3.11
sudo apt-get clean
sudo apt-get autoclean

install python3.10-venv

sudo apt install git python3.10-venv -y

It worked for me.

heloess avatar Jun 12 '24 16:06 heloess