Document the workaround of "ModuleNotFoundError" when setup.py depends on custom dependencies like 'torch'
Issue Kind
Other
Description
I'm working on deploying the AI/ML models to production. I'm using Poetry for dependency management and notably the lock file capabilities. However, I observe a common problem that many AI packages have custom dependencies like torch in their setup.py:
- facebookresearch/detectron2
- open-mmlab/mmdetection
- open-mmlab/mmagic
- facebookresearch/xformers and related issue
- see more on sourcegraph
- note also the closed poetry issue
- see also one of unanswered stackoverflow threads
Steps to reproduce:
$ cd `mktemp -d`
$ poetry new my_package && cd my_package
$ python -m venv .venv
$ source .venv/bin/activate
Now I edit my pyproject.toml to have a dependency on https://github.com/open-mmlab/mmagic.git:
$ cat pyproject.toml
[tool.poetry]
name = "my-package"
version = "0.1.0"
description = ""
authors = ["My Name <[email protected]>"]
readme = "README.md"
[tool.poetry.dependencies]
python = "^3.10"
mmagic = {git = "https://github.com/open-mmlab/mmagic.git", rev = "0a560bb"}
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
Locking:
$ poetry lock
Updating dependencies
Resolving dependencies... (6.2s)
Unable to determine package info for path: /private/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmp.xxZdsKad/my_package/.venv/src/mmagic
Command ['/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmpuvi1xtvh/.venv/bin/python', '-I', '-W', 'ignore', '-c', "import build\nimport build.env\nimport pyproject_hooks\n\nsource = '/private/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmp.xxZdsKad/my_package/.venv/src/mmagic'\ndest = '/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmpuvi1xtvh/dist'\n\nwith build.env.DefaultIsolatedEnv() as env:\n builder = build.ProjectBuilder.from_isolated_env(\n env, source, runner=pyproject_hooks.quiet_subprocess_runner\n )\n env.install(builder.build_system_requires)\n env.install(builder.get_requires_for_build('wheel'))\n builder.metadata_path(dest)\n"] errored with the following return code 1
Error output:
Traceback (most recent call last):
File "<string>", line 13, in <module>
File "/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmpuvi1xtvh/.venv/lib/python3.10/site-packages/build/__init__.py", line 239, in get_requires_for_build
with self._handle_backend(hook_name):
File "/Users/a.yushkovskiy/.pyenv/versions/3.10.10/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmpuvi1xtvh/.venv/lib/python3.10/site-packages/build/__init__.py", line 360, in _handle_backend
raise BuildBackendException(exception, f'Backend subprocess exited when trying to invoke {hook}') from None
build._exceptions.BuildBackendException: Backend subprocess exited when trying to invoke get_requires_for_build_wheel
Fallback egg_info generation failed.
Command ['/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmpuvi1xtvh/.venv/bin/python', 'setup.py', 'egg_info'] errored with the following return code 1
Output:
Traceback (most recent call last):
File "/private/var/folders/6g/pv8036cd7gnb1c86vc3jhx69dh2brq/T/tmp.xxZdsKad/my_package/.venv/src/mmagic/setup.py", line 9, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
So the problem is that mmagic's setup.py depends on torch as build dependency.
Attempt 1: try to add this dependency to [build-system]:
$ cat pyproject.toml
...
[build-system]
requires = ["poetry-core", "torch"]
build-backend = "poetry.core.masonry.api"
$ poetry lock
#> same problem: ModuleNotFoundError: No module named 'torch'
Attempt 2: try to install torch into my poetry virtualenv:
$ pip install torch
Collecting torch
Downloading torch-2.4.1-cp310-none-macosx_11_0_arm64.whl.metadata (26 kB)
...
$ python -c 'from torch import __version__; print(__version__)'
2.4.1
$ poetry run python -c 'from torch import __version__; print(__version__)' # just in case to check
2.4.1
$ poetry lock
#> same problem: ModuleNotFoundError: No module named 'torch'
So the problem is that poetry creates a new isolated virtual environment using virtualenv, which uses the default value of --system-site-packages=false. However, this value can be overwritten using env var VIRTUALENV_SYSTEM_SITE_PACKAGES=true:
$ export VIRTUALENV_SYSTEM_SITE_PACKAGES=true
$ poetry lock
poetry lock
Updating dependencies
Resolving dependencies... (35.7s)
Writing lock file
$ cat poetry.lock | grep mmagic
name = "mmagic"
url = "https://github.com/open-mmlab/mmagic.git"
Success!
Conclusion
I think this workaround should be officially supported (by guaranteeing that --system-site-packages won't be overwritten by poetry when creating temporary virtual envs and clearly documented in a troubleshooting section.
Impact
see details in Description
Workarounds
So Poetry can't work on the listed set of packages, even if running within a docker image where torch is definitely installed, for example:
FROM pytorch/pytorch:2.1.2-cuda11.8-cudnn8-runtime
WORKDIR /app
RUN pip install "poetry==1.8.3"
COPY pyproject.toml /app/
COPY poetry.lock /app/
RUN poetry install # or poetry lock
#> error: ModuleNotFoundError: No module named 'torch'
I don't know a better workaround of this issue rather than RUN poetry export --output=requirements.txt and then pip install -r requirements.txt, which is a dirty hack because solves only a part of the problem (installing in docker image in CI/CD) and doesn't work for setting up my local dev environment.
this is not a good workaround, the whole point of an isolated build environment is that it should be isolated.
you should prefer to raise issues with projects that do not build with isolated environments asking them either (or both) to:
- declare their build dependencies, so that they do build in an isolated environment
- publish wheels, so that end users don't have to care about this
The reality of it is that a large amount of python packages have this problem specifically. I agree that this is a problem with the packages, but it would be nice if poetry for example allowed to specify build requirements for such packages in pyproject.toml since often it's not realistic to get the packages themselves fixed in any reasonable amount of time.
@deivse I have to agree with @dimbleby here, this is a problem for the project. While it might be nice to provide an override feature, it introduces a significant maintenance burden for Poetry maintainers. And when considering that standards exist for a reason and a feature like that would encourage misbehaving actors in the ecosystem, I do not think there is sufficient value there. I am not saying that this specific case is misbehavior, it is likely a simple misconfiguration that haven't been caught yet in the development process - a lot of these cases are.
I understand that it would be nice to have the flexibility, amongst other priorities this might not be a top one. I have raised https://github.com/open-mmlab/mmagic/pull/2154 to help fix the issue. But note that torchvision does not have a compatible release after python 3.9 (that is more a torchvision issue).
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.