RHEL9(plow)/poetry not setting ld library paths correctly when installing PyTorch along with not installing a requirement(numpy)
Description
When I try to add Pytorch and then import torch, I keep getting an error saying the libcudnn.so file is not found. Here is how to reproduce the error:
poetry new torch-newest cd torch-newest poetry add torch (which defaults to the newest 2.2.2) poetry shell python -c "import torch"
Traceback (most recent call last): File "", line 1, in File "/central/home/zwang2/torch-newest/.venv/lib64/python3.9/site-packages/torch/init.py", line 237, in from torch._C import * # noqa: F403 ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory
Workarounds
poetry new torch-newest cd torch-newest poetry add torch poetry add numpy export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/$USER/.cache/pypoetry/virtualenvs/torch-newest-bPtN3B6m-py3.9/lib/python3.9/site-packages/nvidia/cudnn/lib/:/home/$USER/.cache/pypoetry/virtualenvs/torch-newest-bPtN3B6m-py3.9/lib/python3.9/site-packages/nvidia/nccl/lib poetry shell python -c "import torch"
Poetry Installation Method
pip
Operating System
RHEL9
Poetry Version
1.8.2
Poetry Configuration
cache-dir = "/home/zwang2/.cache/pypoetry"
experimental.system-git-client = false
installer.max-workers = null
installer.modern-installation = true
installer.no-binary = null
installer.parallel = true
keyring.enabled = true
solver.lazy-wheel = true
virtualenvs.create = true
virtualenvs.in-project = null
virtualenvs.options.always-copy = false
virtualenvs.options.no-pip = false
virtualenvs.options.no-setuptools = false
virtualenvs.options.system-site-packages = false
virtualenvs.path = "{cache-dir}/virtualenvs" # /home/zwang2/.cache/pypoetry/virtualenvs
virtualenvs.prefer-active-python = false
virtualenvs.prompt = "{project_name}-py{python_version}"
warnings.export = true
Python Sysconfig
No response
Example pyproject.toml
No response
Poetry Runtime Logs
Traceback (most recent call last):
File "", line 1, in
File "/central/home/zwang2/torch-newest/.venv/lib64/python3.9/site-packages/torch/init.py", line 237, in
from torch._C import * # noqa: F403
ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory
it is not poetry's responsibility to set your environment variables. Actually it is not even within poetry's power.
then could you explain why when i do not use poetry and just pip3 install torch, I don't see this issue?
no idea, but it is not to do with setting the LD_LIBRARY_PATH.
if you hope for someone to help you debug this then providing a way to reproduce it would be best, eg in a docker-ized form.
but maybe now that you know that that environment variable is a red herring you will have better luck digging into it yourself.
I am not affiliated with Poetry but I use it daily. This doesn't feel like a Poetry issue. I am also a little curious about the order you've ran your commands. I am going to spin up a VM on my Prox host and test this to see what is happening so this can hopefully be closed.
From what I gather - you want to create a new Poetry environment and install torch. Which should install numpy and set the library path. I don't understand your order of operations. You create your environment, cd into it, then run poetry add but you're not in your shell and you didn't use poetry -c, but you later go into a shell and then use poetry -c.
I think these steps will achieve the same thing you're going for, and they work without issue in RHEL 9.3 for me as far as Poetry goes.
Update after OS install:
sudo dnf update -y
Install pip
sudo dnf install pip -y
Now I did get an error here related to pip but that is a different issue.
WARNING: Value for scheme.platlib does not match. Please report this to <https://github.com/pypa/pip/issues/10151>
distutils: /home/zorro/.local/lib/python3.9/site-packages
sysconfig: /home/zorro/.local/lib64/python3.9/site-packages
WARNING: Additional context:
user = True
home = None
root = None
prefix = None
Install Poetry, create environment, install torch
python -m pip install poetry
poetry new torchster
cd torchster
poetry shell
poetry add torch torchvision
That installed all dependencies including numpy. If you're having trouble with the env variable after following these, it isn't going to be related to Poetry. Poetry's part in this is fine from my testing.
I am not trying to step on toes, but there hasn't been a response from the person who opened this in three weeks. It's hard to convey tone online. I say this with no ill-intent or condescension. I just feel like this can be put to bed and Poetry devs focus their time on other issues.
It seems clear to me that from the original post, this is not a Poetry issue. It is an order-of-operations issue. If @neonine2 would simply create the project, cd into it, do poetry shell, then begin doing poetry add - it would solve their problem.
I tested on Redhat 9.3 (see above), and Poetry worked without any issues. I genuinely feel this can be closed. I apologize if I am over-stepping.
Sorry I just haven't gotten the chance to test your solution, I will close it now.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.