pixi icon indicating copy to clipboard operation
pixi copied to clipboard

Pixi requires cuda in `system-requirements` even if CUDA is installed by Pixi via dependencies

Open garymm opened this issue 1 year ago • 2 comments

Checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pixi, using pixi --version.

Reproducible example

[project]
name = "jax-env"
channels = ["conda-forge", "nvidia"]
platforms = ["linux-64"]

[dependencies]
jax = ">=0.4.23,<0.5.0"

[host-dependencies]
python = ">=3.10,<3.11"

[target.linux-64.dependencies]
jaxlib = { version = ">=0.4.23,<0.5.0", build = "*cuda12*" }
cuda-nvcc = ">=12.0,<13"
cuda-cupti = "*"

With this file, pixi install fails with:

 × failed to solve the conda requirements of 'default' 'linux-64'
  ╰─▶ Cannot solve the request because of: jaxlib >=0.4.23,<0.5.0 *cuda12* cannot be installed because there are no viable options:
      └─ jaxlib 0.4.23 | 0.4.23 | 0.4.23 | 0.4.23 | 0.4.23 | 0.4.23 | 0.4.23 | 0.4.23 | 0.4.23 would require
         └─ __cuda *, for which no candidates were found.

Issue description

But if I add:

[system-requirements]
cuda = "12"

Then things work. This is ugly because:

  1. It's not a system requirement: it's managed by pixi.
  2. I only want it to apply to a specific target, not all targets.

Expected behavior

No system-requirements required.

garymm avatar Apr 26 '24 22:04 garymm

We are having the same issue. I think it's related to https://github.com/prefix-dev/pixi/issues/480

dennis-wey avatar May 02 '24 13:05 dennis-wey

Editing to add that this behavior is also on Windows. I've been trying to run pytorch stuff on Windows and was pulling my hair out as to why cuda wasn't available. The "system-requirements" bug is above is the reason.

Adding:

[tool.pixi.system-requirements]
cuda = "12"

to the pyproject.toml is required to get pytorch with cuda to work.

executing pixi run python -c "import torch;print(torch.cuda.is_available())"

returns "False" without the "tool.pixi.system-requirements" and the cuda requirement to the file.

2themaxx avatar May 07 '24 19:05 2themaxx

Hey sorry for the late reaction,

TL;DR: This is not a bug, we're looking into improved user experiences to lower this issue's impact.

This is not a bug but a deliberate decision. We needed a way to specify the virtual packages on a system while solving.

We know this is currently a bad user experience and we're brainstorming weekly on a proper fix. But its a hard problem to get right. One of the written down solutions currently is the one in this issue: https://github.com/prefix-dev/pixi/issues/346#issuecomment-2094906047.

If you would like to give your input there that would help us.

To explain the technical side:

The dependencies cuda-* is not the same as the system requirement cuda. The dependencies like cuda-version (the root of most cuda packages) has a dependency on __cuda. The __ in this dependency means that it is not a package that conda-forge can provide you but a "thing" that the system you are running on needs to provide. This is called a virtual_package but we didn't like that name so we renamed it to system-requirements.

Thus the toml:

[system-requirements]
cuda = "12.1"

Tells pixi to add the __cuda==12.1 to the set of available packages which can be used to solve an environment. Without this information the dependency solver will not accept the cuda-version package as a viable option as it can not find the package anywhere.

If we always used the package on your local system then would make your project very limited on the available machines it can run on. So we define some default package that we expect to be present like on linux we add the __glibc==2.17 as most conda-forge packages are build for that version. For cuda a default doesn't make a lot of sense as we can't say something like, "all linux machines should atleast have __cuda==12" as that would break more often then not.

So as described in the other issue we need a smarter way, ideas are dripping in and we would like to solve this in the not too distant future.

ruben-arts avatar May 25 '24 10:05 ruben-arts

So I guess part of the issue is that some packages in conda-forge are incorrectly declaring a dependency on system-wide CUDA, even though CUDA can be installed via conda packages (specifically from the nvidia channel)? Is there a way for the packages to declare a dependency on either system cuda or a package that provides cuda? Or some other solution to this from the packager side?

garymm avatar May 28 '24 16:05 garymm

Actually the nvidia channel has the same logic. You can check the dependencies, or actually constraints, the the nvidia cudatoolkit package.

image

Setting the [system-requirements] table lets pixi know what to do.

[system-requirement]
cuda = "11"

ruben-arts avatar May 28 '24 18:05 ruben-arts

I'm able to install pytorch-cuda without any system-requirement, but other packages such as jaxlib I'm not able to. So it seems like whatever the pytorch-cuda package is doing (seems it depends on cuda-libraries) works without a system-requirement. I'm not sure why jaxlib or the cudatoolkit package require the system dependency?

garymm avatar May 28 '24 18:05 garymm

I'm not sure but there might be some runtime code in those dependencies to decide to run with or without cuda. Even if it is installed as pytorch-cuda.

ruben-arts avatar May 29 '24 11:05 ruben-arts

OK I guess my real issue is with the packages declaring this dependency unnecessarily (I think, though not very sure). I will close this, and hopefully https://github.com/prefix-dev/pixi/issues/480 can be fixed to make it more tolerable :-)

garymm avatar May 29 '24 16:05 garymm

FYI opened https://github.com/conda-forge/jaxlib-feedstock/issues/254 against jaxlib.

garymm avatar May 29 '24 17:05 garymm