Request discussion of distributing with conda if you have optional pip dependencies
Hi and thanks for your great guide! 🚀 I'm finding it very useful as I navigate switching from setup.py 💯
This question is sparked from a package review and some helpful comments from @NickleDave .
I use packages such as Deep Lab Cut that contain optional feature dependencies, so there are convenient pip install options like pip install deeplabcut[gui]. You have a nice discussion of optional dependencies in your guide: https://www.pyopensci.org/python-package-guide/package-structure-code/declare-dependencies.html
In my admittedly limited understanding, conda doesn't support optional dependencies. It would be useful to have guidelines on what to do in the case where you want them, and how this should factor into whether you should distribute to conda. As it currently stands, the guide makes it seem like you should default to creating a conda distribution. With this, I'm concerned about conda's apparent lack of flexibility with optional dependencies: e.g., in one package I help maintain we are considering splitting it into two repos because conda doesn't support it. It's a real quandary!
I think it would be helpful to see a discussion of this point, with some suggestions or options. I should add I'm not an expert in package management, so there's a chance I'm just missing something obvious 😆
hey @EricThomson 👋 welcome to pyOpenSci!! this is a great comment - let's see what we can hash out on our discourse with the other conda maintainers regarding this quandary. and then from there we can talk about what the best approach would be for conda users and what to add to our guidebook. admittedly we've spent the least amount of time on the conda end of things (just because of all the work on the pip end) but our conda tutorial will be coming out in the upcoming months and i think this is important for packages that as you mention have feature deps that are optional!
@EricThomson it is going to be a little hard to have a guide on this because conda really doesn't have a good framework for it. For your example, deeplabcut[gui] there are two typical approaches I see:
-
the conda-forge package
deeplabcutcomes with "batteries included" and inlcudes theguidependencies. A secondary output package calldeeplabcut-basecomes without theguidependencies. This is, for example, howmatplotlibworks on conda-forge. Thematplotlibpackage comes with the optionalpyqtandtornadodependencies to support GUI stuff, and depends on thematplotlib-basepackage: https://github.com/conda-forge/matplotlib-feedstock/blob/main/recipe/meta.yaml#L79-L92 -
The conda-forge package
deeplabcutonly includes the base dependencies but specifies run constraints (run_constrainedrequirements in therecipe.yml) so that an optional dependency installed into the same environment has the correct version constraints applied. This works well when single dependencies add functionality, like addingpyarrowto an environment withpandasto get improved I/O performance.
For the two approaches, I would generally suggest (1) if the optional functionality has multiple dependencies and (2) if it has a single dependency. All of this breaks down quickly when there are many sets of optional dependencies, though this typically is a sign that things should be extracted into separate plugins/packages. Either way, deeplabcut will probably want to handle missing dependencies gracefully and suggest to users how to install them.