tpot2 icon indicating copy to clipboard operation
tpot2 copied to clipboard

Remove jupyter as a dependency (+maybe others)

Open chimaerase opened this issue 1 year ago • 4 comments

Installing TPOT2 unexpectedly added Jupyter as a requirement for my project. As a general rule, software dependencies should be minimized to reduce bloat and improve resource use.

Context of the issue

Installing TPOT2 unexpectedly added Jupyter as a requirement for my project. Perhaps some dependencies can be installed as optional extras and the install instructions updated? Possible some of these packages dependencies are being pulled in by Dask, but this subjectively looks like a lot of unnecessary bloat for my application, where we don't have / want a user interface. Switching back to the older TPOT removed jupyter as a requirement.

Process to reproduce the issue

# Start a fresh Python environment using Docker (or any other way)
$ docker run -it --rm --entrypoint /bin/bash python:3.12-slim-bookworm
# No packages installed
root@7d268534f551:/# pip freeze
# Install TPOT2
root@7d268534f551:/# pip install tpot2

The result is a very large list of dependencies: dependency-list.txt

chimaerase avatar Sep 11 '24 19:09 chimaerase

We have removed jupyter as a dependency in PR #130 . The current version of main does not include it. But we haven't pushed an update to pip in a while, which we should do again soon. The older version currently in pip does require jupyter.

we used to have an export to the pipeline API from the baikal package, which was eventually removed due to a refactor and never re-implemented. We could remove that or include it as an optional dependency.

I believe we do use all the other packages are required though.

perib avatar Sep 11 '24 20:09 perib

Thank you for your reply! That makes sense, especially with TPOT2 still in alpha status.

chimaerase avatar Sep 12 '24 21:09 chimaerase

Confirmed that jupyter is removed as a dependency in 0.1.8a0. I'll leave this issue open for now in case its helpful as a reminder to consider removing any others (e.g. @perib mentioned baikal above).

chimaerase avatar Sep 18 '24 16:09 chimaerase

I also noticed that nvidia-nccl-cu12 is a dependency of xgboost as installed here. It appears that's an NVIDIA-specific library that will only work on systems with NVIDIA GPU's (but will still be installed). Maybe TPOT2 can install xgboost-cpu by default, and include xgboost's GPU support as an extra?

chimaerase avatar Oct 24 '24 00:10 chimaerase