Causalml Installation in Databricks Runtime 9.1
Describe the bug
Error while installing causalml libaray in Databricks runtime 9.1
To Reproduce !pip install causalml Expected behavior
Collecting causalml
Using cached causalml-0.12.3.tar.gz (406 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [60 lines of output]
/databricks/python3/lib/python3.8/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-3vtz2qv0/causalml_bcf06cd8516f45eaa55c61a966172194/causalml/inference/tree/uplift.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
/databricks/python3/lib/python3.8/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-3vtz2qv0/causalml_bcf06cd8516f45eaa55c61a966172194/causalml/inference/tree/causaltree.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
```
Error compiling Cython file:
------------------------------------------------------------
...
cdef double node_impurity(self) nogil:
"""Evaluate the impurity of the current node, i.e. the impurity of
samples[start:end]."""
cdef double* sum_total = self.sum_total
^
------------------------------------------------------------
causalml/inference/tree/causaltree.pyx:77:37: Cannot assign type 'double[::1]' to 'double *'
Error compiling Cython file:
------------------------------------------------------------
...
cdef SIZE_t* samples = self.samples
cdef SIZE_t start = self.start
cdef SIZE_t pos = self.pos
cdef SIZE_t end = self.end
cdef double* sum_left = self.sum_left
^
------------------------------------------------------------
causalml/inference/tree/causaltree.pyx:135:36: Cannot assign type 'double[::1]' to 'double *'
Error compiling Cython file:
------------------------------------------------------------
...
cdef SIZE_t start = self.start
cdef SIZE_t pos = self.pos
cdef SIZE_t end = self.end
cdef double* sum_left = self.sum_left
cdef double* sum_right = self.sum_right
^
------------------------------------------------------------
causalml/inference/tree/causaltree.pyx:136:37: Cannot assign type 'double[::1]' to 'double *'
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-3vtz2qv0/causalml_bcf06cd8516f45eaa55c61a966172194/setup.py", line 71, in <module>
ext_modules=cythonize(extensions, annotate=True),
File "/databricks/python3/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1127, in cythonize
cythonize_one(*args)
File "/databricks/python3/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1250, in cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: causalml/inference/tree/causaltree.pyx
Compiling causalml/inference/tree/causaltree.pyx because it depends on /databricks/python3/lib/python3.8/site-packages/sklearn/tree/_tree.pxd.
Compiling causalml/inference/tree/uplift.pyx because it depends on /databricks/python3/lib/python3.8/site-packages/numpy/__init__.pxd.
[1/2] Cythonizing causalml/inference/tree/uplift.pyx
[2/2] Cythonizing causalml/inference/tree/causaltree.pyx
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.`
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Environment (please complete the following information):**
- OS: Linux 5.4.0-1080-aws
- Python Version: 3.8
- Versions of Major Dependencies
- pandas==1.4.3
- scikit-learn
- cython == 0.29.30
I had similar issues. Making sure that I have the required versions from cython, scikit-learn and numpy solved the problem.
I too am facing this issue on my macOS Monterey 12.5. I am able to reproduce this issue on a conda virtual environment.
pip freeze output of the environment is as follows:|
certifi @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_884c889c-96af-444f-bd6d-daddb5e9a462ykj3l5n_/croots/recipe/certifi_1655968814730/work/certifi
cycler==0.11.0
Cython==0.29.32
dill==0.3.5.1
fonttools==4.35.0
future==0.18.2
graphviz==0.20.1
importlib-metadata==4.12.0
joblib==1.1.0
kiwisolver==1.4.4
llvmlite==0.39.0
matplotlib==3.5.3
numba==0.56.0
numpy==1.22.4
opt-einsum==3.3.0
packaging==21.3
pandas==1.4.3
patsy==0.5.2
Pillow==9.2.0
progressbar2==4.0.0
pydotplus==2.0.2
pygam==0.8.0
pyparsing==3.0.9
pyro-api==0.1.2
pyro-ppl==1.8.1
python-dateutil==2.8.2
python-utils==3.3.3
pytz==2022.2.1
scikit-learn==1.1.2
scipy==1.9.0
seaborn==0.11.2
shap==0.37.0
six==1.16.0
slicer==0.0.3
statsmodels==0.13.2
threadpoolctl==3.1.0
torch==1.12.1
tqdm==4.64.0
typing_extensions==4.3.0
xgboost==1.6.1
zipp==3.8.1
I am trying to install v0.12.2 (tried 0.12.3 as well) and all major dependencies have correct versions as present in requirements.txt
Still I am getting same bug and similar stack trace.
@valeria-io would you please share your version of packages in your environment by running pip freeze.?
Interesting ones should be those who are mentioned in requirements.txt.
@thangarajan8
I believe the issue you're having is that gcc and g++ are not installed, you might want to run
sudo apt-get install -y gcc g++