rMIDAS icon indicating copy to clipboard operation
rMIDAS copied to clipboard

rMIDAS can't find dependencies, python can

Open aeggers opened this issue 3 years ago • 10 comments

Hello there -- trying rMIDAS again and from your vignette I have

set_python_env(x = "/usr/bin/python3")

(some more code)

adult_train <- train(adult_conv, training_epochs = 20, layer_structure = c(128,128), input_drop = 0.75, seed = 89)

and I get an error:

Initialising Python connection Error: ModuleNotFoundError: No module named 'matplotlib'

But matplotlib is actually installed. (I type python3 in the terminal and import matplotlib and can make plots etc.)

aeggers avatar Feb 16 '23 14:02 aeggers

Hi @aeggers, are you using an Apple Silicon Mac by any chance?

tsrobinson avatar Feb 16 '23 15:02 tsrobinson

Yes -- Apple M1 Pro

aeggers avatar Feb 16 '23 15:02 aeggers

Sorry for the slow reply @aeggers! We are working on a general fix to this issue, which should be released soon.

In the meantime, we would recommend using miniforge to set up a conda environment (rather than anaconda or miniconda as these don't work well with Apple chips atm).

To help, once you have installed miniforge, you can run the following at the command line (using the unzipped file attached below):

conda env create -f path/to/attached/file/midas-env-arm64.yml

This will create a conda environment called "rmidas" with the right dependencies, which you can then load into rMIDAS using rMIDAS::set_python_env("rmidas", type = "conda").

midas-env-arm64.yml.zip

tsrobinson avatar Feb 17 '23 15:02 tsrobinson

Hi!

Thanks for the instructions.

Unfortunately, I still get the following error when running the train() command. Initialising Python connection Error in py_initialize(config$python, config$libpython, config$pythonhome, : /Users/username/mambaforge/envs/rmidas/lib/libpython3.8.dylib - dlopen(/Users/username/mambaforge/envs/rmidas/lib/libpython3.8.dylib, 0x000A): tried: '/Users/username/mambaforge/envs/rmidas/lib/libpython3.8.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/username/mambaforge/envs/rmidas/lib/libpython3.8.dylib' (no such file), '/Users/username/mambaforge/envs/rmidas/lib/libpython3.8.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))

I am running this on an Apple M2 Pro. I installed the following Miniforge: https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-x86_64.sh

I would be thankful for any help!

vincentheddesheimer avatar Feb 28 '23 03:02 vincentheddesheimer

Hi @vincentheddesheimer -- I think the issue here is you are still using the x86 installer rather than the Apple Silicon native installer for your M2 Macbook (https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh)

You may need to remove the previous miniforge installer before installing the new one.

tsrobinson avatar Feb 28 '23 11:02 tsrobinson

Hi @tsrobinson: I tried to run it now with the link you provided but I got the same error message when running the train()command. Initialising Python connection Error in py_initialize(config$python, config$libpython, config$pythonhome, : /Users/xxx/miniforge3/envs/rmidas/lib/libpython3.8.dylib - dlopen(/Users/xxx/miniforge3/envs/rmidas/lib/libpython3.8.dylib, 0x000A): tried: '/Users/xxx/miniforge3/envs/rmidas/lib/libpython3.8.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/xxx/miniforge3/envs/rmidas/lib/libpython3.8.dylib' (no such file), '/Users/xxx/miniforge3/envs/rmidas/lib/libpython3.8.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))

From the error message I gather that I "need 'x86_64'?

But when I tried downloading the respective miniforge (https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge-pypy3-MacOSX-x86_64.sh) the terminal gave back: Your operating system appears not to be 64-bit, but you are trying to install a 64-bit version of Miniforge3.

Can you make sense of this? Thanks for your help!

vincentheddesheimer avatar Feb 28 '23 13:02 vincentheddesheimer

Hey @vincentheddesheimer, thanks for trying it and for sharing the output.

The ultimate issue is that the system architecture for M1/M2 Mac processors (arm64 or 'aarch64' sometimes) is different from the conventional Intel architecture (x86_64). To make things even more complicated, Apple tries to reconcile these differences in the backend using some software called Rosetta. This can lead to errors when the system tries to use both architectures concurrently.

This is actually a wider issue than rMIDAS, as we rely on reticulate to interface with Python. See, e.g., this issue which is similar to yours: https://github.com/rstudio/reticulate/issues/1159

That said, it's not insurmountable as I can get it to run on my M1 Macbook Pro. You might want to try following these steps (which is how I configured my system):

  1. Make sure you have aarch64 version of R by running R.version in the R console, and check that the platform contains "aarch64". If not, you can install the correct version by downloading the ...arm64.pkg here
  2. Remove your existing miniforge install (sorry)
  3. Follow these instructions to install miniforge via homebrew (a useful package manager for Mac)

Then, you can set up the rmidas conda environment by downloading our rmidas-env.yml file, and running the following at the command line from the same directory as the file:

conda env create -f rmidas-env.yml

Finally, in a fresh R session, load rMIDAS and set the conda environment:

library(rMIDAS)
set_python_env(x = "rmidas", type = "conda")

tsrobinson avatar Feb 28 '23 14:02 tsrobinson

Hi @tsrobinson, thank you - that worked! One more question: is there a difference in efficiency between imputing the data with rMIDAS vs. trying to set this up in Python with MIDASpy (although I encountered problems with installing Tensorflow for the M2 pro)? I am working with a fairly large dataset (100.000 rows : 7000 cols) and trying to figure out whether MIDAS will be able to work with that.

vincentheddesheimer avatar Feb 28 '23 22:02 vincentheddesheimer

Hi @vincentheddesheimer -- great glad to hear it!

Yes, I would expect there to be some difference at that scale -- not least because there will likely be some efficiency loss having to route data between Python and RStudio. The benefit of rMIDAS is some slightly more "ready made" data pre-processing functions.

Fortunately, the "rmidas" env you set up for rMIDAS should also allow you to try MIDASpy on your machine. At the terminal, just run conda activate rmidas and then that environment will have all the dependencies set up (including mac-compatible tensorflow!)

@edvinskis I'm leaving this open, because we may want to think if there's any way we can better address these compatibility issues within rMIDAS itself.

tsrobinson avatar Feb 28 '23 22:02 tsrobinson

Hi @aeggers and @vincentheddesheimer, the dependency issues have been addressed in rMIDAS v0.5.0, which is also available on CRAN. In this update, rMIDAS includes an automatic setup for interactive sessions that prompts the user on whether to automatically set up a Python environment and its dependencies. If you would like to install and try out the updated version, we would greatly appreciate your feedback. Please don't hesitate to reach out if you have any more questions. I plan to close this issue, but I’ll leave it open for a couple of weeks in case you would like to discuss this any further.

edvinskis avatar Sep 03 '23 09:09 edvinskis