weather-tools icon indicating copy to clipboard operation
weather-tools copied to clipboard

MetView should work out of the box

Open alxmrs opened this issue 3 years ago • 6 comments

Currently, in order to include the MetView dependency on Dataflow, we require users to build their own custom container image and pass specific dataflow arguments in the weather-mv command. In the ideal case, users shouldn't have to worry about docker containers or passing the right arguments; regridding should work out of the box.

Implementation notes

  • Short term approach: Publish a public docker image that has MetView installed on a Dataflow worker image.
  • Medium / long term approach: Publish a public docker image that has miniconda installed. Allow us to modify our setup.py build command script to run conda install commands during the setup.

alxmrs avatar Jun 27 '22 21:06 alxmrs

A quick note on the long-term approach: Check out these docs on multistage custom docker environments: https://cloud.google.com/dataflow/docs/guides/using-custom-containers#use_a_custom_base_image_or_multi-stage_builds It seems like it would be pretty easy / quick to build an image from a base Miniconda image that also includes the Python Beam SDK.

alxmrs avatar Jun 28 '22 23:06 alxmrs

FTR I can confirm that the current main repo's build instructions do not work:

command

gcloud builds submit weather_mv/ --tag "$IMAGE_URI:dev"

error

[...]
ModuleNotFoundError: No module named 'conda.cli.main_info'
The command '/bin/sh -c conda install python=${py_version} -y' returned a non-zero code: 1
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1

blackvvine avatar Sep 01 '22 21:09 blackvvine

I also tested @bahmandar 's image here, and it does get regrid working. The build is pretty resource-intensive (~1h on N1_HIGHCPU_32), and downloading and preparing the 2.5GB image itself takes about 10' in Dataflow.

blackvvine avatar Sep 01 '22 21:09 blackvvine

For building the image: do you have anaconda installed on your local machine? (I'm surprised that this seems like a requirement).

alxmrs avatar Sep 01 '22 21:09 alxmrs

Is @bahmandar's image already publicly distributed? If so, that would make fixing this much easier.

alxmrs avatar Sep 01 '22 21:09 alxmrs

It is not, unfortunately.

bahmandar avatar Sep 02 '22 15:09 bahmandar