cloud icon indicating copy to clipboard operation
cloud copied to clipboard

Enable running tfc.run() on notebook running from within a AI Platform hosted notebook.

Open SinaChavoshi opened this issue 5 years ago • 3 comments

Using AI Platform hosted notebooks, we created an Jupyter notebook with the model that we are were planning to train and saved it. We created a separate notebook in which we had our runner wrapping script similar to

import tensorflow_cloud as tfc

tfc.run(
    docker_config=tfc.DockerConfig(
        image_build_bucket="somebucket",
        parent_image="gcr.io/xyz"), 
    entry_point="model.ipynb",
    distribution_strategy="auto",
    worker_count=5,
    requirements_txt='requirements.txt',
    chief_config=tfc.COMMON_MACHINE_CONFIGS["CPU"],
    worker_config=tfc.COMMON_MACHINE_CONFIGS["CPU"],
    job_labels={
        "job": "kaggle_competition",
        "team": "base_line",
    },
    stream_logs=False
)

The run fails with error

/opt/conda/lib/python3.7/site-packages/tensorflow_cloud/core/preprocess.py in _get_colab_notebook_content()
    207 def _get_colab_notebook_content():
    208     """Returns the colab notebook python code contents."""
--> 209     response = _message.blocking_request("get_ipynb",
    210                                          request="",
    211                                          timeout_sec=200)

AttributeError: 'NoneType' object has no attribute 'blocking_request'

Would be nice to add support for this case were all requirements and a proper base image are directly provided for the remote run.

SinaChavoshi avatar Nov 12 '20 21:11 SinaChavoshi

Any news on this?

amanas avatar Apr 16 '21 11:04 amanas

Any updates?

tkawuah avatar Jun 04 '21 22:06 tkawuah

Try the following - I did not run from an AI Platform notebook, but from a private GitLab instance, but the error seams to be related with the same bug (or inprecise code):

Within the tensorflow cloud code, there are detection to see if the code is running from a google-colab notebook or from a kaggle notebook. As you see in the error - it ran into the 'colab' branch which failed as it was not running from colab. Within 'preprocess.py', the proper branch will be reached if the attribute 'called_from_notebook' got the value 'False'. The detection for this is in the 'run.py' module and checks if your 'IPython.get_ipython().class.name' contains the word "Shell". For me (GitLab) it contains the word shell and the branching goes into the wrong direction.

Long story short. Quick and dirty fix: ` def _called_from_notebook_FIX(): return False

from unittest.mock import patch

with patch('tensorflow_cloud.core.run._called_from_notebook', new=_called_from_notebook_FIX): #tfc.run code here...`

Monkey-patching a 'False' into it and it runs for me. Someone (maybe me) should write a pull request on a better environment detection for tensorflow-cloud.

FHermisch avatar Jun 09 '21 08:06 FHermisch