diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Fix Colab and Notebook checks for `diffusers-cli env`

Open tolgacangoz opened this issue 1 year ago • 16 comments

This pull request updates the is_google_colab check to use an environment variable instead of checking for the presence of the google.colab module. This change ensures that the check is more reliable. Currently, !diffusers-cli env in a Colab's cell gives No.

Also, the check of is_notebook gives No if someone commands !diffusers-cli env in a notebook cell. Because, I guess, the checking code is run in a .py file? What to do here? I took this check from huggingface_hub. At the expansion proposal PR, I don't remember if I tried the command in a cell. Probably I tried the checking code directly within a cell. This was for is_google_colab as well. Should I remove is_notebook completely?

tolgacangoz avatar Jun 05 '24 14:06 tolgacangoz

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

"google.colab" in sys.modules gives False if we run !diffusers-cli env in a cell. But in the cell, "google.colab" in sys.modules gives True. Checking env variable gives True in the two cases. Idk, if the name of the env variable will be changed in the future. A similar thing happens for the is_notebook check as well.

tolgacangoz avatar Jun 06 '24 13:06 tolgacangoz

Oh I see, so somehow Google Colab injects the library into ipython but not into python:

!python -c 'import sys;print("google.colab" in sys.modules)'
False
!ipython -c 'import sys;print("google.colab" in sys.modules)'
22;0tTrue

Relying on a single env var still sounds fragile to me, as there is no guarantee that this can't change at any moment. The most official statement I've seen is this reddit commit by a Google Colab employee, and this uses the current approach.

I wonder if we should check if any(k.startswith("COLAB_") for k in os.environ) to be a bit more robust.

BenjaminBossan avatar Jun 06 '24 13:06 BenjaminBossan

Also, !diffusers-cli env produces "No" for notebook checking as well if it is run in a notebook cell. I guess, because the checking code itself is in a .py file? Let's remove is_notebook completely for now?

tolgacangoz avatar Jun 07 '24 08:06 tolgacangoz

Also, !diffusers-cli env produces "No" for notebook checking as well if it is run in a notebook cell.

Even with the proposed changes?

BenjaminBossan avatar Jun 07 '24 09:06 BenjaminBossan

I meant this: https://github.com/huggingface/diffusers/blob/0d68ddf3275b20b0d12cfd3d0a9f002fecfe001c/src/diffusers/utils/import_utils.py#L324-L333 get_ipython() can be run in a notebook/ipython. But, it is in a .py file :thinking:.

tolgacangoz avatar Jun 07 '24 10:06 tolgacangoz

Can/Should I remove is_notebook for now?

tolgacangoz avatar Jun 15 '24 08:06 tolgacangoz

I meant this:

I see. Indeed that doesn't work inside a colab notebook but at least locally, it works in a Jupyter notebook. So maybe we can extend the logic to say _is_notebook = _is_notebook or _is_google_colab (except if _is_notebook is supposed to not include colab notebooks, @sayakpaul do you know this?).

BenjaminBossan avatar Jun 19 '24 13:06 BenjaminBossan

When I run !diffusers-cli env in a Jupyter notebook cell on my laptop, it says No. Do you see Yes on your local machine?

tolgacangoz avatar Jun 19 '24 13:06 tolgacangoz

except if _is_notebook is supposed to not include colab notebooks, @sayakpaul do you know this?

We could include the logic of Colab check within the scope of the notebooks, too if that makes sense?

sayakpaul avatar Jun 19 '24 13:06 sayakpaul

When I run !diffusers-cli env in a Jupyter notebook cell on my laptop, it says No. Do you see Yes on your local machine?

Ah yes you're right, sorry for my mistake.

At least in my environment, I see that jupyter sets JPY_SESSION_NAME and JPY_PARENT_PID. So similarly to what we do for colab, we could check for env vars that start with JPY_.

BenjaminBossan avatar Jun 19 '24 14:06 BenjaminBossan

I have just removed the notebook check completely. Let's merge this?

tolgacangoz avatar Jun 21 '24 19:06 tolgacangoz

@BenjaminBossan WDYT?

So, from what I understand, we are removing the notebook check because it can be broken and we cannot be deterministic most of the times, yeah?

sayakpaul avatar Jun 22 '24 02:06 sayakpaul

So my understanding is that the any(k.startswith("JPY_") for k in os.environ) should work for Jupyter notebooks. No idea about vscode notebooks, as I don't use vscode. So it would be possible to make this attribute work, even if it's not super robust.

As to whether it's needed: I don't know enough about diffusers to answer that. Would it in any way help to debug diffusers issue if we had this information? If not, I would tend to remove it. If yes, I'd keep it, even if it's not super reliable.

BenjaminBossan avatar Jun 24 '24 13:06 BenjaminBossan

Yeah I don’t think having this information is super important for debugging purposes TBH.

sayakpaul avatar Jun 24 '24 13:06 sayakpaul

Thanks for the approvals!

@DN6 Gently pinging here.

tolgacangoz avatar Jul 18 '24 07:07 tolgacangoz

Sorry for the delay @tolgacangoz appreciate your patience here.

DN6 avatar Jul 23 '24 12:07 DN6

Thanks all for the reviews and merging!

tolgacangoz avatar Jul 23 '24 12:07 tolgacangoz