OmniParser icon indicating copy to clipboard operation
OmniParser copied to clipboard

How to run OmniParser on Google Colab?

Open agn-7 opened this issue 8 months ago • 9 comments

Is it possible to run OmniParser on Google Colab?

agn-7 avatar May 23 '25 11:05 agn-7

Actually depends on what you mean. Omniparser V2 is made of: Omniparser: Labels a screenshot so the AI can identify stuff Omnitool: The windows vm Gradio App: operates as glue between the 2.

Now, Omnitool requires virtualization and colab can't do nested virtualization.

So you can't run the whole package. I managed to run Omniparser+gradio app on a T4 free colab.

paciox avatar Jul 07 '25 00:07 paciox

actually I passed that issue I had. But yes, I meant only Omniparser, not OmniTool.

agn-7 avatar Jul 07 '25 15:07 agn-7

actually I passed that issue I had. But yes, I meant only Omniparser, not OmniTool.

Well, ok in any case I managed to run it on colab successfully. May I ask what you do with it? Does it expose an API usable from outside?

paciox avatar Jul 07 '25 16:07 paciox

In fact, I wanted to install OmniParser on my local, but at that time I had some issues regarding CUDA stuff, so I decided to set it up on Colab to see the functionality of OmniParser there . . .

but at the end, by disabling the CUDA stuff, I could set it up on my local and test it . . .

agn-7 avatar Jul 08 '25 07:07 agn-7

@paciox would it be possible for you to share your colab script on how to get it to work? I seem to be having issues with the transformers library not being able to recognize the florence2. I am simply trying to run the demo at https://github.com/microsoft/OmniParser/blob/master/demo.ipynb

kharanpv avatar Jul 28 '25 00:07 kharanpv

@paciox would it be possible for you to share your colab script on how to get it to work? I seem to be having issues with the transformers library not being able to recognize the florence2. I am simply trying to run the demo at https://github.com/microsoft/OmniParser/blob/master/demo.ipynb

Here it is, please let me know if you manage to run the full thing in the future.

!apt-get upgrade
# 0) Clone repo (if you haven't yet) and go into it
!git clone https://github.com/microsoft/OmniParser.git
%cd OmniParser

# 2) Install the other Python deps (no Conda needed on Colab)
!pip install -q -r requirements.txt

If I remember correctly I run them both even if they conflict. Else try to run one or the other

#FIX 1
# the crash has nothing to do with OmniParser itself it is triggered by the latest  Transformers releases (≥ v4.50) that expect every sub-module to expose a _initialize_weights method.
# The custom Florence-2 “DaViT” backbone shipped with OmniParser only implements the old _init_weights hook, so loading fails. Downgrading Transformers to 4.49 (or 4.46.3–4.49.3) or, alternatively, patching the DaViT class fixes the problem instantly.

#if below does not work try this and then rerun it
# !pip uninstall -y -q transformers
# !pip install -q transformers==4.49.0   # or 4.48.3 / 4.46.3 if you prefer

# remove half-installed wheels that show up as "~ransformers"
!rm -rf /usr/local/lib/python3.11/dist-packages/~ransformers*
# start from a known-good transformers build (keeps DaViT happy)
!pip install --force-reinstall -q "transformers==4.49.0"

#FIX 2
#Pydantic 2.11 changed JSON-Schema generation so that some fields
#Microsoft’s requirements.txt doesn’t pin either package, so on a fresh Colab you get the latest Pydantic 2.11.x pulled in by FastAPI, while OmniParser’s old Gradio 4.44 remains — boom.
#Gradio 5.20.0+ merged the patch that tolerates the new Pydantic schemas
!pip uninstall -y gradio gradio_client
!pip install "gradio==5.23.2" "gradio_client==1.7.1"

Yep restart the colab without resetting it

# 0️⃣ restart the Colab runtime first (Runtime ▸ Restart), then run:

# 1. nuke the half-uninstalled dirs that show up as "~ransformers"
!rm -rf /usr/local/lib/python3.11/dist-packages/~ransformers*

# 2. yank incompatible wheels
!pip uninstall -y gradio gradio-client transformers numpy pydantic requests packaging anyio fsspec

# 3. reinstall a **known-good stack**
!pip install --force-reinstall transformers==4.49.0 gradio==5.23.2 gradio-client==1.8.0 numpy==1.26.4 pydantic==2.11.7 requests==2.32.3 packaging==24.0 anyio==4.9.0 fsspec==2025.3.2 --no-cache-dir
# download the model checkpoints to local directory OmniParser/weights/
!for f in icon_detect/{train_args.yaml,model.pt,model.yaml} icon_caption/{config.json,generation_config.json,model.safetensors}; do huggingface-cli download microsoft/OmniParser-v2.0 "$f" --local-dir weights; done
!mv weights/icon_caption weights/icon_caption_florence
!python /content/OmniParser/gradio_demo.py

paciox avatar Jul 28 '25 01:07 paciox

@paciox Thanks for your script; I was able to get it to run. I'm going to see if I can integrate it into a project I am working on, and I will let you know what I figure out.

kharanpv avatar Jul 28 '25 22:07 kharanpv

@paciox Thanks for your script; I was able to get it to run. I'm going to see if I can integrate it into a project I am working on, and I will let you know what I figure out.

Any news? What you did with it? Have you been able to run omnitool also somehow?

paciox avatar Aug 20 '25 10:08 paciox

I got the Omniparser to work without gradio, but I haven't done much with it yet I'm afraid except play around with the hyperparameters. I haven't used Omnitool either.


From: paciox @.> Sent: Wednesday, 20 August 2025 6:07 am To: microsoft/OmniParser @.> Cc: Kharangate, Prateek (kharanpv) @.>; Comment @.> Subject: Re: [microsoft/OmniParser] How to run OmniParser on Google Colab? (Issue #303)

External Email: Use Caution

[https://avatars.githubusercontent.com/u/6546735?s=20&v=4]paciox left a comment (microsoft/OmniParser#303)https://github.com/microsoft/OmniParser/issues/303#issuecomment-3205335893

@pacioxhttps://github.com/paciox Thanks for your script; I was able to get it to run. I'm going to see if I can integrate it into a project I am working on, and I will let you know what I figure out.

Any news? What you did with it? Have you been able to run omnitool also somehow?

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/OmniParser/issues/303#issuecomment-3205335893, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6DNUTD3AOV6I4TVDASDYYT3ORCGHAVCNFSM6AAAAAB5YOPBUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMBVGMZTKOBZGM. You are receiving this because you commented.Message ID: @.***>

kharanpv avatar Aug 20 '25 20:08 kharanpv