[Bug] MultimodalQnA UT test fail
Priority
P1-Stopper
OS type
Ubuntu
Hardware type
Xeon-GNR
Installation method
- [x] Pull docker images from hub.docker.com
- [x] Build docker images from source
- [ ] Other
- [ ] N/A
Deploy method
- [x] Docker
- [x] Docker Compose
- [ ] Kubernetes Helm Charts
- [ ] Kubernetes GMC
- [ ] Other
- [ ] N/A
Running nodes
Single Node
What's the version?
bb9ec6e5d2a810ac054e7dbfdbb5ad9601ba50f4
Description
https://github.com/opea-project/GenAIExamples/actions/runs/15047914413/job/42329846327
https://github.com/opea-project/GenAIExamples/actions/runs/15061692188/job/42338742210
Reproduce steps
cd MultimodalQnA/tests export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" bash test_compose_on_xeon.sh
Raw log
docker logs lvm-llava
Downloading shards: 0%| | 0/3 [00:00<?, ?it/s]
Downloading shards: 33%|███▎ | 1/3 [00:18<00:36, 18.13s/it]
Downloading shards: 67%|██████▋ | 2/3 [00:32<00:16, 16.16s/it]
Downloading shards: 100%|██████████| 3/3 [00:50<00:00, 16.85s/it]
Downloading shards: 100%|██████████| 3/3 [00:50<00:00, 16.86s/it]
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
Loading checkpoint shards: 33%|███▎ | 1/3 [00:00<00:01, 1.33it/s]
Loading checkpoint shards: 67%|██████▋ | 2/3 [00:01<00:00, 1.42it/s]
Loading checkpoint shards: 100%|██████████| 3/3 [00:02<00:00, 1.54it/s]
Loading checkpoint shards: 100%|██████████| 3/3 [00:02<00:00, 1.50it/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Device set to use cpu
Passing `prompt` to the `image-to-text` pipeline is deprecated and will be removed in version 4.48 of 🤗 Transformers. Use the `image-text-to-text` pipeline instead
Traceback (most recent call last):
File "/home/user/comps/third_parties/llava/src/llava_server.py", line 271, in <module>
generator(
File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/image_to_text.py", line 137, in __call__
return super().__call__(inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1349, in __call__
outputs = list(final_iterator)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
item = next(self.iterator)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
processed = self.infer(item, **self.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1275, in forward
model_outputs = self._forward(model_inputs, **forward_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/image_to_text.py", line 209, in _forward
model_outputs = self.model.generate(inputs, **model_inputs, **generate_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/generation/utils.py", line 2223, in generate
result = self._sample(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/generation/utils.py", line 3211, in _sample
outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/models/llava/modeling_llava.py", line 427, in forward
raise ValueError(
ValueError: Image features and image tokens do not match: tokens: 1, features 576
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json
Retrying in 1s [Retry 1/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json
Retrying in 2s [Retry 2/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json
Retrying in 4s [Retry 3/5].
Attachments
No response
Hi @mhbuehler @tileintel Could you please help to check the issue? thank you!
The above error looks like a temporary network failure. You can verify with wget https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json -O /dev/null which now immediately returns. The job was re-run, the service started, and all tests passed here: https://github.com/opea-project/GenAIExamples/actions/runs/15047914413/job/42578809553
https://github.com/opea-project/GenAIExamples/actions/runs/15202824332 test fail again
This looks like a new issue. It's the same error that's reported in https://github.com/opea-project/GenAIComps/issues/1735. The GenAIComps/retriever container named retriever-redis is not able to start because of an import error that suggests there is a version mismatch between the neo4j, llama-index-llms-openai, and/or openai dependencies. The relevant part of the log is:
Traceback (most recent call last):
File "/home/user/comps/retrievers/src/opea_retrievers_microservice.py", line 15, in <module>
from integrations.neo4j import OpeaNeo4jRetriever
File "/home/user/comps/retrievers/src/integrations/neo4j.py", line 17, in <module>
from llama_index.llms.openai import OpenAI
File "/usr/local/lib/python3.11/site-packages/llama_index/llms/openai/__init__.py", line 2, in <module>
from llama_index.llms.openai.responses import OpenAIResponses
File "/usr/local/lib/python3.11/site-packages/llama_index/llms/openai/responses.py", line 6, in <module>
from openai.types.responses import (
ImportError: cannot import name 'ResponseTextAnnotationDeltaEvent' from 'openai.types.responses' (/usr/local/lib/python3.11/site-packages/openai/types/responses/__init__.py)
It looks like it may already be solved, but let me know if I can help further.
https://github.com/opea-project/GenAIExamples/pull/2011