GenAIExamples [Bug] MultimodalQnA UT test fail

Priority

P1-Stopper

OS type

Ubuntu

Hardware type

Xeon-GNR

Installation method

[x] Pull docker images from hub.docker.com
[x] Build docker images from source
[ ] Other
[ ] N/A

Deploy method

[x] Docker
[x] Docker Compose
[ ] Kubernetes Helm Charts
[ ] Kubernetes GMC
[ ] Other
[ ] N/A

Running nodes

Single Node

What's the version?

bb9ec6e5d2a810ac054e7dbfdbb5ad9601ba50f4

Description

https://github.com/opea-project/GenAIExamples/actions/runs/15047914413/job/42329846327

https://github.com/opea-project/GenAIExamples/actions/runs/15061692188/job/42338742210

Reproduce steps

cd MultimodalQnA/tests export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" bash test_compose_on_xeon.sh

Raw log

docker logs lvm-llava

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]
Downloading shards:  33%|███▎      | 1/3 [00:18<00:36, 18.13s/it]
Downloading shards:  67%|██████▋   | 2/3 [00:32<00:16, 16.16s/it]
Downloading shards: 100%|██████████| 3/3 [00:50<00:00, 16.85s/it]
Downloading shards: 100%|██████████| 3/3 [00:50<00:00, 16.86s/it]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]
Loading checkpoint shards:  33%|███▎      | 1/3 [00:00<00:01,  1.33it/s]
Loading checkpoint shards:  67%|██████▋   | 2/3 [00:01<00:00,  1.42it/s]
Loading checkpoint shards: 100%|██████████| 3/3 [00:02<00:00,  1.54it/s]
Loading checkpoint shards: 100%|██████████| 3/3 [00:02<00:00,  1.50it/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Device set to use cpu
Passing `prompt` to the `image-to-text` pipeline is deprecated and will be removed in version 4.48 of 🤗 Transformers. Use the `image-text-to-text` pipeline instead
Traceback (most recent call last):
  File "/home/user/comps/third_parties/llava/src/llava_server.py", line 271, in <module>
    generator(
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/image_to_text.py", line 137, in __call__
    return super().__call__(inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1349, in __call__
    outputs = list(final_iterator)
              ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
    item = next(self.iterator)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
    processed = self.infer(item, **self.params)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1275, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/image_to_text.py", line 209, in _forward
    model_outputs = self.model.generate(inputs, **model_inputs, **generate_kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/generation/utils.py", line 2223, in generate
    result = self._sample(
             ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/generation/utils.py", line 3211, in _sample
    outputs = self(**model_inputs, return_dict=True)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/llava/modeling_llava.py", line 427, in forward
    raise ValueError(
ValueError: Image features and image tokens do not match: tokens: 1, features 576
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json
Retrying in 1s [Retry 1/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json
Retrying in 2s [Retry 2/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json
Retrying in 4s [Retry 3/5].

Attachments

No response

May 16 '25 07:05 ZePan110

Hi @mhbuehler @tileintel Could you please help to check the issue? thank you!

May 19 '25 08:05 yinghu5

The above error looks like a temporary network failure. You can verify with wget https://huggingface.co/llava-hf/llava-1.5-7b-hf/resolve/main/config.json -O /dev/null which now immediately returns. The job was re-run, the service started, and all tests passed here: https://github.com/opea-project/GenAIExamples/actions/runs/15047914413/job/42578809553

May 20 '25 18:05 mhbuehler

https://github.com/opea-project/GenAIExamples/actions/runs/15202824332 test fail again

May 23 '25 08:05 ZePan110

This looks like a new issue. It's the same error that's reported in https://github.com/opea-project/GenAIComps/issues/1735. The GenAIComps/retriever container named retriever-redis is not able to start because of an import error that suggests there is a version mismatch between the neo4j, llama-index-llms-openai, and/or openai dependencies. The relevant part of the log is:

Traceback (most recent call last):
  File "/home/user/comps/retrievers/src/opea_retrievers_microservice.py", line 15, in <module>
    from integrations.neo4j import OpeaNeo4jRetriever
  File "/home/user/comps/retrievers/src/integrations/neo4j.py", line 17, in <module>
    from llama_index.llms.openai import OpenAI
  File "/usr/local/lib/python3.11/site-packages/llama_index/llms/openai/__init__.py", line 2, in <module>
    from llama_index.llms.openai.responses import OpenAIResponses
  File "/usr/local/lib/python3.11/site-packages/llama_index/llms/openai/responses.py", line 6, in <module>
    from openai.types.responses import (
ImportError: cannot import name 'ResponseTextAnnotationDeltaEvent' from 'openai.types.responses' (/usr/local/lib/python3.11/site-packages/openai/types/responses/__init__.py)

It looks like it may already be solved, but let me know if I can help further.

May 28 '25 18:05 mhbuehler

https://github.com/opea-project/GenAIExamples/pull/2011

Jun 03 '25 06:06 ZePan110