the bug of embedding batchsize
embedding [src.pipelines.indexing.db_schema.embedding()] encountered an error< Node inputs: {'chunk': "<Task finished name='Task-1290' coro=<AsyncGraphAd...", 'embedder': '<src.providers.embedder.litellm.AsyncDocumentEmbed...'}
Traceback (most recent call last): File "/app/.venv/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 1127, in aembedding headers, response = await self.make_openai_embedding_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/litellm/litellm_core_utils/logging_utils.py", line 190, in async_wrapper result = await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 1080, in make_openai_embedding_request raise e File "/app/.venv/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 1073, in make_openai_embedding_request raw_response = await openai_aclient.embeddings.with_raw_response.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_legacy_response.py", line 381, in wrapped return cast(LegacyAPIResponse[R], await func(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 251, in create return await self._post( ^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'code': 'InvalidParameter', 'param': None, 'message': '<400> InternalError.Algo.InvalidParameter: Value error, batch size is invalid, it should not be larger than 10.: input.contents', 'type': 'InvalidParameter'}, 'id': '0fcc15b7-efb5-468f-a018-4c2fca8e2597', 'request_id': '0fcc15b7-efb5-468f-a018-4c2fca8e2597'}
please tell me what can i do?
你好,请问解决了吗,我和你遇到了相同的问题
你好,请问解决了吗,我和你遇到了相同的问题
我尝试改了一下底层逻辑,还是不行,这个问题没解决
@COCO-hy 遇到同样的问题,最后解决了吗
@COCO-hy 我已找到解决办法,在config.yaml中配置 embedder
type: embedder provider: litellm_embedder models:
- model: openai/text-embedding-v4 alias: default api_base: https://dashscope.aliyuncs.com/compatible-mode/v1 dimension: 2048 timeout: 120 batch_size: 10
@COCO-hy 我已找到解决办法,在config.yaml中配置 embedder
type: embedder provider: litellm_embedder models:
- model: openai/text-embedding-v4 alias: default api_base: https://dashscope.aliyuncs.com/compatible-mode/v1 dimension: 2048 timeout: 120 batch_size: 10
太棒了 感谢 我去试试
我把这个参数加上之后又有新的报错了,是因为表和字段太多吗,还是openai/ttext-embedding-v4 向量库的兼容问题