Bug Report: column_indexing_batch_size Not Taking Effect with OpenAI-Compatible Embedding Model

Open null-ed opened this issue 3 months ago • 1 comments

Bug Report: column_indexing_batch_size Setting in config.yaml Not Taking Effect for OpenAI-Compatible Embedding Model

Environment:

WrenAI version: [canner/wren-ai-service:0.27.14]
Embedding model: OpenAI-compatible model with a maximum batch size limit of 10.
Configuration: Set column_indexing_batch_size: 10 in config.yaml.

Description:

When using an OpenAI-compatible Embedding model that enforces a maximum batch size of 10, the column_indexing_batch_size parameter set to 10 in config.yaml does not appear to take effect. During operation, the container logs the following error, indicating that the batch size being sent exceeds the model's limit of 10, despite the explicit configuration:

litellm.llms.openai.common_utils.OpenAIError: Error code: 400 - {'error': {'message': '<400> InternalError.Algo.InvalidParameter: Value error, batch size is invalid, it should not be larger than 10.: input.contents', 'type': 'InvalidParameter', 'param': None, 'code': 'InvalidParameter'}, 'id': '2354f194-0475-45e1-ada8-95757467f96a', 'request_id': '2354f194-0475-45e1-ada8-95757467f96a'}

This suggests that the batch size configuration is either being ignored or overridden internally, leading to invalid requests to the embedding model.

Steps to Reproduce:

Configure column_indexing_batch_size: 10 in config.yaml.
Use an OpenAI-compatible Embedding model with a max batch size of 10.
Run the indexing process or relevant operation.
Observe the error in container logs.

Expected Behavior:

The batch size should respect the configured value of 10 and not exceed the model's limit, preventing the 400 error.

Actual Behavior:

The error occurs as if the batch size is larger than 10, ignoring the config setting.

Additional Notes:

This issue persists even with the exact max batch size configured.
Please investigate if there's an internal default or override affecting this parameter.

Nov 10 '25 09:11 null-ed

config.yaml ` type: llm provider: litellm_llm timeout: 120 models:

api_base: https://dashscope.aliyuncs.com/compatible-mode/v1 model: openai/qwen-max alias: default timeout: 120 kwargs: max_tokens: 4096 n: 1 seed: 0 temperature: 0

type: embedder provider: litellm_embedder models:

model: openai/text-embedding-v4 alias: default api_base: https://dashscope.aliyuncs.com/compatible-mode/v1 dimension: 2048 timeout: 120

type: engine provider: wren_ui endpoint: http://wren-ui:3000

type: engine provider: wren_ibis endpoint: http://ibis-server:8000

type: document_store provider: qdrant location: http://qdrant:6333 embedding_model_dim: 2048 timeout: 120 recreate_index: true

type: pipeline pipes:

name: db_schema_indexing embedder: litellm_embedder.default document_store: qdrant
name: historical_question_indexing embedder: litellm_embedder.default document_store: qdrant
name: table_description_indexing embedder: litellm_embedder.default document_store: qdrant
name: db_schema_retrieval llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant
name: historical_question_retrieval embedder: litellm_embedder.default document_store: qdrant
name: sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
name: sql_correction llm: litellm_llm.default engine: wren_ui document_store: qdrant
name: followup_sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
name: sql_answer llm: litellm_llm.default
name: semantics_description llm: litellm_llm.default
name: relationship_recommendation llm: litellm_llm.default
name: question_recommendation llm: litellm_llm.default
name: question_recommendation_sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
name: intent_classification llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant
name: misleading_assistance llm: litellm_llm.default
name: data_assistance llm: litellm_llm.default
name: sql_pairs_indexing document_store: qdrant embedder: litellm_embedder.default
name: sql_pairs_retrieval document_store: qdrant embedder: litellm_embedder.default llm: litellm_llm.default
name: preprocess_sql_data llm: litellm_llm.default
name: sql_executor engine: wren_ui
name: chart_generation llm: litellm_llm.default
name: chart_adjustment llm: litellm_llm.default
name: user_guide_assistance llm: litellm_llm.default
name: sql_question_generation llm: litellm_llm.default
name: sql_generation_reasoning llm: litellm_llm.default
name: followup_sql_generation_reasoning llm: litellm_llm.default
name: sql_regeneration llm: litellm_llm.default engine: wren_ui
name: instructions_indexing embedder: litellm_embedder.default document_store: qdrant
name: instructions_retrieval embedder: litellm_embedder.default document_store: qdrant
name: sql_functions_retrieval engine: wren_ibis document_store: qdrant
name: project_meta_indexing document_store: qdrant
name: sql_tables_extraction llm: litellm_llm.default
name: sql_diagnosis llm: litellm_llm.default

settings: doc_endpoint: https://docs.getwren.ai is_oss: true engine_timeout: 30 column_indexing_batch_size: 5 table_retrieval_size: 10 table_column_retrieval_size: 100 allow_intent_classification: true allow_sql_generation_reasoning: true allow_sql_functions_retrieval: true enable_column_pruning: false max_sql_correction_retries: 3 query_cache_maxsize: 1000 query_cache_ttl: 3600 langfuse_host: https://cloud.langfuse.com langfuse_enable: true logging_level: DEBUG development: false historical_question_retrieval_similarity_threshold: 0.9 sql_pairs_similarity_threshold: 0.7 sql_pairs_retrieval_max_size: 10 instructions_similarity_threshold: 0.7 instructions_top_k: 10`

Nov 10 '25 09:11 null-ed