WrenAI icon indicating copy to clipboard operation
WrenAI copied to clipboard

Bug Report: column_indexing_batch_size Not Taking Effect with OpenAI-Compatible Embedding Model

Open null-ed opened this issue 3 months ago • 1 comments

Bug Report: column_indexing_batch_size Setting in config.yaml Not Taking Effect for OpenAI-Compatible Embedding Model

Environment:

  • WrenAI version: [canner/wren-ai-service:0.27.14]
  • Embedding model: OpenAI-compatible model with a maximum batch size limit of 10.
  • Configuration: Set column_indexing_batch_size: 10 in config.yaml.

Description:

When using an OpenAI-compatible Embedding model that enforces a maximum batch size of 10, the column_indexing_batch_size parameter set to 10 in config.yaml does not appear to take effect. During operation, the container logs the following error, indicating that the batch size being sent exceeds the model's limit of 10, despite the explicit configuration:

litellm.llms.openai.common_utils.OpenAIError: Error code: 400 - {'error': {'message': '<400> InternalError.Algo.InvalidParameter: Value error, batch size is invalid, it should not be larger than 10.: input.contents', 'type': 'InvalidParameter', 'param': None, 'code': 'InvalidParameter'}, 'id': '2354f194-0475-45e1-ada8-95757467f96a', 'request_id': '2354f194-0475-45e1-ada8-95757467f96a'}

This suggests that the batch size configuration is either being ignored or overridden internally, leading to invalid requests to the embedding model.

Steps to Reproduce:

  1. Configure column_indexing_batch_size: 10 in config.yaml.
  2. Use an OpenAI-compatible Embedding model with a max batch size of 10.
  3. Run the indexing process or relevant operation.
  4. Observe the error in container logs.

Expected Behavior:

The batch size should respect the configured value of 10 and not exceed the model's limit, preventing the 400 error.

Actual Behavior:

The error occurs as if the batch size is larger than 10, ignoring the config setting.

Additional Notes:

  • This issue persists even with the exact max batch size configured.
  • Please investigate if there's an internal default or override affecting this parameter.

null-ed avatar Nov 10 '25 09:11 null-ed

config.yaml ` type: llm provider: litellm_llm timeout: 120 models:

  • api_base: https://dashscope.aliyuncs.com/compatible-mode/v1 model: openai/qwen-max alias: default timeout: 120 kwargs: max_tokens: 4096 n: 1 seed: 0 temperature: 0

type: embedder provider: litellm_embedder models:

  • model: openai/text-embedding-v4 alias: default api_base: https://dashscope.aliyuncs.com/compatible-mode/v1 dimension: 2048 timeout: 120

type: engine provider: wren_ui endpoint: http://wren-ui:3000


type: engine provider: wren_ibis endpoint: http://ibis-server:8000


type: document_store provider: qdrant location: http://qdrant:6333 embedding_model_dim: 2048 timeout: 120 recreate_index: true


type: pipeline pipes:

  • name: db_schema_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: historical_question_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: table_description_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: db_schema_retrieval llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant
  • name: historical_question_retrieval embedder: litellm_embedder.default document_store: qdrant
  • name: sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: sql_correction llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: followup_sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: sql_answer llm: litellm_llm.default
  • name: semantics_description llm: litellm_llm.default
  • name: relationship_recommendation llm: litellm_llm.default
  • name: question_recommendation llm: litellm_llm.default
  • name: question_recommendation_sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: intent_classification llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant
  • name: misleading_assistance llm: litellm_llm.default
  • name: data_assistance llm: litellm_llm.default
  • name: sql_pairs_indexing document_store: qdrant embedder: litellm_embedder.default
  • name: sql_pairs_retrieval document_store: qdrant embedder: litellm_embedder.default llm: litellm_llm.default
  • name: preprocess_sql_data llm: litellm_llm.default
  • name: sql_executor engine: wren_ui
  • name: chart_generation llm: litellm_llm.default
  • name: chart_adjustment llm: litellm_llm.default
  • name: user_guide_assistance llm: litellm_llm.default
  • name: sql_question_generation llm: litellm_llm.default
  • name: sql_generation_reasoning llm: litellm_llm.default
  • name: followup_sql_generation_reasoning llm: litellm_llm.default
  • name: sql_regeneration llm: litellm_llm.default engine: wren_ui
  • name: instructions_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: instructions_retrieval embedder: litellm_embedder.default document_store: qdrant
  • name: sql_functions_retrieval engine: wren_ibis document_store: qdrant
  • name: project_meta_indexing document_store: qdrant
  • name: sql_tables_extraction llm: litellm_llm.default
  • name: sql_diagnosis llm: litellm_llm.default

settings: doc_endpoint: https://docs.getwren.ai is_oss: true engine_timeout: 30 column_indexing_batch_size: 5 table_retrieval_size: 10 table_column_retrieval_size: 100 allow_intent_classification: true allow_sql_generation_reasoning: true allow_sql_functions_retrieval: true enable_column_pruning: false max_sql_correction_retries: 3 query_cache_maxsize: 1000 query_cache_ttl: 3600 langfuse_host: https://cloud.langfuse.com langfuse_enable: true logging_level: DEBUG development: false historical_question_retrieval_similarity_threshold: 0.9 sql_pairs_similarity_threshold: 0.7 sql_pairs_retrieval_max_size: 10 instructions_similarity_threshold: 0.7 instructions_top_k: 10`

null-ed avatar Nov 10 '25 09:11 null-ed