Document Embedding issues while using sagemaker model
Self Checks
- [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template :) and fill in all the required fields.
Dify version
15.2
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Adding document to knowledge base by using sagemaker model for embedding model (cohere and voyage) throwing error in the processing.
Input Validation error.
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
Hey @Nicj228! 👋 I'm here to help you with bugs, questions, and becoming a contributor. Let's squash those bugs together!
I found a similar unsolved discussion titled Exception: list index out of range which might be related to your issue. Additionally, there is another discussion titled Error: Response output is missing or does not contain embeddings that could be relevant. Both discussions are still open and might provide some insights into your problem [1][2].
Regarding your issue with input validation errors when using SageMaker models for document embedding, Dify handles embedding models with SageMaker by using the SageMakerEmbeddingModel class. Common error scenarios include connection errors, server unavailability, rate limiting, authorization errors, and bad request errors, which are mapped to specific InvokeError types [3][4].
If you haven't already, you might want to check the configuration of your embedding models and ensure that all required parameters are correctly set. Additionally, reviewing the error logs for specific messages related to input validation might help diagnose the issue further.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other
@dosu. "error": "[sagemaker] Error: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "Input Validation Error: 1 validation error for InputData\ninput\n Field required [type=missing, input_value={'inputs': ["Qu'est-ce qu...alse, 'instruction': ''}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.5/v/missing". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/voyage-embed in account 135808918163 for more information.",
The error message indicates a 400 client error due to a missing required field in the input data for the SageMaker model. The input data is expected to be a dictionary with specific fields, and it seems that one or more required fields are missing.
To resolve this issue, ensure that the input data you are sending to the SageMaker model includes all the required fields. The SageMakerEmbeddingModel class expects the input data to be a list of strings [1]. Make sure that your input data is structured correctly and includes all necessary fields as expected by the model. Additionally, check the error logs for more detailed information on which specific field is missing.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other