Invalid n value (currently only n = 1 is supported)
[ ] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'}
Ragas version: 0.1.10
Code to Reproduce
from ragas import evaluate
from langchain_openai import ChatOpenAI, OpenAI
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from datasets import Dataset
from langchain.vectorstores.milvus import Milvus
#embedding(chroma)
model_name = "BAAI/bge-m3"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
embeddings = HuggingFaceBgeEmbeddings(
model_name=model_name,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs
)
import nest_asyncio
nest_asyncio.apply()
llm = ChatOpenAI(model="deepseek-chat",api_key="sk-*******",base_url="https://api.deepseek.com/v1" ,temperature=0, n=1)
from ragas.metrics import (
faithfulness,
answer_relevancy,
context_relevancy,
context_recall,
context_precision,
)
data = {
'question': ['What is the capital of France?'],
'contexts': [['Paris is the capital of France.']],
'answer': ['Paris'],
'ground_truth': ['Paris']
}
dataset = Dataset.from_dict(data)
result = evaluate(
dataset = dataset,
metrics=[
context_precision,
context_recall,
faithfulness,
answer_relevancy,
],
llm=llm,
embeddings=embeddings,
)
result
**Error trace**
Exception in thread Thread-11:
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 95, in run
results = self.loop.run_until_complete(self._aresults())
File "/usr/local/lib/python3.10/dist-packages/nest_asyncio.py", line 98, in run_until_complete
return f.result()
File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result
raise self._exception.with_traceback(self._exception_tb)
File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in __step
result = coro.send(None)
File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 83, in _aresults
raise e
File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 78, in _aresults
r = await future
File "/usr/lib/python3.10/asyncio/tasks.py", line 571, in _wait_for_one
return f.result() # May raise f.exception().
File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result
raise self._exception.with_traceback(self._exception_tb)
File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in __step
result = coro.send(None)
File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 37, in sema_coro
return await coro
File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 111, in wrapped_callable_async
return counter, await callable(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ragas/metrics/base.py", line 125, in ascore
raise e
File "/usr/local/lib/python3.10/dist-packages/ragas/metrics/base.py", line 121, in ascore
score = await self._ascore(row=row, callbacks=group_cm, is_async=is_async)
File "/usr/local/lib/python3.10/dist-packages/ragas/metrics/_answer_relevance.py", line 152, in _ascore
result = await self.llm.generate(
File "/usr/local/lib/python3.10/dist-packages/ragas/llms/base.py", line 93, in generate
return await agenerate_text_with_retry(
File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/__init__.py", line 189, in async_wrapped
return await copy(fn, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/__init__.py", line 111, in __call__
do = await self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/__init__.py", line 153, in iter
result = await action(retry_state)
File "/usr/local/lib/python3.10/dist-packages/tenacity/_utils.py", line 99, in inner
return call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py", line 398, in <lambda>
self._add_action_func(lambda rs: rs.outcome.result())
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/__init__.py", line 114, in __call__
result = await fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ragas/llms/base.py", line 170, in agenerate_text
return await self.langchain_llm.agenerate_prompt(
File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py", line 691, in agenerate_prompt
return await self.agenerate(
File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py", line 651, in agenerate
raise exceptions[0]
File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in __step
result = coro.send(None)
File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py", line 836, in _agenerate_with_cache
result = await self._agenerate(
File "/usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py", line 674, in _agenerate
response = await self.async_client.create(**payload)
File "/usr/local/lib/python3.10/dist-packages/openai/resources/chat/completions.py", line 1289, in create
return await self._post(
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1816, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1514, in request
return await self._request(
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1610, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'}
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here.
Hello, @jidechao! I'm here to help you with any bugs or questions you have. Let's work together to solve this issue.
To resolve the openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'} error, ensure that the n parameter is set to 1 in your ChatOpenAI model configuration. The ragas library currently supports only n = 1 for text generation.
Here's the corrected part of your code:
llm = ChatOpenAI(
model="deepseek-chat",
api_key="sk-36906d31048e43c1a7ba5d7b9b45ea0f",
base_url="https://api.deepseek.com/v1",
temperature=0,
n=1 # Ensure n is set to 1
)
Make sure that the n parameter is not set to any value other than 1. This should resolve the error related to the "Invalid n value" [1].
To continue talking to Dosu, mention @dosu.
Hello, @jidechao! I'm here to help you with any bugs or questions you have. Let's work together to solve this issue.
To resolve the
openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'}error, ensure that thenparameter is set to 1 in yourChatOpenAImodel configuration. Theragaslibrary currently supports onlyn = 1for text generation.Here's the corrected part of your code:
llm = ChatOpenAI( model="deepseek-chat", api_key="sk-36906d31048e43c1a7ba5d7b9b45ea0f", base_url="https://api.deepseek.com/v1", temperature=0, n=1 # Ensure n is set to 1 )Make sure that the
nparameter is not set to any value other than 1. This should resolve the error related to the "Invalid n value" [1].To continue talking to Dosu, mention @dosu.
I tested it and found that three metrics can be executed. With four metrics, it reports an error:Invalid n value (currently only n = 1 is supported)
The "Invalid n value" error when evaluating four metrics simultaneously is likely due to the internal handling of multiple metrics in the ragas library. Here are a few steps to ensure your setup is correct:
-
Ensure Dataset Format: Verify that your dataset has the correct format and column names (
question,contexts,answer,ground_truth). -
Check Metric Initialization: Ensure that all metrics are properly initialized with the required language models (LLMs) and embeddings.
-
Set
nParameter Correctly: Thenparameter in theChatOpenAIclass should be set to 1, as the library currently supports onlyn = 1for text generation.
Here is an example of how to structure your dataset and call the evaluate function:
from ragas import evaluate
from datasets import Dataset
from langchain_openai import ChatOpenAI
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
# Example dataset
data = {
'question': ["What is the capital of France?"],
'contexts': [["Paris is the capital of France."]],
'answer': ["Paris"],
'ground_truth': [["Paris"]]
}
dataset = Dataset.from_dict(data)
# Initialize LLM and embeddings
llm = ChatOpenAI(model="deepseek-chat", api_key="sk-*******", base_url="https://api.deepseek.com/v1", temperature=0, n=1)
embeddings = HuggingFaceBgeEmbeddings(model_name="BAAI/bge-m3", model_kwargs={'device': 'cpu'}, encode_kwargs={'normalize_embeddings': True})
# Import metrics
from ragas.metrics import faithfulness, answer_relevancy, context_precision, context_recall
# Call the evaluate function
result = evaluate(
dataset=dataset,
metrics=[context_precision, context_recall, faithfulness, answer_relevancy],
llm=llm,
embeddings=embeddings,
)
print(result)
Ensure that the n parameter is set to 1 in the ChatOpenAI configuration. This should help resolve the "Invalid n value" error by ensuring the dataset is correctly formatted and the n parameter is properly set [1][2].
To continue talking to Dosu, mention @dosu.
metrics
I tested it again, and it seems that answer_relevancy has an issue, while other metrics can be executed normally.
error:
correct:
I'm having the same problem.
same problem
@jidechao @123gbk @Mervyn-1 which mode service are you using?
The issue is that use using ChatOpenAI so ragas expects n parameter to be supported. but I realize we need a way to override it now
@jidechao @123gbk @Mervyn-1 which mode service are you using?
The issue is that use using
ChatOpenAIso ragas expects n parameter to be supported. but I realize we need a way to override it now
I guess the problem might be with deepseek-chat, which does not support n > 1. I'm not entirely sure about that, but I just found that both @jidechao and I are using deepseek compatible mode.
Maybe we can fix this problem by calling the deepseek API n times with 1 reply each time, rather than calling it once with n replies.
This issue is specifically related to the Deepseek API, which doesn't support the OpenAI API feature of generating multiple responses with a single call (the n parameter).
I've implemented a fix by modifying the _ascore method in src/ragas/metrics/_answer_relevance.py. Here's the updated code:
async def _ascore(self, row: t.Dict, callbacks: Callbacks) -> float:
assert self.llm is not None, "LLM is not set"
prompt = self._create_question_gen_prompt(row)
answers = []
for _ in range(self.strictness):
result = await self.llm.generate(
prompt,
n=1,
callbacks=callbacks,
)
generated_text = result.generations[0][0].text
answer = await _output_parser.aparse(generated_text, prompt, self.llm)
if answer is not None:
answers.append(answer)
if not answers:
return np.nan
return self._calculate_score(answers, row)
This solution loops through the generation process self.strictness number of times, each time generating a single response. This workaround allows the code to work with APIs like Deepseek that don't support the n parameter for multiple generations.
However, it's worth noting that this approach might have performance implications when used with other models that do support generating multiple responses in a single call.
A more optimal solution would be to implement a mechanism to switch between looping and using the n parameter based on the capabilities of the underlying LLM. But I don't know whats the best way to add this information to the LLM class.
@Xiyuche actually we do that for LLMs that don't support n parameter and the solution is similar to what @PhillipRt has already suggested (thanks a lot btw 🙂 ❤️ )
the con is that it's hard to keep a list. For example ChatOpenAI does support it but in this case since you're using the same class this happens.
https://github.com/explodinggradients/ragas/blob/b988f538b968b378604240d2b279f8bf4d8ecca9/src/ragas/llms/base.py#L29-L35
https://github.com/explodinggradients/ragas/blob/b988f538b968b378604240d2b279f8bf4d8ecca9/src/ragas/llms/base.py#L39-L44
I think the solution would be to have a argument inside the LLM for this.
@Xiyuche actually we do that for LLMs that don't support
nparameter and the solution is similar to what @PhillipRt has already suggested (thanks a lot btw 🙂 ❤️ )the con is that it's hard to keep a list. For example
ChatOpenAIdoes support it but in this case since you're using the same class this happens.https://github.com/explodinggradients/ragas/blob/b988f538b968b378604240d2b279f8bf4d8ecca9/src/ragas/llms/base.py#L29-L35
https://github.com/explodinggradients/ragas/blob/b988f538b968b378604240d2b279f8bf4d8ecca9/src/ragas/llms/base.py#L39-L44
I think the solution would be to have a argument inside the LLM for this.
Thank you, a lot of API providers like deepseek recommand using compatible mode directly , which is however, not that much perfectly compatible with openai 😂 just like n
If using LangchainLLMWrapper, use bypass_n=True when wrapping the LLM
Even better to migrate to llm_factory() instead of LangchainLLMWrapper
Closing as resolved.