ragas icon indicating copy to clipboard operation
ragas copied to clipboard

How can I evaluate Ollama models such as Llama 3.3 using RAGAS?

Open HaniaMo opened this issue 1 year ago • 5 comments

[ ] I checked the documentation and related resources and couldn't find an answer to my question.

HaniaMo avatar Jan 20 '25 13:01 HaniaMo

@dosu Can you solve this ?

Vidit-Ostwal avatar Jan 20 '25 14:01 Vidit-Ostwal

You can successfully use RAGAS with Ollama models by using the following code:

from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import OllamaEmbeddings

ollama_llm = "mistral"
model = ChatOllama(model=ollama_llm)
embeddings_ollama = OllamaEmbeddings(model="mistral")

This solution was suggested in a similar closed issue and received positive feedback [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

dosubot[bot] avatar Jan 20 '25 14:01 dosubot[bot]

I successfully run RAGAS to evaluate an Ollama model. However, I get a timeout error during evaluation and NAN values for metrics, although increasing the timeout for the RunConfig.

Evaluating: 0%| | 0/1 [00:00<?, ?it/s]ERROR:ragas.executor:Exception raised in Job[0]: TimeoutError() {'faithfulness': nan}

**** This is the function I used to calculate the evaluation metrics def RAGAS_metrics(user_query, model_response, contexts_list): # 6.2- Run RAGAS metrics print("\nRunning RAGAS evaluation metrics...") # To dict data = { "question": [user_query], #user_query is string of size 182 "answer": [model_response], #model response is string of size 3361 "contexts": contexts_list, # is a list of the context: [[size=10]]

    }
 dataset = Dataset.from_dict(data)
 results = evaluate(dataset = dataset, metrics=[faithfulness],llm=llm_factory(), 
                    embeddings=embedding_factory(),run_config=RunConfig(max_workers=8,timeout=1000, log_tenacity=True))
 print("Evaluation Results from ragas:")
 print(results)

HaniaMo avatar Jan 22 '25 03:01 HaniaMo

sadly this is a duplicate of #1170 today we don't support ollama model but will get this fixed in the coming weeks

jjmachan avatar Jan 22 '25 04:01 jjmachan

I tried to solve this by the below code.Though I only use the embedding of ollama,I suppose that the llm is also the same . ` from ragas.embeddings import LangchainEmbeddingsWrapper os.environ['OPENAI_API_KEY'] = 'no-key' from langchain_ollama import OllamaEmbeddings ollama_e=OllamaEmbeddings(model="bge-m3",base_url="http://localhost:11434") ollama_embed_self = LangchainEmbeddingsWrapper(ollama_e) from ragas.llms import LangchainLLMWrapper from langchain_community.chat_models import ChatOpenAI deepseek_llm = ChatOpenAI( api_key=os.getenv('DEEPSEEK_API_KEY'), base_url="you_base_url",
model_name="deepseek-chat"
) wrapper = LangchainLLMWrapper(deepseek_llm)

metrics=[ answer_correctness, answer_similarity, ] for m in metrics: m.setattr("llm",wrapper) if hasattr(m,"embeddings"): m.setattr("embeddings",ollama_embed_self)

naive_results = evaluate( Dataset.from_dict({ "question": questions, "ground_truth": labels, "answer": naive_rag_answers, },

),
metrics=[
    # answer_relevancy,
    answer_correctness,
    answer_similarity,
],
embeddings=ollama_embed_self,
llm=wrapper

) local_results = evaluate( Dataset.from_dict({ "question": questions, "ground_truth": labels, "answer": naive_rag_answers, }), metrics=[ # answer_relevancy, answer_correctness, answer_similarity, ], embeddings=ollama_embed_self, llm=wrapper ) `If the official has perfected the project for this, please inform me of the new usage method.

hsbdkdn avatar Mar 23 '25 07:03 hsbdkdn