Test data generation Ground Truth column is missing
Your Question I am using the following code for the evaluation of my dataset. I upgraded recently from 0.1.13 to 0.1.18 to use the new metrics ( noise_sensitivity_relevant, noise_sensitivity_irrelevant). But i got the error that the noise_sensitivity_relevant metric needs the ground_truth column in the dataset. But this column is in the dataset.
import pandas as pd
from datasets import Dataset
from ragas import evaluate, RunConfig
from ragas.metrics import (
answer_relevancy,
faithfulness,
context_recall,
context_precision,
context_entity_recall,
answer_correctness,
answer_similarity,
noise_sensitivity_relevant,
noise_sensitivity_irrelevant
)
from dotenv import load_dotenv
load_dotenv()
df = pd.read_excel('testset_full-100-ES.xlsx')
# Create the 'data_samples' dictionary structure
data_samples = {
'question': df['question'].tolist(),
'answer': df['answer'].tolist(),
'contexts': df['contexts'].apply(lambda x: [x] if pd.notna(x) else []).tolist(),
'ground_truth': df['ground_truth'].tolist()
}
dataset = Dataset.from_dict(data_samples)
print(dataset)
run_config = RunConfig(timeout=120)
result = evaluate(
dataset,
metrics=[
context_precision,
faithfulness,
answer_relevancy,
context_recall,
context_entity_recall,
answer_correctness,
answer_similarity,
noise_sensitivity_relevant,
noise_sensitivity_irrelevant,
],
run_config=run_config
)
df = result.to_pandas()
print(result)
# save evaluation results to csv
df.to_csv('results-100-ES.csv', index=False)
Traceback (most recent call last):
File "C:\Users\Kaan9\bitbucket\rag\Evaluation\scripts\eval.py", line 33, in <module>
result = evaluate(
File "C:\Users\Kaan9\miniconda3\envs\rag\lib\site-packages\ragas\_analytics.py", line 129, in wrapper
result = func(*args, **kwargs)
File "C:\Users\Kaan9\miniconda3\envs\rag\lib\site-packages\ragas\evaluation.py", line 177, in evaluate
validate_required_columns(dataset, metrics)
File "C:\Users\Kaan9\miniconda3\envs\rag\lib\site-packages\ragas\validation.py", line 62, in validate_required_columns
raise ValueError(
ValueError: The metric [noise_sensitivity_relevant] that that is used requires the following additional columns ['ground_truth'] to be present in the dataset.
I found a similar open issue that might help you resolve this problem: Not able to calculate noise_sensitivity_relevant using Azure Open AI. The suggested solution is to ensure that the column names in your dataset exactly match the required names. The noise_sensitivity_relevant metric requires the columns user_input, response, ground_truth, and retrieved_contexts.
Here is an example of how to structure your dataset correctly:
from datasets import Dataset
from ragas.metrics import noise_sensitivity_relevant, noise_sensitivity_irrelevant
from ragas import evaluate
data_sample = {
"user_input": ["What is the Life Insurance Corporation of India (LIC) known for?"],
"ground_truth": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments."],
"response": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributes to the financial stability of the country."],
"retrieved_contexts": [[
"The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.",
"LIC is the largest insurance company in India, with a vast network of policyholders and huge investments.",
"As the largest institutional investor in India, LIC manages substantial funds, contributing to the financial stability of the country.",
"The Indian economy is one of the fastest-growing major economies in the world, thanks to sectors like finance, technology, manufacturing, etc."
]]
}
dataset = Dataset.from_dict(data_sample)
metrics = [noise_sensitivity_relevant, noise_sensitivity_irrelevant]
score = evaluate(dataset, metrics=metrics)
print(score.to_pandas())
Ensure that the column names in your dataset are exactly user_input, response, ground_truth, and retrieved_contexts without any typos or case differences [1].
Additionally, there have been changes in the dataset structure requirements between versions 0.1.13 and 0.1.18 of the Ragas library. For version 0.1.13, the required columns are:
REQUIRED_COLS_v1 = {
"user_input": "question",
"retrieved_contexts": "contexts",
"response": "answer",
"reference": "ground_truth",
}
For version 0.1.18, the column names have been updated. You can use the provided conversion functions to convert between v1 and v2 datasets:
def convert_row_v1_to_v2(row: t.Dict[str, t.Any]) -> t.Dict[str, t.Any]:
required_cols_v2 = {k: v for v, k in REQUIRED_COLS_v1.items()}
return {required_cols_v2[k]: v for k, v in row.items() if k in required_cols_v2}
def convert_v1_to_v2_dataset(dataset: Dataset) -> Dataset:
columns_map = {v: k for k, v in REQUIRED_COLS_v1.items() if v in dataset.features}
return dataset.rename_columns(columns_map)
def convert_v2_to_v1_dataset(dataset: Dataset) -> Dataset:
columns_map = {k: v for k, v in REQUIRED_COLS_v1.items() if k in dataset.features}
return dataset.rename_columns(columns_map)
These functions indicate that the column names have been swapped between the two versions [2].
To continue talking to Dosu, mention @dosu.
Please update to 0.3.9 and checkout the latest docs on available metrics