Creating custom code evaluator drops leading and trailing quotation marks in parameter column

Open Bowman74 opened this issue 9 months ago • 1 comments

If the value of a column starts AND ends with the quotation character (") the quotation characters are dropped when sent to the evaluator's custom python function. I wanted to create an evaluator that compared the ground_truth to the response columns to see if the only difference was starting and ending quotation marks. Unfortunately the quotation marks are stripped away when sent to the custom evaluator so the difference was not able to be detected.

If the value only starts or ends with a quotation mark, it is not dropped. It is only when it starts and ends with the quotation mark that the problem occurs.

Interestingly, the built in code evaluators (BLEU, F1 Score, GLEU and METEOR) all seem to be able to see that the result is not the same as the ground_truth so they must be receiving the leading and trailing quotation marks.

Expected behavior: The values sent to the parameters in the python function for a custom code evaluator should be exactly what they are in the dataset's cell with no characters stripped away.

Apr 17 '25 00:04 Bowman74

Thanks for the feedback. After investigation, it looks a bug from azure-ai-evaluation SDK. A bug is created for tracking: https://github.com/Azure/azure-sdk-for-python/issues/40996

May 14 '25 01:05 qinezh