human-eval issues

AttributeError: Can't pickle local object 'check_correctness.<locals>.unsafe_execute'

6

When I run "evaluate_functional_correctness sample.jsonl --problem_file=problem.jsonl", it has the following problem. Can u help me? thx Detail log. ------------------------------------------------- Reading samples... 1it [00:00, 2118.34it/s] Running test suites... 0%| | 0/1...

tianzhaotju

Removed deprecated fields.

Removed deprecated fields. _Originally posted by @frapierri in https://github.com/facebookresearch/Ad-Library-API-Script-Repository/pull/19_

Youniqueli

evaluate_functional_correctness can't run

7

I created a conda environment with python3.7 using the exact same command in the doc. Then, I used openai's text-davinci-002 to generate a samples.jsonl file with 3 results for each...

BoyuanChen99

Error in tests for HumanEval/163

In the prompt, it is stated that "Knowing that (a) is less than 100." Why are there test cases like assert candidate(11 * 13 * 7) == True in the...

mono-jiarui

When running the code generated by the model, an error occurs: failed: No module named 'scipy'

I have make the evaluating program running successfully. But sometimes the error like: “No module named XXX” occurred. I want to know which python libraries can be called when the...

HangXue-lab

HE vAL

1

Youniqueli

Why pass@k =1.0? use the "evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl"

3

$ evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl Reading samples... 6it [00:00, 7047.28it/s] Running test suites... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00

Smithol

why use ThreadPoolExecutor with GIL in background?

1

In evaluation the code uses ThreadPoolExecutor at first and in each thread use multiprocessing package. Why not use ProcessPoolExecutor at first? Is there any consideration of optimizing performance?

johnmclain

Fix Mistakes in the Dataset

2

Dear HumanEval Maintainers, Thank you so much for sharing this awesome Test Set! I fully understand that due to the nature of a Test Set, we want to keep it...

marcusm117

bug in estimate_pass_at_k

https://github.com/openai/human-eval/blob/312c5e5532f0e0470bf47f77a6243e02a61da530/human_eval/evaluation.py#L26 This code returns 1 when c=0 and n < k, whereas 0 is expected.

sidaw

human-eval
human-eval copied to clipboard

Metadata

AttributeError: Can't pickle local object 'check_correctness.<locals>.unsafe_execute'

Removed deprecated fields.

evaluate_functional_correctness can't run

Error in tests for HumanEval/163

When running the code generated by the model, an error occurs: failed: No module named 'scipy'

HE vAL

Why pass@k =1.0? use the "evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl"

why use ThreadPoolExecutor with GIL in background?

Fix Mistakes in the Dataset

bug in estimate_pass_at_k

← Metadata

Owner

Metadata

human-eval human-eval copied to clipboard

Metadata

← Metadata

Owner

Metadata

human-eval
human-eval copied to clipboard