jeff3071

Results 2 issues of


                                            jeff3071

Share your evaluate result

3

We evaluate llama using 100 examples of the [`SQuAD`](https://huggingface.co/datasets/squad) dataset with the [Open-evals](https://github.com/open-evals/evals) framework, which extends OpenAI's Evals for different language models. We consider the sentence immediately following the prompt...

Share your evaluate result

2

We evaluate llama using 100 examples of the [`SQuAD`](https://huggingface.co/datasets/squad) dataset with the [Open-evals](https://github.com/open-evals/evals) framework, which extends OpenAI's Evals for different language models. We consider the sentence immediately following the prompt...