qrdlgit
qrdlgit
@voynow Why did you close this? It looks good to me, but you should get Andrew's opinion.
Yeah, I saw that as well. TBH though, this seems like a great eval to me, but I'm just a user. Sorting things like this is a very common use...
this seems like an extremely common use case, I feel like we (OP and I) are missing something obvious. If I create a massive (GBs) read only cache used for...
I downloaded all the merged PRs and asked GPT4 to summarize the common characteristics: The merged evals cover a wide range of topics and skills, including: - Language understanding: Japanese,...
@SkyaTura Yes, absolutely. For those serious about creating an eval here, there is definitely value in going back through all the PRs and reading them closely. That said, it's possible...
Not so much expensive, though perhaps a bit technically challenging. However, we can always ask GPT4, right? Try this prompt: _I'd like to better understand why PRs are being merged...
@SkyaTura I think your deviation was important and there needs to be more discussion around this topic - but you're right. I'll take the blame for the hijack here and...
One suggestion for folks at open ai, you might want to add an attribute to the checkbox: [] I understand that opening a PR, even if it meets the requirements...
try the logic evals https://github.com/openai/evals/tree/main/evals/registry/data/logic They fail even with cot of reasoning
Anyone can do this! Works better with GPT4. 