Michael Phelps

Results 12 comments of Michael Phelps

Seems like the suggestion is to move to `django-anymail[mailgun]`. Until I go through the effort of reconfiguring, I'll be replacing `django-mailgun==0.9.1` in my requirements.txt with `git+git://github.com/BradWhittington/[email protected]`.

I'd definitely be open to a PR on this! You could probably add it as one of the last transformation pipelines to be ran or something.

Nice idea to do the rounding (I'd tried this but limited the output to integers since I couldn't think of a way to cleanly test the result otherwise). If you're...

> @nottheswimmer I agree with what you had said about accepting more types of answers - I will work on adding more examples in the code. How could I include...

I believe the model may be replying with timezone offsets in some cases instead of using UTC. That behavior actually sounds in line with your instructions -- "If no location...

Cool eval but I wonder if there are false negatives where the AI is replying in English instead of writing the digits? Could add multiple ideal values and use the...

``` (venv) PS [projectDir]\evals> oaieval gpt-3.5-turbo quadratic-from-three-points [2023-03-15 01:22:03,131] [registry.py:145] Loading registry from [projectDir]\evals\evals\registry\evals [2023-03-15 01:22:03,148] [registry.py:145] Loading registry from [userDir]\.evals\evals [2023-03-15 01:22:03,791] [oaieval.py:178] Run started: 230315052203RH4PSX33 [2023-03-15 01:22:03,795] [data.py:78]...

Here is an example of ChatGPT 3.5 and ChatGPT 4.0 failing a prompt. However, notice that ChatGPT 4.0's answer seems to be closer to the correct answer. Purple dots: The...

Sorry @andrew-openai, left a modified filename in the .yaml when I was running a different dataset :P Should be all good now.