ollama-python icon indicating copy to clipboard operation
ollama-python copied to clipboard

unexpected error format in response (status code: 500) with gpt-oss:20b

Open kha84 opened this issue 6 months ago • 8 comments

Hello there!

Not quite sure if it's relevant to https://github.com/ollama/ollama/issues/11704 or is it a separate thing.

I have updated to the latest version of ollama (which is 0.11.4 as of now). I'm using an official ollama python library (0.5.3) and still constantly getting 500 errors with gpt-oss:20b

unexpected error format in response (status code: 500)
Traceback (most recent call last):
  File "/home/user/test_agent/benchmark.py", line 479, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/test_agent/agent.py", line 581, in llm_process_question
    response = OllamaClient.chat(model=model_name, messages=messages)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/test_ollama/lib/python3.12/site-packages/ollama/_client.py", line 342, in chat
    return self._request(
           ^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/test_ollama/lib/python3.12/site-packages/ollama/_client.py", line 180, in _request
    return cls(**self._request_raw(*args, **kwargs).json())
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/test_ollama/lib/python3.12/site-packages/ollama/_client.py", line 124, in _request_raw
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: unexpected error format in response (status code: 500)

I'm not using any tools - just plain chat calls like res = OllamaClient.chat(model=model_name, messages=messages) and some custom scaffolding around it. I even tested it with overriding the default three-pages-long TEMPLATE with my much more simpler version of TEMPLATE, inherited from simpler models like qwen2.5. With that template gpt-oss:20b acts much more stable, but still prone to 500 time to time.

Is it ollama or ollama python client here who tries (and fails) to parse the output from a model?

kha84 avatar Aug 11 '25 19:08 kha84

Hard to tell what's going wrong from just this - are you able to run the model fine on the CLI? ollama run gpt-oss:20b I wouldn't recommend playing with the template too much - it's VERY different from other models and has a special parser for it.

ParthSareen avatar Aug 11 '25 23:08 ParthSareen

Sure, I get it. This is why I mentioned that I have tried it with different template only as a matter of test. Both gpt-oss:20b models with original template and my simplified template eventually raise 500 time to time.

kha84 avatar Aug 12 '25 06:08 kha84

Sure, I get it. This is why I mentioned that I have tried it with different template only as a matter of test. Both gpt-oss:20b models with original template and my simplified template eventually raise 500 time to time.

Including on the CLI? Could you send the server logs if so?

ParthSareen avatar Aug 12 '25 06:08 ParthSareen

Sure, guys. Thanks a lot. Give me some time and I'll try to make a small reproducable example to share together with server logs. Also will try to reprodude the same in pure ollama run cli.

kha84 avatar Aug 12 '25 07:08 kha84

I was somewhat able to reproduce it in clean ollama run cli chat. The model response was cut in the middle of thinking and >>> ollama prompt appeared welcoming me to enter new message to chat. Although no 500 were seen and nothing interesting in the log, but not sure if that ever happens to ollama run and what kind of message I should be looking for, when I was getting 500 via python ollama library.

I'll do few more experiments to make sure I can reproduce this with high success rate and share it all to you, if all that is relevant.

kha84 avatar Aug 12 '25 07:08 kha84

if you can make that happen again, running with OLLAMA_DEBUG=2 set should make the server logs give more info about what's going on (it'll include both the raw text coming from the model and some of the parsing decisions being made)

drifkin avatar Aug 12 '25 20:08 drifkin

Sorry guys, the workload at my current project doesn't give me a moment even to gasp an air - I've not vanished, I'm just waiting for a calm hour or two to get back to this ticket

kha84 avatar Aug 13 '25 21:08 kha84

not a problem! come back when you're ready and good luck on your current project :)

drifkin avatar Aug 15 '25 16:08 drifkin