stream disconnected before completion: failed to parse ErrorResponse: invalid type: null, expected struct Error
What version of Codex is running?
v0.63.0
What subscription do you have?
Usage based
Which model were you using?
gpt-5.1-codex
What platform is your computer?
MacOS
What issue are you seeing?
■ stream disconnected before completion: failed to parse ErrorResponse: invalid type: null, expected struct Error
› continue
• Explored └ Search progressive in api Read composer.py Search PayloadGenerator in src Search build_adapter_registry Read init.py
■ stream disconnected before completion: failed to parse ErrorResponse: invalid type: null, expected struct Error
› continue
• Explored └ List ls Read architecture.md Search progressive in architecture.md Search adapter in architecture.md
■ stream disconnected before completion: failed to parse ErrorResponse: invalid type: null, expected struct Error
› continue
• Explored └ Search Validation Parser in *.md Search Parser in architecture.md Read architecture.md Search StaticQueryGenerator in *.md
■ stream disconnected before completion: failed to parse ErrorResponse: invalid type: null, expected struct Error
What steps can reproduce the bug?
Uploaded thread: 019ab9d9-83f6-7e71-911f-83579414f28e
What is the expected behavior?
No response
Additional information
No response
Potential duplicates detected. Please review them and close your issue if it is a duplicate.
- #6732
- #6710
Powered by Codex Action
I am using Azure to host the gpt-5.1-codex model and constantly get this error. It ususally runs fine for some period of time, then starts failing to connect and says "Reconnecting" It does this and sometimes succeeds, but eventually fails and displays the error. It seems to be related to limits as if you stop using the model for a while, it will start to work again...for a bit then starts failing. I have a quota of 200 (200k tokens/min) in azure for this model. It should not be hitting these limits. I am using Codex CLI for one task only and no one else is using that model. I am using codex-cli v0.63.0. I started having this problem when I upgraded to 0.58.0. It worked fine with the same volume of requests in versions prior to that.
@ChrisEdwards, could you use the /feedback slash command to upload one of the sessions that's exhibiting this problem and post the thread ID here?
I will do that tomorrow.
I submitted it. Thread ID 019ae16d-e167-75c3-ae74-ad0f028010c4
I uploaded another scenario where it was failing over and over. I thought it would help to have more context. Thread ID: 019ae16d-e167-75c3-ae74-ad0f028010c4
same issue : thread id : 019adee8-cae3-7ea3-8296-d17a2603bece
I think I am exceeding the TPM limits of Azure and that is causing the issue. If I pause for a long time, it runs great for some period of time then fails more and more frequently till its unusable. Then if I wait an hour, it is all good again and degrades again over time. This has to be some sort of throttling. That is my guess.
@ChrisEdwards, yeah, that's a good theory. The Azure endpoints report quota and rate limit errors differently from the OpenAI endpoints. The Codex code base contains code to elegantly handle these conditions for the latter but not for the former. We've reached out to the team at Microsoft responsible for the Azure endpoints to better understand the differences so we can provide a better experience for Azure users.
@azhar-alhasan & @lingfengchencn, are you also using Azure by any chance? The "stream disconnected" error is a rather general error message, so it likely has multiple root causes.
This has made codex unusable for me. I have been using Claude when I would rather use Codex. I will watch this bug and consider using codex again when it is fixed. It can't possibly be token limitations as it fails after only a few calls now...constantly. Codex is the better model. I just can't use it.
What I have found that helps is to set request_max_retries = 20 and max_output_tokens = 16384. This must be set in the provider section in the config.toml file.
The reason is that Azure computes reservations of tokens based on the max_output_tokens for the request. If it is not set, it assumes the largest value of the model, which is hundreds of thousands of tokens, so you run out before you actually should. Therefore, setting a smaller max_output_tokens value helps. I scanned my sessions for the max tokens I had used and set it slightly higher.
The request_max_retries helps because it is an exponential backoff, so it waits longer and longer to retry. My use case is to set Codex up in yolo mode, with full permissions, and I prompt it and let it run completely unattended. The problem is that with the default number of retries, it stops retrying and then stops running. You can queue up a bunch of continues, but it runs through them fast. With this retry value of 20, it never stops retrying before it starts succeeding again, so I don't need continue messages queued up since it won't stop till it finishes the task I gave it.
I can now use it again. I wish it gave better error messages so I could have figured out what the issue was sooner. To be clear, THE ISSUE IS INDEED TOKEN THROTTLING. I validated this by checking my Azure monitoring and saw the throttling in the metrics. This makes it usable, but still slow due to all the retries. I hope it helps.