Github Copilot multiple premium request charges without subagents
Description
Multiple premium requests deducted whilst still being in the same session. After compaction there is now a charge as well but this might be intentional. However you can get charged multiple times whilst the agent is working without subagents
Plugins
none
OpenCode version
1.1.14
Steps to reproduce
Running a session using github copilot(opus is easiest to spot) After compaction there is a cost which might be intentional but during normal prompts you can get hit by 6-9 request costs randomly without a noticable trigger
Screenshot and/or share link
No response
Operating System
Windows 11
Terminal
Windows terminal
This issue might be a duplicate of existing issues. Please check:
- #6816: Github Copilot burns premium quota for non-premium models
- #6802: [BUG] Default small_model charges users when free alternative exists on Copilot
Both of these issues describe unintended premium request charges in similar contexts. Feel free to ignore if your specific case differs from those.
I would say I noticed this as well, my github copilot usage speed up than previously in just some days
Looks a whole lot like #8030 -- I hope you weren't burned too badly, @bklonder-harsco or @OpeOginni
I think it is now working as expected. Before, the counter only went up while opening new sessions. If you are using copilot in vscode, it has always shown increment on every premium request you make.
I agree with you, @flupkede that the previous behavior was incorrect. But correcting it (in 1.1.13 and 1.1.14) appears to have introduced an unwanted side effect. On a long-running complex task involving tool attachments, many agent-initiated requests are being counted as user requests.
Or are you not finding this to be the case on 1.1.13 and 1.1.14?
@bowmanjd i think the issue ur running into is specific to images, are those the types of attachments the agent is reading?
Oh, OK! So perhaps my diagnosis is wrong. I did not think I had any images involved in my request this morning. Honestly, I am unsure how to tell by looking at opencode logs exactly what happened. I do know that I made a couple, at most, requests, and this resulted in Copilot logging 62 (x 3 premium requests) requests.
I can try later, and see if I can give a guide for reproducing this. I'll use Haiku or similar cheaper model, and a small request, with debug logging enabled, and see where we end up
OK, sorry, folks. I am trying to replicate and cannot. Perhaps there was something else going on at the time. I will study the logs more and see if I can discern. My several attempts just now using Haiku yielded exactly the expected behavior.
@bowmanjd I did some tests using OpenAgents framework and when the main agent requests the new ContextScout (subagent) to kick in, I can see my credits used going up. But this one is also using Haiku so it's also considered as Premium Requests.
I’m experiencing the same issue. I’m not using OpenAgents or anything similar, but every time context compaction happens, it counts a single request as if it were three.
Just to add onto this as well. I downgraded to 1.1.13 and behaviour worked again like normal with no unusual spikes.
yeah I did the same to 1.1.13 there is definitely an issue with the new behavior on the copilot calls usage.
The new upgrade should cause more premium request usage than it used to, right, because the older implementation did not use enough premium requests. Are you noticing a slight increase or a significant increase?
After using Opus 4.5 today with my copilot subscription on OpenCode, I've already reached 100% of my monthly premium requests 😢 . Even tho I'm an extremely lightweight user when it comes to this.
I remember that about a week ago, I heavily used Opus 4.5 on OpenCode for a full day long and it only reached 24% (starting point was 18%) of my monthly premium requests.
The new upgrade should cause more premium request usage than it used to, right, because the older implementation did not use enough premium requests. Are you noticing a slight increase or a significant increase?
It’s a significant increase. A single request ended up using about 10 premium requests of my quota without even running subagents. Usually, Copilot counts a single prompt as one request (factoring in the model multiplier), even if compaction happens. Now, though, each context compaction is being counted as three new separated user prompts.
We had to adjust the headers because people were getting banned from excessive usage, we are working w/ copilot team to have official integration to ensure things are setup properly.
We had to adjust the headers because people were getting banned from excessive usage, we are working w/ copilot team to have official integration to ensure things are setup properly.
@rekram1-node Even if you talk with the Copilot team and make the change, that won't bring back the option to use a lot of requests as before, @bowmanjd, isn't it?
The header usage is eminent, looks like right now. Hence, the trick is gone.
that won't bring back the option to use a lot of requests as before, @bowmanjd, isn't it?
The header usage is eminent, looks like right now. Hence, the trick is gone.
Yes, I am finding that "the trick is gone" as well. From my perspective, this is a good move: OpenCode and Github Copilot pair really well, and I am glad we are moving toward compliance so that this pairing can continue into the future.
The mystery is what happened to some of us with Opus -- was it compaction, or perhaps another plugin? It consumed at least ten times the premium requests that it should have.
I have this correct, right?:
- Every single user-initiated request should consume premium token(s)
- Every single agent-initiated request should not consume any premium tokens. So if any part of the agent pipeline is generating requests but labeling them as "user" -- that would be something to seek out and compensate for in some way
that won't bring back the option to use a lot of requests as before, @bowmanjd, isn't it? The header usage is eminent, looks like right now. Hence, the trick is gone.
Yes, I am finding that "the trick is gone" as well. From my perspective, this is a good move: OpenCode and Github Copilot pair really well, and I am glad we are moving toward compliance so that this pairing can continue into the future.
The mystery is what happened to some of us with Opus -- was it compaction, or perhaps another plugin? It consumed at least ten times the premium requests that it should have.
I have this correct, right?:
- Every single user-initiated request should consume premium token(s)
- Every single agent-initiated request should not consume any premium tokens. So if any part of the agent pipeline is generating requests but labeling them as "user" -- that would be something to seek out and compensate for in some way
@bowmanjd the important matter at this moment is, you disclosed the trick of header and I verified from the PR diff, that's the case. I can simply now build opencode from 1.1.10 like version, that would after some times, send header for user instead of agents, and should work fine until copilot themselves do something about this.
that won't bring back the option to use a lot of requests as before, @bowmanjd, isn't it? The header usage is eminent, looks like right now. Hence, the trick is gone.
Yes, I am finding that "the trick is gone" as well. From my perspective, this is a good move: OpenCode and Github Copilot pair really well, and I am glad we are moving toward compliance so that this pairing can continue into the future.
The mystery is what happened to some of us with Opus -- was it compaction, or perhaps another plugin? It consumed at least ten times the premium requests that it should have.
I have this correct, right?:
- Every single user-initiated request should consume premium token(s)
- Every single agent-initiated request should not consume any premium tokens. So if any part of the agent pipeline is generating requests but labeling them as "user" -- that would be something to seek out and compensate for in some way
What plugins did you have? OmO has some injection logic which may have triggered extra usage, as well as DCP for a few hours after the .114 update still had user role injections which counted with the new usage system.
Everything else you said is correct, the usage check is just a simple if the last message in the fetch had tool or assistant role, don't count toward usage.
The detail is that in the compaction process the prompt is sent as a "user" role message, and also in the continuation message. Is there a reason for this? or could they be sent as "assistant" instead since it is an internal agent instruction to summarize the conversation?
Here compaction is sent with role: "user": https://github.com/anomalyco/opencode/blob/c87939ad122a79150fa6360dacfdef63ea396b2d/packages/opencode/src/session/compaction.ts#L141-L164
Here it sends continue with role: "user": https://github.com/anomalyco/opencode/blob/c87939ad122a79150fa6360dacfdef63ea396b2d/packages/opencode/src/session/compaction.ts#L167-L176
They can't be sent as assistant role as that wouldn't work for all providers/models, it would need to be a github specific change or a check within the copilot plugin for the compaction prompt to not change x initiator to user.
as well as DCP for a few hours after the .114 update still had user role injections which counted with the new usage system.
Why yes! I do have DCP; just started using the other day (a couple days before this debacle). The couple sessions in which I accrued a high number of premium requests were full of compaction and pruning.
They can't be sent as assistant role as that wouldn't work for all providers/models, it would need to be a github specific change or a check within the copilot plugin for the compaction prompt to not change x initiator to user.
I see.
as well as DCP for a few hours after the .114 update still had user role injections
I don't use DCP or any similar plugin.
as well as DCP for a few hours after the .114 update still had user role injections which counted with the new usage system.
Why yes! I do have DCP; just started using the other day (a couple days before this debacle). The couple sessions in which I accrued a high number of premium requests were full of compaction and pruning.
Sorry about that, it's fixed (kind of) now but it definitely would have killed usage for a few hours of .114, that's your mystery solved. I even saw this copilot plugin change coming in the dev branch but didn't realize DCP would make it so much worse. I say kind of because the fix basically replaced user role messages with assistant role, once again making copilot use way less usage than it should, hopefully it recovers some of your lost usage. Real fix for usage to be consistent with how opencode intended coming today.
probably #8393 will solve
Official integration is in next release
Excellent! I was reviewing the new plugin. It looks great, and simple, as it should. It defines agentic use as "the message has a role other than 'user'" which makes sense.
Just was hoping for a bit of encouragement -- since it is clearly possible for OpenCode and for plugins to generate synthetic messages with role "user"... is there anything you can say to reassure? Honestly, even if there were a plan to emit a warning if the premium requests per minute (or even over a fraction of a minute) exceeded a certain threshold (just spitballing, but hopefully you get the idea), or at least ensure there is good logging to better ascertain what happened if requests are burned faster than anticipated. Or is there any indicator in OpenCode for when a request is agent-synthesized?
Anyways, looking forward to using the new copilot auth mechanism. Just feeling a little hesitant after losing so many requests the other day. I think we are getting to a good place.
@rekram1-node Thanks for work on this.
However, looking at the solution implemented, I think this issue is not evenfully solved.
A solution could be allowing to choose which model handles compaction/continuation, as long as OpenCode cannot treat these steps as agent messages.
In the official Copilot extension for VS Code, context‑compaction and continuation steps don’t count as separate user requests. In OpenCode, these prompts are still sent with the "user" role though, which results in much higher quota usage than expected.