Need AI Request / Token counter
Pre-submit Checks
- [x] I have searched Warp feature requests and there are no duplicates
- [x] I have searched Warp docs and my feature is not there
Describe the solution you'd like?
You have a limit on the number of AI calls that can be made, based on your account type / subscription level. This is fine, AI is a expensive resource to host and you need to be able to cover costs. However, it seems that the number of AI calls that are made is completely opaque at the moment.
I did some searching to see if there is a way of displaying the AI useagfe requests count that is in the setting menu on the main UI somewhere, but I didn't see anything.
I'd like to see this counter on the toolbar so I am aware of how many requests I have used and how many I have left, without having to constantly check the settings tab.
This problem really became an issue, as I ran out of credits in the middle of a complex project. This seems like a crippling feature to not have.
Is your feature request related to a problem? Please describe.
It is very frustrating when I am using a metered resource (AI requests) but it isn't obvious what the consumption rate is.
Additional context
This is a fantastic product, it is going to be the future of development.
Operating system (Optional, if relevant)
Windows
How important is this feature to you?
5 (Can't work without it!)
Warp Internal (ignore) - linear-label:39cc6478-1249-4ee7-950b-c428edfeecd1
None
I came here to make the same request; in addition, it will be beneficial to the user to make judicious use of ai requests available if we have options given to the user in settings to display the following in the terminal itself.
- number of ai requests used while executing a particular prompt shown below the block in a single line. 2)Summary of number of ai requests used yesterday or on a particular date(at least for that particular profile).
- a list of recently used prompts with the number of ai requests they made, (with an option to sort by the number of ai requests used).
Thanks for this feature request!
To anyone else interested in this feature, please add a 👍 to the original post at the top to signal that you want this feature, and subscribe if you'd like to be notified.
the token thing in general is a bit suss. im getting token limit warnings and i just signed up to turbo a couple hours ago. theres literally no way i've sent 100M tokens. is it an api issue? who knows. no way to know anything about whats happening. which, to gauge how useful all this is... well challenging. I'm a heavy claude code user and while the o1 planner is very nice, the file editor is constantly breaking files which causes a sort of token munching loop where it breaks a file, finds the break, fixes that while causing another break. claude used to do this too but it's much better. in any case, without being able to see even the slightest indicator of how many tokens im using even if this is dramatically cheaper than claude, its applying diffs that break things then using tokens to fix them driving me to a token limit at some pace that i can't monitor. hard to rely on that.
Tokens need to be more transparent
related: https://github.com/warpdotdev/Warp/issues/6304 https://github.com/warpdotdev/Warp/issues/6127
I also was surprised by this. I mistook 100M tokens for the context limit on individual transactions. Reading failure on my part, I guess, but sloppy communication on Warp's part, too.
@ipsod I guess its a reasonable mistake but 100m tokens is a LOT of tokens, it would be shocking if the context was 100M tokens--for scale, Google Gemini 2.5 offers 1M and thats industry leading. I think in general there's a bunch of confusion going on. I found if I did a fair bit eventually it just exhausts something (likely context not actual token limits) ending the chat and starting another fixes it--but that isn't great. still. they need a bit of time to learn the lessons others learnt probably. claude code used to just burp you out of context when it got fully now it has auto compacting with relevant details which lets you mostly just continue..
@ruzz Sure, I understood how I'd been confused, once I realized that I was confused.
It seems like it's a very common mistake to make, though. Lots of us don't know it by heart whether we're dealing with hundreds of thousands or millions, so it's easy to think you know what "100M tokens" means when you don't.
https://www.warp.dev/pricing
Up to 1,000 AI requests per month 35M Tokens Unlimited shared Notebooks and Workflows in Warp Drive Unlimited real-time session sharing Private email support
Pretty easy to scan this and get the wrong idea.
I feel like a credits counter would make it immediately apparent, though. And, it'd be useful, besides - you'd learn more about how much each of your requests cost, so you'd learn to be more efficient, and you'd be able to govern your usage better.
@ipsod i agree it would help. no question. you need some sense of how close to the limits you are if you're about to begin something that might put you over before it finishes. i think the warp team will get it sorted.
@dannyneira Can we please get the title changed to include "tokens", so that people searching for this issue can find it more easily?
Actually, on re-reading, maybe the "token" conversation here is misplaced?
My requests for the month are 285/1000, but I'm getting the following message regardless of the length of my query: Token limit exceeded: Please try again with a shorter query. I'm a Pro user. Am I out of monthly tokens? Is it a bug? There is no visibility and paying for a feature I can't depend on is foolish to me. I'm on a roll with something and it just stops working. I will quit Warp subscription if this isn't resolved and transparent soon.
I'm having a similar issue with the Turbo plan now. I'm on a roll with a project this weekend and ran into the token limit problem. I've only been a subscriber for about a week or so. 😄
@ksylvan We recently introduced the Lite model on the Turbo plan, which can be used once the token limits have been reached to continue using unlimited AI requests. https://docs.warp.dev/features/warp-ai/agent-mode#what-is-lite
That's great, but a token counter makes sense so you can have a sense of how many you've used out of the 100 million on the turbo plan
Not knowing where you're at is just bad UX
I'm of the same mind as @C947326
@dannyneira I love Warp and I appreciate what you guys are rolling out -- and I'd love to know where I am on my token limits.
It's a moot point, now, for Turbo accounts.
From an email:
Starting May 1, 2025, Turbo will switch to a simpler and more transparent model: 3,000 AI requests per month and unlimited access to Lite models.
It's possibly still needed for other accounts, though? @dannyneira can you comment?
I don't think it is a moot point for any account unless the account is an uncondtionally 100% unlimited account with no strings or substitutions of any sort.
If you are selling people something, just tell them what they have used. Anything else is being deceptive.
Well, there's already a request counter, and on Turbo accounts, at least, tokens are no longer counted.
I didnt use Warp that much this month and today in the middle of the month, surrprising message about expired token limits. Im subscribed to team plan where lite version doesnt work. No AI counter does not let me control how much tokens are consumed on particular jobs and models.
Had this exact problem.
Plans would sometimes work extraordinarily well and sometimes they would dissolve into endless loops of causing a bug - fixing it while simultaneously causing a new one. (stop randomly removing a bracket) I had the turbo plan and ran out of tokens middle of a project.
I really loved parts of warp but not knowing how many tokens were available and then being stuck without the ability to get more was terrible.
When Warp goes into loop of self-correction his own introduced bugs, it should not count that into monthly token limits.
I am on Teams plan, and last month spent all my tokens in a week, mostly by Warp looping inside self-introduced bugs regarding merge_tool and replace_in_file tool not working correctly, ie leaving old code inside file, writing new code on top, figuring out that file have bugs (new code, old code), adding new-new code on top, while leaving old-new and old code on bottom, and again finding that file have errors, ....
yep, same looping in as in message above, guys plz optimize the context of "next action" on the user side, I wanna be able to connect my self hosted model to warp 👍 and do some stuff that not those important with it
Hi folks, we recently added a little i chip with info on the latest request cost.
Hover over it to see info on the request cost for that interaction. Note that it's only for the latest one, and they disappear once you continue the conversations, so as not to litter the block history. It's also a chance for you to 👍 or 👎 the response to help us improve.
See more details on tracking AI requests in our docs. https://docs.warp.dev/support-and-billing/plans-and-pricing/ai-requests
Closing as resolved.