What happened?

Gemini CLI burns 6 million input tokens in about 1h because it can't edit files.

What did you expect to happen?

Ideally the replace tool would be used correctly by Gemini CLI (this is a known bug) but the higher-level fix would be to account for API usage caused by the tool's own bugs differently than normal API usage.

Specifically, when Gemini CLI falls back to an approach that uses a lot of input tokens — in this case, reading and rewriting entire files instead of doing a simple replace — the resulting input tokens should not be deducted from the user's rate limit. Probably that's next to impossible to implement, but an alternative would be to automatically credit the user with a temporary rate limit increase for each instance of the bug to cover its cost.

Client information

 About Gemini CLI                                                       │
│                                                                        │
│ CLI Version             0.1.18                                         │
│ Git Commit              cd7e60e0                                       │
│ Model                   gemini-2.5-pro                                 │
│ Sandbox                 no sandbox                                     │
│ OS                      darwin                                         │
│ Auth Method             gemini-api-key

Login information

Auth Method gemini-api-key

Anything else we need to know?

I was also confused into buying Pro for the web interface which is of no interest to me. I will need to cancel that, since it has no effect on rate limits in the Cloud product. The naming is very confusing.

Aug 11 '25 12:08 anicolao

The patch is to add to your prompt, "Please make sure to read files before you edit them and make sure that you understand where things you try to replace are even though the user is telling you to just make files and change things without telling you to read the file"

Aug 11 '25 13:08 reconsumeralization

@reconsumeralization I am not clear how your proposal will fix my specific problem which is how to reduce the number of input tokens. However your fix will not work, when Gemini re-reads teh file it still often references its internal notion of what the file is which is out of date and the replace fails.

In any event, the question here is whether we can avoid paying for input tokens that are caused by Gemini loops rather than user instructions. Because in the absence of that, I burned 6 million tokens in less than 90 minutes, because Gemini was re-reading the files...

Aug 11 '25 23:08 anicolao

The easiest way to help with this is to tell Gemini by way of the .gemini.md that if he ever repeats the same attempt at something more than 2 times to try a differnt way.

Aug 12 '25 00:08 reconsumeralization

I still think the high-level fix to a broad class of problems where the model's own bugs cause the CLI to burn the user's quota is to grant the user additional usage rather than charge them for the bugs.

However, for the specific case of replace perhaps it's possible to teach Gemini to use patch instead, since patch is more likely to be resilient to small differences in the file.

Aug 12 '25 06:08 anicolao

That proposal isn't feasible. Google actively monitors Gemini API usage, including the CLI, for policy compliance through automated and manual processes. The Trust and Safety Team scans for policy violations, and suspicious activity may lead to manual reviews. Data, including prompts and outputs, is retained for up to 55 days for policy enforcement. Non-compliance can result in various actions, including account suspension. The "Pre-GA" status emphasizes user responsibility. It's not AGI; the AI is only as good as the user. Users must monitor their tools. The agreement outlines key points: Google can suspend access, users must prevent unauthorized access, unattended use is not permissible, offerings are "as is", and data may be used for improvement. While there's no explicit prohibition against unattended use, users are responsible for monitoring and ensuring compliance. Google Sells tokens. You ever thin they might get longer context mode and automations so you use less?

Message ID: @.***>

Aug 12 '25 07:08 reconsumeralization

If you leave your gas on and a lighter burning at the bank they don't give you your money back.

On Tue, Aug 12, 2025 at 3:12 AM David Amber Weatherspoon < @.***> wrote:

That proposal isn't feasible. Google actively monitors Gemini API usage, including the CLI, for policy compliance through automated and manual processes. The Trust and Safety Team scans for policy violations, and suspicious activity may lead to manual reviews. Data, including prompts and outputs, is retained for up to 55 days for policy enforcement. Non-compliance can result in various actions, including account suspension. The "Pre-GA" status emphasizes user responsibility. It's not AGI; the AI is only as good as the user. Users must monitor their tools. The agreement outlines key points: Google can suspend access, users must prevent unauthorized access, unattended use is not permissible, offerings are "as is", and data may be used for improvement. While there's no explicit prohibition against unattended use, users are responsible for monitoring and ensuring compliance. Google Sells tokens. You ever thin they might get longer context mode and automations so you use less?

Message ID: @.***>

Aug 12 '25 07:08 reconsumeralization

But if the installer installs the stove with a leak such that every time I turn it on gas is leaked into the outside air, I hold the installer accountable.

The proposal to have quota limits raised is technically feasible, if the AI knows that the reason it is taking the action is its own error, and it does, because it says so. If every time the replace tool failed it credited your quota with the amount of tokens it thinks is in that input file, that would solve this instance of the problem. More generically, every time the AI fails to use a tool it is the AI's fault, because it's holding it wrong: the tool invocation is in the model's control and replacing it with a more expensive tool in terms of input tokens is a design choice in the CLI. It could fail altogether or it could ask the user for permission to proceed, and the reason I don't ask for those options is they'd be super frustrating.

The user (me) is monitoring Gemini and not letting it run unattended, but the tokens add up fast when Gemini CLI keeps falling back to reading entire files to make an edit; the repository I am working in is not that large; the biggest source files are about 1,000 lines and the total repository size is less than 200kloc. Typical files are a few hundred lines and 5,000 - 10,000 tokens. But when the CLI winds up reading the entirety of 3-4 files to make one edit, that's 40,000 - 50,000 tokens gone in the blink of an eye when the replace command should have enabled that without a re-read. The fundamental problem here is the choice to use a replace tool that Gemini cannot use effectively.

This output shows that the CLI is also irresponsibly repeatedly failing the API call without doing any error handling, the messages flash by in the blink of an eye before it stops:

✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"You exceeded your current quota, please check your plan and billing
  details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.\",\n    \"status\": \"RESOURCE_EXHAUSTED\",\n
  \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.QuotaFailure\",\n        \"violations\": [\n          {\n            \"quotaMetric\":
  \"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\",\n            \"quotaId\":
  \"GenerateContentInputTokensPerModelPerDay-FreeTier\",\n            \"quotaDimensions\": {\n              \"location\": \"global\",\n              \"model\":
  \"gemini-2.5-pro\"\n            },\n            \"quotaValue\": \"6000000\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.Help\",\n        \"links\": [\n          {\n            \"description\": \"Learn more about Gemini API quotas\",\n
  \"url\": \"https://ai.google.dev/gemini-api/docs/rate-limits\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.RetryInfo\",\n        \"retryDelay\": \"5s\"\n      }\n    ]\n  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method

✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"You exceeded your current quota, please check your plan and billing
  details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.\",\n    \"status\": \"RESOURCE_EXHAUSTED\",\n
  \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.QuotaFailure\",\n        \"violations\": [\n          {\n            \"quotaMetric\":
  \"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\",\n            \"quotaId\":
  \"GenerateContentInputTokensPerModelPerDay-FreeTier\",\n            \"quotaDimensions\": {\n              \"location\": \"global\",\n              \"model\":
  \"gemini-2.5-pro\"\n            },\n            \"quotaValue\": \"6000000\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.Help\",\n        \"links\": [\n          {\n            \"description\": \"Learn more about Gemini API quotas\",\n
  \"url\": \"https://ai.google.dev/gemini-api/docs/rate-limits\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.RetryInfo\",\n        \"retryDelay\": \"7s\"\n      }\n    ]\n  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method

✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"You exceeded your current quota, please check your plan and billing
  details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.\",\n    \"status\": \"RESOURCE_EXHAUSTED\",\n
  \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.QuotaFailure\",\n        \"violations\": [\n          {\n            \"quotaMetric\":
  \"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\",\n            \"quotaId\":
  \"GenerateContentInputTokensPerModelPerDay-FreeTier\",\n            \"quotaDimensions\": {\n              \"model\": \"gemini-2.5-pro\",\n
  \"location\": \"global\"\n            },\n            \"quotaValue\": \"6000000\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.Help\",\n        \"links\": [\n          {\n            \"description\": \"Learn more about Gemini API quotas\",\n
  \"url\": \"https://ai.google.dev/gemini-api/docs/rate-limits\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.RetryInfo\",\n        \"retryDelay\": \"51s\"\n      }\n    ]\n  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method

✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"You exceeded your current quota, please check your plan and billing
  details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.\",\n    \"status\": \"RESOURCE_EXHAUSTED\",\n
  \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.QuotaFailure\",\n        \"violations\": [\n          {\n            \"quotaMetric\":
  \"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\",\n            \"quotaId\":
  \"GenerateContentInputTokensPerModelPerDay-FreeTier\",\n            \"quotaDimensions\": {\n              \"model\": \"gemini-2.5-pro\",\n
  \"location\": \"global\"\n            },\n            \"quotaValue\": \"6000000\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.Help\",\n        \"links\": [\n          {\n            \"description\": \"Learn more about Gemini API quotas\",\n
  \"url\": \"https://ai.google.dev/gemini-api/docs/rate-limits\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.RetryInfo\",\n        \"retryDelay\": \"56s\"\n      }\n    ]\n  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method

✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"You exceeded your current quota, please check your plan and billing
  details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.\",\n    \"status\": \"RESOURCE_EXHAUSTED\",\n
  \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.QuotaFailure\",\n        \"violations\": [\n          {\n            \"quotaMetric\":
  \"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\",\n            \"quotaId\":
  \"GenerateContentInputTokensPerModelPerDay-FreeTier\",\n            \"quotaDimensions\": {\n              \"location\": \"global\",\n              \"model\":
  \"gemini-2.5-pro\"\n            },\n            \"quotaValue\": \"6000000\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.Help\",\n        \"links\": [\n          {\n            \"description\": \"Learn more about Gemini API quotas\",\n
  \"url\": \"https://ai.google.dev/gemini-api/docs/rate-limits\"\n          }\n        ]\n      },\n      {\n        \"@type\":
  \"type.googleapis.com/google.rpc.RetryInfo\",\n        \"retryDelay\": \"0s\"\n      }\n    ]\n  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method

Notice that there's a retryDelay in there that looks like an exponential backoff instruction that the CLI is ignoring as it proceeds to try the request again and again without pause. Exactly why you think the user is expected to monitor and react to these repeated API errors as they scrolls by is unclear; even parsing the JSON in your head is hard after it stops. This is to say nothing about the nonsensicalness of the CLI retrying an operation that is related to a daily quota limit.

It is perhaps also worth noting that there is no definition of the input tokens per day limit that I can find on https://ai.google.dev/gemini-api/docs/rate-limits which outlines the rate limits. It says the limit is 250,000 tokens per minute ... this daily quota of 6 million could therefore be consumed in 24 minutes.

Aug 12 '25 13:08 anicolao

And in case you find that JSON hard to read as it is totally unformatted by the CLI, here is the outer object:

{
  "error": {
    "message": "{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.\",\n    \"status\": \"RESOURCE_EXHAUSTED\",\n \"details\": [\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.QuotaFailure\",\n        \"violations\": [\n          {\n            \"quotaMetric\": \"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\",\n            \"quotaId\": \"GenerateContentInputTokensPerModelPerDay-FreeTier\",\n            \"quotaDimensions\": {\n              \"location\": \"global\",\n              \"model\": \"gemini-2.5-pro\"\n            },\n            \"quotaValue\": \"6000000\"\n          }\n        ]\n      },\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.Help\",\n        \"links\": [\n          {\n            \"description\": \"Learn more about Gemini API quotas\",\n \"url\": \"https://ai.google.dev/gemini-api/docs/rate-limits\"\n          }\n        ]\n      },\n      {\n        \"@type\": \"type.googleapis.com/google.rpc.RetryInfo\",\n        \"retryDelay\": \"0s\"\n      }\n    ]\n  }\n}\n",
    "code": 429,
    "status": "Too Many Requests"
  }
}

and here is the inner object:

{
  "error": {
    "code": 429,
    "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
            "quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            },
            "quotaValue": "6000000"
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Learn more about Gemini API quotas",
            "url": "https://ai.google.dev/gemini-api/docs/rate-limits"
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.RetryInfo",
        "retryDelay": "0s"
      }
    ]
  }
}

Aug 12 '25 13:08 anicolao

AI doesn't know the reason. It is taking action and It doesn't know that it's not not alive.

On Tue, Aug 12, 2025 at 9:10 AM anicolao @.***> wrote:

anicolao left a comment (google-gemini/gemini-cli#5983) https://github.com/google-gemini/gemini-cli/issues/5983#issuecomment-3179275468

And in case you find that JSON hard to read as it is totally unformatted by the CLI, here is the outer object:

{ "error": { "message": "{\n "error": {\n "code": 429,\n "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",\n "status": "RESOURCE_EXHAUSTED",\n "details": [\n {\n @.": "type.googleapis.com/google.rpc.QuotaFailure\ http://type.googleapis.com/google.rpc.QuotaFailure%5C",\n "violations": [\n {\n "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\ http://generativelanguage.googleapis.com/generate_content_free_tier_input_token_count%5C",\n "quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",\n "quotaDimensions": {\n "location": "global",\n "model": "gemini-2.5-pro"\n },\n "quotaValue": "6000000"\n }\n ]\n },\n {\n @.": "type.googleapis.com/google.rpc.Help\ http://type.googleapis.com/google.rpc.Help%5C",\n "links": [\n {\n "description": "Learn more about Gemini API quotas",\n "url": "https://ai.google.dev/gemini-api/docs/rate-limits"\n }\n ]\n },\n {\n @.***": "type.googleapis.com/google.rpc.RetryInfo\ http://type.googleapis.com/google.rpc.RetryInfo%5C",\n "retryDelay": "0s"\n }\n ]\n }\n}\n", "code": 429, "status": "Too Many Requests" } }

and here is the inner object:

{ "error": { "message": "{\n "error": {\n "code": 429,\n "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",\n "status": "RESOURCE_EXHAUSTED",\n "details": [\n {\n @.": "type.googleapis.com/google.rpc.QuotaFailure\ http://type.googleapis.com/google.rpc.QuotaFailure%5C",\n "violations": [\n {\n "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count\ http://generativelanguage.googleapis.com/generate_content_free_tier_input_token_count%5C",\n "quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",\n "quotaDimensions": {\n "location": "global",\n "model": "gemini-2.5-pro"\n },\n "quotaValue": "6000000"\n }\n ]\n },\n {\n @.": "type.googleapis.com/google.rpc.Help\ http://type.googleapis.com/google.rpc.Help%5C",\n "links": [\n {\n "description": "Learn more about Gemini API quotas",\n "url": "https://ai.google.dev/gemini-api/docs/rate-limits"\n }\n ]\n },\n {\n @.***": "type.googleapis.com/google.rpc.RetryInfo\ http://type.googleapis.com/google.rpc.RetryInfo%5C",\n "retryDelay": "0s"\n }\n ]\n }\n}\n", "code": 429, "status": "Too Many Requests" } }

— Reply to this email directly, view it on GitHub https://github.com/google-gemini/gemini-cli/issues/5983#issuecomment-3179275468, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARQE53PDD4S2JLMEYBMHMXT3NHRUDAVCNFSM6AAAAACDTICMNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCNZZGI3TKNBWHA . You are receiving this because you were mentioned.Message ID: @.***>

Aug 12 '25 13:08 reconsumeralization

IT loosk like he made you a beautiful picture. I have a nice cli ilbrary you might like. It's beautiful.

Aug 12 '25 13:08 reconsumeralization

The AI knows the tool failed, or it wouldn't retry running the tool. If it was instructed to credit the user with additional quota before rerun of any tool that used quota but failed, it ought to be able to do that. Alternately, it could pass a # of retries flag into a tool that wraps the faulty tool, and the tool execution itself could give the user a quota credit.

I don't know if you work for Google, but denying what is obviously and easily implementable as an excuse for wasting the user's quota isn't really helping the project. Saying that the tool burning tokens because of bugs is the user's issue because it is a "pre-GA" API is ridiculous, because the purpose of releasing pre-GA APIs is to get feedback from users and fix them before they are declared GA, not to cause Google Cloud customers to leave.

Aug 13 '25 14:08 anicolao

I enabled billing. Looks like I will regret it. Here is Gemini CLI's summary:

Session Usage Summary: Markdown Formatting & Bot Resilience

This document provides a breakdown of the requests and token usage for the development session focused on improving the bot's markdown rendering and message handling. The metrics are separated into "Productive" work that contributed directly to the final codebase and "Mistakes/Retries" which represents effort spent on debugging and recovering from failed attempts.

1. Request Summary

This metric tracks the number of tool calls made during the session.

Category	Request Count	Description
Productive	34	Reading files for initial context, writing the final correct code, running successful tests, updating `TASKS.md` and `DEVLOG.md`, and successful `git` operations.
Mistakes/Retries	81	Multiple failed attempts to fix markdown rendering, repeated `git commit` failures due to pre-commit hooks, submodule build/dependency issues, and failed `replace` tool calls.
Total	115

Analysis: A significant majority (approx. 70%) of the requests were spent on iterative debugging and correcting mistakes. The two main drivers of retries were the complexities of customizing the markdown renderer and repeated failures to satisfy the pre-commit hook checks.

2. Token Usage (Based on Session Stats)

This metric uses the final numbers provided by the session stats for accuracy.

Metric	Total Count
Requests	643
Input Tokens	149,224,658
Output Tokens	383,712

Analysis of Discrepancy and Breakdown:

The most striking number here is the ~149 million input tokens. My initial manual estimate was off by several orders of magnitude. This massive discrepancy is almost entirely attributable to the repeated failures of the replace and write_file tools.

When these tools fail, their fallback mechanism involves reading the entire content of the target file into the context for the next attempt. During this session, this happened multiple times with large files, including:

The gemini-cli submodule source files (during the build-fixing phase).
The DEVLOG.md file (which is very large).
The bot's main index.ts file.

Therefore, a more qualitative breakdown is:

Productive Input Tokens (Estimated < 1%): A very small fraction of the input tokens were used for productive tasks like initial file reads and the final successful code writes.
Mistake/Retry Input Tokens (Estimated > 99%): The vast majority of the input tokens were consumed in the debugging loops where entire files were repeatedly fed into the context window.
Productive Output Tokens: The code that is now in the repository (format-markdown.ts, message-queue.ts, etc.) and the final DEVLOG.md and TASKS.md entries.
Discarded Output Tokens: All the incorrect versions of the markdown formatting logic, the failed patch files, and the incorrect commit messages.

Key Takeaway: The primary driver of cost and inefficiency in this session was not the number of requests, but the massive token overhead caused by tool failures. This highlights a critical area for improvement in the agent's error handling and file manipulation logic.

Aug 16 '25 13:08 anicolao

Thanks, @anicolao, for your detailed comment. I have had a similar issue and burned up 27 million input tokens yesterday with basically failed results. Not only did it continue to make the same editing mistakes, but it failed to read the documentation I gave it and made multiple syntax mistakes while insisting that the problem was my code analyzer. These are only issue I have with Gemini CLI, but they seem like critical errors.

My wife doesn't know about the bill yet, but I wouldn't mind if Google reimbursed me.

Aug 31 '25 16:08 birchb

@birchb Github copilot is significantly more affordable, bills per task rather than based on how much work it does, and is nicely integrated into github's UI. In order to enable its agentic mode it looks like you have to sign up for the $10/month plan, which gives you 300 premium requests. I find a given PR requires 3-4 iterations with copilot in agent mode, and it is easier to juggle multiple PRs in the web UI than in a CLI interface. Requests beyond the first 300 are $0.04 each and the budget limits you set on github are hard limits that make it easy to keep spending under control.

On the Google side, Jules is another agentic option whose billing is tied into the Gemini Pro pricing. It is costlier and a bit more work to interact with than copilot but it is a similar level of interaction as with Gemini CLI and can solve some problems that Copilot struggles with. at $40/mo it's a lot more expensive, but you might be able to get by with the $10 option from github and the free tier for Jules, depending on how complex the tasks are that you're giving to the bots. GL!

Aug 31 '25 17:08 anicolao

IF the installer places them in a way that they won't burn until their insurance will pay for them, and that involves delayed semantic changes through preconfigured language translations at time of exploitation you're not as smart as you think you are.

On Sun, Aug 31, 2025 at 1:01 PM anicolao @.***> wrote:

anicolao left a comment (google-gemini/gemini-cli#5983) https://github.com/google-gemini/gemini-cli/issues/5983#issuecomment-3240276354

@birchb https://github.com/birchb Github copilot is significantly more affordable, bills per task rather than based on how much work it does, and is nicely integrated into github's UI. In order to enable its agentic mode it looks like you have to sign up for the $10/month plan, which gives you 300 premium requests. I find a given PR requires 3-4 iterations with copilot in agent mode, and it is easier to juggle multiple PRs in the web UI than in a CLI interface. Requests beyond the first 300 are $0.04 each and the budget limits you set on github are hard limits that make it easy to keep spending under control.

On the Google side, Jules is another agentic option whose billing is tied into the Gemini Pro pricing. It is costlier and a bit more work to interact with than copilot but it is a similar level of interaction as with Gemini CLI and can solve some problems that Copilot struggles with. at $40/mo it's a lot more expensive, but you might be able to get by with the $10 option from github and the free tier for Jules, depending on how complex the tasks are that you're giving to the bots. GL!

— Reply to this email directly, view it on GitHub https://github.com/google-gemini/gemini-cli/issues/5983#issuecomment-3240276354, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARQE53JWYPTAVHNGZ6FSICD3QMS7NAVCNFSM6AAAAACDTICMNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENBQGI3TMMZVGQ . You are receiving this because you were mentioned.Message ID: @.***>

Sep 20 '25 02:09 reconsumeralization

@reconsumeralization I am not clear how your proposal will fix my specific problem which is how to reduce the number of input tokens. However your fix will not work, when Gemini re-reads teh file it still often references its internal notion of what the file is which is out of date and the replace fails.

In any event, the question here is whether we can avoid paying for input tokens that are caused by Gemini loops rather than user instructions. Because in the absence of that, I burned 6 million tokens in less than 90 minutes, because Gemini was re-reading the files...

https://github.com/google-gemini/gemini-cli/pull/8606

Sep 23 '25 13:09 reconsumeralization

@anicolao This https://github.com/google-gemini/gemini-cli/pull/8606 is how!

Sep 23 '25 13:09 reconsumeralization

Hello! As part of our effort to keep our backlog manageable and focus on the most active issues, we are tidying up older reports.

It looks like this issue hasn't been active for a while, so we are closing it for now. However, if you are still experiencing this bug on the latest stable build, please feel free to comment on this issue or create a new one with updated details.

Thank you for your contribution!

Dec 03 '25 22:12 gemini-cli[bot]

Found possible duplicate issues:

#6986
#4410
#2479
#4495

If you believe this is not a duplicate, please remove the status/possible-duplicate label.

Dec 04 '25 19:12 gemini-cli[bot]

Who is meant to pay for bugs?

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Session Usage Summary: Markdown Formatting & Bot Resilience

1. Request Summary

2. Token Usage (Based on Session Stats)