dify icon indicating copy to clipboard operation
dify copied to clipboard

Token spending doubled after updating the gemini plugin 0.2.1

Open juniorsereno opened this issue 8 months ago • 3 comments

Self Checks

  • [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

Dify version

1.4.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

I'm using Gemini 2.5 flash (standard use) to LLM and see the token of input increased independ of image create.

Note this is an existing chatflow before the update.

@QuantumGhost Can you see it.

Version plugin 0.1.6 5000 token usage Image

Version plugin 0.2.1 44000 token usage Image

✔️ Expected Behavior

normal token usage.

❌ Actual Behavior

9x More token usage in input token

juniorsereno avatar May 15 '25 16:05 juniorsereno

The increase in token usage after updating to the Gemini plugin version 0.2.1 could be related to changes in the plugin's configuration settings. Specifically, the max_tokens parameter was updated from 2048 to 8192, which directly affects the maximum output token count that can be used. Additionally, the context_size was increased from 30720 to 1048576, potentially impacting token utilization [1][2][3]. These changes might be contributing to the observed increase in token consumption. You may want to review and adjust these settings to manage token usage more effectively.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar May 15 '25 16:05 dosubot[bot]

I feel like there's an error in the token calculation. 19 requests consumed 73 million tokens, which scared me. Then I checked the data on Gemini, and the actual usage was 650k. It's really a bit exaggerated.

PS: i'm using cloud dify.

dify analysis Image

gemini usage Image

LarchLiu avatar May 18 '25 04:05 LarchLiu

same here...

princehaku avatar May 30 '25 14:05 princehaku

Thank you for your continued support. We are experiencing the same issue. The plugin version is 0.2.3.

Here is Dify's graph. Image

Here is Gemini API's graph. Image

It seems that the token output scale is recorded about 1,000 times higher on Dify.

The token calculation status at that time was as follows.

"usage": { "prompt_tokens": 213885, "prompt_unit_price": "1.25", "prompt_price_unit": "0.000001", "prompt_price": "0.2673563", "completion_tokens": 267618, "completion_unit_price": "10", "completion_price_unit": "0.000001", "completion_price": "2.67618", "total_tokens": 481503, "total_price": "2.9435363", "currency": "USD", "latency": 40.789585202001035 },

In the application that moved this, the upper limit was set to 65535 tokens, but when referring to the actual completion_tokens, it appeared that approximately four times that number of tokens were counted. We hope this information will be helpful to all contributors.

tomy-kyu avatar Jun 02 '25 04:06 tomy-kyu

Gemini 0.2.4 is still having this issue. I think It is upon the model used. For instance, Gemini Flash 2.5 gives correct token used and prices. However,

Gemini Pro 2.5 has this issue. [Incorrect prompt_tokens and completion_tokens]

{
  "text": "สวัสดีครับ ยินดีต้อนรับ! มีเรื่องใดที่อยากสอบถามหรือศึกษาต่อเป็นพิเศษบ้างครับ ผมพร้อมช่วยค้นคว้าและอธิบายให้เข้าใจอย่างละเอียดครับ",
  "usage": {
    "prompt_tokens": 200,
    "prompt_unit_price": "1.1",
    "prompt_price_unit": "0.000001",
    "prompt_price": "0.00022",
    "completion_tokens": 259,
    "completion_unit_price": "4.4",
    "completion_price_unit": "0.000001",
    "completion_price": "0.0011396",
    "total_tokens": 459,
    "total_price": "0.0013596",
    "currency": "USD",
    "latency": 7.402578802779317
  },
  "finish_reason": "stop",
  "files": []
}

The above "text" output is approximately 120 tokens but Dify threw back completion_tokens upto 259! The prompt_tokens is also inaccurate.

There is another issue related to Gemini plugin. [No price return.]

{
  "text": "สวัสดี! It's a pleasure to receive your message. How may I assist you today? I'm ready to help in any way that I can, so please feel free to ask anything.\n",
  "usage": {
    "prompt_tokens": 1506,
    "prompt_unit_price": "0",
    "prompt_price_unit": "0.000001",
    "prompt_price": "0",
    "completion_tokens": 43,
    "completion_unit_price": "0",
    "completion_price_unit": "0.000001",
    "completion_price": "0",
    "total_tokens": 1549,
    "total_price": "0",
    "currency": "USD",
    "latency": 5.071159431710839
  },
  "finish_reason": "STOP",
  "files": []
}

The completion_tokens is quite accurate. But the prompt_price and completion_price are 0 !!!

Please fix this plugin.

boonta avatar Jun 20 '25 05:06 boonta

I once had an insane token count of up to 10M... totally bug

CooGen-hub avatar Jun 23 '25 01:06 CooGen-hub

After a few months of reporting this bug, nothing has been fixed evenif the plugin has had multiple updates.

Is this because of plugin of Gemini ? Otherwise it is a bug from LLM node.

Could nobody clarify this issue ? I also have an issue with multiple APIs, eg. openrouter, AWS bedrock. It seems that the model configurations have big issue as well. I tried using openrouter with gemini 2.5 pro and it always respond with thinking evenif it is disabled.

Tanaboon-KU avatar Jul 29 '25 05:07 Tanaboon-KU

Hi, @juniorsereno. I'm Dosu, and I'm helping the Dify team manage their backlog and am marking this issue as stale.

Issue Summary:

  • You reported a sharp increase in input token usage after updating the Gemini plugin from v0.1.6 to v0.2.1 in Dify 1.4.0, with token counts rising about ninefold despite no change in usage.
  • Community members noted configuration changes like increased max_tokens and context_size as possible causes.
  • Others observed discrepancies between Dify's token accounting and Gemini API's actual usage, suggesting token counts are significantly exaggerated in Dify.
  • The issue varies by model, with some showing inaccurate token and price reporting.
  • The problem remains unresolved after months and multiple plugin updates, with uncertainty whether the root cause lies in the Gemini plugin or the LLM node.

Next Steps:

  • Please let me know if this issue is still relevant with the latest version of Dify by commenting here to keep the discussion open.
  • Otherwise, I will automatically close this issue in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Sep 01 '25 16:09 dosubot[bot]