LLocalSearch icon indicating copy to clipboard operation
LLocalSearch copied to clipboard

Agent tool Repetive output

Open XiaoConstantine opened this issue 1 year ago • 2 comments

Describe the bug A clear and concise description of what the bug is. When use mistral:latest as model, I get repetive output from agents, examples: Screenshot 2024-04-05 at 12 15 22 PM

I didn't notice this behavior with hermes-2-pro-mistral tho

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

XiaoConstantine avatar Apr 05 '24 16:04 XiaoConstantine

Probably related to the issues on the ollama side mistral model performance degrade? Maybe worth think about short circuit when model side response is not gonna help?

XiaoConstantine avatar Apr 05 '24 17:04 XiaoConstantine

I think this really comes down to fine-tuning prompts for each model. I already planned to make all internal prompts editable from the webinterface. Maybe we should add a warning next to "unsupported" models

How would you implement a short circuit? Just returning an error after n times the same response?

nilsherzig avatar Apr 05 '24 20:04 nilsherzig

Warning on unsupported models sounds good.

Regarding to short circuit, I think it probably should be done at agent level? e.g. plan? doesn't seems something caller can inject easily (correct me if i m wrong here)

XiaoConstantine avatar Apr 08 '24 01:04 XiaoConstantine

Yes this is something I would have to handle in the callback logic of each step. At the moment callbacks only push new logs to the Webinterface but I see no reason why they couldn't be used to short circuit something like that.

A university professor of mine promised me access to more powerful hardware, which would allow me to implement and regularly run a test harness to catch things like this and develop solutions for them.

nilsherzig avatar Apr 08 '24 06:04 nilsherzig

Sounds good! Closing this for now

XiaoConstantine avatar Apr 09 '24 18:04 XiaoConstantine