holmesgpt Got error when handling a large file

Log file size: 11 MB
When asking with a file, an error has raised:

BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'messages[1].content': string too long. Expected a string with maximum length 1048576, but got a string with length 11282561 instead.",
'type': 'invalid_request_error', 'param': 'messages[1].content', 'code': 'string_above_max_length'}}

What max size is allowed?How can I handle a large file?

Jun 28 '24 03:06 tommy04062019

Screenshot 2024-06-28 at 11 55 30 And this too

Jun 28 '24 04:06 tommy04062019

Thanks, we're looking into better ways to handle. (And to report the max sizes.)

What are you trying to accomplish with the log analysis? Instead of downloading the file, then analyzing it, are you able to just run a holmes ask command without a file and let holmes itself choose which logs to look at? That tends to handle size limits much much better.

Jun 30 '24 07:06 aantn

Thanks, we're looking into better ways to handle. (And to report the max sizes.)

What are you trying to accomplish with the log analysis? Instead of downloading the file, then analyzing it, are you able to just run a holmes ask command without a file and let holmes itself choose which logs to look at? That tends to handle size limits much much better.

As you can see, in the case of nginx ingress, it's important to note that requests and errors can originate from any of the nginx ingress pods. If there is a deployment with 10 pods, investigating each one individually becomes necessary, so it waste time. Our primary goal is to minimize the time spent on investigation, isn't it? It's worth mentioning that errors might be located anywhere within the log. Even if you limit the log reading to the last 10000 lines, the errors could potentially be found only at the last 10500 linesI, so the result willl returned something like be:AI: everything is ok

Jun 30 '24 08:06 tommy04062019

@aantn : I think we should implement two things:

First, having a size limit
Second, with large files, so here's the deal, you gotta read and use AI to dig into things bit by bit(read by chunks). Then, you bring all those little findings together to get the big picture.

Jun 30 '24 08:06 tommy04062019

What is the trigger for the investigation? (Meaning what is the initial breadcrumb that you saw which caused you to start investigating at all?)

Your ideas are good. I would also be interested in exploring a third option: give the AI a tool to search the logs (e.g. grep).

We have an internal framework for benchmarking the results of different approaches. If you're interested in jumping on a short call, I'd love to explore this use case in a little more depth. That might help us find the best solution.

Jun 30 '24 08:06 aantn

Closing in favor of #522 and #523 which should fix this.

Jun 15 '25 11:06 aantn