Proposal: Allow Runtime Configuration of Pixie Logging Level (Env Vars or CLI Flags)
Is your feature request related to a problem? Please describe. Pixie (Cloud + Vizier) is composed of many services, all of which default to the info logging level. As a result, these services generate a high volume of logs over time, which significantly increases logging costs on cloud providers. In some cases, logging costs can exceed compute costs such as CPU and memory. For system workloads like Pixie, it would be advantageous to use a more restrictive logging level (e.g., error) to reduce noise and control operational costs. However, the logging level appears to be hard-coded to info in the source code.
Describe the solution you'd like Ideally, the logging level should be configurable at runtime, preferably via environment variables or command-line arguments. This would allow cluster operators to adjust verbosity according to their needs without modifying the source code. From my initial review, this file would likely need to be updated: https://github.com/pixie-io/pixie/blob/HEAD/src/shared/services/logging.go#L42-L42
I have not yet investigated deeper for potential compatibility concerns, but I would be interested in contributing if the change is feasible.
Describe alternatives you've considered One alternative considered was configuring Fluent Bit to filter out info logs emitted by Pixie pods on GKE. However, this approach has not been tested and is less ideal than having the logging level controlled directly by Pixie components.
Hi @lucascicco, thanks for starting this discussion.
You're right that Pixie's golang services have a hard coded log level. The C++ embedded within the query broker, kelvin and PEM Vizier services also use glog. This portion of the service logging does have runtime toggles as documented in the previous link.
I don't see an issue with making the logrus logger configurable at runtime, but I'm curious if you have a summary of which log messages contribute to the large log volume. Independent of making the log level configurable at runtime, any noisy messages should have their level reconsidered.
In terms of implementation, I think we'd want to introduce a new service flag (i.e. --log_level) in service_flags.go. This can then be used to configure logrus's logging level in the SetupServiceLogging function.
Happy to discuss any of that in more detail if you have questions, but it seems feasible and would be a worthwhile contribution.