river Note for design: Log customization and tuning

I've gotten feedback that we should take special care around customization of log output.

Currently, we only really offer the supported log level, and don't have specific support for specifying the log level of our dependencies including pingora.

As logging is a primary "user interface" for River, we should make another design pass on the desired capabilities and options for logging.

Some additional suggested features include:

reducing the percentage of logs that are printed, allowing for statistical sampling of high-volume logs, e.g. tracing 1%/10%/100% of all connections verbosely
tuning the contents and format of primary logs, allowing for different users to include different information relevant to their use cases
Filtering and anonymization of logs, for example not logging headers based on some sort of regex (to remove PII), or allowing for hashed versions of logs, such as logging the hash of the entire header, allowing for observation of duplicate requests

CC @branlwyd

Jul 26 '24 14:07 jamesmunns

Also note: we should make sure we have setting for consuming structured logs, like OpenTelemetry, compat with Loki, Prometheus, etc.

Jul 31 '24 16:07 jamesmunns

I hope to support full customization through placeholders, instead of adding, deleting, or renaming fields to pre-set JSON like in Caddy. Therefore, there is no need to worry about whether CLF or JSON format is better, allowing users to fully customize it

Additionally, it is necessary to support log rotated like Caddy

Sep 15 '24 03:09 ljianc