fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

[filter_kubernetes] enhancement: provide mechanism to exclude containers from fluent bit via annotations

Open therealdwright opened this issue 7 years ago • 19 comments

Problem Statement In the current implementation the most common way to get container logs parsed by fluentbit in a kubernetes cluster is to have a filter applied to containers using a log message like the one detailed here:

input-kubernetes.conf: |
  [INPUT]
      Name              tail
      Tag               kube.*
      Path              /var/log/containers/*.log
      Parser            docker
      DB                /var/log/flb_kube.db

This represents a bit of a problem with dev containers sometimes polluting our log platform with unwanted logs. We have a select few services we want to include, but the default rule is to exclude.

Describe the solution you'd like

A simple way to include logs in the opposite of how #555 was implemented where you can annotate containers with a fluentbit.io/include: "true" and the fluentbit daemonset will only pick up these logs.

Describe alternatives you've considered

I've updated my Path in the above config to /var/log/containers/*deployment*.log and ensured all my deployments I want to aggregate logs for have deployment in the name.

Additional context

I have a Kubernetes cluster set up with kops 1.10 and used https://github.com/fluent/fluent-bit-kubernetes-logging to set up fluentbit which then forwards to a fluentd service running the logz.io plugin.

therealdwright avatar Sep 03 '18 22:09 therealdwright

looks like I misunderstood the original requirement, you can try the following annotation:

fluentbit.io/exclude: "true"

edsiper avatar Sep 03 '18 22:09 edsiper

Hi @edsiper thanks for replying, but I think I've poorly worded this enhancement sorry.

I want to have fluent-bit by default exclude, and annotate a select few deployments we can include.

therealdwright avatar Sep 03 '18 23:09 therealdwright

I'm searching for a similar thing. I want to e.g. say that all pods with label: something should be included, otherwise just discard the logs. Maybe something like:

 K8S-Logging.include label=something,label=anotherthing

kaspernissen avatar Dec 13 '18 12:12 kaspernissen

Hi. is there any solution to this so far? I'm also looking for this. Will be really helpful.

Thx!

botzill avatar Apr 04 '19 15:04 botzill

Hey, Has anyone made any progress here? I'm also trying to figure out a solution to this :)

slayerjain avatar May 15 '19 07:05 slayerjain

this will be very useful feature and end users will have control on sending logs by adding annotations

infa-ddeore avatar Jan 22 '20 09:01 infa-ddeore

+1 option to exclude by default and opt in include

albertocsm avatar Jan 30 '20 17:01 albertocsm

sad that this issue is open since Sep 2018 and still no solution yet

infa-ddeore avatar Apr 12 '20 15:04 infa-ddeore

https://github.com/dmytroleonenko/fluent-bit/tree/v1.4.2-include In case anybody interested. I'm not sure the way I injected the logic would be fine with the upstream dev team. Works for me. If the config has include mode enabled, only pods with

fluentbit.io/include: "true"

annotations are sent

dmytroleonenko avatar May 05 '20 19:05 dmytroleonenko

https://github.com/dmytroleonenko/fluent-bit/tree/v1.4.2-include In case anybody interested. I'm not sure the way I injected the logic would be fine with the upstream dev team. Works for me. If the config has include mode enabled, only pods with

fluentbit.io/include: "true"

annotations are sent

how to enable include mode and do you have docker image already built?

infa-ddeore avatar May 06 '20 03:05 infa-ddeore

https://github.com/dmytroleonenko/fluent-bit/tree/v1.4.2-include In case anybody interested. I'm not sure the way I injected the logic would be fine with the upstream dev team. Works for me. If the config has include mode enabled, only pods with

fluentbit.io/include: "true"

annotations are sent

how to enable include mode and do you have docker image already built?

You can enable it the same way like exclude mode, just use "include" instead of "exclude" word in both config and annotation. Include should work with "exclude" in combo equally well if you want to exclude particular container from the pod or specific stream (think of stdout) off of a pod log. I use https://github.com/aws/aws-for-fluent-bit.git to build an image for EKS logger. Slightly modified their Dockerfile to get fluent-bit sources from a zip file (based on my fork sources) instead of their git clone way I think I can build/push an image to the Dockerhub. Check it here https://hub.docker.com/r/melco/aws-for-fluent-bit once DockerHub manages to build it

dmytroleonenko avatar May 06 '20 09:05 dmytroleonenko

let me confirm the expectation from for the default behavior:

One of K8S-Logging.Exclude or K8S-Logging.Include must be enabled (not both), behaviors:

K8S-Logging.Exclude K8S-Logging.Include Pod Annotation Process Log ?
On Off exclude: "true" No
On Off exclude: "false" Yes
Off Off any Yes
Off On include: "true" Yes
Off On include: "false" No

comments ?

edsiper avatar May 19 '20 18:05 edsiper

@edsiper I think your approach makes the most sense. We've taken another path but it sounds like there are others who would like this enhancement.

therealdwright avatar May 19 '20 21:05 therealdwright

let me confirm the expectation from for the default behavior:

One of K8S-Logging.Exclude or K8S-Logging.Include must be enabled (not both), behaviors:

K8S-Logging.Exclude K8S-Logging.Include Pod Annotation Process Log ? On Off exclude: "true" No On Off exclude: "false" Yes Off Off any Yes Off On include: "true" Yes Off On include: "false" No comments ?

If the "false" annotation is a default behavior (no annotation == false annotation). Like if K8S-Logging.Include On then if I don't have any annotations on any pods what would happen?

dmytroleonenko avatar May 20 '20 08:05 dmytroleonenko

@dmytroleonenko the proposal above says: if K8S-Logging.Include is turned on, only the Pods that have an annotation fluentbit.io/include: "true" will be included in the pipeline, otherwise discarded.

edsiper avatar May 20 '20 22:05 edsiper

Any progress? Thanks

dmytroleonenko avatar Jan 26 '21 10:01 dmytroleonenko

Any news?

RainingNight avatar May 26 '21 06:05 RainingNight

I had no luck so I investigated my container input. FluentBit mounts all your container log files into /var/log/containers. The [INPUT] section in the config of most standard installations uses a wildcard to match all containers. Modify the input on the agent to include only the containers you want and that will exclude all others.

Standard config wildcard (NOTE the Path field):

  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Path                /var/log/containers/*
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_container.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

Change the Path field to match the containers you wish. This can be comma separated list of patterns:

  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Path                /var/log/containers/my-container.log, /var/log/containers/my-other-container.log
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_container.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

You could also use the exclude part of the config. I wrote a blog post with more info.

jackmahoney avatar Jun 05 '22 00:06 jackmahoney

One solution is to use lua filter to drop records based on labels/annotations. An example that drops all records unless pods have process-logs="true" label:

function drop_disabled_logs(tag, timestamp, record)
  if record["kubernetes"]["labels"]["process-logs"] == "true" then
      return 0, 0, 0
  else
    return -1, 0, 0
  end
end

martinkubrak avatar Sep 20 '22 15:09 martinkubrak

I think exclude all container log collection by default and only include it for containers with annotation is important use case.

side note Isn't it better if we can skip reading log files that we don't want in the first place, so if the pod isn't annotated we don't even read the file, as i understand the proposed solution would make us tail all files then filter logs out based on annotation,

side note if we can make every pod provides it's log config in annotations that would allow for maximum customization like what datadog agent do in here

for example

apiVersion: v1
kind: Pod
metadata:
  name: logger
  namespace: logger-ns
  annotations:
    fluentbit.io/config: |
      [INPUT]
          Name           tail
          Tag               kube.*
          Path             /var/log/pods/logger-ns_logger*/busybox/*.log
          Parser          docker
          DB                /var/log/flb_kube.db
spec:
  containers:
   - name: busybox
     image: busybox
     command: [ "/bin/sh", "-c", "--" ]
     args: [ "while true; do sleep 1; echo `date` example file log; done;" ]

this would allow maximum customization and could even be enhanced by not requiring all this info because the log files path would always be like this /var/log/pods/<namespace-name>_<pod-name>*/<container-name>/*.log and we can get those info from the metadata, so that would make the annotation closer to what datadog do

annotations:
    fluentbit.io/<container-name>.config: |
      [INPUT]
          Parser            docker

husseinraoouf avatar Nov 09 '22 08:11 husseinraoouf

One solution is to use lua filter to drop records based on labels/annotations. An example that drops all records unless pods have process-logs="true" label:

function drop_disabled_logs(tag, timestamp, record)
  if record["kubernetes"]["labels"]["process-logs"] == "true" then
      return 0, 0, 0
  else
    return -1, 0, 0
  end
end

I change some code making this solution works well. ^^|||

 return nil, nil, nil

BibbyChung avatar Apr 01 '23 06:04 BibbyChung

as long as this feature cannot be implemented yet right now I just do like this

...
    [INPUT]
      Name tail
      Path /var/log/containers/*.log
      multiline.parser docker, cri
      Tag kube.*
      Mem_Buf_Limit 5MB
      Skip_Long_Lines On
      Skip_Empty_Lines On

    [FILTER]
      Name kubernetes
      Match kube.*
      Merge_Log On
      Labels On
      Annotations Off
      Keep_Log Off
      K8S-Logging.Parser On
      K8S-Logging.Exclude On
      Buffer_Size 256KB

    [FILTER]
      Name    grep
      Match   kube.*
      regex   $kubernetes['labels']['logging'] enabled
...

so, only pods have

metadata:
  labels:
    logging: enabled

which will result the log to OUTPUT

gihif avatar Oct 08 '23 17:10 gihif