vector icon indicating copy to clipboard operation
vector copied to clipboard

feat(kubernetes_logs source) support k8s API based logs

Open titaneric opened this issue 3 months ago • 1 comments

Summary

As dicussed in #23597 and https://github.com/vectordotdev/vector/discussions/23982, I intend to support Kubernetes logs API to tail the pods logs in this PR.

Features Implemented

  • Kubernetes API Logs Collection: Added log_collection_strategy configuration option to enable log collection via Kubernetes API instead of file-based tailing
  • Event-driven Reconciler: Implemented a reconciler that watches pod events and automatically starts/stops log tailers for running pods
  • Container Log Streaming: Added real-time log streaming from Kubernetes API with proper timestamp tracking and container identification
  • Pod Information Management: Created PodInfo struct for essential pod metadata extraction and container tracking

TODO Items

  • Batched Lines Sending: Implement batch of lines to send to the event channel
  • Position Tracking: Implement timestamp-based position management for log continuity
  • Error Handling: Add comprehensive error handling and retry mechanisms
  • Metrics Integration: Add metrics for API log collection monitoring
  • Metadata Annotation: Complete integration with pod metadata annotation pipeline
  • Performance Optimization: Implement connection pooling and efficient streaming
  • Testing Suite: Add comprehensive unit and integration tests

Vector configuration

api:
  enabled: true
  graphql: true
  playground: true
  address: "127.0.0.1:8686"
sources:
  my_source_id:
    type: kubernetes_logs
    data_dir: /tmp/vector-k8s-logs
    kube_config_file: ./kind-kubeconfig.yaml
    self_node_name: kind-worker
    api_log: true
sinks:
  my_sink_id:
    type: console
    inputs:
    - my_source_id
    encoding:
      codec: raw_message

How did you test this PR?

Given the following kind cluster config named kind-cluster.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker

Create kind clulster from config

kind create cluster --config kind-cluster.yaml

Export the cluster's kubeconfig to kind-kubeconfig

kind export kubeconfig --kubeconfig kind-kubeconfig

Apply the following manifest to kind's cluster

kubectl apply -f pod.yaml --kubeconfig kind-kubeconfig
apiVersion: v1
kind: Pod
metadata:
  name: multi-container-test-pod
  labels:
    app: multi-container-test
    vector.dev/exclude: "false"
spec:
  containers:
  - name: logger-a
    image: busybox:1.35
    command: [ "/bin/sh" ]
    args:
    - -c
    - |
      echo "Container A starting at $(date)"
      counter=1
      while true; do
        echo "[A-${counter}] Container A log: $(date -Iseconds) - Testing multi-container timestamp tracking"
        counter=$((counter + 1))
        sleep 3
      done
    resources:
      requests:
        memory: "32Mi"
        cpu: "50m"
      limits:
        memory: "64Mi"
        cpu: "100m"
  - name: logger-b
    image: busybox:1.35
    command: [ "/bin/sh" ]
    args:
    - -c
    - |
      echo "Container B starting at $(date)"
      counter=1
      while true; do
        echo "[B-${counter}] Container B log: $(date -Iseconds) - Different container, same pod"
        counter=$((counter + 1))
        sleep 7
      done
    resources:
      requests:
        memory: "32Mi"
        cpu: "50m"
      limits:
        memory: "64Mi"
        cpu: "100m"
  restartPolicy: Always

Build vector with nessary features and run the given config.

cargo build --no-default-features --features sources-kubernetes_logs --features sinks-console --features api
 ./target/debug/vector --config vector.yaml -v

Change Type

  • [ ] Bug fix
  • [x] New feature
  • [ ] Non-functional (chore, refactoring, docs)
  • [ ] Performance

Is this a breaking change?

  • [ ] Yes
  • [x] No

Does this PR include user facing changes?

  • [x] Yes. Please add a changelog fragment based on our guidelines.
  • [ ] No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

titaneric avatar Oct 15 '25 17:10 titaneric

@pront , I have introduced the new config named log_collection_strategy , and I have done lots of refactor later. I believe code review could be time consuming. Please take your time to look at it.

Hope that the whole strucuture is good, and then we could focus on the temporary workaround to handle the Line and further event processing pipeline for api log collection strategy later.

titaneric avatar Oct 17 '25 17:10 titaneric