AKS icon indicating copy to clipboard operation
AKS copied to clipboard

[BUG] kube-audit logs properties.log field is truncated at 16KiB

Open robinlandstrom opened this issue 1 year ago • 6 comments

Describe the bug I export AKS kube-audit logs to a storage account for later processing. The exported logs are tuncated by Azure/AKS/? before written to the storage account.

The "kube-audit" object is valid json but the underlying properties.log string is truncated and not valid json for longer log entries.

{ 
  "category": "kube-audit",
  "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read", 
  ...
  "properties": {
    "log": "{\"kind\":\"Event\",\"apiVersion\":\"audit.k8s.io\/v1\",\"level\":\"RequestResponse\", ..." # Tuncated after 16KiB
    ...
  }
}

Observed a couple of times per day per cluster, for example when system:serviceaccount:kube-system:node-controller performs updates to /api/v1/nodes/<NODE-NAME>/status, it is logged with "level": "RequestResponse" and both the requestObject and responseObject are pretty big.

To Reproduce "Add diagnostics setting" to collect kube-audit logs to a storage account from an AKS cluster. https://learn.microsoft.com/en-us/azure/aks/hybrid/kubernetes-monitor-audit-events

Parse log files and observe properties.log strings that are truncated and not valid json or audit.k8s.io/v1 objects.

Expected behavior I expect a valid audit.k8s.io/v1 event to always be available in the properties.log field.

Environment (please complete the following information):

  • Kubernetes version v1.30.5

robinlandstrom avatar Nov 12 '24 11:11 robinlandstrom

https://github.com/Azure/AKS/issues/4750 should have fixed this

aritraghosh avatar May 22 '25 21:05 aritraghosh

Hi @robbiezhang @kthakar1990, are there any updates on investigating this issue?

julia-yin avatar Jun 05 '25 20:06 julia-yin

I still get kube-audit logs where the propertries.log entry is not a complete json object.

Seems like long audit.k8s.io/v1 event's are split into two Microsoft.ContainerService/managedClusters/diagnosticLogs/Read entries.

At least all the data seems to be there but it would be a pain to parse and try to join next line on parse fail..

Event 1 (pretty printed)

{
  "category": "kube-audit",
  "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read",
  "properties": {
    ...
    "log": "{\"kind\":\"Event\",\"apiVersion\":\"audit.k8s.io/v1\",\"level\":\"RequestResponse\", `REDACTED` ,\"k"
  },
  "resourceId": "/SUBSCRIPTIONS/REDACTED/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/REDACTED",
  "serviceBuild": "na",
  "time": "2025-07-08T23:51:50.450205796Z"
}

Event 2 (pretty printed)

{
  "category": "kube-audit",
  "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read",
  "properties": {
    ...
    "log": "ubectl.kubernetes.io/restartedAt\":\"2025-05-27T16:04:45+02:00\" `REDACTED` \\\"daemon-set-controller/kube-system\\\"\"}}\n"
  },
  "resourceId": "/SUBSCRIPTIONS/REDACTED/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/REDACTED",
  "serviceBuild": "na",
  "time": "2025-07-08T23:51:50.450205796Z"
}

Possible to get a complete event out with some manual stitching..

$ jq '.properties.log' event1.json | sed 's/.$//' > audit.k8s.part1
$ jq '.properties.log' event2.json | sed 's/^.//' > audit.k8s.part2
$ cat audit.k8s.part1 audit.k8s.part2 | tr -d '\n' | jq fromjson
{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "RequestResponse",
  "auditID": "eec606f2-f143-4e8c-ac81-411b339e1367",
  "stage": "ResponseComplete",
... REDACTED ...
  "requestReceivedTimestamp": "2025-07-08T23:51:50.437461Z",
  "stageTimestamp": "2025-07-08T23:51:50.449278Z",
  "annotations": {
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"system:controller:daemon-set-controller\" of ClusterRole \"system:controller:daemon-set-controller\" to ServiceAccount \"daemon-set-controller/kube-system\""
  }
}

robinlandstrom avatar Jul 16 '25 12:07 robinlandstrom

This issue has been automatically marked as stale because it has not had any activity for 30 days. It will be closed if no further activity occurs within 7 days of this comment. Please review @aritraghosh, @robbiezhang, @kthakar1990.

This issue will now be closed because it hasn't had any activity for 7 days after stale. @robinlandstrom feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion.