Remove task action audit logging and druid_taskLog metadata table
Note
-
This PR does not pertain to the audit logging system used by Druid for auditing all major update actions such as run a task, update rules, update dynamic configs, create a supervisor, etc. That information is persisted in
druid_auditmetadata table (ifdruid.audit.manager.type=sql) or simply logged. - Instead, it deals with the audit logging used only for task actions i.e. the
druid_taskLogmetadata table.
Description
Task action audit logging was first deprecated and disabled by default in Druid 0.13, #6368.
As called out in the original discussion #5859, there are several drawbacks to persisting task action audit logs.
- Only usage of the task audit logs is to serve the API
/indexer/v1/task/{taskId}/segmentswhich returns the list of segments created by a task. - The use case is really narrow and no prod clusters really use this information.
- There can be better ways of obtaining this information, such as the metric
segment/added/byteswhich reports both the segment ID and task ID when a segment is committed by a task. We could also include committed segment IDs in task reports. - A task persisting several segments would bloat up the audit logs table putting unnecessary strain on metadata storage.
Changes
- Remove
TaskAuditLogConfig - Remove method
TaskAction.isAudited(). No task action is audited anymore. - Remove
SegmentInsertActionas it is not used anymore.SegmentTransactionalInsertActionis the new incarnation which has been in use for a while. - Deprecate
MetadataStorageActionHandler.addLog()andgetLogs(). These are not used anymore but need to be retained for backward compatibility of extensions. - Do not create
druid_taskLogmetadata table anymore.
Release notes
- Task action audit logging was deprecated in Druid 0.13 and is being completely removed in this release.
- The API
/indexer/v1/task/{taskId}/segmentsis not supported anymore and will give a 404 NOT FOUND response. - Druid will not write to or read from the metadata table
druid_taskLoganymore. - The property
druid.indexer.auditlog.enabledwill be ignored by Druid. - The metric
task/action/log/timewill not be emitted anymore.
Extension dev notes
The changes in this PR are backward compatible with all existing metadata storage extensions.
The methods addLog and getLogs of MetadataStorageActionHandler are now deprecated
and not used by the Druid code.
Any new metadata storage extension need not implement these methods.
Rolling upgrade concerns
No upgrade concerns as none of the tasks use the SegmentInsertAction.
Future solutions
Which task created a segment?
A more preferable approach would be to simply add a task_id column in the segments table.
Something similar has been recently done for pending segments in #16144.
Alternatively, it could also be possible to determine the list of segments committed by a task by inspecting the reports of the task or emitted metrics.
Which user created a segment?
Task submission is already logged and/or persisted depending on configuration by the Druid audit system. Once we can associate segments to task IDs, we would also be able to identify which user created a given segment.
This PR has:
- [ ] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.