Audit and Logging¶
Operational logging and API audit logging serve different goals. You need both.
- audit logs: who changed cluster state and when
- workload logs: what applications and components are doing at runtime
Kubernetes API audit logging¶
Audit logs are produced by the API server according to an audit policy. Each log entry captures one stage of a request lifecycle: RequestReceived, ResponseStarted, ResponseComplete, or Panic.
Common audit levels:
None: skip this request entirelyMetadata: record who, what, and when -- no bodyRequest: metadata plus request bodyRequestResponse: metadata plus request and response bodies
For most production systems, Metadata is the practical default. Add Request-level logging selectively for high-value resources like Secrets, ConfigMaps, and RBAC objects. Avoid RequestResponse on Secrets -- it logs plaintext secret values.
Policy example¶
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
- level: Request
resources:
- group: ""
resources: ["configmaps"]
- level: Metadata
Use caution with request and response body logging for sensitive resources such as secrets.
Workload logging pipeline¶
Kubernetes does not persist logs for you. Container logs must be collected and shipped to durable storage.
flowchart LR
APP[Application\nstdout / stderr] --> RT[Container Runtime\n/var/log/containers/]
RT --> COL[Log Collector DaemonSet\nFluent Bit · Vector · Promtail]
COL --> BE[Central Backend\nOpenSearch · Loki · Splunk]
BE --> DASH[Dashboards\nand Alerts]
Common collectors: Fluent Bit (lightweight, widely deployed), Vector (high-performance, flexible), Promtail (Loki ecosystem).
Typical architecture:
- Application writes to stdout and stderr (not log files).
- Container runtime rotates log files under
/var/log/containers/. - Node-level collector DaemonSet tails log files and ships to backend.
- Central backend stores, indexes, and retains logs.
- Dashboards and alerts consume centralized data.
Incident response value¶
Audit plus workload logs provide full investigation context:
- who changed policy or deployment
- what changed in workload behavior
- when an issue started and how it propagated
Practical controls¶
- set retention by data class and compliance requirements
- sanitize logs to avoid leaking credentials or tokens
- alert on sensitive action patterns, such as repeated secret access denial
- keep clock synchronization healthy across nodes for timeline accuracy
Summary¶
Without audit logging, control plane actions are hard to reconstruct. Without workload logging, runtime failures are hard to explain. Production Kubernetes needs both pipelines running continuously.