Audit & Logging

In Kubernetes, there are two stories being told at the same time:

The Detective Story (Audit Logs): "Who changed the system configuration?"
The Operator Story (App Logs): "Why is my application crashing?"

If you don't capture these stories immediately, they vanish. Kubernetes logs are ephemeral; when a Pod dies, its logs die with it.

1. Audit Logs (The "Black Box" Recorder)

Audit logs record every single request sent to the Kubernetes API Server. They answer the questions: Who, What, Where, and When.

Who: User alice
What: Tried to delete
Where: The secret named db-pass
When: At 12:05 PM
Result: 403 Forbidden

The 4 Audit Levels

You must configure how much data you want. This is a trade-off between "Visibility" and "Disk Space."

Level	Description	Use Case
`None`	Don't log anything.	Frequent, noisy events (like `kube-proxy` watching endpoints).
`Metadata`	Log the User, Timestamp, Resource, and Verb. No payloads.	Standard Production (Low Cost, High Value).
`Request`	Log metadata + the request body sent by the user.	Debugging "Why did this object change?"
`RequestResponse`	Log everything + the server's response body.	High Security / Debugging. (Generates massive data).

Security Warning: Secrets in Logs

Be extremely careful using Request or RequestResponse levels on Secret or ConfigMap resources. You might accidentally write your database passwords into your plain-text audit log files!

Configuration (The Policy File)

You pass a policy file to the API Server to define rules.

# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # 1. Don't log noisy system calls
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]

  # 2. Log full request body for critical changes (Pod modifications)
  - level: Request
    resources:
    - group: ""
      resources: ["pods"]

  # 3. Default: Log metadata only for everything else
  - level: Metadata

2. Application Logging (The "Stream")

Kubernetes does not provide a native storage solution for logs. It assumes your application writes to Standard Output (stdout) and Standard Error (stderr).

The container runtime (containerd) captures these streams and writes them to a file on the Node (usually /var/log/containers/*.log).

The Logging Pipeline

Since logs on the Node are deleted when the Pod is deleted, you need a Cluster-Level Logging Stack to ship them to safety.

graph LR
    subgraph "Worker Node"
        Pod["App Pod"] -->|stdout| File["/var/log/containers/..."]
        Agent["Log Agent<br/>(Fluent Bit / Promtail)"] -.->|Reads| File
    end

    Agent -->|Push| Backend["Log Storage<br/>(Loki / Elasticsearch)"]
    Backend -->|Query| UI["Dashboard<br/>(Grafana / Kibana)"]

The "DaemonSet" Pattern

The most common architecture is running a Logging Agent (like Fluent Bit or Promtail) as a DaemonSet.

One agent runs on every Node.
It mounts /var/log/containers as a read-only volume.
It tails every log file, adds metadata (Pod Name, Namespace), and pushes it to the backend.

3. Best Practices

Security

Alert on 403 Forbidden: If your Audit Logs show a user trying to read secrets and getting denied 10 times in a minute, you are under attack. Alert on this pattern.
Separate Retention: Audit logs are legal documents. Keep them in a separate bucket (e.g., S3 Glacier) for 1 year, even if you only keep App logs for 7 days.

Operations

JSON Logging: Force your developers to log in JSON format.
- Bad: 2023-10-01 Error: DB failed (Hard to parse).
- Good: {"level": "error", "msg": "DB failed", "service": "payment"} (Easy to filter).
Don't Log to Files Inside Containers: If your legacy app writes to /app/logs/server.log, standard Kubernetes logging will not catch it. You must use a "Sidecar" container to tail -f that file to stdout, or configure the app to write to stdout directly.

Summary

Audit Logs track API access ("Who did it?"). They are configured via a Policy File on the control plane.
App Logs track application health. They rely on the stdout/stderr stream.
Persistence: Logs are ephemeral. You must use a collection agent (DaemonSet) to ship them to a central store (Loki/Elastic) or they will be lost on Pod restart.
Golden Rule: Avoid RequestResponse logging for Secrets to prevent credential leaks.