Resource Requests and Limits¶

Resource settings are one of the most important workload controls in Kubernetes.

They determine scheduling quality, runtime stability, and autoscaling behavior.

Core model¶

requests: minimum resources reserved for scheduling
limits: maximum runtime resources a container may use

If requests are too low, pods get packed too tightly and become unstable under load. If they are too high, cluster capacity is wasted.

CPU and memory behavior¶

CPU and memory limits fail differently:

CPU over limit: throttling, usually latency increase
Memory over limit: OOM kill, container restart

That is why memory sizing errors are typically more disruptive.

Example¶

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: ghcr.io/example/api:v5.2.1
          resources:
            requests:
              cpu: 250m
              memory: 512Mi
            limits:
              cpu: 1000m
              memory: 1Gi

QoS classes¶

Pod QoS class is derived from resource configuration.

Guaranteed: requests equal limits for all containers
Burstable: at least one request set, but not all equal to limits
BestEffort: no requests or limits

In node pressure events, BestEffort is usually evicted first.

Sizing guidance¶

Start from observed p50 and p95 usage, not guesses
Keep requests close to realistic baseline load
Set memory limits with enough headroom for peak behavior
Avoid setting very low CPU limits on latency-sensitive services

Relationship to autoscaling¶

HPA resource targets depend on requests. Bad request values produce bad scaling decisions.

Always tune requests before tuning autoscaler thresholds.

Operational checks¶

kubectl top pods -A
kubectl describe pod <pod-name>
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[*].lastState}'

Look for OOMKilled events and sustained CPU throttling signals in metrics.

Summary¶

Requests drive placement. Limits enforce runtime caps. Correct values improve stability, cost efficiency, and autoscaling quality.