Cilium¶

Cilium is a CNI plugin that implements Kubernetes networking, network policy, and observability using eBPF (extended Berkeley Packet Filter). It runs programs directly in the Linux kernel without kernel modules or sidecars, giving it performance characteristics that iptables-based CNIs cannot match at scale.

Why eBPF changes the networking model¶

Traditional Kubernetes networking stacks traffic through iptables chains. At scale (thousands of services, tens of thousands of endpoints), iptables tables grow into the hundreds of thousands of rules. Rule evaluation is linear, rule changes require a full table rewrite, and the kernel has no visibility into what's happening.

eBPF programs attach to kernel hooks and execute in a sandboxed JIT-compiled environment. Cilium attaches to the network interface XDP hook for the fastest path, tc (traffic control) ingress/egress hooks for policy, and cgroup socket hooks for load balancing. Maps (hash tables, LRUs) replace iptables rules and update atomically in O(1).

Architecture¶

flowchart TD
    subgraph Node
        Kubelet --> CiliumAgent[cilium-agent\n(DaemonSet)]
        CiliumAgent --> eBPF[eBPF programs\nloaded into kernel]
        CiliumAgent --> Maps[eBPF maps\n(policy, endpoints,\nservices, NAT)]
        eBPF --> NIC[Network Interface]
    end
    subgraph Control
        CiliumOperator[cilium-operator\n(Deployment)] --> IPAM[IPAM pool\nmanagement]
        CiliumOperator --> CRDs[CiliumNetworkPolicy\nCiliumClusterwideNetworkPolicy]
    end
    CiliumAgent <-- KV --> etcd[(etcd / CRDs)]
    subgraph Hubble
        HubbleRelay[hubble-relay] --> HubbleUI[hubble-ui]
        CiliumAgent --> HubbleRelay
    end

cilium-agent runs on every node. It watches Kubernetes API for endpoint, service, and policy changes, then compiles and loads eBPF programs and updates maps accordingly. No coordination with other nodes is needed for datapath decisions - the agent is fully self-contained.

cilium-operator handles cluster-wide operations: IPAM pool management, CRD cleanup, and leader-elected tasks that shouldn't run on every node.

Installation¶

helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --version 1.16.0 \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=<API_SERVER_IP> \
  --set k8sServicePort=6443 \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

kube-proxy replacement¶

Cilium can replace kube-proxy entirely. Service load balancing moves to eBPF socket hooks - the kernel rewrites the destination address before the packet even enters the network stack. This eliminates the conntrack table, reduces latency, and handles millions of service endpoints without iptables overhead.

# Verify kube-proxy is absent and Cilium handles services
cilium status --verbose
kubectl -n kube-system exec ds/cilium -- cilium service list

Network policy¶

Cilium enforces standard NetworkPolicy objects, and extends them with CiliumNetworkPolicy for L3, L4, and L7 rules.

Standard NetworkPolicy (L3/L4)¶

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-api
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - protocol: TCP
          port: 8080
  policyTypes:
    - Ingress

CiliumNetworkPolicy - L7 HTTP¶

CiliumNetworkPolicy can enforce HTTP methods and paths, DNS FQDNs, Kafka topics, and gRPC services:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-l7-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: "GET"
                path: "/api/v1/.*"
              - method: "POST"
                path: "/api/v1/orders"

DNS-based egress policy¶

Restrict egress by FQDN instead of IP (IPs rotate; FQDNs are stable):

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-stripe-egress
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: payment
  egress:
    - toFQDNs:
        - matchName: "api.stripe.com"
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP

Cilium intercepts DNS responses and populates an internal map of FQDN-to-IP. Policy enforcement uses that map rather than requiring operators to track IP ranges.

Hubble - network observability¶

Hubble is Cilium's observability layer. It provides flow-level visibility into all traffic, surfaced as logs, metrics, and a UI - without sidecar injection.

cilium hubble enable
hubble observe --namespace production --follow
hubble observe --pod frontend-7d9b84c4f-x2k4p --follow
hubble observe --verdict DROPPED --follow   # see what policy is blocking

Hubble exports Prometheus metrics for request rates, drop rates, and latency by namespace, pod, and destination. These are the same signals you'd get from a service mesh without the sidecar overhead.

# Find which flows are being dropped by policy
hubble observe --verdict DROPPED -j | jq '{src: .source.namespace, dst: .destination.namespace, reason: .drop_reason_desc}'

Hubble UI provides a graphical service map that shows live traffic flows and policy decisions.

Encryption¶

Cilium supports transparent encryption of all node-to-node traffic with two options:

IPsec - kernel-native, highest compatibility, requires key rotation management:

helm upgrade cilium cilium/cilium --reuse-values \
  --set encryption.enabled=true \
  --set encryption.type=ipsec

WireGuard - simpler, better performance, requires kernel 5.6+:

helm upgrade cilium cilium/cilium --reuse-values \
  --set encryption.enabled=true \
  --set encryption.type=wireguard

Both modes encrypt all pod-to-pod traffic crossing node boundaries without application changes. WireGuard keys are automatically generated and rotated by Cilium.

Cluster Mesh¶

Cluster Mesh connects multiple Kubernetes clusters at the network level, allowing pods in one cluster to reach services in another using standard DNS and Kubernetes service names.

flowchart LR
    subgraph Cluster A
        PodA[Pod] --> SvcA[Service]
    end
    subgraph Cluster B
        SvcB[Global Service\nmirror in A]
        PodB[Pod]
    end
    SvcA --> PodB

cilium clustermesh enable --context cluster-a
cilium clustermesh enable --context cluster-b
cilium clustermesh connect --destination-context cluster-b

Annotate a service as global:

metadata:
  annotations:
    service.cilium.io/global: "true"
    service.cilium.io/shared: "true"

Global services load-balance across healthy endpoints in all connected clusters. Combined with service.cilium.io/affinity: "local", you get prefer-local-cluster behavior with automatic failover.

Cilium as a service mesh¶

Cilium's CiliumEnvoyConfig and Cilium Service Mesh mode provide L7 traffic management (retries, timeouts, header manipulation, traffic splitting) via Envoy deployed as a per-node DaemonSet - not per-pod sidecars. This gives service mesh capabilities with a much lower resource footprint.

helm upgrade cilium cilium/cilium --reuse-values \
  --set ingressController.enabled=true \
  --set ingressController.loadbalancerMode=dedicated

BGP control plane¶

Cilium can advertise pod and service CIDRs to upstream BGP routers, enabling bare-metal clusters to expose LoadBalancer services without a cloud provider:

apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeeringPolicy
metadata:
  name: rack-bgp
spec:
  nodeSelector:
    matchLabels:
      rack: "1"
  virtualRouters:
    - localASN: 65001
      exportPodCIDR: true
      neighbors:
        - peerAddress: 10.0.0.1/32
          peerASN: 65000

Operational patterns¶

Debug policy drops: hubble observe --verdict DROPPED is far faster than reading iptables logs. Every drop includes the policy name and direction that blocked it.

Identity-based policy: Cilium assigns a numeric identity to each endpoint based on its labels. Policy evaluation uses identities, not IPs. This means policy is stable across pod restarts and IP reassignments.

Monitor eBPF map pressure: cilium bpf policy list and cilium bpf nat list show map occupancy. At very high scale, tune map sizes before running into hard limits.

Bandwidth management: Cilium supports egress bandwidth limits via eBPF traffic control, without requiring a separate rate-limiting sidecar.