LEARN · DEBUGGING GUIDE

Debugging Kubernetes Pod OOMKilled: Beyond the Memory Limit

When a pod gets OOMKilled, the obvious culprit is hitting the memory limit. But often the real cause is subtler: silent memory leaks, cgroup pressure from sidecars, or misconfigured limits.

IntermediateKubernetes8 min read

What this usually means

The OOMKilled status means the Linux kernel's Out-Of-Memory (OOM) killer terminated the container's main process because the container exceeded its memory limit (as set in the pod spec) OR the node ran out of memory and the pod was evicted. The kernel tracks memory usage per cgroup; when the container's cgroup hits the limit, the OOM killer fires. However, it's not always the application's fault—sidecar proxies, log shippers, or shared memory (tmpfs) can consume the budget. Also, memory limits set too low for bursty workloads cause unnecessary kills.

( 01 )Fast diagnosis

The first ten minutes — establish facts before touching code.

  • 1Run 'kubectl describe pod <pod-name>' and check the 'Last State' section—look for 'Reason: OOMKilled' and 'Exit Code: 137'.
  • 2Check current memory usage: 'kubectl top pod <pod-name> --containers' to see per-container usage. If the pod has multiple containers, one may be hogging memory.
  • 3Inspect the container's cgroup memory stats via 'kubectl exec <pod-name> -c <container> -- cat /sys/fs/cgroup/memory/memory.usage_in_bytes' (requires privileged or hostPID access).
  • 4Look at node memory pressure: 'kubectl describe node <node-name>' under 'Conditions'—if 'MemoryPressure' is True, the node itself is tight, and the OOM could be due to node-level eviction.
  • 5Review the application's memory profile: use a memory profiler (e.g., pprof for Go, heap dump for Java) to see if memory grows unbounded over time.
  • 6Check if any container is using tmpfs (emptyDir with medium:Memory) that counts toward the container's memory limit.
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchPod description: 'kubectl describe pod <pod-name>'—check 'Last State' and 'Containers' section for resource limits and requests.
  • searchContainer logs: 'kubectl logs --previous <pod-name> -c <container>'—look for any crash logs before OOM, but often logs are incomplete.
  • searchNode conditions: 'kubectl describe node <node-name>'—check MemoryPressure, DiskPressure, and allocatable resources.
  • searchMetrics from Prometheus (if available): queries like 'container_memory_working_set_bytes{container=~"<container>"}' or 'kube_pod_container_resource_limits{resource="memory"}'.
  • searchCgroup memory events: On the node, 'cat /sys/fs/cgroup/memory/kubepods/<podUID>/<containerUID>/memory.oom_control' shows oom_kill_disable and under_oom count.
  • searchApplication heap dumps or profiler output—if you can capture before crash, enable auto-heap-dump on OOM (e.g., -XX:+HeapDumpOnOutOfMemoryError for Java).
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningMemory limit set too low for normal traffic spikes—the pod OOMs under load but would be fine with a higher limit.
  • warningMemory leak in the application code—heap grows unbounded until hitting the limit (common in languages with GC like Java, Go, Python).
  • warningSidecar container consuming memory—e.g., Envoy proxy or fluentd log shipper with buffer growth, eating into the pod's total memory limit.
  • warningtmpfs mounts (emptyDir with medium:Memory) counting toward container memory—a large temporary file can trigger OOM.
  • warningNode memory pressure triggers eviction—the kubelet evicts pods even if they haven't hit their limit, and the status appears as OOMKilled.
  • warningOversized requests vs limits—if requests are too high, the node may overcommit, and under memory pressure, pods with high requests get evicted first.
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildAdjust memory limits and requests based on steady-state usage plus 20-30% headroom—use 'kubectl top pod' over a day to set appropriate values.
  • buildFix memory leaks: profile the application, identify leak sources (e.g., unclosed file handles, growing caches), and apply fixes or tune GC settings.
  • buildSeparate sidecar memory limits: set individual memory limits per container in the pod spec so one container cannot starve others.
  • buildAvoid using tmpfs for large data—use emptyDir backed by disk (default) or PVC. If tmpfs is necessary, account for its size in the memory limit.
  • buildSet pod priority classes or disable node-level eviction for critical pods by adjusting kubelet eviction thresholds (not recommended generally).
  • buildImplement Horizontal Pod Autoscaler (HPA) based on memory usage to scale out before hitting limits.
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedAfter applying fixes, run load tests that reproduce the traffic spike and monitor memory usage—'kubectl top pod -w' should stay below limit.
  • verifiedCheck pod restart count stays at zero: 'kubectl get pods -w | grep <pod-name>'.
  • verifiedVerify node memory pressure disappears: 'kubectl describe node <node-name>' shows MemoryPressure is False.
  • verifiedFor memory leaks, run the application for 24+ hours and observe heap growth via metrics; use a diff of heap dumps to confirm leak is gone.
  • verifiedIf sidecar was the issue, after limiting sidecar memory, verify that main container has stable memory usage and no OOM kills.
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningSetting memory requests equal to limits—this prevents burst and causes OOM even on tiny spikes; always leave headroom.
  • warningIgnoring sidecar memory—many people only look at the main container and miss that the sidecar is the hog.
  • warningUsing '--all-containers' in 'kubectl top' without filtering—this gives pod total, not per-container, hiding which container is problematic.
  • warningAssuming OOMKilled always means the app is at fault—node eviction can cause OOMKilled on healthy pods.
  • warningAdding more memory without profiling—this masks the leak and increases cost; fix the root cause instead.
  • warningNot setting resource limits at all—pods can consume node memory and cause other pods to be evicted.
( 07 )War story

A Java Service OOMing Every Hour at Peak

Platform EngineerKubernetes 1.22, Java 11 (Spring Boot), Prometheus + Grafana, Istio sidecar

Timeline

  1. 13:30PagerDuty alerts: payment-service pod in 'CrashLoopBackOff' with OOMKilled.
  2. 13:32I run 'kubectl describe pod payment-service-7f8c9' and confirm OOMKilled, Exit Code 137.
  3. 13:35Check 'kubectl top pod payment-service-7f8c9 --containers' — main container using 450Mi, Istio-proxy using 150Mi, limit is 512Mi total.
  4. 13:40Look at Prometheus: container_memory_working_set_bytes shows the main container jumps from 300Mi to 480Mi in 5 minutes, then OOM.
  5. 13:45I enable heap dump on OOM for Java: add -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof to JAVA_OPTS.
  6. 14:00Pod OOMs again; I kubectl cp the heap dump and analyze with Eclipse MAT. Found a growing HashMap in the payment processing thread that never clears entries for abandoned transactions.
  7. 14:30Hotfix: increase memory limit to 1Gi temporarily to stop the alerts; team writes a fix to clean up the HashMap periodically.
  8. 15:00Deploy the fix with the original 512Mi limit; monitor for 2 hours — no OOM. Confirm via Grafana that memory stays below 400Mi.

It was 1:30 PM on a Tuesday when the alerts hit. Payment-service was restarting every few minutes, and the status was OOMKilled. My first instinct was to check the resource limits. The pod spec had a memory limit of 512Mi, and 'kubectl top' showed the main container peaking near 500Mi. It looked like a simple limit issue—just increase it. But I wanted to know why it suddenly started OOMing after weeks of stable running.

I dug into the metrics. The memory usage pattern showed a slow climb from 300Mi to 480Mi over 5 minutes, then the kill. That suggested a leak, not a traffic spike. I enabled Java heap dumps on OOM and waited for the next crash. When it hit, I copied the dump and opened it in Eclipse MAT. The leak was obvious: a HashMap in the payment processor that stored transaction contexts but never removed them for abandoned transactions. Over the course of an hour, it grew until the container hit the limit.

We deployed a temporary fix by doubling the memory limit to 1Gi to stop the immediate bleeding, but the real fix was adding a scheduled cleanup of stale entries. Once the code change was live, we reverted the limit back to 512Mi. The memory usage stabilized at around 350Mi. The lesson: increasing limits without understanding the memory behavior is a band-aid. Always profile first—the OOM killer is a symptom, not the cause.

Root cause

Memory leak in Java payment service: a HashMap in the payment processor grew unbounded due to abandoned transactions not being cleaned up.

The fix

Added a scheduled task to periodically remove abandoned transaction entries from the HashMap plus temporarily increased memory limit to 1Gi.

The lesson

Don't just increase memory limits; profile the application to find the leak. Use heap dumps on OOM to pinpoint the culprit.

( 08 )How the OOM Killer Works in Kubernetes

The Linux kernel's OOM killer is invoked when a cgroup's memory usage hits its limit (memory.max in cgroup v2 or memory.limit_in_bytes in v1). Kubernetes sets these cgroup limits based on the container's resources.limits.memory. When the limit is exceeded, the kernel kills the process with the highest badness score in that cgroup. The container exits with code 137 (SIGKILL), and Kubernetes reports OOMKilled.

Not all memory is equal: the cgroup tracks RSS, cache, swap (if enabled), and kernel memory. Large page cache or tmpfs can push usage over the limit. Also, node-level OOM occurs if the sum of all containers exceeds node memory; the kubelet evicts pods based on QoS class, and evicted pods show as OOMKilled if the main process was killed.

( 09 )Differentiating Pod vs Node OOM

A pod can be OOMKilled for two reasons: (1) its container exceeded its own memory limit (container-level OOM), or (2) the node ran out of memory and the kubelet evicted the pod (node-level OOM). In the first case, the OOM killer fires inside the container's cgroup; in the second, the kubelet sends a SIGKILL to the pod's processes. The symptom is the same: OOMKilled status.

To differentiate, check 'kubectl describe pod' for the 'Reason' field under 'Last State'. If it says 'OOMKilled' and the exit code is 137, it's container-level. If the event says 'Evicted' or the node had MemoryPressure, it's node-level. Also, look at the node's allocatable memory vs capacity: if the node is near capacity, node OOM is likely.

( 10 )Memory Accounting: What Counts Toward the Limit

The container's memory limit includes all memory used by processes in the cgroup: anonymous memory (heap, stack), page cache (file I/O buffers), and kernel memory (slab). Shared memory (tmpfs) also counts. However, some memory is not accounted: the kernel's own overhead, and memory used by the container runtime (containerd, dockerd) is outside the cgroup.

A common pitfall is that writing to a file in an emptyDir volume backed by memory (medium: Memory) consumes the container's memory limit. If your app writes large temporary files to tmpfs, it can trigger OOM even if the app's heap is small. Always use disk-backed emptyDir for large I/O, or set memory limits accordingly.

( 11 )Profiling Memory Usage in Live Pods

To catch memory leaks, you need to examine memory usage over time. Use 'kubectl top pod --containers' to get real-time per-container usage. For deeper introspection, exec into the pod and use tools like 'ps aux --sort=-%mem' to see which process is consuming memory, or read cgroup stats: 'cat /sys/fs/cgroup/memory/memory.stat' gives detailed breakdowns (cache, RSS, swap, etc.).

For Java applications, enable JMX or use 'jcmd' to trigger heap dumps. For Go, use pprof endpoints. For Node.js, use --inspect with Chrome DevTools. Automate heap dump generation on OOM by setting JVM flags like -XX:+HeapDumpOnOutOfMemoryError and save the dump to a PersistentVolumeClaim.

( 12 )Setting Proper Memory Requests and Limits

Memory requests are used by the scheduler to place pods; limits enforce hard caps. The golden rule: set requests based on the 95th percentile of steady-state usage, and limits at 20-30% higher to allow for spikes. Never set requests equal to limits—this prevents any burst and causes unnecessary OOMs. Use Vertical Pod Autoscaler (VPA) in recommendation mode to get suggested values from historical metrics.

For multi-container pods, set individual container limits. Without per-container limits, one container's memory spike can starve another. Also consider the pod's overall memory footprint: the sum of container limits should not exceed the node's allocatable memory for that pod's QoS class. Guaranteed QoS (requests == limits) gives the highest priority but leaves no headroom.

Frequently asked questions

What is the difference between OOMKilled and CrashLoopBackOff?

OOMKilled is a specific termination reason meaning the process was killed by the kernel's OOM killer. CrashLoopBackOff is a state where the pod keeps crashing and restarting; the crash reason can be OOMKilled, but also other exit codes. Use 'kubectl describe pod' to see the last state's reason.

My pod has multiple containers; which container caused the OOM?

Check 'kubectl top pod <pod> --containers' to see per-container memory usage. The container with the highest usage relative to its limit is likely the culprit. Also, 'kubectl describe pod' shows the last state of each container—look for OOMKilled in one of them.

Can swap cause OOMKilled issues?

When swap is enabled, the kernel can swap out memory pages, delaying OOM. However, Kubernetes recommends disabling swap on nodes (swap is not supported since K8s 1.24). If swap is on, the OOM killer may still fire if memory pressure is high and swap is saturated. Disable swap for predictable behavior.

How do I set up a heap dump on OOM for Java in Kubernetes?

Add JVM options: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dump.hprof. Then configure the pod to mount a PersistentVolumeClaim at /tmp (or use emptyDir with size limit). When the pod OOMs, exec in and copy the dump file. Better: automate upload to S3 using a sidecar.

Should I always increase memory limits when I see OOMKilled?

No. Increasing limits may mask a memory leak or misconfiguration. Always profile first: check if memory grows over time (leak) or spikes under load (needs headroom). If it's a leak, fix the code. If it's a spike, adjust limits but also consider HPA or better resource management.