All guides

LEARN \u00b7 DEBUGGING GUIDE

Kubernetes pod CrashLoopBackOff: how to debug crashing pods

You deploy a new image. The pod enters CrashLoopBackOff. It starts for a few seconds, then dies. The logs show an error, or nothing at all. The pod will not stay up long enough to debug.

AdvancedDocker/deployment debugging

What this usually means

CrashLoopBackOff means the container's main process exits with a non-zero exit code, and after several restarts, Kubernetes backs off by waiting longer between restart attempts. The cause is whatever made the process exit: a runtime error, a missing file, a failed dependency check, an OOM kill, or a misconfigured command or entrypoint.

( 01 )Fast diagnosis

The first ten minutes \u2014 establish facts before touching code.

  • 1Get the pod logs from the crashed container: kubectl logs <pod-name> --previous
  • 2Check the exit code: kubectl describe pod <pod-name> shows it under Last State. Exit code 1 = application error. 137 = SIGKILL (OOM or liveness probe failure).
  • 3Check if the pod has resource limits. An OOMKilled exit means the container exceeded its memory limit.
  • 4Check the startup command. kubectl describe pod shows the command and args. Is the command correct?
  • 5Run the container image locally with the same command and environment variables to reproduce the crash.
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchPod logs from the previous (crashed) container instance
  • searchPod description for Events, Last State, exit code, and reason
  • searchContainer command and args in the Deployment or Pod spec
  • searchResource limits for memory and CPU
  • searchLiveness and startup probe configuration
  • searchConfigMap and Secret mounts — are files present at expected paths?
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningApplication crashes on startup due to a runtime error or unhandled exception
  • warningMissing environment variable or config file — app exits because it cannot configure itself
  • warningOOMKilled — container exceeds memory limit
  • warningLiveness probe fails because the app is not ready fast enough
  • warningFile permission error — container runs as non-root user and cannot access a file
  • warningDependency like database or API not reachable at startup
  • warningWrong entrypoint or command — the specified binary does not exist in the image
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildFix the application error shown in the pod logs
  • buildIf OOMKilled, increase memory limits or reduce the application memory footprint
  • buildIncrease initialDelaySeconds on liveness probes to give the app time to start
  • buildAdd a startup probe that allows a longer startup period before liveness probes begin
  • buildAdd a preflight check in the app that validates config and dependencies before starting
  • buildTest the container image locally to reproduce and fix the crash
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedAfter fixing, deploy and watch the pod reach Running without restarting.
  • verifiedCheck pod logs for successful startup messages.
  • verifiedCheck the restart count is 0.
  • verifiedPort-forward to the pod and verify the application responds correctly.
  • verifiedRun a load test against the pod to ensure it handles traffic without crashing.
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningNot checking --previous logs — the current container has no error because it just started
  • warningSetting memory limits without understanding the application's actual memory usage
  • warningNot configuring liveness probe initialDelaySeconds — the probe starts before the app is ready
  • warningAssuming the crash is a Kubernetes problem when it is an application error
  • warningNot running the container locally to reproduce the crash