What this usually means
CrashLoopBackOff means the container's main process exits with a non-zero exit code, and after several restarts, Kubernetes backs off by waiting longer between restart attempts. The cause is whatever made the process exit: a runtime error, a missing file, a failed dependency check, an OOM kill, or a misconfigured command or entrypoint.
The first ten minutes \u2014 establish facts before touching code.
- 1Get the pod logs from the crashed container: kubectl logs <pod-name> --previous
- 2Check the exit code: kubectl describe pod <pod-name> shows it under Last State. Exit code 1 = application error. 137 = SIGKILL (OOM or liveness probe failure).
- 3Check if the pod has resource limits. An OOMKilled exit means the container exceeded its memory limit.
- 4Check the startup command. kubectl describe pod shows the command and args. Is the command correct?
- 5Run the container image locally with the same command and environment variables to reproduce the crash.
The specific files, logs, configs, and dashboards that usually own this bug.
- searchPod logs from the previous (crashed) container instance
- searchPod description for Events, Last State, exit code, and reason
- searchContainer command and args in the Deployment or Pod spec
- searchResource limits for memory and CPU
- searchLiveness and startup probe configuration
- searchConfigMap and Secret mounts — are files present at expected paths?
Practical causes, not theory. These are the things you will actually find.
- warningApplication crashes on startup due to a runtime error or unhandled exception
- warningMissing environment variable or config file — app exits because it cannot configure itself
- warningOOMKilled — container exceeds memory limit
- warningLiveness probe fails because the app is not ready fast enough
- warningFile permission error — container runs as non-root user and cannot access a file
- warningDependency like database or API not reachable at startup
- warningWrong entrypoint or command — the specified binary does not exist in the image
Concrete fix directions. Pick the one that matches your root cause.
- buildFix the application error shown in the pod logs
- buildIf OOMKilled, increase memory limits or reduce the application memory footprint
- buildIncrease initialDelaySeconds on liveness probes to give the app time to start
- buildAdd a startup probe that allows a longer startup period before liveness probes begin
- buildAdd a preflight check in the app that validates config and dependencies before starting
- buildTest the container image locally to reproduce and fix the crash
A fix you cannot prove is a guess. Close the loop.
- verifiedAfter fixing, deploy and watch the pod reach Running without restarting.
- verifiedCheck pod logs for successful startup messages.
- verifiedCheck the restart count is 0.
- verifiedPort-forward to the pod and verify the application responds correctly.
- verifiedRun a load test against the pod to ensure it handles traffic without crashing.
Things that make this bug worse or harder to find.
- warningNot checking --previous logs — the current container has no error because it just started
- warningSetting memory limits without understanding the application's actual memory usage
- warningNot configuring liveness probe initialDelaySeconds — the probe starts before the app is ready
- warningAssuming the crash is a Kubernetes problem when it is an application error
- warningNot running the container locally to reproduce the crash