All guides

LEARN \u00b7 DEBUGGING GUIDE

Works locally but fails in production: how to debug it

Code runs fine on your laptop. Deploy it and it breaks. This is the most common production debugging pattern — and it is almost never a code bug.

IntermediateWorks locally, fails in production

What this usually means

The app is running in a different environment than your machine. The difference could be anything: missing enviroment variables, a different Node version, a filesystem permission gap, a missing build artefact, a database that behaves differently, or a network boundary your local setup never crosses. The code itself is usually fine — the environment around it is wrong.

( 01 )Fast diagnosis

The first ten minutes \u2014 establish facts before touching code.

  • 1Compare the runtime versions between local and production. Run `node --version`, `npm --version`, and check the language/runtime version in your deploy config or Dockerfile.
  • 2List every environment variable the app reads at startup. Print them (excluding secrets) in a preflight log line. Check that production has every one.
  • 3Read the first error in the production logs. Do not skip it. Do not guess. Copy the exact stack trace and trace backwards from the throw site.
  • 4Diff your local `.env` (or config file) against the production config. One missing key, one wrong value, or one extra quote can cause the crash.
  • 5Check if your local dev uses a different database, cache, or queue than production. A local SQLite vs production Postgres gap breaks more apps than any code change.
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchRuntime version logs (startup lines, Dockerfile `FROM`, `engines` in package.json)
  • searchEnvironment variable injection path (`.env`, secrets manager, CI/CD vars, Kubernetes ConfigMap)
  • searchProduction logs — first error line and full stack trace
  • searchBuild output vs runtime output comparison
  • searchDatabase connection strings and driver versions
  • searchFilesystem paths — `/tmp` vs `/var/tmp`, case sensitivity, permission bits
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningMissing environment variable that defaults to a local value during dev
  • warningProduction runs a different Node/Go/Python version than local
  • warningProduction database is a different engine (Postgres vs SQLite) or version
  • warningBuild step produces artefacts that are not carried into the runtime container
  • warningFilesystem path assumptions that work on macOS but fail on Linux
  • warningNetwork egress blocked in production (firewall, security group, VPC)
  • warningMemory or CPU limits in production that do not exist on a dev machine
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildAdd a startup preflight that checks required env vars and logs runtime versions before serving traffic
  • buildPin the runtime version in your Dockerfile and in CI — use the same version local dev runs
  • buildMake the production database available in a staging environment that mirrors production
  • buildRun the production build command locally to catch artefact gaps before deploy
  • buildAdd a health-check endpoint that verifies database, cache, and external dependency connectivity
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedDeploy to a staging environment that matches production. Run the same request that failed.
  • verifiedCheck startup logs show the correct runtime version and all required env vars are set.
  • verifiedRun the health-check endpoint after deploy — confirm 200 and all dependency checks pass.
  • verifiedReproduce the exact production error locally by matching the production environment (Docker, same DB, same config).
  • verifiedSet up a smoke test in CI that hits the deployed app and asserts expected behaviour.
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningGuessing the cause without reading the production error log first
  • warningAssuming local dev and production are the same because 'the code is the same'
  • warningFixing the symptom (adding a try/catch) instead of fixing the environment gap
  • warningDeploying a hotfix without testing it in a staging environment first
  • warningNot adding a preflight check after the fix — the same gap will re-appear in 3 months