All guides

LEARN \u00b7 DEBUGGING GUIDE

Observability missing logs: how to debug gaps in logging and monitoring

Users report an error. You check the logs. Nothing. The error happened but it was never recorded. Your observability pipeline has a gap.

IntermediateObservability/performance debugging

What this usually means

Logs go missing for several reasons: the log level is set too high (only ERROR is logged, but the useful context is at INFO level), the log pipeline is overloaded and drops messages (log buffer full, network congestion), logs are sampled and the critical event was not in the sample, or the logging library has a bug that swallows exceptions. Structured logging can also fail silently if the log format is malformed.

( 01 )Fast diagnosis

The first ten minutes \u2014 establish facts before touching code.

  • 1Check the log level configuration in production. Is it set to ERROR or WARN? Critical context is often at INFO.
  • 2Check if the logging library has a sampling rate. Some libraries sample only 1% of logs by default.
  • 3Check the log pipeline health. Is the log agent (Fluentd, Logstash, Vector) running? Is it dropping logs?
  • 4Check if logs are buffered and the buffer is not flushed. A crash before flush loses buffered logs.
  • 5Check if structured log fields are being dropped because they do not match an index mapping or schema.
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchLog level configuration — environment variable, config file, or code constant
  • searchLogging library configuration — sampling rate, buffering, transport settings
  • searchLog pipeline — agent status, network connectivity, destination health
  • searchLog destination — storage quota, retention policy, index mapping
  • searchApplication code — try/catch blocks that log errors, structured log calls
  • searchPlatform limits — some platforms throttle or truncate log output
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningLog level is set to ERROR but the useful diagnostic information is at INFO or DEBUG
  • warningLog sampling is enabled and the error was not sampled
  • warningLog buffer is full and new logs are dropped
  • warningLog pipeline is down — agent crashed, network issue, or destination unreachable
  • warningStructured log fields exceed the destination's index mapping limits and are dropped
  • warningLogs are written asynchronously and a process crash loses the in-flight buffer
  • warningPlatform log throttling drops logs under high volume
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildSet the production log level to INFO for application logs, DEBUG for development
  • buildDisable sampling for error-level logs — every error must be captured
  • buildAdd health checks for the log pipeline — monitor log throughput and error rates
  • buildUse a log buffer that persists to disk so logs survive process restarts
  • buildAdd a kill switch log that fires when the logging system itself fails
  • buildCentralise log configuration so it cannot be accidentally changed per service
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedTrigger a known error in staging. Verify the log appears in the log destination.
  • verifiedCheck log throughput: the number of logs in the destination should match expectations.
  • verifiedSimulate a log pipeline failure and verify the application handles it gracefully.
  • verifiedRun a load test and verify log volume scales with traffic without drops.
  • verifiedSet up alerting that fires when log throughput drops below a threshold.
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningSetting the production log level to ERROR and losing all context for debugging
  • warningEnabling log sampling without excluding errors from sampling
  • warningNot monitoring the log pipeline health
  • warningUsing synchronous logging in performance-critical paths
  • warningNot testing that logs actually appear in the destination after deployment