All guides

LEARN \u00b7 DEBUGGING GUIDE

Queue consumer not processing messages: how to debug it

You publish messages to a queue. The queue depth grows. The consumer is running. But no messages are being processed. Something is sitting between the messages and your handler.

IntermediateCache/queue/distributed bugs

What this usually means

The consumer might be connected to the wrong queue, might be stuck on a poison message that crashes it before acknowledgement, or might have its prefetch count set to zero. In some message brokers (RabbitMQ, SQS, Kafka), if a consumer fails to acknowledge a message, the broker holds it and waits. If the consumer crashes before acking, the message is requeued — but if the consumer crashes on every attempt, the message loops until it hits the dead-letter queue, and the consumer appears idle.

( 01 )Fast diagnosis

The first ten minutes \u2014 establish facts before touching code.

  • 1Check the queue dashboard (RabbitMQ Management, AWS SQS console, Kafka UI). What state are the messages in? Ready? Unacked? In flight?
  • 2Check if the consumer process is actually running. `ps aux | grep consumer` or check your process manager / Kubernetes pod status.
  • 3Check the consumer logs for the last processed message. If there is one, the consumer may be stuck on a single message.
  • 4Check if the consumer is connected to the right queue, exchange, or topic. A misconfigured routing key or topic subscription means messages go elsewhere.
  • 5Check for a dead-letter queue (DLQ). Are messages being routed there after repeated failures?
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchMessage broker dashboard — queue depth, message states, consumer count, DLQ depth
  • searchConsumer connection settings — host, port, queue name, routing key, consumer group
  • searchConsumer logs — last processed message timestamp, any error patterns
  • searchConsumer code — prefetch count, acknowledgement mode (auto vs manual), error handling
  • searchMessage payload — is the consumer failing to parse the message body?
  • searchDLQ — are messages accumulating there?
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningConsumer is connected to a different queue or topic than the one receiving messages
  • warningConsumer crashes on a poison message and restarts, creating a crash loop
  • warningConsumer has no error handling — an uncaught exception kills the process
  • warningPrefetch count is set to 0 or a very low number
  • warningConsumer is stuck waiting for an external resource (database, API) that is unavailable
  • warningFor Kafka: consumer group rebalance in progress — no consumer is assigned partitions during rebalance
  • warningFor SQS: visibility timeout is too long and messages are hidden after a failed processing attempt
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildAdd a health check endpoint to the consumer that reports last-processed timestamp and queue lag
  • buildWrap message processing in a try/catch with a dead-letter queue for messages that fail repeatedly
  • buildSet a reasonable prefetch count (e.g. 10-50) so the consumer pulls multiple messages
  • buildAdd structured logging: log message ID on receive, on processing start, on success, and on failure
  • buildFor Kafka, increase `max.poll.interval.ms` if processing takes longer than the default 5 minutes
  • buildFor SQS, set the visibility timeout to exceed the maximum expected processing time
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedPublish a test message and observe it flowing through to the consumer logs and being acknowledged.
  • verifiedCheck the queue dashboard — the message should move from ready to unacked to gone.
  • verifiedPublish a message that the consumer would fail to process. Confirm it lands in the DLQ, not stuck.
  • verifiedRun the consumer locally against a development queue and verify processing.
  • verifiedMonitor queue depth over time — it should stabilise or decrease, not grow indefinitely.
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningNot setting up a dead-letter queue from the start
  • warningUsing auto-ack mode without understanding its implications (message is lost on crash)
  • warningNot logging message IDs — makes it impossible to trace a specific message through the system
  • warningAssuming the consumer is fine because the process is running
  • warningDeploying a consumer without monitoring queue depth and consumer lag