Worker Process Memory Leak — Debugging Guide | Buglyst Learn

What this usually means

Memory leaks in long-running processes happen when memory is allocated but never released. In Node.js, the garbage collector frees memory that is no longer referenced. A leak occurs when references to objects are kept alive unintentionally: a global array that grows without bound, an event listener that is never removed, a closure that holds a reference to a large object, or a stream that is never closed. Each job processed adds a little more to the memory footprint until the process exhausts available memory.

( 01 )Fast diagnosis

The first ten minutes — establish facts before touching code.

1Check the process memory over time. Use `process.memoryUsage()` in a periodic log or the platform's memory graph.
2Look at the heap. If heap grows without bound, the leak is in JavaScript objects. If external memory grows, the leak is in buffers, streams, or native addons.
3Take a heap snapshot early and another after many jobs. Compare them in Chrome DevTools to see which objects are accumulating.
4Check for global state that grows with job count. Arrays that are pushed to but never cleared. Maps that accumulate keys.
5Check for event listeners. If jobs add listeners to a shared EventEmitter without removing them, the emitter holds references.

( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

searchProcess memory metrics — `process.memoryUsage()`, platform memory graphs
searchHeap snapshots — take with `--inspect` and Chrome DevTools or `heapdump` module
searchGlobal variables and module-level state — arrays, Maps, Sets that grow without bound
searchEvent listeners — `emitter.listenerCount(event)` to check for accumulation
searchStreams and buffers — unclosed file handles, network sockets, database connections
searchThird-party libraries — some have known memory leaks in certain versions
searchWorker process code — the main job processing loop and per-job cleanup

( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

warningGlobal array or Map accumulates data from every job without ever being cleared
warningEvent listener added per job but never removed — EventEmitter holds a reference to the job's scope
warningDatabase connection or HTTP agent creates a new connection pool per job without closing old ones
warningStream (file, network) is opened but not destroyed after the job completes
warningClosure captures a large object that outlives the job
warningTimer (setInterval, setTimeout) is set per job but never cleared
warningCache or memoisation layer grows without eviction policy

( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

buildAdd explicit cleanup per job: clear timers, close streams, remove listeners in a `finally` block
buildUse `WeakMap` or `WeakRef` for caches that should not prevent garbage collection
buildSet a max size on any in-memory cache with LRU eviction
buildRestart worker processes after N jobs as a safety net (e.g. `pm2` with `--max-memory-restart`)
buildAdd memory monitoring and alerting: log `process.memoryUsage()` periodically and alert on upward trend
buildProfile in production with `--inspect` and take heap snapshots periodically to track object count growth

( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

verifiedRun the worker with 1000 jobs and observe memory usage. It should stabilise, not grow indefinitely.
verifiedTake heap snapshots at job 100 and job 1000. Compare — no object type should grow proportionally to job count.
verifiedRun the worker under a memory profiler and verify no objects are retained after each job completes.
verifiedSet a memory limit in the test environment and verify the worker does not hit it over a long run.
verifiedAdd an automated test that processes N jobs and asserts memory is within expected bounds.

( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

warningRestarting the worker every hour as the only fix — the underlying leak still exists
warningNot measuring memory usage in production — you will not know about the leak until it crashes
warningAdding items to a global cache without an eviction policy
warningNot cleaning up event listeners and timers in async job handlers
warningAssuming the garbage collector will handle everything — it cannot free objects that are still referenced

Related debugging guides

Worker process memory leak: how to find and fix memory leaks in background jobs

What this usually means