What this usually means
A classic deadlock occurs when two or more threads hold locks that the others need, forming a cycle. In Java, this is typically caused by synchronized blocks or explicit Lock objects where lock acquisition order is inconsistent. The JVM can detect deadlocks at runtime and reports them in thread dumps, but silent deadlocks (where threads are waiting forever on conditions that never become true) are also common. Non-obvious causes include using multiple locks from different layers, forgetting to release locks in finally blocks, or using thread pools with bounded queues that fill up.
The first ten minutes — establish facts before touching code.
- 1Run `jstack -l <pid>` on the stuck Java process; look for 'Found one Java-level deadlock' at the top of the output
- 2If no deadlock is reported, grep for 'BLOCKED' and 'WAITING' to identify threads that are stuck
- 3Check the stack traces of BLOCKED threads to see which lock they are waiting for (e.g., `- waiting to lock <0x...>`)
- 4Identify the thread holding that lock (look for `- locked <0x...>`) and check its state
- 5If the holder is also BLOCKED or WAITING, you have a cycle; trace the chain to confirm
- 6For silent deadlocks (threads in WAITING state), look for signals like `parking to wait for <0x...>` and check if the condition they wait on (e.g., CountDownLatch) is ever counted down
The specific files, logs, configs, and dashboards that usually own this bug.
- searchThread dumps captured with `jstack -l <pid>` (the -l flag prints lock information)
- searchApplication logs: look for repeated timeouts or 'stuck thread' warnings
- searchHealth check endpoints: if they hang, that's a strong indicator
- searchJMC (Java Mission Control) or VisualVM for live thread monitoring
- searchLock contention statistics from `jcmd <pid> Thread.print` or `jstack`
- searchCustom thread pool rejection logs (e.g., ThreadPoolExecutor's RejectedExecutionHandler)
Practical causes, not theory. These are the things you will actually find.
- warningCircular lock dependency: thread A holds lock 1, waits for lock 2; thread B holds lock 2, waits for lock 1
- warningNested synchronized blocks where locks are acquired in inconsistent order across code paths
- warningUsing ReentrantReadWriteLock with write lock starvation when read locks are held long
- warningThread pool deadlock: a task waits for another task that is queued behind it in the same pool
- warningForgotten unlock in exception paths (e.g., Lock.unlock() not in finally block)
- warningDatabase connection pool exhaustion causing threads to wait forever for a connection
Concrete fix directions. Pick the one that matches your root cause.
- buildEnforce a global lock ordering: always acquire locks in the same order (e.g., by a hash code comparison)
- buildUse timed lock acquisition (`tryLock` with timeout) to break deadlocks and retry or fail gracefully
- buildReplace synchronized blocks with higher-level concurrency utilities like `LinkedBlockingQueue` or `CompletableFuture`
- buildFor thread pool deadlocks, use a separate thread pool for dependent tasks or increase the pool size
- buildAdd explicit deadlock detection in code using `ThreadMXBean.findMonitorDeadlockedThreads()` and log an alert
- buildEnsure all Lock.unlock() calls are in finally blocks, and use try-with-resources for AutoCloseable locks if available
A fix you cannot prove is a guess. Close the loop.
- verifiedReproduce the scenario with a controlled load test and capture thread dumps before and after the fix
- verifiedRun `jstack` multiple times to confirm that stuck threads are no longer present
- verifiedVerify that CPU usage returns to normal levels under load
- verifiedCheck health endpoints respond within expected timeouts
- verifiedMonitor lock contention metrics: reduce in `jcmd <pid> Thread.print` or via JMC
- verifiedAdd unit/integration tests that specifically test the locking order with multiple threads
Things that make this bug worse or harder to find.
- warningOnly looking at the first thread dump; deadlocks may be intermittent, so take multiple dumps over time
- warningIgnoring threads in WAITING state (e.g., parked) that are not part of a classic deadlock but are stuck on conditions
- warningKilling the JVM without capturing a thread dump first—you lose the evidence
- warningAssuming that because `jstack` doesn't report a deadlock, there isn't one; silent deadlocks are common
- warningUsing `kill -9` or restarting the application without investigating root cause, leading to recurrence
- warningFixing symptoms (e.g., increasing thread pool size) instead of the actual lock ordering problem
Production Deadlock: Payment Service Stops Processing Orders
Timeline
- 14:22Alert: Payment service health check timeout (30s)
- 14:24SSH to server, run `top` -> CPU near 0%, 400 threads
- 14:26Run `jstack -l <pid> > dump1.txt`
- 14:28Read dump: 'Found one Java-level deadlock' between Thread-45 and Thread-102
- 14:30Trace locks: Thread-45 holds lock on OrderService instance, waits for InventoryService; Thread-102 holds InventoryService, waits for OrderService
- 14:35Identify code: two different methods that acquire locks in opposite order
- 14:40Apply hotfix: change lock order in one method to match the other; restart
- 14:45Health check passes, orders resume processing
It was a Tuesday afternoon, and the payment service had been running fine for weeks. Suddenly, the monitoring dashboard lit up: health check failures across all instances. I SSH'd into one box and saw CPU was nearly idle, but there were 400 threads—way more than usual. My first instinct was to grab a thread dump. I ran `jstack -l <pid>` and redirected it to a file. The output started with 'Found one Java-level deadlock'—exactly what I feared.
I opened the dump and saw the cycle: Thread-45 was blocked trying to lock an InventoryService object, but Thread-102 held that lock. Meanwhile, Thread-102 was blocked trying to lock an OrderService object, which Thread-45 held. I searched the codebase for where these locks were acquired. Two methods—placeOrder and updateInventory—used synchronized blocks but in different orders. placeOrder synchronized on orderService then inventoryService; updateInventory did the opposite.
The fix was straightforward: I changed updateInventory to acquire locks in the same order as placeOrder. I deployed the fix to a canary, verified it passed health checks, then rolled out to all instances. Orders started processing again within minutes. The lesson: we needed a strict lock ordering policy and code reviews to catch such inconsistencies. I also added a scheduled thread dump collection to detect future deadlocks early.
Root cause
Inconsistent lock ordering in synchronized methods: `placeOrder()` locked OrderService then InventoryService, while `updateInventory()` locked InventoryService then OrderService.
The fix
Changed `updateInventory()` to acquire locks in OrderService -> InventoryService order, ensuring consistent ordering across all code paths.
The lesson
Always enforce a global lock ordering, and use tools like ThreadMXBean to programmatically detect deadlocks in test environments.
Run `jstack -l <pid>` to get a thread dump with lock information. The `-l` flag is critical—it prints lock owners and waiting threads. Look at the top of the output for a line like 'Found one Java-level deadlock'. If present, it will list the threads involved and the locks they hold/wait for.
If no deadlock is reported, examine threads in BLOCKED state. Example: `"Thread-1" #12 prio=5 os_prio=0 tid=0x00007f... nid=0x... waiting for monitor entry [0x...] java.lang.Thread.State: BLOCKED (on object monitor)` followed by `- waiting to lock <0x...> (a com.example.Service)`. Then find the thread holding that monitor: look for `- locked <0x...>`. If that thread is also BLOCKED, trace the chain. For WAITING threads, look for `parking to wait for <0x...>` and check the associated condition (e.g., `java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject`).
Not all thread stalls are detected by the JVM. Classic deadlocks require a cycle of lock acquisitions; the JVM only reports those. But threads can be stuck in a state where they are waiting for a condition that will never be satisfied—like a CountDownLatch that is never counted down, or a task that is queued in a thread pool waiting for another task that is stuck in the same pool. These are sometimes called 'silent deadlocks' or 'livelocks'.
To detect these, look for threads in WAITING or TIMED_WAITING state that are blocked on a specific condition (e.g., `parking to wait for <0x...>`). Check the stack trace to see what they are waiting for. For thread pool deadlocks, the pattern is: a task submits a subtask to the same pool and waits for its result, but the subtask is queued behind the original task, which is blocked waiting. The fix is to use a separate thread pool or increase the pool size.
You don't have to wait for a crisis to detect deadlocks. Java's `ThreadMXBean` provides `findMonitorDeadlockedThreads()` and `findDeadlockedThreads()` (the latter includes locks from `java.util.concurrent.locks`). Call this periodically (e.g., every 10 seconds) in a background thread and log the stack traces if any deadlock is detected. This can be a lifesaver in production.
Example code: `ThreadMXBean bean = ManagementFactory.getThreadMXBean(); long[] deadlockedThreadIds = bean.findDeadlockedThreads(); if (deadlockedThreadIds != null) { ThreadInfo[] infos = bean.getThreadInfo(deadlockedThreadIds, true, true); for (ThreadInfo info : infos) { log.error("Deadlocked thread: " + info); } }`. Integrate this into your health check or monitoring infrastructure.
`jstack -F <pid>` forces a thread dump when the process is not responding. Use this if `jstack -l` hangs. `jcmd <pid> Thread.print` is an alternative that also works well. For live monitoring, use `jvisualvm` or `jmc` to see thread states update in real time. On Kubernetes, you can get thread dumps via `kubectl exec <pod> -- jstack <pid>`.
For deeper analysis, tools like `fastthread.io` or `spotify/threaddump-analyzer` can parse thread dumps and highlight patterns. However, I've found that manually reading the dump is often faster for pinpointing the exact cycle. Know how to read the raw output: the `nid` is the native thread ID, and the `tid` is the Java thread ID. You can correlate with `top -H` to see native thread CPU usage.
The best fix is preventive: enforce a strict lock ordering. For example, if you have multiple locks, always acquire them in the same order (e.g., by a hash code of the lock object, or by a defined hierarchy). Document this order and enforce it via code reviews. Use `tryLock` with a timeout instead of `lock()`—if the lock isn't acquired within a reasonable time, release all held locks and retry or fail.
Higher-level abstractions like `LinkedBlockingQueue`, `CompletableFuture`, or reactive streams can eliminate explicit locking altogether. For thread pools, use different pools for independent tasks to avoid starvation. Also, consider using `StampedLock` for read-heavy workloads to reduce contention. And always, always put unlock in finally blocks.
Frequently asked questions
What does 'Found one Java-level deadlock' mean in jstack output?
It means the JVM detected a cycle where two or more threads are waiting for locks held by each other. The output will list the threads and the locks involved. This is a classic deadlock. If you see this, you have a lock ordering problem.
How do I get a thread dump if the JVM is unresponsive?
Use `jstack -F <pid>` to force a dump, or `kill -3 <pid>` to send a SIGQUIT signal (the dump goes to stdout). If the process is completely hung, you might need to use `gcore` to create a core dump and then analyze it with `jstack` or `jhat`.
Can deadlocks happen with java.util.concurrent locks like ReentrantLock?
Yes. The JVM can detect these too if you use `findDeadlockedThreads()` (not just `findMonitorDeadlockedThreads()`). The thread dump will show threads in WAITING state with `parking to wait for <0x...>`. The fix is the same: ensure consistent lock ordering and use tryLock with timeouts.
My thread dump doesn't show a deadlock, but the application is stuck. What else could it be?
Check for threads in WAITING or TIMED_WAITING state that are waiting on conditions that may never be signaled (e.g., CountDownLatch, Condition, or a blocked I/O operation). Also check for thread pool deadlocks where tasks are queued waiting for each other. Use `jstack` multiple times to see if threads are progressing.
How do I prevent deadlocks in a large codebase?
Adopt a strict lock ordering policy and document it. Use static analysis tools like FindBugs or SpotBugs to detect potential deadlocks. Write integration tests that exercise concurrent scenarios. Use `ThreadMXBean` in tests to assert no deadlocks. And prefer higher-level concurrency utilities that manage locking internally.