Docker12 min read

Debugging Inside Docker Containers: Tools, Techniques, and a War Story

Practical techniques for debugging inside Docker containers: exec, nsenter, strace, and a real incident that made me rethink debug images.

Dockerdebuggingcontainersstracensenterdebug-image

I've spent more hours than I'd like to admit debugging issues inside Docker containers. The first few times, I made the same mistakes: assuming the container had a shell, forgetting to install tools, or running a production image without debug symbols. This post is a collection of techniques I've settled on after those painful lessons.

We'll cover three levels of debugging: interactive sessions inside the container, host-level namespace inspection with nsenter, and advanced tracing with strace and tcpdump. I'll also walk through a real incident where a missing debug build cost our team an entire afternoon.

Level 1: Interactive Debugging with docker exec

The most straightforward way to debug a running container is to exec into it. This works if the container has a shell and the tools you need. But many production images are stripped down — Alpine-based images might not even have bash, only sh. Node.js and Go images often lack curl or ping.

Here's how I handle it: I keep a small script that checks for a shell and installs tools if needed.

Simple script to exec into a container and install debug tools
#!/bin/bash
# usage: debug-docker <container-name>
CONTAINER=$1
docker exec -it $CONTAINER /bin/bash || docker exec -it $CONTAINER /bin/sh
# Then inside:
# apt-get update && apt-get install -y strace curl vim
# or for Alpine: apk add strace curl vim

But there's a catch: these changes are ephemeral. If the container restarts, you lose everything. For a quick look, it's fine. For longer investigations, you want a dedicated debug container.

Building a Debug Image

I maintain a Dockerfile for a debug image that I can spin up alongside any running container. It shares the same network and PID namespace so I can inspect the target's processes.

Minimal debug image with common tools
FROM alpine:latest
RUN apk add --no-cache strace curl vim htop tcpdump bash gdb
CMD ["/bin/bash"]

To use it, I run: `docker run -it --rm --pid=container:my-app --net=container:my-app --cap-add SYS_PTRACE my-debug-image`. This gives me full access to the target's processes and network.

Level 2: Host-Level Debugging with nsenter

Sometimes the container is so broken that docker exec won't work — maybe the shell crashed, or the image doesn't have one. In that case, I use nsenter from the host. First, find the container's PID on the host:

Entering a container's namespaces from the host using nsenter
CONTAINER_ID=$(docker ps -q --filter name=my-app)
HOST_PID=$(docker inspect -f '{{.State.Pid}}' $CONTAINER_ID)
nsenter --target $HOST_PID --mount --uts --ipc --net --pid

Now you're inside the container's namespaces, but you're using the host's filesystem (which has all the tools). You can run strace, gdb, or anything else that's installed on the host. This is my go-to when the container is completely unresponsive.

warning

nsenter requires root on the host. Also, be careful: you're now sharing the container's namespaces but not its chroot. If you modify files, you're modifying the host's filesystem unless you also enter the mount namespace.

Level 3: Advanced Tracing with strace and tcpdump

When the issue is about system calls or network behavior, I reach for strace or tcpdump. But they require capabilities. Specifically, strace needs SYS_PTRACE, and tcpdump needs NET_ADMIN and NET_RAW.

If you start the container with `--cap-add=SYS_PTRACE`, you can attach strace to any process. Here's an example of tracing all system calls from a Python app:

Running strace inside a container with SYS_PTRACE capability
docker run --cap-add=SYS_PTRACE -d --name myapp python:3.10 python app.py
# Find the PID inside the container
docker exec myapp ps aux
# Attach strace to PID 1
docker exec myapp strace -p 1 -f -o /tmp/strace.log

For network debugging, tcpdump is invaluable. But you need to grant NET_ADMIN and NET_RAW:

Running tcpdump inside a container
docker run --cap-add=NET_ADMIN --cap-add=NET_RAW -d --name myapp nginx
docker exec myapp tcpdump -i eth0 -w /tmp/dump.pcap

War Story: The Missing Debug Build

The 4-Hour Debug Symbol Hunt

  1. 14:00Deploy new Go service to staging. Service crashes on startup with SIGSEGV.
  2. 14:10Check logs: no stack trace, just 'fatal error: unexpected signal'.
  3. 14:30Attempt to exec into container: /bin/sh not found (scratch image). Use nsenter.
  4. 14:45Run strace on PID 1: see mmap failing with ENOMEM. Suspicious.
  5. 15:00Notice the binary was compiled without debug symbols (-ldflags=-s -w). No line numbers in core dumps.
  6. 15:30Rebuild binary with debug symbols, deploy, get proper stack trace: null pointer in a CGO library.
  7. 18:00Fix the bug: missing nil check after C call.

Lesson

Always keep debug symbols in your staging builds. You can strip them in production, but for debugging, you need line numbers. Also, if you use a scratch image, keep a debug image with busybox and strace ready.

Debug Containers as a Service

At scale, I've seen teams create a 'debug sidecar' pattern. They run a separate container in the same pod (or docker-compose service) that shares the same namespaces. This way, the debug tools are always available without modifying the target image.

For Kubernetes, you can use ephemeral containers (kubectl debug) which do exactly this. In Docker Compose, you can define a debug service that uses `network_mode: service:target` and `pid: service:target`.

Docker Compose with a debug sidecar sharing namespaces
version: '3'
services:
  app:
    image: myapp:latest
    cap_add:
      - SYS_PTRACE
  debug:
    image: debug-toolbox:latest
    pid: "service:app"
    network_mode: "service:app"
    cap_add:
      - SYS_PTRACE

The best debug tool is the one you have ready before you need it.

Final Thoughts

Debugging inside containers is not fundamentally different from debugging on a VM or bare metal — the same tools work. The difference is that containers are ephemeral and often minimal. Plan ahead: build a debug image, grant capabilities in staging, and know how to use nsenter when exec fails.

Next time you hit a mysterious segfault in a scratch container, you'll know exactly what to do.

Frequently asked questions

How do I install debugging tools inside a container without modifying the image?

Use `docker exec -it <container> /bin/bash` and then run `apt-get update && apt-get install -y strace curl vim` (for Debian-based images). For Alpine, use `apk add strace curl`. This is ephemeral — changes are lost on container restart.

What is the difference between docker exec and nsenter?

`docker exec` runs a process inside the container's namespaces via the Docker API. `nsenter` is a host-level tool that enters the container's namespaces (pid, net, mount) directly using the container's PID. nsenter works even if the container has no shell or the Docker daemon is unresponsive.

Can I run strace inside a container without --cap-add SYS_PTRACE?

No. By default, containers run with a restricted set of capabilities. strace requires the SYS_PTRACE capability. You must start the container with `--cap-add SYS_PTRACE` or use `docker run --privileged` (not recommended).

How do I create a reusable debug container for any running container?

Build a debug image (e.g., `debug-toolbox`) containing tools like strace, curl, nmap, htop, vim. Then, to debug a running container, use `docker run -it --rm --pid=container:<target> --net=container:<target> --cap-add SYS_PTRACE debug-toolbox bash`. This shares the target's namespaces.