What this usually means
t.Parallel() lets tests run concurrently, but if they share any mutable state (package-level variables, global config, test data structures, or even test-scoped variables passed by pointer), you get data races. The Go race detector catches these only if the race actually happens during the test run. Because scheduling is nondeterministic, the race may not trigger on every execution. CI environments with different CPU counts, load, or container limits often expose races that stay hidden on a developer's laptop.
The first ten minutes — establish facts before touching code.
- 1Run `go test -race -count=1 ./...` — if it reports a race, fix that immediately.
- 2Run `go test -race -count=10 -failfast ./...` to stress the race; note the failing test names.
- 3Isolate the flaky test: `go test -race -run 'TestName' -count=100 .` to reproduce reliably.
- 4Add `-v` and capture the exact ordering of t.Parallel() subtests; look for shared variables printed in logs.
- 5Check for package-level variables or init() functions that mutate state across tests.
- 6Review test helpers that write to shared slices or maps without synchronization.
The specific files, logs, configs, and dashboards that usually own this bug.
- searchAll files in the test package with `t.Parallel()` calls
- searchPackage-level variable declarations (var, const) that tests modify
- searchinit() functions that set global state
- searchTest helpers that append to or modify shared slices/maps
- searchThe race detector output: `go test -race -v 2>&1 | grep "WARNING: DATA RACE"`
- searchBenchmark or integration test files that share a common setup function
- searchCI logs: compare the test output of successful vs failing runs for ordering differences
Practical causes, not theory. These are the things you will actually find.
- warningPackage-level variables used as test fixtures without reset between parallel tests
- warningShared test config struct passed by pointer to multiple parallel subtests
- warningt.Parallel() inside a for-loop that captures loop variable (pre-Go 1.22)
- warningTest helper that writes to a global counter or map for tracking test progress
- warningDatabase/HTTP mock that is stateful and reused across parallel tests
- warningSubtests that modify the same file or environment variable
- warningTest cleanup (t.Cleanup) that races with other tests still running
Concrete fix directions. Pick the one that matches your root cause.
- buildUse t.Run() with t.Parallel() but pass copies of data, not pointers: `tc := tc` (or `tc := testCase` for Go <1.22)
- buildReset shared state in a `sync.Mutex`-protected block or use `sync.Map` for concurrent-safe accumulation
- buildRefactor shared state into per-test setup: move globals into test-local variables inside `t.Run()`
- buildAvoid `t.Parallel()` in subtests that mutate the same file or external resource—use serial execution for those
- buildReplace package-level variables with test-scoped ones using `TestMain` setup and teardown
- buildAdd a `sync.WaitGroup` to coordinate cleanup, but prefer `t.Cleanup` over manual defer
A fix you cannot prove is a guess. Close the loop.
- verifiedRun `go test -race -count=100 ./flaky-package` — zero failures after fix
- verifiedToggle `-parallel` flag: `go test -parallel 1` should always pass; `-parallel 8` should also pass
- verifiedAdd `t.Parallel()` to every subtest and run with `-race` — no data race warnings
- verifiedCompare test logs before and after: no unexpected zero values or stale state
- verifiedDeploy to CI and observe 10 consecutive green runs on the flaky test suite
Things that make this bug worse or harder to find.
- warningBlindly removing t.Parallel() — you lose test speed and hide the real bug
- warningAdding `time.Sleep()` to work around races — it masks the issue and makes tests slower
- warningUsing `-race` only once and assuming clean output means no race
- warningIgnoring loop variable capture in Go <1.22 — always copy the variable
- warningSharing a database transaction or connection between parallel tests without proper isolation
- warningUsing `sync/atomic` without understanding memory ordering — can still race if not used correctly
The Flaky CI Race in a Payment Gateway Test Suite
Timeline
- 09:15CI fails on TestPaymentRetry — but only on the second run of the day.
- 09:30Local `go test -race ./...` passes 10 times in a row.
- 09:45I notice the failing test shares a package-level variable `processedTransactions` with other tests.
- 10:00Run `go test -race -count=100 -run TestPaymentRetry` locally — finally catches a race after 47 runs.
- 10:15Race detector output shows concurrent write to `processedTransactions` map from two parallel subtests.
- 10:30I check git blame: `processedTransactions` was added 6 months ago for logging, never meant to be thread-safe.
- 10:45Fix: replace the global map with a test-local variable created inside `t.Run()`.
- 11:00Re-run CI 10 times — all green. The race is gone.
The CI failure was intermittent: TestPaymentRetry would fail only on the second run of the day. Locally, I couldn't reproduce it. The test suite had 200+ tests, many using t.Parallel(). I spent two hours adding debug prints and running subsets, but the failure never showed up.
I finally ran `go test -race -count=100 -run TestPaymentRetry` and saw the race after 47 runs. The race detector pointed to a package-level map `processedTransactions` that was being written by multiple parallel subtests. This map was introduced months ago to track retries for monitoring—never intended for concurrent access.
The fix was simple: move the map inside the test function so each test gets its own instance. No mutex, no atomic—just no sharing. After the change, CI passed consistently. The lesson: package-level variables in parallel tests are landmines. If it doesn't need to be shared, don't share it.
Root cause
Package-level variable `processedTransactions` was written by multiple parallel subtests without synchronization.
The fix
Moved the map declaration inside the test function, making it local to each test execution.
The lesson
Never use package-level mutable state in test suites with t.Parallel(). Always allocate test-scoped data inside t.Run().
The Go race detector is a dynamic analysis tool: it only reports races that actually happen during execution. If the scheduler never interleaves the conflicting accesses in a way that triggers the race, the detector stays silent. This is why running tests with `-race` multiple times (e.g., `-count=100`) is essential for exposing flaky races.
CI environments often have different CPU counts, load, or container CPU limits, which change the scheduler behavior. A race that never triggers on a 4-core laptop might trigger every time on a 2-core CI runner. Always run `-race` with `-count=10` at minimum on the CI build to reduce false negatives.
A classic source of race conditions in parallel subtests is capturing the loop variable in a closure. In Go versions before 1.22, the loop variable is reused across iterations. So `for _, tc := range tests { t.Run(tc.name, func(t *testing.T) { t.Parallel(); fmt.Println(tc.input) }) }` will cause all subtests to see the last value of `tc` or race on the variable.
The fix is to shadow the variable: `tc := tc` inside the loop before the closure. Go 1.22 changed this behavior, but if your project targets older versions, you must always copy. Use `go vet` to catch this pattern: `go vet -vettool=$(which vet) ./...` will flag loop variable captures.
A systematic approach: 1) Run `go test -race -count=200 -failfast ./package` to find any race quickly. 2) If a race is found, isolate the failing test with `-run TestName`. 3) Increase parallelism with `-parallel 8` or `GOMAXPROCS=2` to change scheduling. 4) Use `stress` tool from `golang.org/x/tools/cmd/stress` to run the test under heavy load: `stress -p 4 go test -race -run TestName`.
Also, consider using `go test -exec 'stress -p 4'` to run tests under stress. This can surface races that only appear under high concurrency. Document the exact command that reproduces the race so it can be used for regression testing.
Some engineers advocate removing t.Parallel() to avoid races entirely. That's a bad trade-off: you lose test speed and the race may still exist in production code. The goal should be to fix the race, not hide it. Parallel tests expose concurrency bugs that could also affect production. Use t.Parallel() as a tool to catch these bugs early.
However, if a test is inherently sequential (e.g., testing a global rate limiter), it's fine to omit t.Parallel(). But don't remove it just because it's convenient—fix the underlying data race. Your production code will thank you.
Frequently asked questions
Why does `go test -race` report a race only sometimes?
The race detector is a dynamic tool—it only catches races that actually occur during execution. Because goroutine scheduling is nondeterministic, the same test run may or may not trigger the race. To increase the chance of detection, run the test many times with `-count=100` and vary the parallelism with `-parallel` flag or `GOMAXPROCS`.
Is it safe to use `sync.Mutex` in test helpers called from parallel subtests?
Yes, but be careful: mutexes protect shared state, but if your test helper modifies global state under a mutex, you lose parallelism—only one subtest can run the helper at a time. That might be fine if the helper is fast. A better design is to avoid sharing state altogether: make the helper return a fresh copy and let each subtest have its own data.
My test passes with `-race` locally but fails on CI. What should I do?
CI often has different CPU count, load, or container limits that affect goroutine scheduling. First, mimic CI's environment: set `GOMAXPROCS=2` or use Docker with the same CPU limit. Then run `go test -race -count=100` to stress the race. If that still doesn't reproduce, add logging of shared state to see if it's corrupted. The root cause is almost always a race that doesn't trigger locally due to different scheduling.
Can I use `t.Cleanup` safely with `t.Parallel()`?
Yes, `t.Cleanup` is safe to use with `t.Parallel()`. Cleanup functions run after the test and all its subtests complete, and they do not race with other tests because Go's test framework ensures that cleanup runs before the test function returns. However, avoid sharing mutable state between cleanup functions of different parallel tests—that would still be a race.