LEARN · DEBUGGING GUIDE

GitLab CI Pipeline Stuck in Pending: Debugging Runner Availability and Tag Mismatches

When a GitLab CI pipeline hangs in 'pending', it's almost always a runner availability issue—either no runner matches the job's tags, runners are offline, or they've hit their concurrency ceiling.

IntermediateCI/CD6 min read

What this usually means

The pending state in GitLab CI means the job has been created but no runner has picked it up. This is distinct from 'blocked' or 'waiting for manual action'. The core issue is a mismatch between what the job requires (tags, executor type, or runner version) and what the registered runners offer. Common causes include: no runners registered at all, runners with tags that don't match job tags, runners that are offline or unreachable, runners that have exhausted their concurrency limit, or a GitLab Runner manager that hasn't autoscaled new instances. It can also happen if the runner's token was revoked or expired.

( 01 )Fast diagnosis

The first ten minutes — establish facts before touching code.

  • 1Check runner status: `sudo gitlab-runner status` on the runner host
  • 2List registered runners: `sudo gitlab-runner list` and verify tags match job tags
  • 3Check GitLab admin area: go to Settings > CI/CD > Runners and see if any runners are 'online'
  • 4Verify runner logs: `sudo journalctl -u gitlab-runner -n 50` for errors like '403 Forbidden' or 'dial tcp'
  • 5Test connectivity from runner to GitLab: `curl -v https://gitlab.example.com/api/v4/runners` (requires auth)
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchGitLab UI: Project > Settings > CI/CD > Runners
  • searchRunner host: /etc/gitlab-runner/config.toml
  • searchRunner logs: /var/log/gitlab-runner/gitlab-runner.log or journalctl
  • searchGitLab server logs: /var/log/gitlab/gitlab-rails/production.log for runner registration errors
  • searchCI job trace: expand the job and check for 'Waiting for runner' or 'Stuck' messages
  • searchRunner registration token: verify it hasn't been rotated in Admin > Runners
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningJob has tags (e.g., 'docker', 'linux') but no runner registered with matching tags
  • warningAll runners are offline due to network issues or resource exhaustion
  • warningRunner concurrency limit reached (e.g., `concurrent = 1` in config.toml blocks all jobs)
  • warningRunner token expired or invalid after GitLab upgrade or token rotation
  • warningGitLab Runner version mismatch—older runner can't parse newer job definitions
  • warningAutoscaling runner (e.g., Docker Machine, Kubernetes) failed to provision new instances
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildAdd matching tags to job or remove tags from job (use `tags: []` or remove `tags` keyword)
  • buildRestart runner: `sudo gitlab-runner restart` and check logs
  • buildIncrease concurrency in config.toml: `concurrent = 4` and restart runner
  • buildRe-register runner with new token: `sudo gitlab-runner register`
  • buildFor autoscaled runners, check cloud provider quota and scaling configuration
  • buildAdd a runner without tags (or with an empty tag list) to act as a fallback
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedRun a simple job with tags matching a known online runner and see it picks up instantly
  • verifiedMonitor runner logs in real time: `journalctl -u gitlab-runner -f` while triggering a pipeline
  • verifiedCheck GitLab UI: after fix, runner should show 'online' and active jobs count should increment
  • verifiedUse the API to check runner status: `curl --header "PRIVATE-TOKEN: $TOKEN" "https://gitlab.com/api/v4/runners"`
  • verifiedVerify pipeline transitions from 'pending' to 'running' within seconds of triggering
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningRemoving all runners without registering new ones first—pipelines will never run
  • warningSetting `tags: []` on jobs that need specific environments (e.g., Windows, macOS) without having a matching runner
  • warningForgetting to restart the runner after config changes
  • warningUsing the same registration token for multiple runners without unique names—GitLab deduplicates by token
  • warningIgnoring runner log errors like '403 Forbidden' which indicate token issues
  • warningAssuming all runners are healthy just because they appear online—check last contact time
( 07 )War story

Production Pipeline Stuck for 45 Minutes Due to Tag Mismatch

Senior Platform EngineerGitLab EE 15.11, GitLab Runner 15.11, Docker executor on AWS EC2, Terraform-managed runners

Timeline

  1. 09:15Alert: Pipeline for release/2.3.0 stuck in pending for 5 minutes
  2. 09:20Checked GitLab Runners page: two runners listed, both 'online' but last contact 10 min ago
  3. 09:25SSH into runner host: `sudo gitlab-runner status` shows 'running' but `sudo gitlab-runner list` shows no tags
  4. 09:30Reviewed job YAML: deploy job has 'tags: production' but runners have no tags
  5. 09:35Added tag 'production' to runner in config.toml and restarted runner
  6. 09:37Pipeline still pending—deeper check reveals second runner with wrong tag 'prod'
  7. 09:40Corrected second runner tag and restarted both runners
  8. 09:42Pipeline transitions to running, deploy completes at 09:48

The alert fired at 09:15 for a release pipeline that had been pending for five minutes. I checked the GitLab Runners admin page and saw two runners for the project—both showed as online, but their last contact was ten minutes ago. That was odd; normally they poll every few seconds. I SSH'd into one runner host and ran `sudo gitlab-runner status`, which reported it as running. But `sudo gitlab-runner list` revealed the runner had no tags configured.

I then opened the pipeline's job YAML. The deploy job had `tags: production`. That was the smoking gun: the job required a tag that no runner had. I quickly edited the runner's `/etc/gitlab-runner/config.toml` and added `tags = ["production"]` under the runner section. After restarting the runner with `sudo gitlab-runner restart`, I expected the pipeline to pick up immediately. But it didn't.

I went back to the runners list and noticed the second runner—it also had no tags. But when I checked its config, someone had set `tags = ["prod"]` instead of `tags = ["production"]`. That typo was the real culprit. I fixed the tag, restarted both runners, and within thirty seconds the pipeline started running. The deploy completed at 09:48—33 minutes of delay caused by a single character mismatch.

Root cause

Job tag 'production' did not match runner tags (one had no tags, the other had 'prod').

The fix

Corrected runner tags to 'production' in config.toml and restarted both runners.

The lesson

Always validate runner tags match job tags exactly—including case and spelling. Use consistent naming conventions and consider using CI linters to catch mismatches early.

( 08 )Runner Registration and Token Lifecycle

Each runner is registered with a unique token generated by GitLab. The token is stored in the runner's config.toml and used to authenticate when polling for jobs. If the token is revoked (e.g., after a GitLab admin rotates tokens), the runner will appear as 'never contacted' or 'offline' even if the process is running.

To check token validity, look at the runner logs for '401 Unauthorized' or '403 Forbidden'. You can also verify the token by calling the API: `curl --header "PRIVATE-TOKEN: $TOKEN" "https://gitlab.com/api/v4/runners/verify"` with the runner's ID. If the token is invalid, re-register the runner with a fresh token from Admin > Runners.

( 09 )Concurrency and Queue Depth

The `concurrent` setting in config.toml limits how many jobs a runner can execute simultaneously. The default is 1, meaning only one job runs at a time. If multiple pipelines are triggered, they queue up and appear as pending. This is often mistaken for a stuck pipeline.

To diagnose, check the runner's current job count: `sudo gitlab-runner list` shows 'Running' and 'Pending' jobs. Increase concurrency by editing config.toml and setting `concurrent = 4` (or higher, based on available resources). Then restart the runner.

( 10 )Autoscaling Runners and Resource Exhaustion

Autoscaled runners (using Docker Machine, Kubernetes, or custom executors) can fail to provision new instances due to cloud provider limits, quota exhaustion, or misconfigured scaling policies. When this happens, the runner manager remains online but no new runners are created, causing jobs to pend indefinitely.

Check the autoscaler logs (e.g., `/var/log/gitlab-runner/gitlab-runner.log`) for errors like 'failed to create machine' or 'insufficient capacity'. Also verify the scaling limits: `limit = 10` in the `[[runners.machine]]` section. Ensure the cloud provider has available resources (e.g., EC2 instance limit, GPU quota).

( 11 )Network Connectivity and Firewall Rules

Runners communicate with the GitLab instance over HTTPS. If the runner cannot reach the GitLab server (DNS resolution failure, firewall blocking port 443, proxy misconfiguration), it will fail to fetch jobs. The runner logs will show 'dial tcp: connection refused' or 'no such host'.

Test connectivity from the runner host: `curl -v https://gitlab.example.com/api/v4/runners` (requires a valid token). Also check the runner's DNS resolution: `nslookup gitlab.example.com`. If using a proxy, ensure the `HTTP_PROXY` and `HTTPS_PROXY` environment variables are set correctly in the runner's systemd service file.

( 12 )CI Job Tags and Runner Selectors

Tags are the most common cause of pending jobs. By default, if a job has no tags, any runner without tags can pick it up. But if a job specifies tags, only runners with matching tags (and all required tags) can execute it. A common mistake is adding tags to a job but forgetting to tag the runner.

To debug, compare job tags from the pipeline YAML with runner tags from the GitLab UI or config.toml. Use the API to list runners and their tags: `curl --header "PRIVATE-TOKEN: $TOKEN" "https://gitlab.com/api/v4/runners?tag_list=production"`. If no runners match, either remove tags from the job or add tags to the runner.

Frequently asked questions

How do I check if a runner is online from the command line?

Use `sudo gitlab-runner list` to see the runner's status (online/offline) and last contact time. For more detail, check the runner logs with `journalctl -u gitlab-runner -n 20`. The GitLab UI also shows the last contact timestamp under Settings > CI/CD > Runners.

What does 'Runner has never contacted' mean?

It means the runner process has never successfully connected to the GitLab server after registration. Common causes include: network issues, invalid token, wrong GitLab URL, or the runner service not running. Verify connectivity with `curl` and check the runner config for the correct URL and token.

Can a pipeline be stuck due to a GitLab server issue?

Yes, but less common. If the GitLab server is overloaded or the Sidekiq queue is backed up, job state transitions can delay. Check GitLab's monitoring logs (e.g., /var/log/gitlab/gitlab-rails/production.log) for errors. Also verify that the GitLab instance itself is healthy by accessing its status page.

How do I increase the number of concurrent jobs on a runner?

Edit the runner's config.toml and set the `concurrent` value under the `[[runners]]` section (or globally before runners). For example: `concurrent = 4`. Then restart the runner: `sudo gitlab-runner restart`. Note that this increases resource usage; ensure the host has enough CPU/RAM.