LEARN · DEBUGGING GUIDE

Debugging Kubernetes Resource Quota Exceeded

When Kubernetes rejects your pod with a quota exceeded error, it's not always about actual usage. Here's how to find and fix the real cause quickly.

IntermediateKubernetes7 min read

What this usually means

A ResourceQuota object in the namespace has hard limits that prevent the new resource from being created. This is not the same as node-level resource pressure. The quota can be exceeded even if the cluster has plenty of CPU and memory capacity, because the quota is a namespace-level admission control. The error often comes from a mismatch between the sum of requested resources (including overhead like sidecars and init containers) and the quota limits. Common non-obvious causes include: quota not accounting for init containers' requested resources, overlapping quota scopes (e.g., BestEffort vs NotBestEffort), or quota on persistent volume claims (PVCs) that is exceeded by storage requests.

( 01 )Fast diagnosis

The first ten minutes — establish facts before touching code.

  • 1Run `kubectl get quota -n <namespace>` to list all ResourceQuota objects in the namespace
  • 2Run `kubectl describe quota <quota-name> -n <namespace>` to see current usage vs hard limits
  • 3Check the pending pod's spec: `kubectl get pod <pod-name> -n <namespace> -o yaml` and sum the resource requests (including init containers)
  • 4Look at namespace events: `kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -10` for quota-related messages
  • 5If the quota is on PVCs, run `kubectl get pvc -n <namespace>` and check total requested storage against quota limits
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchResourceQuota objects in the namespace: `kubectl get quota -n <namespace>`
  • searchPod events and describe output: `kubectl describe pod <pod-name> -n <namespace>`
  • searchNamespace events: `kubectl get events -n <namespace> --field-selector reason=FailedCreate`
  • searchkube-apiserver audit logs if admission control details are needed
  • searchkubectl top pods -n <namespace> (for actual usage, though quota is about requests)
  • searchDeployment/StatefulSet configuration for resource requests and limits
  • searchClusterResourceQuota if using OpenShift or multi-tenant clusters
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningResourceQuota hard limit is too low for the deployment's total requested resources
  • warningInit containers have resource requests that count against the quota but are overlooked
  • warningOverlapping quota scopes (e.g., BestEffort and NotBestEffort) causing unexpected double-counting
  • warningQuota on persistent volume claims (storage requests) exceeded by new or existing PVCs
  • warningResourceQuota object has a scope that doesn't match the pod's QoS class (e.g., BestEffort quota but pod is Burstable)
  • warningMultiple ResourceQuota objects in the same namespace with cumulative limits (some may be ClusterResourceQuota)
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildIncrease the appropriate hard limit in the ResourceQuota: `kubectl edit quota <quota-name> -n <namespace>`
  • buildReduce the resource requests of the pod (and init containers) to fit within the quota
  • buildIf quota scopes mismatch, adjust the scope selector (e.g., change from BestEffort to NotBestEffort) or remove the quota
  • buildFor storage quota, either increase the limit or reduce PVC size requests
  • buildDelete unnecessary PVCs or pods that are consuming quota but not needed
  • buildUse LimitRange to set default resource requests if the quota requires all pods to have requests set
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedAfter fix, attempt to create a new pod: `kubectl run test-pod --image=nginx -n <namespace>` and check it runs
  • verifiedRun `kubectl describe quota -n <namespace>` and confirm usage is below hard limits
  • verifiedCheck events: `kubectl get events -n <namespace>` should show no quota-related failures
  • verifiedVerify the deployment scales: `kubectl scale deployment <name> --replicas=<desired> -n <namespace>`
  • verifiedFor storage, create a test PVC and verify it binds successfully
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningIncreasing quota without verifying actual usage—might mask resource waste
  • warningForgetting that quota applies to resource requests, not limits (unless specified)
  • warningIgnoring init containers—their requests count toward pod-level quota (compute resources)
  • warningAssuming quota is the same across namespaces—each namespace has its own quota
  • warningApplying a quota change to production without testing in a staging environment
  • warningDeleting quota without ensuring no pods rely on it (or that it's not a cluster-scoped quota)
( 07 )War story

Midnight Quota Fail: When Init Containers Eat Your Limits

Platform EngineerKubernetes 1.24 on AWS EKS

Timeline

  1. 23:15PagerDuty alert: Pods in namespace 'checkout' are stuck Pending
  2. 23:18kubectl get pods -n checkout shows 12 pods in Pending status
  3. 23:20kubectl describe pod checkout-7d4f8b-5x2k9 -n checkout shows event: 'FailedCreate: quota exceeded'
  4. 23:22kubectl get quota -n checkout shows 'compute-resources' quota with hard limits: cpu=10, memory=20Gi
  5. 23:25kubectl describe quota compute-resources -n checkout shows usage: cpu=9.8, memory=19.8Gi
  6. 23:30Check deployment spec: each pod requests cpu=0.5, memory=1Gi, but init container requests cpu=1, memory=2Gi
  7. 23:32Sum for 12 pods: (0.5+1)*12 = 18 cpu, but quota is 10 — quota was calculated without init containers
  8. 23:35Temporarily increase quota cpu to 20, memory to 40Gi
  9. 23:37Pods start running. Root cause: init container resource requests not accounted for in quota planning.

At 23:15, I got paged about pods stuck pending in the checkout namespace. We had just rolled a new deployment with an init container that downloads a large model. I checked the pods and saw FailedCreate events mentioning quota exceeded. My first instinct was to check the node resources, but nodes had plenty of capacity. Then I remembered we had a ResourceQuota.

I ran `kubectl describe quota` and saw we were at 9.8 CPU out of 10. That seemed close. But our pods only requested 0.5 CPU each. How could 12 pods be over quota? I calculated: 12 * 0.5 = 6, well under 10. Then I looked at the pod spec and noticed the init container requested 1 CPU. That's 12 * (0.5+1) = 18 CPU. Bingo. The quota was set before the init container was added.

I increased the quota to 20 CPU and 40Gi memory temporarily, and the pods started. Then I filed a ticket to reduce the init container's request to 0.5 CPU (it didn't need that much) and to update our quota review process to include init containers. The lesson: quota counts the sum of all containers in a pod, including init containers. Always sum requests across all containers.

Root cause

Init container resource requests were not included when calculating the required quota, causing the sum of pod resource requests to exceed the hard limit.

The fix

Temporarily increased the compute-resources quota hard limits from cpu=10 to 20 and memory=20Gi to 40Gi. Subsequently reduced the init container's CPU request from 1 to 0.5 and updated quota planning to include init containers.

The lesson

Always account for init containers when setting ResourceQuota limits. The quota applies to the total requests of all containers in a pod, including init containers, not just the main containers.

( 08 )Understanding ResourceQuota Scopes and QoS Classes

ResourceQuota can be scoped to specific QoS classes using the 'scopes' field. The scopes include 'BestEffort', 'NotBestEffort', 'Terminating', 'NotTerminating', etc. If you create a quota with scope 'BestEffort', it only applies to pods with QoS class BestEffort (no requests/limits set). Similarly, 'NotBestEffort' applies to all other pods.

A common pitfall is having overlapping scopes. For example, if you have two quotas: one with scope 'BestEffort' and another without a scope (which applies to all pods), the quotas are additive. A BestEffort pod would be subject to both quotas, potentially causing unexpected denials. Always check for overlapping quotas using `kubectl get quota -n <namespace> -o yaml` to see the scopes.

( 09 )Init Containers and Resource Quota: The Hidden Consumer

Init containers run before the main containers, and their resource requests count toward the pod's total resource request. This is a frequent source of quota exceeded errors because teams often forget to include init containers when sizing quotas.

When debugging, use `kubectl get pod <pod-name> -n <namespace> -o json | jq '.spec.initContainers[].resources.requests'` to see init container requests. Then sum them with the main container requests. You can also use `kubectl describe quota` to see current usage, which includes all active pods' total requests.

( 10 )Storage Quotas: More Than Just PVC Count

ResourceQuota can also limit persistent volume claims (PVCs). The quota can specify 'requests.storage', 'persistentvolumeclaims', and storage class-specific limits. Exceeding these quotas can block PVC creation and cause pods to stay pending.

To debug storage quota issues, run `kubectl get quota -n <namespace> -o yaml` and look for 'persistentvolumeclaims' or 'requests.storage'. Then run `kubectl get pvc -n <namespace>` to see current claims and their sizes. The quota applies to the sum of requested storage, not the actual used capacity. Also note that if you delete a PVC, the quota may take some time to reflect the freed capacity.

( 11 )Quota Enforcement via Admission Control: The ApiServer Perspective

ResourceQuota is enforced by an admission controller in the kube-apiserver. When a pod creation request arrives, the controller calculates the total resources (including init containers) of all existing and new pods against the quota. If the sum exceeds any hard limit, the request is rejected with a 'quota exceeded' error.

The admission controller runs synchronously, so the error appears immediately in the pod events. To see the exact admission decision, you can enable audit logging on the apiserver with `--audit-log-path` and look for the 'admission' stage with 'quota' plugin. This helps when the error message is vague.

( 12 )ClusterResourceQuota: Multi-Tenant Quota Gotchas

In OpenShift or using custom operators, you might have ClusterResourceQuota that applies across multiple namespaces. These quotas aggregate resource usage from selected namespaces. If a pod fails with quota exceeded but the namespace-level quota seems fine, check if a ClusterResourceQuota is in effect.

Use `oc get clusterresourcequota` (OpenShift) or check for custom resources. The key is to identify which quota object is rejecting the request. The event message usually includes the quota name. If it references a cluster-scoped quota, you need to edit that instead of namespace-level quota.

Frequently asked questions

Why does my pod fail with quota exceeded even though my namespace has free quota?

Check if the quota has scopes that don't match the pod's QoS class. For example, if quota targets 'BestEffort' but your pod has limits set (making it Burstable), it won't count toward that quota. Also, ensure you're looking at the correct quota object, as there could be multiple quotas in the namespace. Finally, remember that quota counts the sum of requests for all containers, including init containers, so a single pod might exceed the limit by itself.

How do I see the current usage of a ResourceQuota?

Use `kubectl describe quota <quota-name> -n <namespace>`. The output shows 'Used' and 'Hard' columns for each resource. For example, 'cpu: 5/10' means 5 CPUs used out of 10 limit. The usage is the sum of requests from all pods (and PVCs) in the namespace that match the quota's scope.

Can I set a ResourceQuota on GPU resources?

Yes, but only if you are using extended resources like 'nvidia.com/gpu'. You must define the resource name exactly as it appears in the pod spec. The quota works the same way: it limits the sum of requests for that extended resource across all pods in the namespace. Make sure the resource is registered in the cluster (e.g., via device plugin).

What happens if I delete a ResourceQuota while pods are running?

Deleting a ResourceQuota removes the admission control, so new pods can be created without limit enforcement. Existing pods continue running. However, if there are multiple quotas, only the deleted quota's limits are removed. Be careful: deleting a quota could allow resource overcommitment. It's better to edit the quota to increase limits temporarily.

How do I debug quota exceeded for PVCs?

First, check the storage-related hard limits in the quota: 'requests.storage', 'persistentvolumeclaims', and storage class-specific limits. Then run `kubectl get pvc -n <namespace>` to see all PVCs and their requested storage sizes. The quota sums the requested sizes, not the actual used capacity. If you delete a PVC, the quota usage may take a few seconds to update.