Engineering process8 min read

Rest API Design Mistakes That Inject Bugs Into Your System

Most API bugs aren't in the implementation—they're baked into the contract. Here are the design errors I've seen cause the most production incidents.

API designRESTdebuggingerror handlingidempotency

I've spent the last six years building and debugging REST APIs for a SaaS platform that processes millions of requests a day. Over time, I've noticed a pattern: the hardest bugs to find aren't in the code—they're in the API contract itself. A poorly designed endpoint can inject subtle, non-deterministic bugs that only show up under load or during retries.

This article covers the specific design mistakes I've seen cause real production incidents, with concrete examples and fixes. If you're designing or maintaining a REST API, these are the traps to watch for.

1. Vague Error Responses That Hide the Real Problem

The single most frustrating API pattern is returning a 400 Bad Request with no details. I once debugged an integration where the API returned { "error": "Validation failed" } for every bad input. It took three days to realize the client was sending a string where an integer was expected—the error message told us nothing.

A good error response includes three things: a stable machine-readable error code, a human-readable message, and a list of field-level issues. Here's what I use now:

A structured error response that clients can parse and display directly.
HTTP/1.1 400 Bad Request
Content-Type: application/json

{
  "error": {
    "code": "INVALID_FIELD_TYPE",
    "message": "The 'quantity' field must be an integer.",
    "details": [
      {
        "field": "quantity",
        "value": "abc",
        "issue": "Expected integer, got string"
      }
    ]
  }
}
lightbulb

Define a standard error schema in your API spec (OpenAPI) and enforce it with tests. Your clients will thank you—and you'll spend less time in Slack debugging.

2. Using 200 for Everything (Yes, Even Errors)

Some APIs return 200 OK for every request, including errors, and put the status inside the response body. This forces clients to inspect the body to determine success. The result: clients miss error conditions, retry logic breaks, and monitoring dashboards show green while users see failures.

I worked with a payment gateway that returned 200 with { "success": false, "error": "insufficient funds" }. Our monitoring only tracked HTTP 5xx, so we didn't detect a 30% failure rate for three hours. The fix was simple: use proper HTTP status codes. 402 for payment failures, 422 for validation, 409 for conflicts.

HTTP status codes exist for a reason. Use them. If you embed status in the body, you lose the ability to monitor at the infrastructure level.

The Right Way: Status Code + Structured Body

A proper 422 response that alerts monitoring and gives the client actionable info.
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/json

{
  "error": {
    "code": "INSUFFICIENT_FUNDS",
    "message": "Account balance $5.00 is less than required $10.00."
  }
}

3. Non-Idempotent POST Endpoints That Double-Charge

The Double-Charge That Cost Us $50k

  1. 14:32A network blip causes the client's POST /orders request to timeout after 5 seconds.
  2. 14:33Client retries the same request. Server creates a second order with a different ID.
  3. 14:34Payment processor charges the customer twice. Customer notices immediately and escalates.
  4. 16:00Engineering discovers the issue: no idempotency key on the orders endpoint.
  5. 22:00Fix deployed: POST /orders now requires an Idempotency-Key header. Duplicate requests return the original order.

Lesson

Always make mutating POST endpoints idempotent using an idempotency key. This simple header prevents duplicates from network retries.

Idempotency is not optional for endpoints that create resources or trigger side effects. The client sends a unique key (like a UUID) in the Idempotency-Key header. The server stores the key and the response for the first request. On subsequent requests with the same key, the server returns the stored response without executing the operation again.

This is standard in payment APIs (Stripe, Square) but rare in internal services. Implement it, and you'll eliminate an entire class of bugs.

4. Inconsistent Resource Structure Across Endpoints

If GET /users returns { id, name, email } but GET /users/{id}/orders returns { orderId, userId, total }, clients need to handle two different representations. This seems minor, but it causes bugs when developers assume field names are consistent.

I once saw a bug where the frontend used `user.id` from one endpoint and `user.userId` from another—when they merged the data, half the users had no ID. The fix: always return the same structure for the same entity. Use a consistent representation object (like `UserResponse` and `OrderResponse`) everywhere.

Consistent user representation across endpoints avoids client-side mapping bugs.
// GET /users/123
{
  "id": "123",
  "name": "Alice",
  "email": "alice@example.com"
}

// GET /users/123/orders
{
  "orders": [
    {
      "id": "order-456",
      "user": {
        "id": "123",
        "name": "Alice",
        "email": "alice@example.com"
      },
      "total": 29.99
    }
  ]
}

5. Missing Rate Limit Headers That Crash the Service

If you don't tell clients their rate limit status, they will retry blindly—and when the limit is reached, they get a 429 with no Retry-After header. They guess a retry interval, which is usually too short, causing a retry storm that takes down the service.

I've seen this happen with a reporting API that had a 100 req/min limit. A client hit the limit, got 429, retried after 1 second (too soon), got another 429, retried again... The server spent all its resources sending 429s and collapsed. The fix: always include RateLimit-Remaining, RateLimit-Reset, and Retry-After headers.

3x

Increase in server load when clients retry without exponential backoff due to missing rate limit headers.

A rate limit response that includes a Retry-After header so clients can back off correctly.
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "You have exceeded 100 requests per minute. Retry after 60 seconds."
  }
}

6. Overloading Query Parameters With Undocumented Formats

Filtering by nested fields is common, but if you don't document the exact syntax, clients will guess. I've seen APIs that expect `filter=status:active` for simple filters and `filter=created_at gt 2023-01-01` for ranges—but the docs say nothing about the `gt` operator. Clients use `>` instead, which the server silently ignores, returning all records.

The result: users see stale data, and no one knows why. The fix: standardize on a filter syntax (like JSON:API or OData) and document it clearly. Or, better, expose explicit query parameters like `status=active&created_at_gt=2023-01-01`.

warning

Silent fallback to default filters is a bug. If a client sends an invalid filter, return a 400 with details—don't ignore it.

Wrapping Up

These six mistakes are responsible for the majority of API-related bugs I've encountered. They don't require a massive refactor to fix—most are changes to the contract and a few lines of code. But the payoff is huge: fewer bugs, faster debugging, and happier clients.

Next time you design an endpoint, think about what happens when it's called under failure conditions. Because the bugs that hurt most are the ones that only happen when something already went wrong.

Frequently asked questions

What is the most common REST API design mistake that causes bugs?

Vague error responses—returning a 400 with { "error": "bad request" } tells the client nothing. The client can't fix the request, and developers waste hours guessing. Always include a machine-readable error code, a human-readable message, and a field-level detail list.

Should I use 200 or 201 for POST responses?

Use 201 Created when a resource is created. Using 200 for everything forces clients to parse the body to know what happened, which leads to missed creation events and duplicate resources on retry.

Why is idempotency important for REST APIs?

Without idempotency, network retries (which are inevitable) cause duplicate side effects—like charging a credit card twice. POST should accept an Idempotency-Key header so that the same request can be safely retried.

How do inconsistent resource structures cause bugs?

If GET /users returns { id, name } but GET /users/{id}/orders returns { userId, total }, clients need different parsing logic per endpoint. A single misspelled field or missing property leads to runtime errors that are hard to trace.