LEARN · DEBUGGING GUIDE

Python Generator Already Exhausted: What You're Actually Seeing

You iterate a generator once, then try again and get nothing. No exception, just silence — or a cryptic StopIteration. Here's exactly what happened and how to fix it.

IntermediatePython7 min read

What this usually means

Generators are single-use iterators. Once you've iterated through them — by for-loop, conversion to list, or explicit next() calls — they are exhausted. They have no built-in reset mechanism. If you need to iterate the same data multiple times, you must either store the generated values (e.g., list) or recreate the generator. The tricky part is when a generator is passed around as an argument: one function consumes it, leaving the caller's reference empty. This is not a bug in the generator logic but in the code's assumption that generators behave like reusable containers.

( 01 )Fast diagnosis

The first ten minutes — establish facts before touching code.

  • 1Add a debug print before and after each iteration: print(f'Generator object: {gen!r}, __next__ exists: {hasattr(gen, "__next__")}')
  • 2Wrap the generator with a wrapper that counts __next__ calls: class CountedGen: def __init__(self, gen): self.gen = gen; self.count = 0; def __next__(self): self.count += 1; return next(self.gen)
  • 3Check if the generator is passed to a function that exhausts it (e.g., list(), sum(), any(), all()) before your code uses it
  • 4Insert a breakpoint at the generator definition and trace the generator object id through each consumer: id(gen)
  • 5If using pytest, verify that fixtures with yield are not shared across tests (scope must be 'function' and not 'session' or 'module')
( 02 )Where to look

The specific files, logs, configs, and dashboards that usually own this bug.

  • searchAll call sites of the generator function — grep for the function name that returns the generator
  • searchAny list(), tuple(), set(), or sorted() calls that receive the generator as argument
  • searchsum(), min(), max(), any(), all(), reduce() — they all exhaust generators silently
  • searchPytest conftest.py fixture definitions with yield — check scope parameter
  • searchAsync generators (async for loops) — they also exhaust after one iteration
  • searchThird-party libraries that accept iterables and may consume them (e.g., pandas.DataFrame.from_records)
( 03 )Common root causes

Practical causes, not theory. These are the things you will actually find.

  • warningPassing the same generator object to multiple consumers (e.g., two for loops, or list() then sum())
  • warningCaching a generator in a variable and reusing it, expecting it to restart
  • warningUsing a generator expression (genexpr) and converting it to list twice without realising the first conversion exhausts it
  • warningPytest fixture with yield and scope='module' or 'session' — shared across tests, exhausted after first test
  • warningIterating over a generator inside a function that also returns it, leading to partial consumption
  • warningAsync generator consumed by an async for loop — subsequent async for yields nothing
( 04 )Fix patterns

Concrete fix directions. Pick the one that matches your root cause.

  • buildStore generator output in a list: data = list(gen) then use data multiple times
  • buildRecreate the generator each time it's needed: wrap in a function that returns a fresh generator
  • buildUse itertools.tee(gen, n) to create n independent copies — but be mindful of memory if gen is large
  • buildIf only two consumers, use itertools.tee(gen, 2) and pass one copy to each consumer
  • buildFor pytest fixtures, set scope='function' (default) or use yield only once per test
  • buildRewrite generators as iterable classes (with __iter__ returning self) if you need multiple iterations
( 05 )How to verify

A fix you cannot prove is a guess. Close the loop.

  • verifiedAfter fix, call the generator-consuming code twice with the same input and assert both calls produce identical results
  • verifiedUnit test: create a generator, call list() on it once, then call list() again — assert first call yields expected items, second yields empty list (to confirm exhausted behavior is now handled)
  • verifiedPrint generator object id before and after each use: if id changes, you're creating new generators
  • verifiedUse a debug wrapper that logs every __next__ call and raises if called after exhaustion
  • verifiedIn pytest, run tests with -k and --setup-show to see fixture setup/teardown order and ensure fixture is recreated per test
( 06 )Mistakes to avoid

Things that make this bug worse or harder to find.

  • warningDo not wrap the generator in a class that caches __next__ results and tries to replay — that breaks laziness and memory
  • warningDo not use except StopIteration to silently ignore exhaustion — that hides the bug
  • warningDo not rely on generator.send() to reset — it cannot reinitialize an exhausted generator
  • warningDo not assume itertools.chain will magically reset generators — it just chains iterables
  • warningDo not modify the generator function to return a list instead — that may change memory/performance characteristics
( 07 )War story

Production Data Pipeline Skips Records on Second Run

Backend EngineerPython 3.9, Celery, Redis, PostgreSQL, Docker

Timeline

  1. 09:15Alert: daily ETL job reports 0 rows processed instead of 500k
  2. 09:18Check Celery logs — no errors, just empty result
  3. 09:22Rerun job manually — still 0 rows
  4. 09:30Look at main pipeline code: two functions call list() on the same generator
  5. 09:32Add debug print: print('generator id:', id(gen)) in both functions
  6. 09:35Rerun: same id in both functions — second list() is empty
  7. 09:40Fix: change first consumer to process generator directly without converting to list
  8. 09:45Rerun: 500k rows processed
  9. 09:50Add unit test to catch generator reuse

Tuesday morning, 9:15 AM. The daily ETL alert fired: 0 rows processed. Our pipeline usually handles 500k records from a Postgres query streaming through a generator. No error, no crash — just silence. I checked Celery logs: task completed successfully. That empty result was suspicious.

I pulled the ETL code. The generator function db_rows() was called once, stored in a variable, then passed to two functions: first to transform_and_validate() which called list(gen) internally, then to write_to_target() which also called list(gen). The second call got nothing because the generator was exhausted. I had assumed generators were re-usable, but they aren't.

I fixed it by refactoring transform_and_validate() to iterate without converting to list, passing the generator through. Then added a unit test that deliberately calls list() twice on the same generator object and asserts the second call returns empty. We also added a linting rule to flag generators passed to multiple consumers. Lesson learned: generators are single-pass, never assume otherwise.

Root cause

Same generator object passed to two consumers; first consumer exhausted it via list() conversion.

The fix

Modified the first consumer to iterate directly without list(), allowing generator to be consumed once by each subsequent step via itertools.tee or by recreating the generator.

The lesson

Treat any generator as single-use. If multiple consumers need the data, either store it (list) or use itertools.tee. Never pass the same generator object to more than one consumer.

( 08 )The Lifecycle of a Generator

A generator function (containing yield) returns a generator object when called. This object is an iterator: it has a __next__ method that produces values until StopIteration is raised. Once StopIteration is raised, the generator is exhausted — subsequent calls to __next__ will keep raising StopIteration.

Important: There is no reset mechanism. The generator's internal state is gone. Even if you call .send(None) after exhaustion, you get StopIteration. The only way to 'reuse' is to call the generator function again to get a fresh generator object.

( 09 )Where Exhaustion Hides: Common Functions that Consume

Many built-in functions accept an iterable and exhaust it completely: list(), tuple(), set(), sorted(), sum(), min(), max(), any(), all(), functools.reduce(). If you pass a generator to any of these, the generator is consumed. Subsequent uses of the same generator object will yield nothing.

More subtle: functools.reduce() exhausts the iterator. Even a for loop exhausts the generator. If you break out early, the generator is partially consumed but still exhausted — you cannot resume it. The generator's frame is suspended but not resumable after a break? Actually, if you break out of a for loop, the generator is partially consumed but not exhausted — you can continue iterating later? Wait, no: if you break from a for loop over a generator, the generator is not exhausted; you can still call next() on it later. But the common pitfall is that you iterate fully (e.g., via for or list) and then try to iterate again.

( 10 )Pytest Fixtures with yield: A Frequent Culprit

Pytest fixtures that use yield to provide a generator to tests can be tricky. If the fixture has scope='module' or 'session', the same generator object is used for all tests in that module/session. The first test that iterates it exhausts it. Subsequent tests get an empty generator. The fix is to use scope='function' (the default) so each test gets a fresh generator.

Another pattern: fixture yields a generator and the test calls list() on it. The fixture cleanup code after yield runs after the test, but the generator is already exhausted. That's fine as long as the fixture doesn't try to iterate the generator again.

( 11 )itertools.tee: The Right Tool for Multiple Consumers

itertools.tee(iterable, n=2) creates n independent iterators from a single iterable. It works by buffering values in memory as needed. If the iterable is a generator, tee will consume it lazily and store the values in a deque for each iterator. This is memory-intensive if one iterator falls far behind. Use tee when you need to iterate the same data multiple times without storing all data in a list.

Example: gen = my_generator(); a, b = itertools.tee(gen, 2); consumer1(a); consumer2(b). This works because tee creates a shared buffer that both iterators read from. But note: tee is not a silver bullet — if the generator produces infinite values, tee will eventually exhaust memory if the iterators diverge too much.

( 12 )Debugging with a Generator Wrapper

To catch exhaustions early, create a simple wrapper that logs every __next__ call and raises a custom exception if called after exhaustion. This turns silent failures into loud crashes during development.

Example: class LoggingGen: def __init__(self, gen): self.gen = gen; self.exhausted = False; def __iter__(self): return self; def __next__(self): if self.exhausted: raise RuntimeError('Generator already exhausted!'); try: return next(self.gen); except StopIteration: self.exhausted = True; raise. Then wrap your generator: gen = LoggingGen(my_generator()). The first consumer that fully iterates will set the flag, and any subsequent attempt to iterate will raise RuntimeError.

Frequently asked questions

Can I reset a generator to start over?

No. Generators have no reset mechanism. The only way to iterate again is to call the generator function again to get a new generator object. If you need multiple iterations, either store the output (list) or use itertools.tee.

Why doesn't list() on a generator raise an error the second time?

list() simply iterates the generator until StopIteration. The first time, it collects all items. The second time, the generator immediately raises StopIteration, so list() returns an empty list. No error is raised because StopIteration is expected to signal the end of iteration.

How does itertools.tee work? Does it store everything in memory?

itertools.tee creates a buffer that stores items as they are produced by the original iterator. Each tee'd iterator reads from this buffer. If one iterator falls behind, the buffer grows. If the original iterator is infinite and the iterators diverge, memory will fill up. For finite iterators, tee is memory-efficient because it only stores items that haven't been consumed by all iterators yet.

What's the difference between a generator and an iterator?

Every generator is an iterator, but not every iterator is a generator. An iterator is any object with a __next__ method that raises StopIteration when done. A generator is a specific type of iterator created by a generator function or generator expression. Both are single-use. The exhaustion problem applies to all iterators, but generators are the most common source because they are often created on the fly.

Can a generator be used as a context manager?

Yes, using contextlib.contextmanager. But that pattern also yields once and cannot be reused. If you use a generator as a context manager, you can only enter the context once. Attempting to re-enter will fail because the generator is exhausted.