What does the “yield” keyword do in Python?

Tagline Infotech LLP
8 min readJan 24, 2024
What does the “yield” keyword do in Python?

Introduction

In Python, a generator is a function that returns an iterator that produces a sequence of values when iterated over.

Generators are useful when you need to generate a large sequence of values, but you don’t want to store all of them in memory at once.

The key to creating a generator is the yield keyword. When a function contains a yield statement, it becomes a generator function. Calling a generator function does not execute its body immediately; instead, it returns a special iterator called a generator.

Here’s a simple example:

def my_generator():
yield 1
yield 2
yield 3
g = my_generator()

When we call my_generator, it returns a generator object g. We can then iterate over g using a for loop to get the sequence of numbers:

for i in g:
print(i)
# Prints 1, 2, 3

The yield statement pauses the function and saves the local state. On the next call to next() on the generator, the function resumes after the yield.

Generators produce values on-demand, allowing memory-efficient lazy generation of sequences. The yield the keyword is what makes this possible in Python.

Allowing a Function to Pause Execution

The yield keyword in Python allows a function to pause execution and return a value, then continue where it left off on the next call.

When a function contains a yield statement, it becomes a generator function. Instead of returning once like regular functions, generator functions will pause after each yield and resume on the next call.

For example:

def my_generator():
yield 1
yield 2
yield 3
g = my_generator()

When my_generator() is called, it returns a generator object g but does not run the function. The function is paused at the start.

Then when we call next(g), it resumes my_generator(), executes until the first yield 1 statement, pauses again, and returns 1.

On the next call next(g), it continues from where it left off, executes the next line, pauses again, and returns 2. This allows the function to pause and resume between yields.

This pattern of pausing and resuming allows generator functions to produce a sequence of values over time, rather than computing an entire result at once like regular functions. The yield the keyword is what enables this useful behavior in Python.

Iterating Over a Generator

In Python, a generator object can be iterated over using a for loop or the next() function. Here’s an example:

def my_generator():
yield 1
yield 2
yield 3
g = my_generator()

To iterate using a for loop:

for i in g:
print(i)
# Outputs: 
# 1
# 2
# 3

The for loop automatically calls next() on the generator object until it raises StopIteration.

We can also manually iterate over the generator using next():

print(next(g)) # Outputs 1 
print(next(g)) # Outputs 2print(next(g)) # Outputs 3 print(next(g)) # Raises StopIteration

Calling next() repeatedly advances the generator and returns the next yielded value until there are no more values to yield.

So both for loops and next() can iterate over generators to consume their sequence of yielded values. The main difference is that the for loop handles StopIteration automatically while next() requires manually catching the exception.

Generator Expressions

Generator expressions provide a concise way to create generators using syntax similar to list, dictionary, and set comprehensions. They allow you to generate sequences without having to explicitly create a function.

For example, this generator function:

def squares(n):
for i in range(n):
yield i**2

Could be rewritten as a generator expression like this:

squares = (i**2 for i in range(n))

Generator expressions create anonymous generator functions and begin execution immediately when called. The main advantage they provide is syntactic compactness. However, they are limited to a single expression and cannot contain branches or multiple statements.

Overall, generator expressions are ideal for situations where you want to quickly create a generator in one line without defining an entire function. The syntax is intuitive for anyone familiar with list/set/dict comprehensions, making generator expressions easy to read and write. But for more complex logic that requires branching or multiple lines, defining a full generator function is likely better.

Yield From

The yield from expression was added in Python 3.3. It allows a generator to delegate part of its operations to another generator.

For example, you may have a generator that fetches data from an API. Rather than iterating over the raw response, you could define another generator that processes the raw data and yields cleaned-up results.

The outer generator can use yield from to delegate iteration to the inner data cleaning generator. This allows:

  • The outer generator abstracts away the raw data fetching
  • The inner generator focuses solely on data cleaning
  • The code is to be split for readability while preserving the iteration interface

When the outer generator yields from the inner one, iteration control shifts to the inner generator until it terminates. Any values yielded from the inner generator are passed directly to the caller of the outer generator. Exceptions are also propagated automatically.

Overall, yield from provides a simple syntax to delegate iteration between generators. This enables better abstraction and encapsulation when working with complex generators and coroutines.

Sending Values into a Generator

The yield keyword allows values to be sent into a generator when it is resumed. Here’s how it works:

When you call next() on a generator, execution resumes from the last yield statement and progresses until the next yield. However, send() allows resuming the generator and passing a value back into it at the same time. For example:

def my_generator():
print('Starting')
x = yield
print('Received:', x)
g = my_generator()
next(g) # Prints "Starting"
g.send(42) # Prints "Received: 42"

In this example, the generator pauses at the yield statement when next() is called. When we later call send(42), it resumes and assigns the value 42 to x.

The send() method can be called multiple times to keep passing new values back into the generator each time it resumes. This creates a two-way communication between the generator code and the calling code.

Values that are yielded out of the generator can also be captured when using send():

def my_generator():
yield 1
print('Received:', (yield 2))
g = my_generator()
print(next(g)) # Prints 1
print(g.send(42)) # Prints 2

Here the generator yields 1, then awaits a value to be sent in. We send 42, which gets printed out, and the yielded 2 gets captured and printed.

So send() provides a way to pass data into a generator as it runs, enabling communication in both directions.

Generator Pitfalls

Generators can be tricky to work with if you don’t fully understand their behavior. Here are some common pitfalls to watch out for:

  • Returning a value from a generator — Unlike normal functions, generators should not return a value, and doing so will raise a StopIteration exception. For example:
  • def bad_generator(): yield 1 return 2 # raises StopIteration!
  • Losing generator state on exception — If an exception occurs inside the generator, the generator will be closed and lose its internal state. This can lead to unexpected behavior if you try to resume iteration after catching the exception.
  • Forgetting to iterate — Simply defining a generator function does not execute any code — you have to explicitly iterate over the generator using a for loop, next(), etc. It's easy to define a generator but forget to use it.
  • Not handling StopIteration correctly - When a generator reaches the end of iteration and stops, it raises StopIteration. Make sure you are properly catching this exception if you are iterating manually with next().
  • Unhandled generator exceptions — Exceptions raised inside the generator are not automatically propagated outside — you need to manually catch them inside the generator and re-raise them if needed.

To avoid these issues, remember that generators are fully paused until the next next() call, they shouldn't return values, and exceptions need special handling. With some practice, generators can be powerful and efficient tools in Python. But watch out for the common traps when first working with them!

Generator Advantages

One of the main benefits of using generators is lazy evaluation. When you create a regular function, it executes all of its code at once. But generators are lazy — they pause after each yield statement and resume from that point when the next() is called. This allows programs to produce item-by-item results on demand, rather than computing everything upfront.

Generators are also very memory efficient. A regular function needs to store all of its results in memory at once before returning. But a generator can yield results one chunk at a time, avoiding storing everything in memory simultaneously. This makes generators useful for working with large datasets that don’t fit in memory.

Additionally, generators facilitate pipelines and producer/consumer patterns. The yield statement allows easily sending data from a producer to a consumer without needing shared memory between them. Generators give you iterator benefits like lazy evaluation and memory efficiency while also providing the ability to pass data in both directions between producers and consumers.

Overall, generators enable lazily evaluated iterable data pipelines. This provides powerful capabilities like:

  • Processing large datasets that don’t fit in memory
  • Implementing producer/consumer patterns and pipelines
  • Avoiding unnecessary upfront computation
  • Writing iterative algorithms that produce results on demand
  • Pausing and resuming execution at precise points

By taking advantage of these generator strengths, you can write efficient programs that elegantly handle large and streaming datasets.

Use Cases

Generators are commonly used in Python for several useful purposes:

  • Implementing iterators — Generators provide an easy way to define iterators without having to implement them __iter__ and __next__ magic methods. This can simplify code when creating custom iterators.
  • Lazy evaluation — Generators allow deferring the evaluation of values until they are needed. This can improve performance for large datasets or infinite sequences. Values are produced on-demand rather than computed upfront.
  • Data streaming — Generators can act as data streams, yielding one item at a time instead of storing the entire sequence in memory. This enables processing big data and files line by line or chunk by chunk.
  • Concurrency and parallelism — Generator functions can yield control back to the event loop in asynchronous programming. This enables other tasks to run while waiting on long-running operations.
  • Pipelining generators — Generators can be composed into pipelines where each stage transforms data and passes it to the next generator. This allows efficient chaining of producer-consumer patterns.
  • Coroutines — Generators are the basis for coroutines, a generalized subroutine that can have multiple entry points for suspending and resuming execution. This is helpful for asynchronous programming and concurrent operations.

Some examples of good use cases for generators include processing large datasets, implementing state machines, asynchronous I/O, streaming data files, and creating pipelines for ETL (extract, transform, load) tasks. Overall, generators excel at iterative and incremental processing of data flows in memory-efficient ways.

Conclusion

The yield keyword in Python provides a powerful tool for creating generator functions and expressions. Generators allow pausing and resuming execution, iterating over data without storing everything in memory, and simplifying code for many use cases.

Some key points about generators and yield:

  • Calling a generator function returns a generator object, which can be iterated over. The function is paused until the next value is requested via next().
  • The yield keyword signals a value to send out of the generator, as well as where to pause execution.
  • Generator expressions provide a compact syntax similar to list comprehensions but return a generator object.
  • yield from simplifies yielding from sub-generators.
  • Values can be sent into a generator using .send().
  • Generators are useful for lazily producing data on demand, processing streams, implementing coroutines, and other tasks requiring pausing execution.
  • Compared to lists, generators save memory by not storing entire result sets. But they can only be iterated over once.

Overall, generators and yield provide powerful capabilities for iterators, data processing, and coroutines in Python. Understanding their usage unlocks simpler and more Pythonic code across many applications. I hope you are like this blog so read more technical blogs.

--

--

Tagline Infotech LLP

We campaign a team of Developers from individuals and set up the business with a Change + Positive Progressive frame of mind in every aspect of the work line.