As an Amazon Associate I earn from qualifying purchases.
Today, I will be showing you what the yield keyword in Python does. The
yield keyword can be tricky because it is similar to
return but with some differences. To understand what
yield does, let’s have a look iterables and generators.
Let’s say you want to create a program that prints out 100 numbers starting from 0 and incrementing by 1 each time. Normally you would create a list with values 0 – 99 and loop over the list, printing each element out. This is called iteration. You iterate over each value in the list, thus, a list is considered an iterable. Anything in Python that you can iterate over is considered an iterable. Examples include: list, set, range, tuple etc. Here is an example of how you would normally print a list of values out from a list.
num_list = range(100) for i in num_list: print(i)
In the above code, we simply created a list and iterated over it, printing out each element in the list.
Let’s have a look at another example. In this example, I want to create a list of 100 numbers, double each value and then print each value. To do this you would use a sequence as follows:
num_list = [x * 2 for x in range(100)] for i in num_list: print(i)
When we compute the multiplication for each element in the list the entire result is stored in memory. For 100 numbers this is fine, but what happens if you need to store 10 billion numbers? Well, lets find out!
C:\Users\conor\>python test.py Traceback (most recent call last): File "C:\Users\conor\test.py", line 3, in <module> num_list = [x * 2 for x in range(10000000000)] File "C:\Users\conor\test.py", line 3, in <listcomp> num_list = [x * 2 for x in range(10000000000)] MemoryError
You will noticed that we have received a memory error. Essentially this means that this operation is too large for my computer to handle and we ran out of memory trying to compute and store the entire result. We can use generators and the
yield keyword to solve this problem.
Generators using yield
Generators, are iterables. Although with generators, you can only iterate over the values once. They are in a sense, lazy iterators as they do not store all of their values in memory. To create a generator, we must make use of the
yield keyword. Let’s have a look at how we would print 100 numbers using a generator:
def my_generator(): num_list = range(100) for i in num_list: yield i my_gen = my_generator() for i in my_gen: print(i)
In this example, the function
my_generator is creating the generator. It is very important to note that when we call the function and assign it to
my_gen, the function does not actually run. Instead, it returns a generator object.
my_generator will run for the first time when it is accessed from this for loop:
for i in my_gen: print(i)
When it is called for the first time, it will execute from the beginning until it hits the
yield keyword. At this point it will return the value of the first loop. Each subsequent call to the generator will only run another iteration of the loop until it hits
yield again. Essentially,
yield is like a ‘pause’ of execution.
Let’s go back to our previous example of doubling 10 billion numbers and printing them out. To double each number, we simply need to add modify
yield i to
yield i * 2 and increase the range:
def my_generator(): num_list = range(10000000000) for i in num_list: yield i*2 my_gen = my_generator() for i in my_gen: print(i)
You will notice when we run this code, it is extremely quick and we do not get
MemoryError. So what is the difference here? In the previous example we computed the value of each element in the list at one time and stored the values in memory. When we use a generator and
yield, it calculated the values only when needed (lazy) and stores the value needed at that time instead of storing every single computed value.
One key difference between a iterable and a generator is that a generator can only be iterated over once.
You don’t need a loop to access elements in a generator. You can access values sequentially using
next(). If we only wanted to access the first 5 values we can do this as follows:
def my_generator(): num_list = range(10000000000) for i in num_list: yield i*2 my_gen = my_generator() print(next(my_gen)) print(next(my_gen)) print(next(my_gen)) print(next(my_gen)) print(next(my_gen))
0 2 4 6 8
Difference between Python yield and return
Return is used at the end of a function to return an object or any other value. Yield is similar to return, but is used for a generator to return the next value.
This tutorial has demonstrated how to use an iterable, how to use a generator, and what the yield keyword in Python does. Using
yield is a great way to compute large sets of data with very little memory and overhead. It is very efficient, only computing and returning values when needed.
That’s all for What does the yield keyword do in Python! As always, if you have any questions or comments please feel free to post them below. Additionally, if you run into any issues please let me know.
Make sure to check out these other Python tutorials 🙂
If you’re interested in learning Python I highly recommend this book. In the first half of the book, you”ll learn basic programming concepts, such as variables, lists, classes, and loops, and practice writing clean code with exercises for each topic. In the second half, you”ll put your new knowledge into practice with three substantial projects: a Space Invaders-inspired arcade game, a set of data visualizations with Python”s handy libraries, and a simple web app you can deploy online. Get it here.