I yield to Python

I’ve been using Python for a few years now and I am amazed that I got by all these years without using yield. I think this is a result of my coming from a C/Matlab background and self-teaching myself Python. Now of course I have the zeal of a recent convert.

The way I came to use ‘yield’ is when I started to refactor my analysis code. I needed to repeat a computation many times (in a loop) and in a slightly different way each time. A lot of the code was very similar, but some parts of the heart of the code were different.

So I started out by passing functions into other functions. So I had a basic function that ran my computation loop into which I would pass parameters and functions that I would run inside the loop. But then I sometimes needed to make subtle changes within the loop itself or before the loop ran and it started to get messy.

The code would get a lot cleaner and more understandable if I could refactor the concept of the loop out and reuse that. If only Python had something like that, if only there was some way …

So, here’s a short tutorial on Python’s yield from a recent convert using a contrived example.

Task: Given an integer N, take all the numbers 1…2*N, split them into even and odd numbers and then return the result of adding the number pairs for various values of N.

def run_generator(Nmax):
    print """    Any code upto the yield statement 
    only runs the first time this generator is
    called"""
    for n in range(1,Nmax+1):
        yield n

def number_generator(N):
    for n in range(1,N+1):
       yield 2*n-1,2*n 
        
for N in run_generator(10):
    for a,b in number_generator(N):
        print a + b

Ah, you say, but I could have done that using just a set of explicit nested loops. Ah, but what if I come along and say, now I want to find the product of the number pairs AND a running total of their product? Before you would be copying the code and pasting it and making some tweaks to the inner and out loops. But now, because you are using generators:

sum = 0
for N in run_generator(10):
    for a,b in number_generator(N):
        prod = a*b
        sum += prod
        print prod, sum

Some people will tell you that yield is really useful because they form generators and therefore compute the next value just-in-time, saving memory (you don’t keep the entire list in memory at the same time). Pfft, memory schmemory, I find the real use is in keeping code elegant and reusable.

(I will not engage in lame wordplay for blog post titles. I will not engage in lame wordplay for blog post titles. I will not engage in lame wordplay for blog post titles. I will not engage in lame wordplay for blog post titles. I will not engage in lame wordplay for blog post titles. I will not engage in lame wordplay for blog post titles.)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s