The magic of memoization

Project Euler problem #15 (This may be a spoiler BTW) can be solved via recursion, but because the tree is so wide it quickly blows up. More irritatingly, much like computing the Fibonacci sequence recursively, we keep doing the same computation over and over again. I knew how to solve this using imperative programming and I was trying to shoehorn that solution into Racket, but just thinking about it made me feel dirty. If only there was some way to retain the elegance of recursion but not have to redo computations …

Project Euler problem #15 poses the challenge: Starting in the top left corner of a 2×2 grid, and only being able to move to the right and down, there are exactly 6 routes to the bottom right corner. How many such routes are there through a 20×20 grid?

It turns out that there is an analytical solution to this problem involving combinatorials. I kind of got there while sketching out solutions, but fortunately for me, I decided to let the computer do all the work. Fortunately because I learned a cool new functional tool.

The straightforward way to solve this problem using functional programming paradigms is to use recursion. Just set off little machines that keep forking off copies off them selves chasing down alternate paths. If the machines go off the grid, discard them, and if they reach the exit corner, tell them to report back. Count up the reports and you are done!

#lang racket
(require rackunit)

; A 2D value
(struct pos (x y) #:inspector #f)

;; The two legal moves
(define (move-right p0)
  (struct-copy pos p0 [x (+ (pos-x p0) 1)])) 

(define (move-down p0)
  (struct-copy pos p0 [y (+ (pos-y p0) 1)])) 

;; Return True if we are at the grid boundary
(define (boundary-pos p0 grid-size)
  (if (or (= (pos-x grid-size) (pos-x p0)) (= (pos-y grid-size) (pos-y p0))) #t #f))

;; Use recursion to explore all possible moves. Our base case occurs when we
;; land on the grid boundary. We know that from then on there is only one
;; path to the exit - either straight right or straight down. Return 1 when that
;; happens
; p0 - initial position (pos struct)
; grid-size - size of grid (pos struct)
(define (count-paths grid-size [p0 (pos 0 0)])
    [(boundary-pos p0 grid-size) 1]
    [(+ (count-paths grid-size (move-right p0)) (count-paths grid-size (move-down p0)))])) 

(test-case "Recursive"
           (check-equal? (count-paths (pos 2 2)) 6)
           (check-equal? (count-paths (pos 1 3)) 4)

(time (let ([k (count-paths (pos 10 10))]) (display k)))
;-> 184756  cpu time: 88 real time: 88 gc time: 0
;; 20x20 grid does not finish and keeps asking for more memory

Now this thing blows up quickly because, as you can see, each node we visit spawns two machines (function calls) and pretty soon our grid is crawling with machines. On my machine I can’t even compute the 20×20 solution.

What is annoying is that it doesn’t have to be this way: we can see that the recursion repeatedly visits the same lattice nodes (after all there are not THAT many of them). An interesting property of this problem is the number of paths from a given node to the exit does not depend on how we got to that node. So we don’t really need to send machines down a node we have already visited.

The way I would have exploited this property in imperative programming is to build a two dimensional array representing the grid. I would have started to fill out this array by walking backwards from the exit and noting, in the array values, how many paths there are to the exit. The value of each node is the sum of the values of the two “children” node – nodes that can be reached by going down or by going right.

If you want keywords, this is a bottom up dynamic programming approach because we are taking small, individual pieces of the problem and working out the answer separately and then putting things together to solve bigger parts of the problem. (Wikipedia’s page on dynamic programming is pretty good)

I started to poke around Racket and look into matrices and arrays and mutability and I started to get a yucky feeling. It started to feel a lot like housekeeping and housekeeping is one thing functional programming promises to free us from, so we may concentrate on the fun stuff.

While doing various web searches related to “dynamic programming functional languages” and “update mutable array Racket” (yes, I was desperate) I kept hitting the word memoization. I’d come across this term before, but had not paid any attention to it. Then I found this great post on the Racket blog.

Before we go any further, take a look at the code below:

(require memoize)

(define/memo* (count-paths-m grid-size p0)
    [(boundary-pos p0 grid-size) 1]
    [(+ (count-paths-m grid-size (move-right p0)) (count-paths-m grid-size (move-down p0)))])) 

(test-case "Memoize"
           (check-equal? (count-paths-m (pos 2 2) (pos 0 0)) 6)
           (check-equal? (count-paths-m (pos 1 3) (pos 0 0)) 4)

(time (let ([k (count-paths-m (pos 20 20) (pos 0 0))]) (display k)))
; -> 137846528820  cpu time: 0 real time: 0 gc time: 0

;; Note that this last operation would basically run out of memory and crash using the naive recursive implementation

What just went down here? (Yes, I learned this catchy phrase from Tim Roughgarden’s lectures)

We only made one change – we used the memoize package and we defined the function as a memoized function. (We also had to remove the default value, apparently the memoization package does not support that)

What is this Memoization magic?

Recall that what I was looking for was a scheme that would prevent us having to recompute already known values. The approach I was familiar with from imperative programming was an explicit table store of known values that I could reference each time I went to do a computation.

Memoize is a very elegant implementation of a related idea. However, instead of putting the implementation (e.g. a matrix of known values) first, it puts the function first and says “What we really want is a way for a function to remember the values it computes”.

What the two lines of Racket above do (and note, really, that we did not have to touch the algorithm at ALL) is to say, here’s a function count-paths-m and I want to remember the answer to each call that is made to it, so if I ever call the function with the same arguments, instead of executing the body of the function, just return the stored value.

And THAT’s IT! No need to mess with dirty implementation details. Memoize gets to the heart of what we want to do.

Now, such memoizations, because they have to work with any function, are usually implemented as a hash (I know that the Racket one is – by the way, the Racket implementation is one short file). Performance freaks may hem and haw, since in this particular case, for example, rather than having a hash look up, it is faster and more compact to have a 2D array lookup. But, as you can see from the timing, it’s fast enough.

As a side note on Racket, I was very impressed by the smoothness of getting this package installed. I used the GUI to search for “memoize”, the manager searched in different repositories, found the github repo, downloaded the code and integrated the documentation with the rest, so now I can find the docs for memoize just like I would any of the pre-installed packages.

Finally, a small point specific to this Racket package: note I used define/memo*. I first used define/memo and got very unexpected results, which seemed a lot like memoziation wasn’t working. It turns out to be a detail of the package: I was using structs – not primitive types – as inputs to the function, and the *-d versions of the functions implement the proper comparisons to handle such derived data types.

UPDATE: I forgot to put this in: Note that now we have a kind of state in the function. It’s not a state that changes the input/output characteristics of the function, but rather an “implementation state” if you will. The state changes how the function is implemented based on the previous history of the program. This does not give us any problems when reasoning about what the function does (as befits a proper functional program) but it does make it hard to reason about the speed characteristics of the function in a way that mirrors difficulties in reasoning about algorithmic characteristics of an imperative program with mutable state. Funny world.


One Reply to “The magic of memoization”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s