functools.partial

functools.partial is one of those things that you don’t need until you need it, and then you need it badly. Trying to create a series of lambda functions I discovered that the simplest answer that came to my mind did something counter intuitive, and functools.partial – which I had ignored so far, came to my rescue.

Effectively, I wanted to create a list of functions

f1(x), f2(x), f3(x) ...

where each function took the same input but did something different, for example

f1(x) = x + 1, f2(x) = x + 2, f3(x) = x + 3

What I tried was as follows

fl = [lambda x: x + i for i in range(4)]

It ran fine, but it gave me the following output

fl[0](1) -> 4
fl[1](1) -> 4
fl[2](1) -> 4
fl[3](1) -> 4

What was happening was that all the lambda functions were taking the last value of i (as a global), rather than the value of i at the time the function was created.

An object oriented person would probably do something like

class F:
  def __init__(self, i):
    self.i = i
  def f(self, x):
    return x + self.i

fl = [F(i) for i in range(4)]

But I enjoy Python because it borrows a lot of things from LISP and allows you to write a lot of code very succinctly, making it easier to grasp (if done well – it’s also possible to write Perl in Python)

Turns out the way you can capture the state of i in the lambda is to create a partial function, and functools.partial allows you to do this succinctly as

fl = [functools.partial(lambda x, j: x + j, j=i) for i in range(4)]

For those who are curious, WHY I had to do this – well, here is an analogy to my use case: I had data that was well expressed as a table of values. The data itself was stored as a list of dictionary which was sometimes nested – some of the values formed part of a hierarchy. For example, one row of the data could look like this:

row = {
 'name': 'Baker',
 'rank': 22,
 'score': {
   'raw': 43,
   'scaled': 25
 }
}

I wanted a table with columns

Last Name | Rank | raw score | scaled score |
----------|------|-----------|--------------|
    Baker |   22 |        43 |            25|

The actual table I was creating was meant to show ALL the data from an experiment with many variables and so had a lot of columns. I was wary of making a mistake in labeling the column relative to the value it had. It would be one of those silent and deadly errors.

If I had a simple flat structure, I might simply have changed the code such that the dictionary keys matched the column headers, weighing the fact that there was other code – already written and debugged – that used the existing keys and that the existing keys were short and succinct and I liked them.

My solution was to create a list of tuples

[(column header, function to extract column value from dictionary)]

And then simply loop over this list, first to create the header, and then for each row, to extract the data value into it’s correct place. Having the column header right next to the extraction function minimized the chances I would mess up somewhere, especially when adding or moving a column when editing or refactoring the code.

(time (apply max a))

A while ago I started up with Haskell in an effort to figure out what this functional programming was all about. One of the simple tests I did was try to find the maximum of a list of numbers and time it. This led me down an interesting rabbit hole, documented here.

I’ve now started with Racket in earnest (and am documenting it here). The corresponding test for Racket – the maximum of a list of a million numbers – is as follows:

#lang racket
(define a (build-list 1000000 values))
(time (apply max a))

And gives a respectable 92ms running time, 70ms of which is garbage collection. To recap, the corresponding performances for Python was 20ms and 2ms for C. Haskell’s performance was, well, confusing.

maximum [1..1e6]

I had heard about Haskell – the coolest language on the block – from several friends. After reading the chapter on typeclasses from Learn You A Haskell for Great Good my mind was blown away and I wanted to get cracking and learn this cool functional programming thing.

I’m a practical guy. As I followed the tutorial along defining toy functions, foraging for a suitable editor (lighttable works, but I’m still looking), figuring out why cabal was giving 502 errors (define a different remote, hackage goes down sometimes) I suddenly thought, how does Haskell stack up against Python in doing a simple thing like finding the maximum of a list of numbers?

In Python, this stuff is easy. You open up IPython and do something like this:

%%timeit x = xrange(int(1e6))
max(x)

And get something like this:

10 loops, best of 3: 22.4 ms per loop

In the interactive Haskell compiler ghci

:set +s

turns on profiling and one can do

maximum [1..1e6]

To get

(0.20 secs, 244690144 bytes)

Both of these numbers (time taken, memory used) seem a little high! 244 megabytes for this?

Scuttlebutt on the internet has it that the ghci runs slowly, so I wanted to time a compiled version of this code.

To start off, I wanted to see if the bare bones version of the code would compile and run:

main = do let z = maximum [1..1e6]
          putStrLn $ "The result is: " ++ show z

Which led to:

Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

Uh? Mog no understand. How much stack do we need for this again? It’s just a million numbers. Python, even Matlab, eats this for breakfast everyday, as a starter, with the OJ.

Ok, let’s try RTS.

./max +RTS -K100M
max: Most RTS options are disabled. Link with -rtsopts to enable them.

Oh, alright. We’ll compile with

ghc -rtsopts  max.hs

(Don’t forget to delete the intermediate files, got tripped up there)

The result is: 1000000.0

real    0m0.262s
user    0m0.222s
sys 0m0.039s

Well, this isn’t THAT impressive.

What does a guy have to do round here to get some decent performance from this thing? (I also tried with pure ints, in case you were wondering, with the same results)

As a comparison, what does a simple C implementation of bubble max get us? (Please excuse my rusty C)

#include <stdio.h>

struct ListGenPtr {
  int current;
  int next() {
    if (current < 1000000)
      return ++current;
    else
      return -1;
  }
};

int bubble_max(ListGenPtr *list_gen_ptr) {
  int m0 = list_gen_ptr->next(),
      m1 = list_gen_ptr->next();
  while (m1 != -1) {
    if (m0 < m1) m0 = m1;
    m1 = list_gen_ptr->next();    
  }
  return m0;
}

int main() {
  ListGenPtr lgp = ListGenPtr();
  lgp.current = 0;
  printf("%d\n", bubble_max(&lgp));
}

Compiled with

g++ max.c -o maxc
time ./maxc
1000000

real    0m0.008s
user    0m0.006s
sys 0m0.002s

Which seems reasonable and three times faster than Python.

(We can push Python to the limit and do:

%%timeit x = xrange(int(1e9))
max(x)
   ....: 
1 loops, best of 3: 22.9 s per loop

And against C:

time ./maxc
1000000000

real    0m4.547s
user    0m4.540s
sys 0m0.005s

Which is 5 times faster.)

Hmm. How do we get some decent performance out of Haskell for the kind of datasets we expect to play with in the real world? It seems that to do this I will need to learn more about Haskell internals – things that will come under advanced topics – things that meddle with lazy evaluation and tail recursion.

So, right now, to me, Haskell is an academic language, in that with Python I can get up and running and write good code very concisely while with Haskell, even for a simple thing like the maximum of a list we have to start tinkering with non-trivial parts to get acceptable performance.