Some notes on using Python multiprocessing

This is a short note describing a common code pattern useful for parallelizing some computations using the Python multiprocessing module.

Many problems are of the embarrassingly parallel type, where the task consists of the same set of computations done independently on a large set of input data. In many languages such problems are solved using multi-threaded code.  Due to events too complicated and embarrassing to mention Python doesn’t do too well with multiple threads, and we need to use multiple processes via the multiprocessing module instead.

A common pattern is to set up a number of workers and use queues to pass data back-and -forth. The pattern looks like this:

multiprocessing-queues.png

The use of this pattern is relatively straight forward but there are some things to pay attention to.

Sharing read only data

The general topic of sharing data between processes is worthy of deeper discussion, and there are some pointers in the Python documentation but there is an easy way to share read only data between processes – use globals that are written to in the parent process BEFORE the child processes are started.

If you declare a global variable and then modify it after the child processes are started, or if you modify the global from the child process you’ll enter into the twilight zone. There will be no errors and from within the process where you do the writing everything will look consistent, but the rest of the universe will not pay any attention to your writes.

Queues have overhead – the object has to pickled by the sender and then un-pickled by the receiver, and while the object is in the queue, it’s taking up memory. If you have a large amount of read-only data that needs to be shared amongst child processes – like a lookup table of primes, for example – this pattern of using globals can be pragmatic.

 

Explicitly set Queue size

On Mac OS X the default size of the queue is  32767 items. On Linux, the default size is … INFINITY. I learned this the hard way. I usually prototype my code on my mac, and then deploy to a linux container that runs god knows where on data that is several orders of magnitude larger than my prototype set. On Linux things SEEMED to run fine but the size of the result set was abnormally small compared to the input. After a lot of poking around I discovered this property of the Python implementation and what was happening was that in Linux my queues got larger and larger, taking up more and more RAM and the OS started to kill off the child processes. Setting an explicit queue size allows the processes to block when the queue becomes full, capping memory usage.

Wrap the child process function using traceback

Error messages from child processes are messy and uninformative. Importantly, the friendly python stack trace print out is missing. A fairly convenient pattern is to wrap the actual function call within a try-except block in thin outer function, using the traceback module, like so:

def func_w(args):
  """A thin wrapper to allow proper tracebacks when things go wrong in a thread

  :param args:
  :return:
  """
  import traceback
  try:
    return func(**args)
  except Exception as e:
    traceback.print_exc()
    print('')
    raise e

Check for normal termination from the child

One of the reasons it took me so long to track down the above mentioned memory error was that when the child processes was being killed by the OS no Python error was raised. The principled way to check for normal termination in is to look at the exitcode value of the process.

When not to use this pattern

Passing data via queues can be expensive in terms of computation and memory. If your individual workers have to pass a lot of data back and forth – say large submatrices of an even larger matrix – it may be neater and faster to use shared memory objects as supplied by the multiprocessing module which takes care of semaphores for you.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s