Common Lisp accessors

The Common Lisp setf function is deceptively humble looking. At the start I didn’t think much of it – except that it looked eccentric compared to the assignment syntax from other languages – but recently I’ve found out how powerful it is, though not quite as convenient as I would hope.

In most languages to perform a variable assignment one does something like

int a = 22;

which places the value “22” into a place in memory which can be accessed by writing “a” in the program.

In Common Lisp the corresponding syntax is

(setf a 22)

and the only thing weird about it is the everything-is-a-list pattern of Lisp.

I was blown away, though, when I saw code that looked like this

(setf x `(1 2 3 4 5))  ; x -> (1 2 3 4 5)
(setf (rest x) `(6 7 8))  ; x -> (1 6 7 8)

You can’t do this as transparently in C++ or Python. You can implement a function to do the equivalent thing, but you can’t assign a value to the return value of a function. This Lisp syntax makes writing code very compact and logical.

However, I stumbled when I tried to use this in a function I wrote. My function started at the root of a complete binary tree and then snaked recursively through it to find and return a particular node.

;; Use the binary representation trick to thread through the
;; tree to a given node. No bounds checking is done here
; root - the root of the tree
; index - the index number of the node (1 = root)
(defun find-node (root index)
  (multiple-value-bind
   (n-idx rem) (floor index 2)
   (if (= n-idx 1)
     (find-node-pick-child root rem)
     (find-node-pick-child (find-node root n-idx) rem))))

(defun find-node-pick-child (nd rem)
  (if (= rem 0) (node-lchild nd) (node-rchild nd)))

When I tried to execute the statement (which I expected to attach new-node to the place returned by find-node)

(setf (find-node root 8) new-node)

I ran into compiler errors.

This lead me on a very interesting chase all round the internet with the word “generalized variables” turning up a lot. It turns out that (setf …) is a lisp macro that can figure out how to get hold of the “place” its first argument refers to. If the first argument is just a variable, (setf …) acts just like the assignment statement in any other language. If the first argument is a function, Lisp will (effectively) call that function and setf on the place returned by the function. This function however has to have a special form for this to work and is called an accessor function

Following advice from answers on comp.lang.lisp, I came up with the following code:

;; Use the binary representation trick to thread through the
;; tree to return an tree node and some auxiliary information
;; No bounds checking is done here
; root - the root of the tree
; index - the index number of the node (1 = root)
; Return value is a pair
; node - a node of the tree
; info - :me - this is the node we're looking for
;        :lchild - we want the left child
;        :rchild - we want the right child
(defun find-node-with-index (root index)
  (multiple-value-bind
   (new-idx rem) (floor index 2)
   (cond
     ((= new-idx 0) (values root :me))
     ((= new-idx 1) (values root (if (= rem 0) :lchild :rchild)))
     (t (find-node-with-index
         (if (= rem 0) (node-lchild root) (node-rchild root))
         new-idx)))))

(defun node-with-index (bh index)
  (multiple-value-bind
   (nd info) (find-node-with-index (bheap-root bh) index)
   (case info
     (:me (bheap-root bh))  ; Special case - root of heap
     (:lchild (node-lchild nd))
     (:rchild (node-rchild nd)))))

(defun (setf node-with-index) (new-node bh index)
  (multiple-value-bind
   (nd info) (find-node-with-index (bheap-root bh) index)
   (case info
     (:me (setf (bheap-root bh) new-node))  ; Special case - root of heap
     (:lchild (setf (node-lchild nd) new-node))
     (:rchild (setf (node-rchild nd) new-node)))))

I like the DRY principle primarily because it reduces bugs and the total amount of code. I tried to abstract as much of the core functionality into a single function as I could, but the getter and setter functions look so duplicate, but I don’t think I can reduce them further.

Thanks go to Kaz Kylheku and Barry Margolin on comp.lang.lisp for answering my questions on this topic.

Links-Rechts, Links-Rechts, Ahnentafel!

While looking into priority queues and binary heaps, I ran across two gems, one of which may not be so widely known. Consider a complete binary tree with nodes sequentially numbered:

binary-tree

You will observe that the parent of every node is the integer division of itself by two. For example, the parent of 11 is 11/2 = 5 (integer division), the parent of 14 is 14/2 = 7 and so on. Way back in 1590 a genealogist used this numbering pattern to organize people’s ancestries, calling it an Ahnentafel. From the article:

The subject (proband or progenitor) of the ahnentafel is listed as No. 1, the subject’s father as No. 2 and the mother as No. 3, the paternal grandparents as No. 4 and No. 5 and the maternal grandparents as No. 6 and No. 7, and so on, back through the generations. Apart from No. 1, who can be male or female, all even-numbered persons are male, and all odd-numbered persons are female. In this schema, the number of any person’s father is double the person’s number, and a person’s mother is double the person’s number plus one. Using this definition of numeration, one can derive some basic information about individuals who are listed without additional research.

This construct displays a person’s genealogy compactly, without the need for a diagram such as a family tree. It is particularly useful in situations where one may be restricted to presenting a genealogy in plain text, for example, in e-mails or newsgroup articles. In effect, an ahnentafel is a method for storing a binary tree in an array by listing the nodes (individuals) in level-order (in generation order).

Now, suppose we see the exact same complete binary tree and sequential numbering in binary:

binary-tree-in-binary

Do you notice a striking pattern? The children of any given node are basically the binary of the parent left shifted by one with a 0 or 1 in the new last bit, depending upon whether they are the left or right child. So, for example, the children of 101 are 1010 and 1011. This is the basis for the following neat algorithm (which I first came across here) for finding a given node in this tree.

To find the path from the root (here node #1) to some other node, say (11) express the target node number in binary (1011) and discard the first digit – which is always 1 – (011). Each digit now successively indicates the child to pick (0 = left, 1= right) as you traverse the path. So, to get to node #11 you would take (011) left, right, right. (I read this twice before realizing the choice of eleven (11) is peculiarly bad in this context. Here’s the example using 12. 12 => 1100 => 100 => right, left, left)

This algorithm is useful when using a pointer-based representation and, for example, inserting or deleting from a binary heap. This pattern, as you have guessed, comes from the number representation and the tree being both binary based.

PS. Pardon my terrible title. The Ahnentafel put “German” in my head, traversing the tree put “left, right” in my head and watching too many world war II movies in childhood did the rest.

PPS. Python code to generate the trees, not that you need it.

from matplotlib import pyplot as plt

def node_xy(k, n, dx):
  n0, n1 = 2**k, 2**(k+1)-1
  center = (n0 + n1) / 2.0
  return (n - center) * dx / n0, 0.5 - k / 10.0

def plot(ax, max_k):
  dx = 1.0
  for k in range(max_k):
    for n in range(2**k, 2**(k+1)):
      x, y = node_xy(k, n, dx)
      # ax.text(x, y, str(n), ha='center', va='bottom')
      ax.text(x, y, bin(n)[2:], ha='center', va='bottom')
      if k:
        px, py = node_xy(k - 1, int(n/2), dx)
        plt.plot([px, x], [py, y + 0.025], 'k')

fig = plt.figure()
ax = plt.subplot(111)
plot(ax, 5)
plt.setp(ax, xlim=[-0.5, 0.5], ylim=[0.0, 0.6])
plt.axis('off')

Problem 11

I’m using problems in The Advent of Code 2016 to motivate my study of Common Lisp though, sadly, I’ve not been devoting time to it recently. Luckily several of my colleagues are also doing the challenge and one of them came by and described what sounded like a particularly interesting problem – number 11 – , and I decided to try my hand at it in Lisp. It promises to be engaging and instructive.

I’m not going to repeat the problem here (but some of the stuff I’ll write will only make sense in the context of the details of the problem); in the abstract it is a path finding problem: The problem world is in some initial state, there are set of intricate rules by which you can change the world state, and there is an end goal state. The task is to find the shortest number of steps to that end goal.

For me this is a rich problem both in terms of algorithms and language. I think I’ll try and make short posts about the steps I take. My course of action will be (and these may change, of course):

  1. Figure out what algorithms apply to this problem
  2. Think of data structures that will help
  3. Implement this in Common Lisp, from scratch where it looks interesting
    1. I’ve had at least one case so far where I had a naive way of coding things and then I discovered a different Lisp idiom that made it better. I hope to show these examples too.

There are several tutorials and expositions on path finding. The one I like the best is from Red Blob Games. It has interactive visualizations of the algorithms which are very motivating and the tutorial itself starts slow and easy and makes nice progress to end at the A* algorithm. If you want a refresher, please refer to that. If you’ve read Problem 11 you’ll note that it looks different, but with the proper abstraction, this is what it is.

Now let’s take up where the tutorial leaves off. The A* algorithm seems to be the ticket and we want to figure out some implementation details. The interesting things the A* algorithm requires are a priority queue and a heuristic for determining how far a given state is from the end state.

Popular lore has it that, for this application, a Fibonacci heap is a good implementation for the priority queue for this application, but The algorithm design manual points us to rank-paired heaps as something that is better in practice. I found this paper on rank-pairing heaps. The authors have written the paper very well: They start with a nice systematic introduction to heaps and then slowly introduce modifications they make to existing structures and algorithms.

To start with however, I’m going to implement a priority queue using a simple binary heap (there is a good illustrated guide here). At a later stage, when the whole thing is worked out, I’m going to come back and implement rank-pairing heaps.

The data structure for holding the world state can be separated out from all this priority queue business, but it does interface with the A* algorithm in two places 1) The world state has to be amenable to hashing, such that we can put it in a map-like structure and 2) The implementation of rules for going from state to state i.e. the manner in which we find child nodes of a given parent node can, we intuit, be made considerably easy or difficult based on our choice of world state representation.

This whole thing is going to go down as a series of blog posts and I’ll link to each one (creating a linked list) but also try and set them out here, so that this post serves as a table of contents.

  1. Binary heap priority queue
    1. Learned about accessor functions
    2. And print-object which is what Python’s __repr__ is modeled on.
    3. and a diversion
  2. A* search framework
  3. Data structure and functions to handle world state
  4. Completion of problem 11
  5. (Bonus) Rank-pairing heap implementation
  6. A wrap up?

Holy $#!* THAT’s a Lisp macro!

Still struggling to understand what’s so special about Lisp, I came across a post linked on lobste.rs, innocuously titled “Hash Table Syntax in Common Lisp“. It had the arcane macro syntax I’d seen in some other examples and I was about to skim past it when I suddenly saw the expression {:key1 => "value1", :key2 => "value2"} being blithely evaluated by the REPL as if it was just any other Lisp expression. I sat there with my mouth slightly open.
 
Lisp, Common Lisp no exception, has a very basic syntax. Everything, and I mean everything, consists of function evaluations, and the evaluations all have the format (fun arg1 arg2 ...) and here was this expression full of braces and commas and odd operators being turned into a common lisp hash table.

There has to be some catch, I thought. It has to be invoked in a special way or something. Turns out, no. You can just paste the macro into the REPL and start using the curly braces notation.

I won’t claim that I “understand” Lisp. I won’t even claim I think Lisp is an amazing language – at this point I’m willing to be shown it is, but I still haven’t thought “Ah! I’m going to be using Lisp instead of Python (or C++) for my next project.” However, seeing this example I can attest that I had that religious feeling that some folks have claimed after studying Lisp. An Oh $#!@ moment, in today’s profaner terms.

And quite suddenly, I was taken back a decade and a half, to my first exposure to TeX and LaTeX. People have done crazy things using TeX, demonstrating that it is a programming language disguised as a typesetting system. The few times I dared to mess with TeX I was really modifying LaTeX macros to improve the style of some document I was working on – most probably procrastinating on my master’s thesis.

Both the effect of this Lisp macro, and it’s general appearance remind me of the terse, somewhat arcane alien, form of LaTeX macros and styles. Both change the syntax of the language you are using to make what you are doing more convenient.

I suppose this is not an accident. Both Lisp and TeX come from an era when people fiddling with computers really liked to get into the guts of everything, understand it and change it to their own personal taste.

Further posters on lobste.rs point out to me that this is a “reader-macro” which operates during parse time, rather than a regular macro, which operates during compile time. All in all, this was a very informative series of comments for me!

Sleep on it

I was pretty annoyed. I had installed a program into a docker container and had invoked this container via our cloud computing platform and it was garbling my output files – with the exact same inputs and parameters I had just tested it with on my local machine.

At first, I didn’t realize anything was wrong. The program completed without complaints, the log file indicated a certain size of result which corresponded to the result I obtained locally. However, when I fed the output file into another tool it failed. The tool had failed with an esoteric error message which is very non-informative (yay bioinformatics tools), but after seeing this message several times I’ve come to understand that it usually means the input is truncated.

Sure enough, when I downloaded the file from the platform onto my local machine and gunzipped it gunzip ran for a while before complaining of a truncated file.

Now, I had just come out of dealing with an annoying issue related to running docker containers on our platform. When the containers ran, they ran using a non-interactive non-login shell. This leads to subtle differences in how many things work, which will take up a different post, but basically I was in a foul mood already.

The code I actually wanted to run looked like this

my-program \
  input-file \
  >(gzip > outputfile1) \
  >(gzip > outputfile2)

It turns out that the non-interactive shell on the ubuntu base image I used for my tool image is dash, rather than bash, and dash doesn’t like this process substitution thing and complains sh: 1: Syntax error: "(" unexpected

To get around this expediently I wrapped my program in a bash script my-script.sh and invoked it on the platform as bash my-script.sh. This worked, but now my file was getting truncated.

Same code, same data, same docker image. Container executes correctly on my mac, corrupts my file on the AWS linux machine. I cross checked that I was closing my file in my code, I read up a bit about flushing (close does it) and fsync (named pipes don’t have it – they don’t touch disk) and the fact that Python will raise an exception if something goes wrong with the write.

After some more thrashing about like this I suddenly wondered about the multiple processes involved here. By doing process substitution, I had created two new processes – both involving gzip – that were sucking up data from pipes linking to the process my main program was running in. It was quite possible that the two gzip processes were finishing slightly after my main process was done. Now suppose, just suppose, the original bash shell that I was invoking to run my program was terminating as soon as my main program was done without waiting for the gzip processes? This would lead to this kind of file truncation.

I put in a wait in the shell script, but this did not make any difference. I wasn’t explicitly sending any processes to the background, so this was perhaps expected. Then I added a sleep 2s command to the script:

my-program \
  input-file \
  >(gzip > outputfile1) \
  >(gzip > outputfile2)

sleep 2s # <--- BINGO!

and indeed my files were no longer being truncated. So, what was happening was that the parent shell had no way of ‘realizing’ that the two gzip subprocesses were running and was exiting and the docker container was shutting down, killing my gzips just before they were done writing the last bit of data.

Ah, the joys. The joys.

Some notes on using Python multiprocessing

This is a short note describing a common code pattern useful for parallelizing some computations using the Python multiprocessing module.

Many problems are of the embarrassingly parallel type, where the task consists of the same set of computations done independently on a large set of input data. In many languages such problems are solved using multi-threaded code.  Due to events too complicated and embarrassing to mention Python doesn’t do too well with multiple threads, and we need to use multiple processes via the multiprocessing module instead.

A common pattern is to set up a number of workers and use queues to pass data back-and -forth. The pattern looks like this:

multiprocessing-queues.png

The use of this pattern is relatively straight forward but there are some things to pay attention to.

Sharing read only data

The general topic of sharing data between processes is worthy of deeper discussion, and there are some pointers in the Python documentation but there is an easy way to share read only data between processes – use globals that are written to in the parent process BEFORE the child processes are started.

If you declare a global variable and then modify it after the child processes are started, or if you modify the global from the child process you’ll enter into the twilight zone. There will be no errors and from within the process where you do the writing everything will look consistent, but the rest of the universe will not pay any attention to your writes.

Queues have overhead – the object has to pickled by the sender and then un-pickled by the receiver, and while the object is in the queue, it’s taking up memory. If you have a large amount of read-only data that needs to be shared amongst child processes – like a lookup table of primes, for example – this pattern of using globals can be pragmatic.

 

Explicitly set Queue size

On Mac OS X the default size of the queue is  32767 items. On Linux, the default size is … INFINITY. I learned this the hard way. I usually prototype my code on my mac, and then deploy to a linux container that runs god knows where on data that is several orders of magnitude larger than my prototype set. On Linux things SEEMED to run fine but the size of the result set was abnormally small compared to the input. After a lot of poking around I discovered this property of the Python implementation and what was happening was that in Linux my queues got larger and larger, taking up more and more RAM and the OS started to kill off the child processes. Setting an explicit queue size allows the processes to block when the queue becomes full, capping memory usage.

Wrap the child process function using traceback

Error messages from child processes are messy and uninformative. Importantly, the friendly python stack trace print out is missing. A fairly convenient pattern is to wrap the actual function call within a try-except block in thin outer function, using the traceback module, like so:

def func_w(args):
  """A thin wrapper to allow proper tracebacks when things go wrong in a thread

  :param args:
  :return:
  """
  import traceback
  try:
    return func(**args)
  except Exception as e:
    traceback.print_exc()
    print('')
    raise e

Check for normal termination from the child

One of the reasons it took me so long to track down the above mentioned memory error was that when the child processes was being killed by the OS no Python error was raised. The principled way to check for normal termination in is to look at the exitcode value of the process.

When not to use this pattern

Passing data via queues can be expensive in terms of computation and memory. If your individual workers have to pass a lot of data back and forth – say large submatrices of an even larger matrix – it may be neater and faster to use shared memory objects as supplied by the multiprocessing module which takes care of semaphores for you.

Named pipes, process substitution and tee

Most people using bash are familiar with piping: this is the ability to direct the output of one process to the input of another allowing the second process to work on data as soon as becomes available. What is not so well known is that piping is a particular case of a more general and powerful inter process communication system bash gives us.

The basic bash piping flow looks like:

pipe-fig1

In this setup cmd1 writes to stdout and the pipe operator (|) redirects this to the cmd2 which is setup to accept input from stdin and which finally either writes directly to file, or to stdout again,  which is then redirected to file. The bash command would look like

cmd1 | cmd2 > file

There are two great advantages to this kind of data flow. Firstly we are skipping writing data from cmd1 to disk, which saves us a potentially very expensive operation. Secondly, cmd2 is running in parallel to cmd1 and can start processing results from cmd1 as soon as they become available.

(This kind of structure is of course restricted to a data flow that can work in chunks. i.e cmd2 does not need the whole output of cmd1 to be written before starting work)

Now suppose that our first program produces multiple files and so can not afford to write to stdout. Our second program, similarly, accepts two files as input and produces two files as output.

pipe-fig2.png

Here cmd1 expects to write two streams of data to two files, whose names are represented by the first and second arguments to cmd1. cmd2 expects to read data from two files and write data to two files, all represented by four positional arguments.

Can we still use pipes? Can we still get parallelization for free if cmd1 and cmd2 can work in chunks? It turns out that we can, using some thing called named pipes. The above data flow, in bash, can be executed as follows

mkfifo tf1
mkfifo tf2
cmd1 tf1 tf2 &
cmd2 tf1 tf2 file1 file2

The fun thing here is the mkfifo command which creates a named pipes, also called First In First Out (FIFO) structures. After executing mkfifo tf1 if you do a directory listing, you will two odd files named tf1 and tf2. They have no size and their permissions flag string starts with ‘p’.

ls -al 
prw-r--r--   1 kghose  staff      0 Oct 26 21:35 tf1
prw-r--r--   1 kghose  staff      0 Oct 26 21:35 tf2

Whenever a process asks to write to or read from these ‘files’ the OS sets up an in-memory buffer. The process reading the buffer will block until data is placed on the buffer. The process writes to the buffer also blocks until it can be sure there is a receiving process at the other end. Once both ends are connected an in-memory transfer of data occurs from process1 to process2, never touching the disk.

To me the amazing thing is that the original programs cmd1 and cmd2 don’t know anything about “pipes”. They just expect to write to/read from files and the OS presents an interface that looks just like a file. (There are differences, and it is possible to write programs that won’t work with pipes. Any program, for example, that needs to do random access on the file won’t work with this paradigm)

Now, what if you want to maintain this data flow, but in addition, want to save the original data after passing it through another process, say a compression program?

pipe-fig3.png

Now things get a little involved and we’ll invoke two additional concepts, process substitution and the tee command

Consider the command

cmd1 >(gzip > fileA) >(gzip > fileB) &

In the places where cmd1 expects to find a file name we have supplied it with a very strange looking construct! The pattern >(...) is called process substitution. It tells Bash to construct a named pipe on the fly and connect the first stream output from cmd1 to the input of the command in the parens (in this case it is the gzip command). We do this twice because we want both outputs to be gzipped.

Symmetrically, we can do process substitution with the pattern <(...) . Here we tell Bash to send the output of the process in the parens (...) via a named pipe to the relevant input of the other process.

Now consider the command

cmd1 >(tee > tf1 >(gzip > fileA)) >(tee > tf2 >(gzip > fileB) ) &

The new thing here is the tee command (and the nesting of process substitutions which is just neat). The tee command takes input from stdin and copies it to stdout as well as any files that are supplied to it. In this case tee’s stdout is redirected to our named pipe (the >tf1 construct) as well as the ‘file’ represented by the process substitution >(gzip > fileA) (which is a nested process substitution).

So, the data flow drawn above can be represented as:

mkfifo tf1
mkfifo tf2
cmd1 >(tee > tf1 >(gzip > fileA)) >(tee > tf2 >(gzip > fileB) ) &
cmd2 tf1 tf2 file1 file2

Part of the subtle magic in all of this is that it’s a bit like functional programming with parallelization managed by the OS. We’re crunching data in one process as soon as it has emerged from another process. We’ve managed to isolate our concerns but still parallelize some of the work! And if we take our programs to some non-fancy operating system that can’t do all these fireworks we can still run the code just fine, except we’ll be writing to disk files and operating serially.

Frankly, the people who developed *nix systems are geniuses.

(But a distant memory comes back to me, of sitting at an IBM PC in what we called  “The Measurement and Automation Lab”  at Jadavpur University in far, far Calcutta, and an excited professor explaining to me how a RAMdisk worked in PC DOS, though this is clearly not of the same calibre as the *nix features)

A short rant on Darwin

So when I first tried this all out, the code did not work on my Mac. I was pretty sure I was doing things correctly but things just hung and I had to kill the processes. When I tried the exact same commands on a Linux box, things worked as expected.

Then I changed the way I was reading from the pipe to:

python cmd2.py <(cat < tf1) <(cat < tf2) fo1.txt fo2.txt

Using a – seemingly redundant – process substitution via cat to read the pipes made the code work on my Mac.

So we have here, again, a subtle way in which Darwin’s BASH shell differs from Linux’s and lurks, waiting, to screw you up and waste your time. It’s a pity that the Mac has taken off for developers (myself included) because it’s clear Darwin will never be a proper *nix but the adoption of Macs by developers probably takes some energy away from getting Linux on laptops.

Further reading