Links-Rechts, Links-Rechts, Ahnentafel!

While looking into priority queues and binary heaps, I ran across two gems, one of which may not be so widely known. Consider a complete binary tree with nodes sequentially numbered:

binary-tree

You will observe that the parent of every node is the integer division of itself by two. For example, the parent of 11 is 11/2 = 5 (integer division), the parent of 14 is 14/2 = 7 and so on. Way back in 1590 a genealogist used this numbering pattern to organize people’s ancestries, calling it an Ahnentafel. From the article:

The subject (proband or progenitor) of the ahnentafel is listed as No. 1, the subject’s father as No. 2 and the mother as No. 3, the paternal grandparents as No. 4 and No. 5 and the maternal grandparents as No. 6 and No. 7, and so on, back through the generations. Apart from No. 1, who can be male or female, all even-numbered persons are male, and all odd-numbered persons are female. In this schema, the number of any person’s father is double the person’s number, and a person’s mother is double the person’s number plus one. Using this definition of numeration, one can derive some basic information about individuals who are listed without additional research.

This construct displays a person’s genealogy compactly, without the need for a diagram such as a family tree. It is particularly useful in situations where one may be restricted to presenting a genealogy in plain text, for example, in e-mails or newsgroup articles. In effect, an ahnentafel is a method for storing a binary tree in an array by listing the nodes (individuals) in level-order (in generation order).

Now, suppose we see the exact same complete binary tree and sequential numbering in binary:

binary-tree-in-binary

Do you notice a striking pattern? The children of any given node are basically the binary of the parent left shifted by one with a 0 or 1 in the new last bit, depending upon whether they are the left or right child. So, for example, the children of 101 are 1010 and 1011. This is the basis for the following neat algorithm (which I first came across here) for finding a given node in this tree.

To find the path from the root (here node #1) to some other node, say (11) express the target node number in binary (1011) and discard the first digit – which is always 1 – (011). Each digit now successively indicates the child to pick (0 = left, 1= right) as you traverse the path. So, to get to node #11 you would take (011) left, right, right. (I read this twice before realizing the choice of eleven (11) is peculiarly bad in this context. Here’s the example using 12. 12 => 1100 => 100 => right, left, left)

This algorithm is useful when using a pointer-based representation and, for example, inserting or deleting from a binary heap. This pattern, as you have guessed, comes from the number representation and the tree being both binary based.

PS. Pardon my terrible title. The Ahnentafel put “German” in my head, traversing the tree put “left, right” in my head and watching too many world war II movies in childhood did the rest.

PPS. Python code to generate the trees, not that you need it.

from matplotlib import pyplot as plt

def node_xy(k, n, dx):
  n0, n1 = 2**k, 2**(k+1)-1
  center = (n0 + n1) / 2.0
  return (n - center) * dx / n0, 0.5 - k / 10.0

def plot(ax, max_k):
  dx = 1.0
  for k in range(max_k):
    for n in range(2**k, 2**(k+1)):
      x, y = node_xy(k, n, dx)
      # ax.text(x, y, str(n), ha='center', va='bottom')
      ax.text(x, y, bin(n)[2:], ha='center', va='bottom')
      if k:
        px, py = node_xy(k - 1, int(n/2), dx)
        plt.plot([px, x], [py, y + 0.025], 'k')

fig = plt.figure()
ax = plt.subplot(111)
plot(ax, 5)
plt.setp(ax, xlim=[-0.5, 0.5], ylim=[0.0, 0.6])
plt.axis('off')

Problem 11

I’m using problems in The Advent of Code 2016 to motivate my study of Common Lisp though, sadly, I’ve not been devoting time to it recently. Luckily several of my colleagues are also doing the challenge and one of them came by and described what sounded like a particularly interesting problem – number 11 – , and I decided to try my hand at it in Lisp. It promises to be engaging and instructive.

I’m not going to repeat the problem here (but some of the stuff I’ll write will only make sense in the context of the details of the problem); in the abstract it is a path finding problem: The problem world is in some initial state, there are set of intricate rules by which you can change the world state, and there is an end goal state. The task is to find the shortest number of steps to that end goal.

For me this is a rich problem both in terms of algorithms and language. I think I’ll try and make short posts about the steps I take. My course of action will be (and these may change, of course):

  1. Figure out what algorithms apply to this problem
  2. Think of data structures that will help
  3. Implement this in Common Lisp, from scratch where it looks interesting
    1. I’ve had at least one case so far where I had a naive way of coding things and then I discovered a different Lisp idiom that made it better. I hope to show these examples too.

There are several tutorials and expositions on path finding. The one I like the best is from Red Blob Games. It has interactive visualizations of the algorithms which are very motivating and the tutorial itself starts slow and easy and makes nice progress to end at the A* algorithm. If you want a refresher, please refer to that. If you’ve read Problem 11 you’ll note that it looks different, but with the proper abstraction, this is what it is.

Now let’s take up where the tutorial leaves off. The A* algorithm seems to be the ticket and we want to figure out some implementation details. The interesting things the A* algorithm requires are a priority queue and a heuristic for determining how far a given state is from the end state.

Popular lore has it that, for this application, a Fibonacci heap is a good implementation for the priority queue for this application, but The algorithm design manual points us to rank-paired heaps as something that is better in practice. I found this paper on rank-pairing heaps. The authors have written the paper very well: They start with a nice systematic introduction to heaps and then slowly introduce modifications they make to existing structures and algorithms.

To start with however, I’m going to implement a priority queue using a simple binary heap (there is a good illustrated guide here). At a later stage, when the whole thing is worked out, I’m going to come back and implement rank-pairing heaps.

The data structure for holding the world state can be separated out from all this priority queue business, but it does interface with the A* algorithm in two places 1) The world state has to be amenable to hashing, such that we can put it in a map-like structure and 2) The implementation of rules for going from state to state i.e. the manner in which we find child nodes of a given parent node can, we intuit, be made considerably easy or difficult based on our choice of world state representation.

This whole thing is going to go down as a series of blog posts and I’ll link to each one (creating a linked list) but also try and set them out here, so that this post serves as a table of contents.

  1. Binary heap priority queue
    1. Learned about accessor functions
    2. And print-object which is what Python’s __repr__ is modeled on.
    3. and a diversion
  2. A* search framework
  3. Data structure and functions to handle world state
  4. Completion of problem 11
  5. (Bonus) Rank-pairing heap implementation
  6. A wrap up?

Holy $#!* THAT’s a Lisp macro!

Still struggling to understand what’s so special about Lisp, I came across a post linked on lobste.rs, innocuously titled “Hash Table Syntax in Common Lisp“. It had the arcane macro syntax I’d seen in some other examples and I was about to skim past it when I suddenly saw the expression {:key1 => "value1", :key2 => "value2"} being blithely evaluated by the REPL as if it was just any other Lisp expression. I sat there with my mouth slightly open.
 
Lisp, Common Lisp no exception, has a very basic syntax. Everything, and I mean everything, consists of function evaluations, and the evaluations all have the format (fun arg1 arg2 ...) and here was this expression full of braces and commas and odd operators being turned into a common lisp hash table.

There has to be some catch, I thought. It has to be invoked in a special way or something. Turns out, no. You can just paste the macro into the REPL and start using the curly braces notation.

I won’t claim that I “understand” Lisp. I won’t even claim I think Lisp is an amazing language – at this point I’m willing to be shown it is, but I still haven’t thought “Ah! I’m going to be using Lisp instead of Python (or C++) for my next project.” However, seeing this example I can attest that I had that religious feeling that some folks have claimed after studying Lisp. An Oh $#!@ moment, in today’s profaner terms.

And quite suddenly, I was taken back a decade and a half, to my first exposure to TeX and LaTeX. People have done crazy things using TeX, demonstrating that it is a programming language disguised as a typesetting system. The few times I dared to mess with TeX I was really modifying LaTeX macros to improve the style of some document I was working on – most probably procrastinating on my master’s thesis.

Both the effect of this Lisp macro, and it’s general appearance remind me of the terse, somewhat arcane alien, form of LaTeX macros and styles. Both change the syntax of the language you are using to make what you are doing more convenient.

I suppose this is not an accident. Both Lisp and TeX come from an era when people fiddling with computers really liked to get into the guts of everything, understand it and change it to their own personal taste.

Further posters on lobste.rs point out to me that this is a “reader-macro” which operates during parse time, rather than a regular macro, which operates during compile time. All in all, this was a very informative series of comments for me!