Derived classes and std::vector

I want to talk about a fun corner of C++ I ran into recently. It has to do with how to handle calling member functions of a list of heterogenous objects.

My use case was a bit like the following: say we have a bunch of different functions – a * x^2, b * x, c/x – and we want to apply them one after the other to a series of x values. Which functions are included, and the order of application are user defined.

My solution was to have a base class

and three derived classes

But when I put them in a list and applied them:

I have

g++ --std=c++14 main_wrong.cpp -o test
./test
0
0
0
0

Whaaaaa? Waah! Why is it calling the dummy base class function. Isn’t the WHOLE PURPOSE of virtual functions to ensure that the function in the derived class gets called?

The core problem here is that std::vector makes a copy of the object when we use the push_back function. We’ve defined the container type as a base class and when our derived class is passed to push_back it “slices” the object down so that the subclass now looks like the base class – losing all it’s derived attributes. (I first ran into Object slicing thanks to “Cubbi”‘s answer here)

So what can we do? We could use a vector of pointers instead. But the internet seems to think this is an unsafe idea. UNSAFE!? We don’t write C++ to be SAFE!! Oh, alright. We’ll practice safe code. What can we do?

The safest route is to use std::unique_ptr:

I’m not full clear on the use of emplace_back instead of push_back: it certainly isn’t for efficiency reasons here – the just code won’t compile using push_back.

So there you have it.

Development environment

While it can be an avenue for procrastination, searching for a proper development environment often pays off with greatly improved productivity in the long term. I had some finicky requirements. I wanted to maintain my own CMakeLists file so that the project was IDE/editor/platform independent. I wanted to be able to just add files to the project from anywhere – from VI, by OS copy, by git pull – (I set wildcards in the CMakeLists.txt file) – and definitely not have to fiddle in the IDE to add files. I wanted to edit markdown files which I was using for documentation, and placing in a separate docs folder. I wanted to see a decent preview of the markdown occassionally. Ideally, my environment would simply track my directory organization, including non C++ files. However, I also wanted intelligent code completion, including for my own code.

At work, where a ton of my current stuff is in Python, I happily use Pycharm which has been a very worthwhile investment. So CLion was a natural candidate to start with. I didn’t get very far, because I balked at the license terms. It felt a little annoying to have to apply for an open source license and then renew every year and so on. When I want to play with red-tape I talk to the government.

I completely understand needing to charge people for your software. It’s a necessity for maintenance and continued development. But you can not get money from people who have none. To me the rational model is to charge professional entities for the software, to make it free to use for non-professional use and just not bother with the small fry. Basically, have a smallish legal department that goes after a big player if they violate. For companies and individual professionals a software license is the least of their costs (and probably the item with the highest ROI), and they will gladly pay to have some one to go to for support. Chasing down individual hobbyists just doesn’t make sense. I’m sorry, where did that soapbox come from? I must have stepped on it by accident. Moving on to the next editor/IDE …

I tried XCode next. It was a medium-ish install, and I think I needed an iTunes account. I was encouraged to use it because CMake can generate XCode projects and many coders working on Macs swore by it. I have used it infrequently – mostly when I was editing a largish Objective C project during my postdoc. I found it fine to use and debug with. It was fine, but I disliked how CMake organized the XCode project virtual directories, and I disliked how I had to manually add non-source and new files. And memories of the over complicated build system started to come back to me.

I then found something called CodeLite. It was confusing because I spent some time trying to figure out if it was related to Code::Blocks. CodeLite looked kind of promising. There was a nice tutorial on how to integrate with CMake based projects. But it looked cumbersome, and I had to add files to the project through the interface. By now, I was willing to endure this but I couldn’t get over the fact that there was no preview for Markdown. I liked that they limited the number of plugins, but I really wanted markdown preview. I told you I was finicky.

Then, quite by accident I started to poke around Visual Studio Code and downloaded it to try it out. In brief, it seems to satisfy all my finicky requirements. This choice rather surprised me, because it was just not on my radar. I also realize this does nothing to advance my attempt to get into the cool kids programming club where people use vim or emacs, but that was a lost cause anyway. (I now recall that I initially avoided VSC because a colleague spoke out very strongly against it a few months ago. Goes to show, you need to try things out for yourself)

Why would I ever write in C?

I’ve made computers do tricks in both C/C++ and Python for many years now. For the past few years I’ve written exclusively in Python for both work and play. I recently went back to a personal project to rewrite it in C++ and, for the first time in my life, thought, “Why would I ever write in C?”

I have two personal projects (A note taker/reference manager and a photo organizer) that I’ve been working on for over a decade now. I started both in C++ using the QT widget library. I moved the note taker to Ruby (on-rails, I had my browser-as-GUI phase) then to Python (bottle is an amazing and lightweight web framework, you should try it) and then I abandoned it in favor of paper and pencil (but I don’t do much manuscript writing anymore, which accounts for me not needing a reference manager and making do with paper notes).

The photo organizer I made two versions of in C++ using QT, and then I moved it to Python. I was quite happy with the Python version, but I decided I wanted some changes, and it seemed to be a new decade, so I started to itch to move it back into C++. QT had changed a lot since 2012 (which I think was the last time I messed with the C++/QT version of my photo organizer). There were a lot of new bells and whistles and I got excited and I wanted to sit down and see what I could do.

I wanted to start of with a library of core functions and an associated test suite. In Python, this kind of code is very easy to set up: I put the library in a source directory and the tests in a directory marked ‘tests’ and use nosetests to automatically discover and run them. The tests come about organically as short pieces of code I write in the REPL to test newly added features that smoothly build up the module

The first thing I realized is that I missed the REPL. For twenty years I had done without a REPL. I would code up a skeleton of an application, craft the makefile, have function stubs and then just make sure it compiled. I would then add functionality by fleshing out the stubs and adding new functions, refactoring into new files and so on. But now, just the thought of spending hours, perhaps days, setting everything up without any feedback, raised a barrier. It made things not fun. It was WORK! This is not a good thing for a project done in spare time, for enjoyment.

The next thing I missed were nosetests. The scene for unit tests is now much better for C++ and QT has it’s own little framework for writing and running tests. But they all look so CLUNKY! You have to declare a class, then include a header file, then make an application out of that. You have to tediously add every test separately, then add every function separately, then add every test case.

I guess, even with IDEs like QT Creator, there was too much boilerplate, too much gap between me writing some code and seeing what it did. The compile cycle didn’t help. Doxygen sounded OK, but there was no smile on my face when I went to download it and see how I would be using it in my project.

8085-Microprocessor-Kit

My first introduction to computers was via BASIC, then Pascal. I then went to C and C++. I had a class in 8085 assembly (We used kits packaged in neat wooden boxes. I enjoyed keying in the program and the data and then hitting go. My favorite project had been to write an infinite precision division program. I had seen Apollo 13 a few years back, and those kits and their interfaces reminded me of the DSKY).

My introduction to proper C++ was in a data structures and algorithms class, and the height of my C++ programming adventures was in a computer graphics class where I made animations out of particles in OpenGL and a kind of fly through of a complex scene. These were done on SGIs O2 machines, at the Indian Institute of Science’s Super Computer Center, where I would retire early each evening after dinner, because the O2 machines were limited in number and the cubicles would be quickly occupied.

I say all this, perhaps in a subconscious attempt to convince you that I’m not totally inexperienced in programming and do indeed intrinsically enjoy programming computers and understanding the art of programming computers, and don’t view it merely as a means to an end, which is to make these amazing calculating machines do clever tricks.

Screen Shot 2015-05-09 at 12.19.07 AM

But I think Python has spoiled me for ever in terms of the manner in which I enjoy computers. Writing C/C++ has become work. It’s something I would do wearing a suit and tie (not that I wear a suit and tie at work – which should tell you something). It’s something that I would do with a serious demeanor and a feeling, I imagine, mental patients have when wearing a straightjacket.

Boilerplate, boilerplate everywhere I look. Don’t smile. The doctors will take you away to the padded room. Better to look serious and keep your head down.

Performance you say? Well, to be honest, there are very few parts of my code where I need “C like” speed. And where I do, well, I can use CFFI, or more likely, I use Cython. It looks a little ugly, but hey, I have to write so little of it. The only annoying thing is inspecting the generated C code to make sure it’s getting rid of the dynamic Python calls.

Oh, yeah, pip install has spoiled me too. I ain’t chasing down your exotic library to read EXIF tag data in photo files. I haven’t got that kind of time. But I do have time to type pip install.

So, perhaps the only caveat I would add, in closing, is to say, the full title of this post should be “Why would I ever write in C for fun?”

Get more out of Cython

Getting more out of your Cython code doesn’t have to be a black art. A key step is to peek into the cythonized .c file and note if inner loops are getting down to bare C without any of the overheads associated with Python’s dynamism.

In an earlier post I had introduced Cython, which is a really cool language that straddles Python and C. Cython is Python with some ctype defs that helps a translator (Cython) to convert the Python code into a .c file which can be compiled and imported as a module. I’m one of the many folks who rave about this system because, just by changing .py to .pyx and running the code through the Cython system get get a 2x speedup for your functions. The next stage is to add some type declarations to your code. These give the real speedups.

I became curious as to what the type declarations actually do, and how they can help. Beyond curiosity, I was hoping that understanding this process would help me write more efficient Cython code.

I was working on a simple piece of code to determine the transition matrix for a DNA sequence. The DNA sequence is a string of characters from the set {A,C,G,T,N} (The N represents an unknown character, or, sometimes, a don’t care character). The transition matrix simply represents the probabilities that, in the sequence, character X will be followed by character Y, for every possible transition. This means a 5×5 matrix.

My first pass looked like this:
actg.pyx

def compute_transition_matrix(bytes seq):
  # Given a string, computes the transition matrix description of that string
  cdef:
    unsigned long int t_mat[85][85]
    unsigned long int seq_len = len(seq), n, m

  for n in [65, 67, 71, 84, 78]:
    for m in [65, 67, 71, 84, 78]:
      t_mat[n][m] = 0

  for n in range(1, seq_len):
    t_mat[<unsigned char> seq[n - 1]][<unsigned char> seq[n]] += 1

  # Prefer to copy the data into a python list, rather than mess around with pointer conversion etc.
  return [[t_mat[n][m] for m in [65, 67, 71, 84, 78]] for n in [65, 67, 71, 84, 78]]

Inspecting the actg.c code generated by running cython on this file I found that my inner loop looked like this:

  /* "actg.pyx":17
 *       t_mat[n][m] = 0
 * 
 *   for n in range(1, seq_len):             # <<<<<<<<<<<<<<
 *     t_mat[<unsigned char> seq[n - 1]][<unsigned char> seq[n]] += 1
 * 
 */
  __pyx_t_4 = __pyx_v_seq_len;
  for (__pyx_t_7 = 1; __pyx_t_7 < __pyx_t_4; __pyx_t_7+=1) {
    __pyx_v_n = __pyx_t_7;

    /* "actg.pyx":18
 * 
 *   for n in range(1, seq_len):
 *     t_mat[<unsigned char> seq[n - 1]][<unsigned char> seq[n]] += 1             # <<<<<<<<<<<<<<
 * 
 *   # Prefer to copy the data into a python list, rather than mess around with pointer conversion etc.
 */
    if (unlikely(__pyx_v_seq == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      {__pyx_filename = __pyx_f[0]; __pyx_lineno = 18; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    }
    __pyx_t_8 = (__pyx_v_n - 1);
    __pyx_t_9 = __Pyx_PyBytes_GetItemInt(__pyx_v_seq, __pyx_t_8, 1); if (unlikely(__pyx_t_9 == ((char)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 18; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __pyx_t_10 = ((unsigned char)__pyx_t_9);
    if (unlikely(__pyx_v_seq == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      {__pyx_filename = __pyx_f[0]; __pyx_lineno = 18; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    }
    __pyx_t_9 = __Pyx_PyBytes_GetItemInt(__pyx_v_seq, __pyx_v_n, 1); if (unlikely(__pyx_t_9 == ((char)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 18; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __pyx_t_11 = ((unsigned char)__pyx_t_9);
    ((__pyx_v_t_mat[__pyx_t_10])[__pyx_t_11]) = (((__pyx_v_t_mat[__pyx_t_10])[__pyx_t_11]) + 1);
  }

I know that the first thing you are thinking is that, Man this looks butt-ugly. Also gotos?!?! Well, the code is not really meant for human consumption AND there are some things in here that makes it easy to find what you are looking for. The first is that the name mangling preserves the core of the function name, so you can do a search for the function in an editor. The second thing, which turns out to be very, very important is that the original code is in comments along side the relevant C code. And for this I take my hats off the Cython folks, because they did not have to do this, but they put it in any way, which makes our inspection a LOT easier.

Now, you can see right away that the inner loop has a bunch of funny checks related to making sure variable types are correct and have appropriate attributes etc. And I thought, I don’t need this cruft, I KNOW this is a string of bytes – an array of chars.

My change was to explicitly tell Cython that my bytes seq variable was really an array of char (Note the char *seq = s):

def compute_transition_matrix(bytes s):
  """Given a string, computes the transition matrix description of that string"""
  cdef:
    unsigned long int t_mat[85][85]
    unsigned long int seq_len = len(s), n, m
    char *seq = s

  for n in [65, 67, 71, 84, 78]:
    for m in [65, 67, 71, 84, 78]:
      t_mat[n][m] = 0

  for n in range(1, seq_len):
    t_mat[<unsigned char> seq[n - 1]][<unsigned char> seq[n]] += 1

  # Prefer to copy the data into a python list, rather than mess around with pointer conversion etc.
  return [[t_mat[n][m] for m in [65, 67, 71, 84, 78]] for n in [65, 67, 71, 84, 78]]

The inner loop that results from this code, looks like:

  /* "actg.pyx":35
 *       t_mat[n][m] = 0
 * 
 *   for n in range(1, seq_len):             # <<<<<<<<<<<<<<
 *     t_mat[<unsigned char> seq[n - 1]][<unsigned char> seq[n]] += 1
 * 
 */
  __pyx_t_5 = __pyx_v_seq_len;
  for (__pyx_t_8 = 1; __pyx_t_8 < __pyx_t_5; __pyx_t_8+=1) {
    __pyx_v_n = __pyx_t_8;

    /* "actg.pyx":36
 * 
 *   for n in range(1, seq_len):
 *     t_mat[<unsigned char> seq[n - 1]][<unsigned char> seq[n]] += 1             # <<<<<<<<<<<<<<
 * 
 *   # Prefer to copy the data into a python list, rather than mess around with pointer conversion etc.
 */
    __pyx_t_9 = ((unsigned char)(__pyx_v_seq[(__pyx_v_n - 1)]));
    __pyx_t_10 = ((unsigned char)(__pyx_v_seq[__pyx_v_n]));
    ((__pyx_v_t_mat[__pyx_t_9])[__pyx_t_10]) = (((__pyx_v_t_mat[__pyx_t_9])[__pyx_t_10]) + 1);
  }

Which looks A LOT BETTER.

This is reflected in some simple %timeit based benchmarking by running this code on Human chromosome 1. The first version takes 1.4s to run while the second version takes 0.27s – a 5x speedup, which corresponds with us taking all that dynamic cruft out of the inner loop.

One thing to note is that the PROPER way to do all this is profile your code first and find out where the bottle necks are. Sometimes doing %timeit on individual functions are all that is needed. Sometimes you will need to do a proper profile, and Cython as directives to allow you to profile Cython functions, but these can slow the code down.

CFFI-ize!

Ok, so it doesn’t have as nice a ring to it as “Cythonize!” but the Python C-Foreign Function Interface blew my mind away. I had previously used Cython to write code that was a mind-altering mix of Python and C-like static type defs and so on. It was impressive how much speed you could gain by just telling the interpreter the types of a few of your variables.

Cython, however, leads to cluttered code, losing some of the charm of Python. You also have these funny .pyx and .pyd files starting to lie around. With CFFI you can write straight C code, compile it separately into a library (if you so wish: you don’t need to compile it separately if you don’t want), use it with a pure C project, have proper syntax highlighting, run it through a debugger – whatever. And then, with CFFI, you can very, very easily integrate it with your Python code, write unit tests for it and so on and so forth. The documentation page also has a simple recipe for how to write setup scripts with the c-sources.

Here’s our Mandelbrot set example again, this time rewritten via CFFI.

The plain Python version

def f(x, y, max_iter=1000):
  """
  z <- z^2 + c
  """
  c = x + y*1j
  z = (0 +0j)
  for n in range(max_iter):
    z = z**2 + c
    if abs(z) > 2:
      return 1.0 - float(n)/max_iter
  else:
    return 0.0

def mandelbrot(x0=-2.0, y0=-1.5, x1=1.0, y1=1.5, grid=50, max_iter=1000):
  dx, dy = float(x1 - x0)/grid, float(y1 - y0)/grid
  return [[f(x0 + n * dx, y0 + m * dy, max_iter) for n in range(grid)] for m in range(grid)]

def quick_plot(grid=100, max_iter=10):
  import pylab
  pylab.imshow(mandelbrot(grid=grid, max_iter=max_iter), cmap=pylab.cm.gray)
  pylab.show()

The CFFI-zed version

import numpy
from cffi import FFI

ffi = FFI()
ffi.cdef("""
float f(float, float, int);
void mandelbrot(float, float, float, float, int, int, float*);
""")
lib = ffi.verify("""#include <fmandel.h>""", include_dirs=['./'], sources=['fmandel.c'])

def mandelbrot(x0=-2.0, y0=-1.5, x1=1.0, y1=1.5, grid=50, max_iter=1000):
  g = numpy.empty((grid, grid), dtype=numpy.float32)
  lib.mandelbrot(x0, y0, x1, y1, grid, max_iter, ffi.cast("float*", g.ctypes.data))
  return g

def quick_plot(x0=-2.0, y0=-1.5, x1=1.0, y1=1.5, grid=100, max_iter=10):
  import pylab
  pylab.imshow(mandelbrot(x0, y0, x1, y1, grid=grid, max_iter=max_iter), cmap=pylab.cm.gray)
  pylab.show()

Where fmandel.h is:

#include <stdlib.h>

float f(float, float, int);
void mandelbrot(float, float, float, float, int, int, float*);

and fmandel.c is:

#include <stdlib.h>

float f(float x, float y, int max_iter) {
  float cr = x, ci = y, zr = 0, zi = 0;
  int n;
  for(n = 0; n < max_iter; n++) {
    float tzr = zr, tzi = zi;
    zr = tzr*tzr - tzi*tzi + cr;
    zi = 2*tzr*tzi + ci;
    if(zr*zr + zi*zi > 4.0) break;
  }
  return 1.0 - (float)n/(float)max_iter;
}

void mandelbrot(float x0, float y0, float x1, float y1, int grid, int max_iter, float* mv) {
  float dx = (x1 - x0)/(float)grid, dy = (y1 - y0)/(float)grid;
  for(int n=0; n < grid; n++)
    for(int m=0; m < grid; m++)
      mv[n * grid + m] = f(x0 + (float)m * dx, y0 + (float)n * dy, max_iter);
}

The only tricky thing here, which is not well explained in the CFFI docs, is passing the array data between Python and C.

In the first version of the code I had a malloc allocate a 2-D block of memory in the C-function, and then return this 2D pointer. CFFI has an excellent interface that understands 2D pointers, however this is not compatible with matplotlib. I was first copying over the CFFI Cdata pointer element by element into a numpy array, but that was expensive. In addition the malloc with no corresponding free of the memory is a bug.

I finally decided to let numpy handle the memory management: I first allocate memory using numpy.empty which is fast (and does not waste time initializing). I then use numpy.ctypes.data to access the underlying C-pointer to this data, cast it to a float* using ffi.cast.

The one annoyance with CFFI is that it caches the compiled c-code but does not know when the c-sources have been altered. Only if you inline the C-source will CFFI recompile the inlined code. I had to resort to repeatedly deleting the __pycache__ directory each time I change the C-source.

mandel_whole