If you’ve grown fat and complacent using Python, try a little Cython. It’ll put hair on your chest and take some off your scalp.
Take the following Cython code I was writing (heavily paraphrased)
cdef: char *s = seq # seq is passed in to the function unsigned char i, j unsigned char sub_mat for i, j in zip([65, 67, 71, 84], [84, 65, 67, 71]): sub_mat[i] = j return [ sub_mat[s[i]] for i in range(len(seq))]
This code compiles file. It even runs fine for a few test cases I set up for it. But when I put it into a standard testing pipeline I use, it caused a different test to freeze up.
My first reaction was, I’ve lost a pointer somewhere. So I started to dig around. (That reminds me, there has got to be a easier way to hook a debugger up to Cython. The prescribed ways are so complicated). I started to sprinkle print statements in the code. Nothing.
I then broke the final list comprehension out into a loop and put print statements in there. Suddenly I saw that my program wasn’t going off the reservation by losing a pointer. It was actually looping over and over again at that point. And then the lightbulb went off. The loop variable
i had been typed as an
unsigned char (max value 255) and if
len(seq) was ever greater than that it would just cause the loop to go on for ever as
i overflowed and reset to 0 repeatedly.
Of course, those of you programming in C will have immediately noticed this issue and asked – “Does len(seq) ever get larger than 255?” But I’ve grown fat and lazy on Python which lets me do things like
1<<1024 without missing a beat (yes, that creates a 1025 bit integer. Suck it up C programmer), so, on occasion, I have to lose some hair from the top of my head when I slum it with all you C-coders.