A certain awkwardness with C++ inheritance

C++ is an object oriented language. So is Python. Python in my opinion handles most object oriented concepts much more elegantly than C++. Here is a small example.

(Well, both are sold as multi-paradigm languages, which means they do a bit of everything. C++ has tail call optimization and Python has lambdas and functions are first class objects. But I digress. Both will certainly qualify as object oriented languages.)

A major aspect of object oriented design is inheritance: the idea that you can design a “family” of classes that are related to each other in functionality and structure, starting with some basic properties, and then “inheriting” them in child classes which add (or even remove) some functionality.

In practice, in my opinion, in this aspect Python is superior to C++ . Some people have said that the object oriented aspects of C++ have been “bolted on” to C, meaning things can be ugly and inconsistent. I will give an example below.

For purposes of another demonstration I wanted to create two Matrix classes. One would internally store its 2D data in row-major order and the other in column-major order. In all other (external) respects the Matrix class should behave identically.

In Python, I can create this in a very pleasing, logical manner: I create an abstract base class that defines most of the stable Matrix interface. The only thing I leave undefined is the internal detail of how the 2D data is indexed into the 1D store. I then define two child classes which have the same interface, differing only in how the data are stored (runnable here):

from abc import ABC, abstractmethod

class Matrix(ABC):
  def __init__(self, n, m):
    self.n, self.m = n, m
    self.data = [0 for _ in range(n * m)]

  def __mul__(self, rhs):
    ans = type(self)(self.n, rhs.m)
    for i in range(self.n):
      for j in range(rhs.m):
        x = 0
        for k in range(self.m):
          x += self[i, k] * rhs[k, j]
        ans[i, j] = x
    return ans

  @abstractmethod
  def _index(self, key):
    pass

  def __getitem__(self, key):
    return self.data[self._index(key)]

  def __setitem__(self, key, value):
    self.data[self._index(key)] = value

  def __str__(self):
    s = []
    for i in range(self.n):
      s += [", ".join([str(self[i, j]) for j in range(self.m)])]
    return "\n".join(s)


class MatrixRowMajor(Matrix):
  def __init__(self, n, m):
    super().__init__(n, m)

  def _index(self, key):
    return key[0] * self.m + key[1]


class MatrixColMajor(Matrix):
  def __init__(self, n, m):
    super().__init__(n, m)

  def _index(self, key):
    return key[0] + key[1] * self.n

  
def main():
  
  # This correctly raises an exception
  # m = Matrix(2, 4)

  m = MatrixRowMajor(2, 4)
  m[0, 2] = 2  
  print(m.data)
  print(m)

  m = MatrixColMajor(2, 4)
  m[0, 2] = 2  
  print(m.data)
  print(m)

  m1 = MatrixRowMajor(2, 4)
  m1[0, 1] = 2
  print(m1)

  m2 = MatrixColMajor(4, 2)
  m2[1, 0] = 2
  print(m2)

  print((m1 * m2))

main()

In C++ I run into multiple problems which are only resolvable if I resort to using pointers.

The initial setup is smooth:

class Matrix {

public:
    Matrix(size_t n, size_t m)
        : n(n)
        , m(m)
    {
        data.resize(n * m);
    }

    size_t rows() const { return n; }
    size_t cols() const { return m; }

    double&      get(size_t i, size_t j) { return data[_index(i, j)]; };
    const double get(size_t i, size_t j) const { return data[_index(i, j)]; };


protected:
    virtual size_t _index(size_t i, size_t j) const = 0;

    size_t n, m;

    std::vector<double> data;
};

class MatrixRowMajor : public Matrix {

public:
    MatrixRowMajor(size_t n, size_t m)
        : Matrix(n, m)
    {
    }

private:
    size_t _index(size_t i, size_t j) { return i * m + j; };
};

class MatrixColMajor : public Matrix {

public:
    MatrixColMajor(size_t n, size_t m)
        : Matrix(n, m)
    {
    }

private:
    size_t _index(size_t i, size_t j) { return i + j * n; };
};

The problem arises when we try to implement the generic multiplication operator override:

Matrix operator*(const Matrix& lhs, const Matrix& rhs)
{
    Matrix ans(lhs.rows(), rhs.cols());
    for (size_t i = 0; i < lhs.rows(); i++) {
        for (size_t j = 0; j < rhs.cols(); j++) {
            double v = 0;
            for (size_t k = 0; k < lhs.cols(); k++) {
                v += lhs.get(i, k) * rhs.get(k, j);
            }
            ans.get(i, j) = v;
        }
    }
    return ans;
}

This does not work because Matrix is an abstract class. I can have references to Matrix, or pointers to Matrix, but, of course, I can’t instantiate it. Instead I have to pick one of the derived classes and write a multiplication operator that returns that.

void mul(const Matrix& lhs, const Matrix& rhs, Matrix& ans)
{
    for (size_t i = 0; i < lhs.rows(); i++) {
        for (size_t j = 0; j < rhs.cols(); j++) {
            double v = 0;
            for (size_t k = 0; k < lhs.cols(); k++) {
                v += lhs.get(i, k) * rhs.get(k, j);
            }
            ans.get(i, j) = v;
        }
    }
}

MatrixRowMajor operator*(const Matrix& lhs, const Matrix& rhs)
{
    MatrixRowMajor ans(lhs.rows(), rhs.cols());
    mul(lhs, rhs, ans);
    return ans;
}

Ok, ok, you say I really complain too much, this is just a minor detail. Python’s duck typing also hides a major decision under the hood – the type of the lhs determines the return type of the multiplication and this is not very well advertised and so on.

But I do find these visible seams where, for example, inconsistencies between references and pointers show up, annoying.

The full code is available in this gist

3 thoughts on “A certain awkwardness with C++ inheritance

    1. dlgbrdv, I can’t think of how to use templates. I’ve used templates in the past to apply the same computations + data structures to different data types. In this case I want to apply different computations to the same data type. How would I structure this?

  1. Kaushik, you can define multiplication method as a template. The caveat is that the type for _lhs_ should be known at compile time. If not, you still can use generic code to fill the matrix, but may need to virtualise the operator itself, so that the proper type is instantiated dynamically.
    “`
    template T operator*(const T& lhs, const Matrix & rhs)
    {
    T ans(lhs.rows(), rhs.cols());
    //…
    return T;
    }
    “`
    PS. A good insight into different OOP paradigms would be to compare C++ with ObjectiveC, both are bolted to the same basement, but in two distinct ways. Or look at simula vs smalltalk, if you really want to go to the roots.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.