## Why expression templates matter ?

Imagine we are using a matrix library, with a naive implementation of +, that simply adds column and row-wise each coefficient. Thus, for a, m1, m2, m3 being for example square matrices of order n, if we want to compute the sum of m1, m2 and m3 and to put the result in a, corresponding to the following code :

a = m1 + m2 + m3

then the computation will look like the following tree :

Thus, there will be a first for loop to compute *m1 + m2* which will result in a matrix we’ll call *t*, then another one to compute *t + m3*. For 3×3 matrices, it’s ok. Imagine n is actually 40000. Doing two for loops of 40000 iterations where we would do a single one… quite annoying, isn’t it ? Actually, we would rather want an evaluation tree like the following :

Here comes expression templates ðŸ˜‰

It’ll need a bit of refactoring though. First, let’s say our matrix type is the following (we’ll only deal with float to avoid ambedding an additional typename template parameter representing the number type we use) :

template <unsigned int N> class matrix { float data[N*N]; public: float operator()(unsigned int row, unsigned int col) const { return data[row + N*col]; } float& operator()(unsigned int row, unsigned int col) { return data[row + N*col]; } }; template <unsigned int N> std::ostream & operator<<(std::ostream& o, const matrix<N>& m) { o << "["; for(unsigned int row = 0; row < N; row++) { for(unsigned int col = 0; col < N; col++) { o << m(row,col); if(col != N-1) o << ";"; } if(row != N-1) o << "\n"; } o << "]"; return o; }

Now, we’ll have to define a **Domain Specific Embedded Language** for matrix operations. Like in any language people design, we will have a tree representing what’s happening in the code. In our case, it’ll represent the evaluation tree of the matrix operations we’re dealing with. Thus, like in any expression tree, we need an *Expression* type. Ours will look like this :

template <typename LeftOperandType, typename OperationTag, typename RightOperandType> struct Expression { const LeftOperandType& l; const RightOperandType& r; Expression(const LeftOperandType& l_, const RightOperandType& r_) : l(l_), r(r_) { } float operator() (unsigned int row, unsigned int col) const { return OperationTag::apply(l(row, col), r(row,col)); } };

Ok, now, what should an *OperationTag* look like ? Well, we’ll implement the operation tag corresponding to additions :

struct plus { static float apply(float a, float b) { return a+b; } };

and the *+ operator* that’ll let us create an expression with a ‘plus’ operation.

template <typename L, typename R> Expression<L, plus, R> operator+(const L& l, const R& r) { return Expression<L, plus, R>(l, r); }

Thanks to the definition of *Expression*, we can embed matrix operations in Expressions, but for the moment we can’t do the reverse way, that is we can’t convert an *Expression* to a *matrix*. So let’s write an *operator=* in the *matrix* class.

// inside the matrix class template <typename ExprType> matrix<N>& operator=(const ExprType& e) { for(unsigned int row = 0; row < N; row++) { for(unsigned int col = 0; col < N; col++) { (*this)(row, col) = e(row, col); } } return (*this); }

So you see, this is the only moment where we call the *operator()(unsigned int, unsigned int)* of the *Expression* type. Thus, this is the only moment when the whole expression is being evaluated. Let’s study the following code.

#include <iostream> #include "matrix.h" #include "expression.h" int main() { matrix<2> m1; m1(0,0) = 1.0; m1(1,0) = 0.0; m1(0,1) = 4.0; m1(1,1) = 1.0; matrix<2> m2; m2(0,0) = 0.0; m2(1,0) = -1.0; m2(0,1) = 1.0; m2(1,1) = 2.0; matrix<2> m3; m3(0,0) = 1.0; m3(1,0) = -2.0; m3(0,1) = 3.0; m3(1,1) = 5.0; matrix<2> a; a = m1 + m2 + m3; std::cout << a << std::endl; }

Here, “m1 + m2 + m3” crates an

Expression< Expression<matrix<2>, plus, matrix<2> >, plus, matrix<2> >

instance. The coefficients aren’t computed until *operator=* is called. Indeed, we have the following expression tree.

That is, we know which operations we have to call, on which matrices, but nothing is evaluated. The only place where we call *operator() (unsigned int, unsigned int)* on a value of type *Expression* is in *matrix<N>::operator=*, and this call actually computes each coefficient, one by one, applying the whole computation tree (here, two calls to ‘+’) for each coefficient. This way, we only execute the two for loops once, and it’ll remain the same whatever the number of computations is. Moreover, the only change we had to make to our matrix class was to add an operator= to be able to assign an expression to a matrix.

By the way, the output of our main function is the following.

[2;8

-3;8]

I hope you enjoyed this post ðŸ˜‰

8comments