Alp Mestanogullari's Blog

Why expression templates matter ?

Posted in Uncategorized by alpmestan on 2009/11/01

Imagine we are using a matrix library, with a naive implementation of +, that simply adds column and row-wise each coefficient. Thus, for a, m1, m2, m3 being for example square matrices of order n, if we want to compute the sum of m1, m2 and m3 and to put the result in a, corresponding to the following code :

a = m1 + m2 + m3

then the computation will look like the following tree :

Evaluation tree of m1 + m2 + m3

Evaluation tree of m1 + m2 + m3

Thus, there will be a first for loop to compute m1 + m2 which will result in a matrix we’ll call t, then another one to compute t + m3. For 3×3 matrices, it’s ok. Imagine n is actually 40000. Doing two for loops of 40000 iterations where we would do a single one… quite annoying, isn’t it ? Actually, we would rather want an evaluation tree like the following :

The desired evaluation tree of m1 + m2 + m3

The desired evaluation tree of m1 + m2 + m3

Here comes expression templates ;)

It’ll need a bit of refactoring though. First, let’s say our matrix type is the following (we’ll only deal with float to avoid ambedding an additional typename template parameter representing the number type we use) :

template <unsigned int N>
class matrix
{
    float data[N*N];

public:
    float operator()(unsigned int row, unsigned int col) const
    {
        return data[row + N*col];
    }

    float& operator()(unsigned int row, unsigned int col)
    {
        return data[row + N*col];
    }
};

template <unsigned int N>
std::ostream & operator<<(std::ostream& o, const matrix<N>& m)
{
    o << "[";
    for(unsigned int row = 0; row < N; row++)
    {
        for(unsigned int col = 0; col < N; col++)
        {
            o << m(row,col);
            if(col != N-1) o << ";";
        }
        if(row != N-1) o << "\n";
    }
    o << "]";
    return o;
}

Now, we’ll have to define a Domain Specific Embedded Language for matrix operations. Like in any language people design, we will have a tree representing what’s happening in the code. In our case, it’ll represent the evaluation tree of the matrix operations we’re dealing with. Thus, like in any expression tree, we need an Expression type. Ours will look like this :

template <typename LeftOperandType, typename OperationTag, typename RightOperandType>
struct Expression
{
    const LeftOperandType& l;
    const RightOperandType& r;

    Expression(const LeftOperandType& l_, const RightOperandType& r_) : l(l_), r(r_) { }

    float operator() (unsigned int row, unsigned int col) const
    {
        return OperationTag::apply(l(row, col), r(row,col));
    }
};

Ok, now, what should an OperationTag look like ? Well, we’ll implement the operation tag corresponding to additions :

struct plus
{
    static float apply(float a, float b)
    {
        return a+b;
    }
};

and the + operator that’ll let us create an expression with a ‘plus’ operation.

template <typename L, typename R>
Expression<L, plus, R> operator+(const L& l, const R& r)
{
    return Expression<L, plus, R>(l, r);
}

Thanks to the definition of Expression, we can embed matrix operations in Expressions, but for the moment we can’t do the reverse way, that is we can’t convert an Expression to a matrix. So let’s write an operator= in the matrix class.

    // inside the matrix class
    template <typename ExprType>
    matrix<N>& operator=(const ExprType& e)
    {
        for(unsigned int row = 0; row < N; row++)
        {
            for(unsigned int col = 0; col < N; col++)
            {
                (*this)(row, col) = e(row, col);
            }
        }
        return (*this);
    }

So you see, this is the only moment where we call the operator()(unsigned int, unsigned int) of the Expression type. Thus, this is the only moment when the whole expression is being evaluated. Let’s study the following code.

#include <iostream>
#include "matrix.h"
#include "expression.h"

int main()
{
    matrix<2> m1;
    m1(0,0) = 1.0;
    m1(1,0) = 0.0;
    m1(0,1) = 4.0;
    m1(1,1) = 1.0;

    matrix<2> m2;
    m2(0,0) = 0.0;
    m2(1,0) = -1.0;
    m2(0,1) = 1.0;
    m2(1,1) = 2.0;

    matrix<2> m3;
    m3(0,0) = 1.0;
    m3(1,0) = -2.0;
    m3(0,1) = 3.0;
    m3(1,1) = 5.0;

    matrix<2> a;
    a = m1 + m2 + m3; 
    std::cout << a << std::endl;
}

Here, “m1 + m2 + m3″ crates an

Expression< Expression<matrix<2>, plus, matrix<2> >, plus, matrix<2> >

instance. The coefficients aren’t computed until operator= is called. Indeed, we have the following expression tree.

Expression tree

Expression tree

That is, we know which operations we have to call, on which matrices, but nothing is evaluated. The only place where we call operator() (unsigned int, unsigned int) on a value of type Expression is in matrix<N>::operator=, and this call actually computes each coefficient, one by one, applying the whole computation tree (here, two calls to ‘+’) for each coefficient. This way, we only execute the two for loops once, and it’ll remain the same whatever the number of computations is. Moreover, the only change we had to make to our matrix class was to add an operator= to be able to assign an expression to a matrix.

By the way, the output of our main function is the following.

[2;8
-3;8]

I hope you enjoyed this post ;)

About these ads

8 Responses

Subscribe to comments with RSS.

  1. rmn said, on 2009/11/01 at 8:56 pm

    First off, nice post – I enjoyed reading.

    This will work nicely for the operation you showed, but since you’re not copying the pbjects you’re operating on, if I write something like

    a = a*b; // or something else with the lhs in it

    I’ll get the wrong results because a is changed in the process.
    But then again, as an example for expressions templates, this is a good post.

  2. alpmestan said, on 2009/11/01 at 10:53 pm

    Yeah indeed. My code is just a beginning. We should have a temporary object created in the operator=, and then putting this one’s data in this’ data.

    I consider writing about expression templates / DSELs for a less common task, in this one I’ll take care of avoiding such “bugs”. By the way, if you have any idea for the “domain” of the DSEL, don’t hesitate. And thanks for reading & commenting !

  3. [...] Why expression templates matter ? « Alp Mestanogullari's Blog alpmestan.wordpress.com/2009/11/01/why-expression-templates-matter – view page – cached Imagine we are using a matrix library, with a naive implementation of +, that simply adds column and row-wise each coefficient. Thus, for a, m1, m2, m3 being for example square matrices of order n,… (Read more)Imagine we are using a matrix library, with a naive implementation of +, that simply adds column and row-wise each coefficient. Thus, for a, m1, m2, m3 being for example square matrices of order n, if we want to compute the sum of m1, m2 and m3 and to put the result in a, corresponding to the following code : a = m1 + m2 + m3 (Read less) — From the page [...]

  4. and said, on 2009/11/02 at 5:52 pm

    Awesome post! I have been using Eigen recently, a linear algebra library, and they stated support for expression templates. I didn’t know what that meant, but I assumed it was along the lines of this. It is nice to see a simple and clear implementation, thanks.

  5. uberVU - social comments said, on 2009/11/04 at 1:54 am

    Social comments and analytics for this post…

    This post was mentioned on Twitter by Alp Mestan: Why expression templates matter ?: http://wp.me/pEB41-W

  6. Gwenaël Dunand said, on 2009/11/23 at 8:55 pm

    Very nice blog dude.
    Keep writing !! :-)

  7. nicolas66 said, on 2009/12/18 at 9:10 pm

    Nice post. But a problem arise when you decide to do operations on generic types: how to choose the right type between matrix and matrix? Traits can bring an elegant solution to this problem but I don’t know if it’s possible to reduce the number of classes (e.g for symetric cases).

  8. alpmestan said, on 2009/12/18 at 10:32 pm

    You can create a matrix class being built combinating many “blocks” (the storage, etc), well something like a policy-based matrix, or policy/traits-based, something like that. But then we rather enter the generative programming world, totally, whereas here it was just about having an AST for a minimal DSEL for matrixes. Very minimal.

    But yeah, absolutely nicolas, you’re right.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 251 other followers

%d bloggers like this: