Sunday, March 15, 2009

C++ Visitors: Basic Implementation

In the last post, I described the pros and cons of the Visitor design pattern, and promised some sample C++. This post has a basic implementation of Visitor, to illustrate the concepts. In later posts, we'll build a more powerful and yet more user-friendly Visitor, and use some preprocessor tricks to partially foolproof the use of Visitor.

One place that I've found Visitor useful is in class hierarchies that describe a language -- in my case, usually a hardware description language, but it would apply equally to classes representing a programming or scripting language. In a language project, you often find yourself traversing a parse tree of the input code, performing functions which can't properly be said to be the responsibility of the language classes. Even if you could make a case for adding virtual functions to achieve the task, it makes sense to keep the related functionality together in one file. But you want to dispatch on the node type, so that your code doesn't devolve into a forest of switch statements. This is exactly the place to use Visitor.

Since language parse trees are a common place to apply Visitor, let's start our sample C++ code with a toy class hierarchy for expressions. There are only three concrete classes: Variable, Number, and Operation, all derived from abstract Expression. I kept these classes very simple for brevity's sake, hopefully more care would go into their design in real life:

class Expression {
public:
Expression();
virtual ~Expression() throw() = 0;
virtual void accept(Visitor &v) const = 0;
};

class Number : public Expression {
public:
explicit Number(int value);
virtual ~Number() throw();
int value() const;
virtual void accept(Visitor &v) const;

private:
int _value;
};

class Variable : public Expression {
public:
explicit Variable(const std::string &name);
virtual ~Variable() throw();
std::string name() const;
virtual void accept(Visitor &v) const;

private:
std::string _name;
};

class Operation : public Expression {
public:
enum Operator {
PLUS, MINUS, MAX_OPERATOR
};
Operation(const Operator &op,
Expression *lhsOrOnly,
Expression *rhs = NULL);
virtual ~Operation() throw();
Operator op() const;
const char *symbol() const;
const std::vector<Expression *> &operands() const;
virtual void accept(Visitor &v) const;

private:
Operator _op;
std::vector<Expression *> _operands;
};


Notice that Expression has a pure virtual accept() function. We need that in order to bounce into the correct function in Visitor classes -- we'll discuss that in a moment. Now let's see what the abstract base Visitor class looks like. It's just an interface, any real visiting work will be done by concrete visitors derived from this. As I said, we'll be making the interface more sophisticated in the next post, but here's our first crack at it:


class Visitor {
public:
Visitor();
virtual ~Visitor();

virtual void visitVariable(const Variable &);
virtual void visitNumber(const Number &);
virtual void visitOperation(const Operation &);
};


Our first implementation of accept() is quite simple, so I'll only show one example:


void Operation::accept(Visitor &v) const
{
v.visitOperation(*this);
}


C++ Visitors are coded with this accept/visitFoo idiom to work around C++'s lack of multiple dispatch. Calling visitor.visit(*expr) would pick visit()'s signature at compile time based on the static type of expr. So we have to call expr->accept(visitor), which picks the accept() function at run time based on expr's actual type. Inside accept(), *this has the correct static type and can go to the correct member of the Visitor.

Some things to note about this code:
  1. It's usually reasonable for accept() to be a const function and visit*() to take const references. But you may need it to be otherwise. Remember, Visitors are like external virtual functions. Will the virtual functions you'll add be const or non-const?
  2. On the other hand, you definitely want visit*() to be non-const, and for accept to take a non-const reference. Concrete Visitor classes may need to store some state.
  3. Make ~Visitor() virtual but not pure, and don't declare private copy/assign members (see 5.3.1) for Visitor, so that derived Visitors can rely on compiler defaults if they wish.
  4. For the same reason, it seems better to define no-op defaults for the visit*() functions instead of making them pure virtual.
  5. As written, Operation::accept() does not traverse the Expressions contained in the Operation. That means that concrete Visitors have to do so themselves if they want to. Really such traversal should be Operation's responsibility; we'll add that capability in our next pass.
So how do you use this thing? A simple example is class Print : public Visitor, which will print one of our Expressions. The interface is self-explanatory -- a constructor and inherited virtual visit*() functions -- so I'll just give the implementation:

Print::Print(std::ostream &str)
: Visitor()
, _str(str)
{
}

void Print::visitVariable(const Variable &v)
{
_str << v.name();
}

void Print::visitNumber(const Number &n)
{
_str << n.value();
}

void Print::visitOperation(const Operation &o)
{
_str << "(";
std::vector<Expression *>::const_iterator it =
o.operands().begin();

while (it != o.operands().end()) {
const Expression *e = *it;
++it;
if (it == o.operands().end()) {
// Print unary or binary operator.
_str << o.symbol();
}
e->accept(*this);
}
_str << ")";
}

The Print class is suitable for framing in an ostream shift operator:

std::ostream &operator<<(
std::ostream &str, const Expression &expr)
{
Print printer(str);
expr.accept(printer);
return str;
}

There you go. The basics of C++ Visitors. The toy example doesn't really do justice to how useful they are: Visitors are a big help when you have a big hierarchy to deal with. But the simple design presented here runs into some difficulties when you have a big hierarchy. Coming up, I'll show you some tricks to get the power of Visitors while softening some of the difficulties with using them.

Here is the complete list of posts in this series:

No comments: