Sunday, March 15, 2009

C++ Visitors: Basic Implementation

In the last post, I described the pros and cons of the Visitor design pattern, and promised some sample C++. This post has a basic implementation of Visitor, to illustrate the concepts. In later posts, we'll build a more powerful and yet more user-friendly Visitor, and use some preprocessor tricks to partially foolproof the use of Visitor.

One place that I've found Visitor useful is in class hierarchies that describe a language -- in my case, usually a hardware description language, but it would apply equally to classes representing a programming or scripting language. In a language project, you often find yourself traversing a parse tree of the input code, performing functions which can't properly be said to be the responsibility of the language classes. Even if you could make a case for adding virtual functions to achieve the task, it makes sense to keep the related functionality together in one file. But you want to dispatch on the node type, so that your code doesn't devolve into a forest of switch statements. This is exactly the place to use Visitor.

Since language parse trees are a common place to apply Visitor, let's start our sample C++ code with a toy class hierarchy for expressions. There are only three concrete classes: Variable, Number, and Operation, all derived from abstract Expression. I kept these classes very simple for brevity's sake, hopefully more care would go into their design in real life:

class Expression {
public:
Expression();
virtual ~Expression() throw() = 0;
virtual void accept(Visitor &v) const = 0;
};

class Number : public Expression {
public:
explicit Number(int value);
virtual ~Number() throw();
int value() const;
virtual void accept(Visitor &v) const;

private:
int _value;
};

class Variable : public Expression {
public:
explicit Variable(const std::string &name);
virtual ~Variable() throw();
std::string name() const;
virtual void accept(Visitor &v) const;

private:
std::string _name;
};

class Operation : public Expression {
public:
enum Operator {
PLUS, MINUS, MAX_OPERATOR
};
Operation(const Operator &op,
Expression *lhsOrOnly,
Expression *rhs = NULL);
virtual ~Operation() throw();
Operator op() const;
const char *symbol() const;
const std::vector<Expression *> &operands() const;
virtual void accept(Visitor &v) const;

private:
Operator _op;
std::vector<Expression *> _operands;
};


Notice that Expression has a pure virtual accept() function. We need that in order to bounce into the correct function in Visitor classes -- we'll discuss that in a moment. Now let's see what the abstract base Visitor class looks like. It's just an interface, any real visiting work will be done by concrete visitors derived from this. As I said, we'll be making the interface more sophisticated in the next post, but here's our first crack at it:


class Visitor {
public:
Visitor();
virtual ~Visitor();

virtual void visitVariable(const Variable &);
virtual void visitNumber(const Number &);
virtual void visitOperation(const Operation &);
};


Our first implementation of accept() is quite simple, so I'll only show one example:


void Operation::accept(Visitor &v) const
{
v.visitOperation(*this);
}


C++ Visitors are coded with this accept/visitFoo idiom to work around C++'s lack of multiple dispatch. Calling visitor.visit(*expr) would pick visit()'s signature at compile time based on the static type of expr. So we have to call expr->accept(visitor), which picks the accept() function at run time based on expr's actual type. Inside accept(), *this has the correct static type and can go to the correct member of the Visitor.

Some things to note about this code:
  1. It's usually reasonable for accept() to be a const function and visit*() to take const references. But you may need it to be otherwise. Remember, Visitors are like external virtual functions. Will the virtual functions you'll add be const or non-const?
  2. On the other hand, you definitely want visit*() to be non-const, and for accept to take a non-const reference. Concrete Visitor classes may need to store some state.
  3. Make ~Visitor() virtual but not pure, and don't declare private copy/assign members (see 5.3.1) for Visitor, so that derived Visitors can rely on compiler defaults if they wish.
  4. For the same reason, it seems better to define no-op defaults for the visit*() functions instead of making them pure virtual.
  5. As written, Operation::accept() does not traverse the Expressions contained in the Operation. That means that concrete Visitors have to do so themselves if they want to. Really such traversal should be Operation's responsibility; we'll add that capability in our next pass.
So how do you use this thing? A simple example is class Print : public Visitor, which will print one of our Expressions. The interface is self-explanatory -- a constructor and inherited virtual visit*() functions -- so I'll just give the implementation:

Print::Print(std::ostream &str)
: Visitor()
, _str(str)
{
}

void Print::visitVariable(const Variable &v)
{
_str << v.name();
}

void Print::visitNumber(const Number &n)
{
_str << n.value();
}

void Print::visitOperation(const Operation &o)
{
_str << "(";
std::vector<Expression *>::const_iterator it =
o.operands().begin();

while (it != o.operands().end()) {
const Expression *e = *it;
++it;
if (it == o.operands().end()) {
// Print unary or binary operator.
_str << o.symbol();
}
e->accept(*this);
}
_str << ")";
}

The Print class is suitable for framing in an ostream shift operator:

std::ostream &operator<<(
std::ostream &str, const Expression &expr)
{
Print printer(str);
expr.accept(printer);
return str;
}

There you go. The basics of C++ Visitors. The toy example doesn't really do justice to how useful they are: Visitors are a big help when you have a big hierarchy to deal with. But the simple design presented here runs into some difficulties when you have a big hierarchy. Coming up, I'll show you some tricks to get the power of Visitors while softening some of the difficulties with using them.

Here is the complete list of posts in this series:

Sunday, March 8, 2009

C++ Visitor Pattern: Motivation

The Visitor design pattern is perhaps not the best-loved of the GoF patterns. One indication of its unpopularity is its position as the last pattern in the GoF Design Patterns book. But I have found it useful in several projects, and would like to share some Visitor tricks I've learned along the way.

In C++, Visitor is an abstract base class, with a set of one-argument virtual functions -- one for each class in a forest of hierarchies -- thereby providing "multiple-dispatch" of some piece of functionality. Virtual functions are an example of single-dispatch: based on the type of an object, the correct virtual function is called. Multiple-dispatch means that a function is selected based on more than one type: in this case, the type of the visitor, and the type of the class it visits.

Have you ever been writing some code, and thought: "I wish I could add a virtual function to these classes, without changing the class interfaces"? If so, that is exactly what Visitor allows. Of course, you have to have designed the class to accept a Visitor first, but after that, you have this "external virtual function" capability.

Another way in which Visitor is helpful in big C++ projects, is that it allows you to have good source file hygiene -- as described in our coding standard -- but still keep related functionality in one place. An example is pretty-printing of an Expression hierarchy: to make consistency easier to maintain, you would like all the printing functions to be in a single file, which would violate our source file standards if implemented by virtual functions in the classes.

Sounds good, why is Visitor unpopular? There are a couple of valid complaints about the pattern. First, it can make for difficulties when adding a new class to a visited hierarchy. This is somewhat mitigated by the Default Visitor variant, and we can also use some preprocessor tricks to create a compile- or link-time error for Visitors which must be extended for some or all new classes. It's also worth mentioning that if the Visitor functionality were instead implemented as a pure virtual function, the same amount of work is required (except for tracking down all the Visitors).

Secondly, effective use of Visitors may imply a loss of encapsulation. If a Visitor is to do meaningful work on an object, it may need access to implementation details that -- all other things being equal -- we would prefer to hide inside the class.

Nevertheless, Visitor can be a very useful design pattern. In coming posts I will present some practical C++ examples for getting the most out of it.

Here is the complete list of posts in this series: