Showing posts with label visitor. Show all posts
Showing posts with label visitor. Show all posts

Sunday, May 3, 2009

Handy C++ Visitor Macros

This is the fifth post on the subject of the Visitor design pattern. First I tried to explain the motivation for Visitor, then presented the textbook implementation, what I consider to be some improvements on that implementation, and finally an example of the power of the Default Visitor variant of the design pattern. In this post, I want to show some preprocessor macros that make Visitor even handier. Brace yourself, this is going to be a long post. There's a lot to discuss.

The C preprocessor is a dangerous weapon when used indiscriminately. In the case of our C++ Visitors, however, some preprocessor magic will provide us with the following benefits:
  1. Compiler errors (not warnings) when a concrete Visitor class has not been defined correctly.
  2. Automatic generation of tedious parts of the Default Visitor implementation.
  3. Linker errors when Visitors which must not implement Default Visitor are not updated to visit a newly-added class.
We need to add the Statement hierarchy described in the last post to the Visitor class we declared earlier. Instead of hand-coding everything, we are going to write a header file that uses some preprocessor macros to describe the hierarchy of visited classes. If we also add a header file containing #include directives for each concrete visited class, then we won't have to edit Visitor.hpp or Visitor.cpp when we add classes to the hierarchy in the future. Nifty!

Let's call the first header file VisitorSpecs.hpp. It looks like this:

// NOTE: No include guard for this header!

#ifndef VISITED_ABSTRACT_CLASS
#define VISITED_ABSTRACT_CLASS(cls,parent)
#endif

#ifndef VISITED_ROOT_CLASS
#define VISITED_ROOT_CLASS(cls) \
VISITED_ABSTRACT_CLASS(cls,0)
#endif

VISITED_ROOT_CLASS(Expression)
VISITED_CLASS(Variable, Expression)
VISITED_CLASS(Number, Expression)
VISITED_CLASS(Operation, Expression)

VISITED_ROOT_CLASS(Statement)
VISITED_CLASS(Assignment, Statement)
VISITED_ABSTRACT_CLASS(MultiStatement, Statement)
VISITED_CLASS(Block, MultiStatement)
VISITED_CLASS(IfElse, MultiStatement)

#undef VISITED_ROOT_CLASS
#undef VISITED_ABSTRACT_CLASS
#undef VISITED_CLASS

Hang on a second... This defines a couple of macros as no-ops, and leaves another macro undefined. What gives? The answer is that Visitor.hpp defines the macros one way before including VisitorSpecs.hpp, and Visitor.cpp defines the macros another way, and also includes VisitorSpecs.hpp. That's why the header must not have an include guard: it will get included into Visitor.cpp multiple times. You may even find other consumers for VisitorSpecs.hpp, as you'll see a little later.

The macros are going to end up a little bit complicated, to provide some of the wonderful compiler fail-safes I've been hyping. But to make things clear, let's present the basic implementation first. Visitor.hpp has to include VisitorSpecs.hpp two separate times. First to declare the class names:

#define VISITED_CLASS(cls,parent) class cls;
#define VISITED_ABSTRACT_CLASS(cls, parent) \
VISITED_CLASS(cls,parent)
#include "VisitorSpecs.hpp"


Then to declare the member functions, Visitor.hpp defines the macros as follows:

#define VISITED_CLASS(cls,parent) \
virtual bool enter(const cls &); \
virtual void exit(const cls &);
#define VISITED_ABSTRACT_CLASS(cls, parent) \
VISITED_CLASS(cls,parent)
#include "VisitorSpecs.hpp"

Remember, the "root" classes (those with no parent) are defined in terms of abstract classes. So in fact, Visitor.hpp declares enter() and exit() for every visited class.

Visitor.cpp defines default implementations for the member functions by defining the macros like this:

// No parent default: don't recurse, do nothing.
#define VISITED_ROOT_CLASS(cls) \
bool Visitor::enter(const cls &) \
{ \
VISITOR_CATCH_MISSING_HEADER(cls); \
return false; \
} \
void Visitor::exit(const cls &) \
{ \
}

// Child classes: do what the parent would do.
#define VISITED_CLASS(cls,parent) \
bool Visitor::enter(const cls &c) \
{ \
VISITOR_CATCH_MISSING_HEADER(cls); \
const parent &p = c; \
return enter(p); \
} \
void Visitor::exit(const cls &c) \
{ \
const parent &p = c; \
exit(p); \
}

// Abstract classes: same as for concrete classes.
#define VISITED_ABSTRACT_CLASS(cls, parent) \
VISITED_CLASS(cls,parent)

#include "VisitorSpecs.hpp"

The macro definitions for Visitor.cpp implement the Default Visitor scheme: if a concrete Visitor doesn't define enter() for a given type, the default is picked up from the base class, and looks in the virtual table for the enter() belonging to the parent class. We have Visitor.cpp include the header file VisitedClasses.hpp, which includes the .hpp for every concrete visited class. Thus, when a new class is added to the hierarchy, Visitor.hpp and Visitor.cpp do not require any change whatsoever.

Notice the lines in the function definitions that say VISITOR_CATCH_MISSING_HEADER(cls). The reason for these lines, is that if you add a new class to the hierarchy, but forget to add an include to VisitedClasses.hpp, most compilers give you a confusing error message about initializing the variable p. On the versions of g++ and Solaris CC I've looked at, the message says nothing about an undefined class, let alone a missing header file. So, in Visitor.cpp we define a macro which will give you an error message that points in the direction of the problem, if you have forgotten that include. The macro is:

#ifndef NDEBUG
#define VISITOR_CATCH_MISSING_HEADER(cls) \
typedef cls INCLUDE_ ## cls ## \
_hpp_IN_VisitedClasses_hpp; \
struct Dummy : \
public INCLUDE_ ## cls ## \
_hpp_IN_VisitedClasses_hpp { \
Dummy(); \
}
#else
#define VISITOR_CATCH_MISSING_HEADER(cls) /* nothing */
#endif

We make the macro a no-op for non-debug builds, though it shouldn't really affect the code even if you enable it for release builds. Compare the messages in g++ 4.1.2. Suppose you omitted #include "IfElse.hpp" from VisitedClasses.hpp. Without the macro, the message says VisitorSpecs.hpp:38: error: invalid initialization of reference of type 'const MultiStatement&' from expression of type 'const IfElse'; with the macro, you still get that message, but before it you get: VisitorSpecs.hpp:38: error: invalid use of undefined type 'struct Visitor::enter(const IfElse&)::INCLUDE_IfElse_hpp_IN_VisitedClasses_hpp'. A strange message, but it gets the point across better.

Speaking of compiler errors, we will beef up the macro definitions in Visitor.hpp to catch a common coding error at compile time instead of having to debug strange behavior at runtime. The issue is this: since the enter()/exit() functions in our Visitor are not pure virtual, if you mistakenly add a concrete Visitor with one or more incorrect prototypes for those functions, the code will compile, link, and run -- silently skipping the functions with the incorrect prototypes, because the abstract base Visitor implementations will be found in the virtual table.

For example, if your concrete Visitor declared bool enter(const MultiStatement *) -- with a pointer argument instead of a reference -- then you wouldn't know anything was wrong until you ran the program. In the worst case, you would not even notice that whatever behavior was supposed to happen in that MultiStatement enter() didn't happen; even if you did notice it, you would have to step through in the debugger to watch the visit go into the wrong function. To preclude some of the more common silent-but-deadly typos, we bulk up the Visitor.hpp macros like so:

// The "real" prototypes.
#define VISITED_CLASS_ENTER_EXIT(cls,parent) \
virtual bool enter(const cls &); \
virtual void exit(const cls &);

// Fakes that turn typos into compile errors.
#define FORBID_PTR_ARG(cls,parent) \
virtual _USE_REFERENCE_ARG *enter(const cls *) \
{ return 0; } \
virtual _USE_REFERENCE_ARG *exit(const cls *) \
{ return 0; }
#define FORBID_NON_CONST_ARG(cls,parent) \
virtual _USE_CONST_REF *enter(cls &) \
{ return 0; } \
virtual _USE_CONST_REF *exit(cls &) \
{ return 0; }
#define FORBID_CONST_FUNC(cls,parent) \
virtual _USE_NON_CONST_FUNC *enter(const cls &) \
const { return 0; } \
virtual _USE_NON_CONST_FUNC *exit(const cls &) \
const { return 0; }

// Final macro definitions.
#ifdef NDEBUG
#define VISITED_CLASS(cls,parent) \
VISITED_CLASS_ENTER_EXIT(cls,parent)
#else
class _USE_REFERENCE_ARG;
class _USE_CONST_REF;
class _USE_NON_CONST_FUNC;
#define VISITED_CLASS(cls,parent) \
VISITED_CLASS_ENTER_EXIT(cls,parent) \
FORBID_PTR_ARG(cls,parent) \
FORBID_NON_CONST_ARG(cls,parent) \
FORBID_CONST_FUNC(cls,parent)
#endif

#define VISITED_ABSTRACT_CLASS(cls, parent) \
VISITED_CLASS(cls,parent)

#include "VisitorSpecs.hpp"

Without the FORBID* macros, the incorrect pointer-arg prototype above would prevent our FindLhsVars visitor in our previous example from finding assigned-to variables in complex statements such as blocks or if-else statements. With the macros in place, we get the following compile-time error:
VisitorSpecs.hpp:36: error: overriding 'virtual Visitor::_USE_REFERENCE_ARG* Visitor::enter(const MultiStatement*)'. That is money in the bank.

Are you still with me? Then there's one more feature of this setup that I'd like to point out. Notice that VISITED_CLASS and its related macros can be defined in any file that includes VisitorSpecs.hpp. As useful as Default Visitor is, there are some concrete Visitors that must be updated whenever a new class is added to the hierarchy -- for example, the pretty-printer Visitor from the earlier post. In that case, instead of hard-coding the enter() function declarations in the concrete Visitor class definition, do this:

#define VISITED_CLASS(cls,parent) \
virtual bool enter(const cls &);
#include "VisitorSpecs.hpp"

Now, if you forget to define the enter function for a newly-added class, you will get a link error when the linker tries to build the virtual table. From g++ 4.1.2: Print.cpp:61: undefined reference to `Print::enter(IfElse const&)'. Again, money in the bank. You might also find that there is some boilerplate code you would like to generate for every visited class -- a thought that comes to mind is an enum representing class types in the hierarchy. Simply extend the macro to include a third argument for that extra information -- if it can't be created with a '##' concatenation -- and include VisitorSpecs.hpp in the appropriate place.

Whew! This is a huge post, but when you apply these macro definitions to the improved Visitor in the earlier post, you have a very powerful idiom to work with. I can see a couple more posts coming up in this series: 1. a listing of the final Statement, Expression, and Visitor code presented here (I've been compiling it as I wrote these posts, to make sure it all worked), and an example main() to exercise it; and 2. discussion of a few Visitor issues I've glossed over to get this thing out.

Friday, May 1, 2009

The Default Visitor Idiom

The text of this post was originally in the next post, about our preprocessor framework for C++ Visitors. But I realized that it was a huge distraction from that material, and deserved its own post. To illustrate some points, it builds on the toy Expression hierarchy described here, but using the improved Visitor idiom described here.

Recall that in the Default Visitor idiom, concrete Visitor classes need not implement every enter() function. Whichever classes do not have an enter() will call the enter() of their parent class (the same goes for exit() of course). It's hard to imagine the usefulness in the toy Expression hierarchy of our earlier Visitor posts. To make the example a little more complicated, we'll add some simple Statement classes to our toy visited hierarchy, including an abstract class which is not entirely realistic: MultiStatement, representing a Statement that itself contains one or more other Statements.

Here are the definitions of our new Statement classes (as before, with the disclaimer that it's a toy and not too cleverly designed):

class Statement {
public:
Statement();
virtual ~Statement() throw() = 0;
virtual void accept(Visitor &v) const = 0;
};

class Assignment : public Statement {
public:
Assignment(Expression *lhs, Expression *rhs);
virtual ~Assignment() throw();

const Expression &lhs() const;
const Expression &rhs() const;
void accept(Visitor &v) const;

private:
Expression *_lhs;
Expression *_rhs;
};

class MultiStatement : public Statement {
public:
MultiStatement();
virtual ~MultiStatement() throw();
};

class Block : public MultiStatement {
public:
Block();
virtual ~Block() throw();

const std::vector<Statement *> &stmts() const;
void accept(Visitor &v) const;

void addStmt(Statement *stmt);

private:
std::vector<Statement *> _stmts;
};

class IfElse : public MultiStatement {
public:
IfElse(Expression *condition, Statement *ifBranch,
Statement *elseBranch = NULL);
virtual ~IfElse() throw();

const Expression &condition() const;
const Statement &ifBranch() const;
const Statement *elseBranch() const;
void accept(Visitor &v) const;

private:
Expression *_condition;
Statement *_ifBranch;
Statement *_elseBranch;
};

The Statement::accept() functions look like this:

void Assignment::accept(Visitor &v) const
{
if (v.enter(*this)) {
v.visit(_lhs);
v.visit(_rhs);
v.exit(*this);
}
}

void Block::accept(Visitor &v) const
{
if (v.enter(*this)) {
std::vector<Statement *>::const_iterator it;
for (it = stmts().begin();
it != stmts().end(); ++it) {
const Statement *s = *it;
v.visit(s);
}
v.exit(*this);
}
}

void IfElse::accept(Visitor &v) const
{
if (v.enter(*this)) {
v.visit(*_condition);
v.visit(*_ifBranch);
v.visit(_elseBranch);
v.exit(*this);
}
}

Note that Visitor pushes polymorphism a little farther than C++ inheritance does. As shown in IfElse::accept(), it can visit Statements or Expressions, despite the fact that they do not have a common ancestor! Depending on your project, you may find it convenient to group various independent hierarchies into common Visitor classes. Minor detail: we do need to add Visitor::visit() inline functions for Statement pointers and references, as we did for Expressions.

Here's an example that shows the expressive power of Default Visitor: a class that collects the set of Variables assigned to in a piece of code. For brevity I'll just show the class function definitions, not the class declaration:

FindLhsVars::FindLhsVars()
: Visitor(), _vars(), _lhs(false)
{
}

bool FindLhsVars::enter(const MultiStatement &)
{
return true;
}

bool FindLhsVars::enter(const Assignment &a)
{
_lhs = true;
visit(a.lhs());
_lhs = false;
return false;
}

bool FindLhsVars::enter(const Variable &v)
{
if (_lhs) {
_vars.insert(v.name());
}
return false;
}

const std::set<std::string> &FindLhsVars::
vars() const
{
return _vars;
}

It might not seem like much in our toy example, but consider if you had a realistic language hierarchy, and your Visitor was not a Default Visitor. You would have to tediously define an enter() function for every class in the hierarchy, instead of for three classes. Furthermore, unlike the textbook concrete Visitor, FindLhsVars is likely to still work correctly without any changes if you add a class to the hierarchy. I can't say enough good things about Default Visitor -- it takes one of the sourest Visitor lemons and makes it into a very sweet lemonade.

The next post provides some preprocessor tricks that take a lot of the grunt-work out of adding classes to a Visitor, while also bulletproofing your derived concrete Visitors.

Sunday, April 5, 2009

Easier-to-Use C++ Visitors

A recent post described the basic structure of the Visitor design pattern for C++. We used the visitFoo()/accept() idiom presented in the GoF book.

Now let's tweak the design a bit, to make things a little prettier, and to make it easier to derive classes from Visitor. After that, I'll need to do another post on preprocessor tricks that generate boilerplate code and set up compile-time errors for common Visitor coding mistakes.

Cosmetic Changes

It's a little awkward to invoke the Visitor code by calling a member function of the visited base class -- obj.accept(v) -- so let's add inline functions to Visitor to provide the more natural idiom of v.visit(obj):

void visit(const Expression &e) {
e.accept(*this);
}
void visit(const Expression *e) {
if (e) {
e->accept(*this);
}
}
Note: this means that Visitor.hpp includes Expression.hpp. That is the correct dependency -- Visitors know more about Expressions than vice versa. Thus, Expression.hpp will declare class Visitor instead of including Visitor.hpp.

Don't be tempted to use operator()(const Expression &e) instead of visit(). At first it sounds clever that you will be able to write Print print(stream); print(obj);, but it's bewildering to new readers of your code, and it will even make you scratch your own head when you revisit that code a year into the future. It's much easier to read Print printer(stream); printer.visit(obj);. Trust me, I've been there, trying to figure out what my own program was doing.

Next, I disagree with encoding the type names into the Visitor function names -- it sounds silly to read the declarations aloud: visitOperation(Operation &) -- is there an echo in here? C++ has function overloading -- let's use it, and give all the Visitor members the same short name. We just used up visit(), so we'll have to think of something else. For reasons that will make more sense in a moment, let's use the name enter():

virtual void enter(const Variable &);
virtual void enter(const Number &);
virtual void enter(const Operation &);
A pleasant side effect of this is that there is less room for error when you change the name of a class. If the type name is encoded in the function name, and you forget to change the function name in a concrete visitor, you have just disabled the visitor for that type.

Ease-of-use

Let's fix some usability defects in the basic Visitor implementation. The worst one is that a recursive operation has to know the structure of the visited objects -- see Print::visitOperation() in the earlier post. That is a breakdown of encapsulation, or responsibility, or both.

On the other hand, some concrete Visitors do not want to recurse into subobjects -- they just want to operate on the one visited object. To accomodate both kinds of Visitor, we move the recursiveness into accept() -- makes sense, it's the object's responsibility -- but change the signature of enter() to allow the derived Visitor to choose whether to recurse or not, by returning either true (recurse) or false (don't).

Now the enter() prototypes look like this:

virtual bool enter(const Variable &);
virtual bool enter(const Number &);
virtual bool enter(const Operation &);
The no-op defaults in the base class return false. I went around and around with myself on the question of whether to make recursion the default, but I think the answer is, If your Visitor works harder by recursing, it should require more code to get it done -- overriding the defaults to recurse where needed.

Now that responsibility for recursion has returned to the visited hierarchy, Operation::accept() looks like this:

void Operation::accept(Visitor &v) const
{
if (v.enter(*this)) {
std::vector<Expression *>::const_iterator it;
for (it = operands().begin();
it != operands().end();
++it) {
const Expression *e = *it;
v.visit(e);
}
v.exit(*this);
}
}
Handling recursion in the object does the trick most of the time, but the sad thing is, it still doesn't fix the encapsulation issue with our pretty-printer example. To print an infix expression, the Print Visitor has to know the structure of Operator. That's just a fact of life, but if we change the Visitor interface just a tiny bit, we can at least print expressions in prefix and postfix notations without loss of encapsulation. All that's needed is an exit() function to correspond to the enter() function, which will close parentheses for prefix expressions, or print the operator for postfix expressions.

As you can guess, the exit() function returns void, and the default implementation in abstract base Visitor does nothing. If you have sharp eyes, you'll notice that I already added a call to it in Operation::accept() above. It only gets called if the Visitor chose to recurse. Now a postfix pretty-printer is much simpler than the infix version presented in the earlier post:

bool Postfixer::enter(const Variable &v)
{
_str << v.name() << " ";
return false;
}

bool Postfixer::enter(const Number &n)
{
_str << n.value() << " ";
return false;
}

bool Postfixer::enter(const Operation &o)
{
return true;
}

void Postfixer::exit(const Operation &o)
{
_str << o.symbol() << " ";
}
Most importantly, it knows nothing about the structure of Operation.

This post has dragged on long enough, but there is one last ease-of-use improvement I want to make to our Visitor. It must implement the Default Visitor idiom, which makes the double-dispatch polymorphic in both parameters. In simpler terms, if the concrete visitor does not implement, say, enter(Foo &), the default action is not "do nothing", it is enter(ParentOfFoo &). It's hard to visualize the benefit of this in my tiny toy example, but in a realistic hierarchy -- for example, a language hierarchy with abstract Declarations, Statements, and Expressions -- it often saves a lot of coding. Note that in our example, it means we have to add a function to Visitor that hasn't been there before: enter(Expression &).

Next up, some preprocessor magic which generates a lot of the code for these Visitors -- including the Default Visitor fallbacks -- and which also gives you distant early warning of certain common typos.

Sunday, March 15, 2009

C++ Visitors: Basic Implementation

In the last post, I described the pros and cons of the Visitor design pattern, and promised some sample C++. This post has a basic implementation of Visitor, to illustrate the concepts. In later posts, we'll build a more powerful and yet more user-friendly Visitor, and use some preprocessor tricks to partially foolproof the use of Visitor.

One place that I've found Visitor useful is in class hierarchies that describe a language -- in my case, usually a hardware description language, but it would apply equally to classes representing a programming or scripting language. In a language project, you often find yourself traversing a parse tree of the input code, performing functions which can't properly be said to be the responsibility of the language classes. Even if you could make a case for adding virtual functions to achieve the task, it makes sense to keep the related functionality together in one file. But you want to dispatch on the node type, so that your code doesn't devolve into a forest of switch statements. This is exactly the place to use Visitor.

Since language parse trees are a common place to apply Visitor, let's start our sample C++ code with a toy class hierarchy for expressions. There are only three concrete classes: Variable, Number, and Operation, all derived from abstract Expression. I kept these classes very simple for brevity's sake, hopefully more care would go into their design in real life:

class Expression {
public:
Expression();
virtual ~Expression() throw() = 0;
virtual void accept(Visitor &v) const = 0;
};

class Number : public Expression {
public:
explicit Number(int value);
virtual ~Number() throw();
int value() const;
virtual void accept(Visitor &v) const;

private:
int _value;
};

class Variable : public Expression {
public:
explicit Variable(const std::string &name);
virtual ~Variable() throw();
std::string name() const;
virtual void accept(Visitor &v) const;

private:
std::string _name;
};

class Operation : public Expression {
public:
enum Operator {
PLUS, MINUS, MAX_OPERATOR
};
Operation(const Operator &op,
Expression *lhsOrOnly,
Expression *rhs = NULL);
virtual ~Operation() throw();
Operator op() const;
const char *symbol() const;
const std::vector<Expression *> &operands() const;
virtual void accept(Visitor &v) const;

private:
Operator _op;
std::vector<Expression *> _operands;
};


Notice that Expression has a pure virtual accept() function. We need that in order to bounce into the correct function in Visitor classes -- we'll discuss that in a moment. Now let's see what the abstract base Visitor class looks like. It's just an interface, any real visiting work will be done by concrete visitors derived from this. As I said, we'll be making the interface more sophisticated in the next post, but here's our first crack at it:


class Visitor {
public:
Visitor();
virtual ~Visitor();

virtual void visitVariable(const Variable &);
virtual void visitNumber(const Number &);
virtual void visitOperation(const Operation &);
};


Our first implementation of accept() is quite simple, so I'll only show one example:


void Operation::accept(Visitor &v) const
{
v.visitOperation(*this);
}


C++ Visitors are coded with this accept/visitFoo idiom to work around C++'s lack of multiple dispatch. Calling visitor.visit(*expr) would pick visit()'s signature at compile time based on the static type of expr. So we have to call expr->accept(visitor), which picks the accept() function at run time based on expr's actual type. Inside accept(), *this has the correct static type and can go to the correct member of the Visitor.

Some things to note about this code:
  1. It's usually reasonable for accept() to be a const function and visit*() to take const references. But you may need it to be otherwise. Remember, Visitors are like external virtual functions. Will the virtual functions you'll add be const or non-const?
  2. On the other hand, you definitely want visit*() to be non-const, and for accept to take a non-const reference. Concrete Visitor classes may need to store some state.
  3. Make ~Visitor() virtual but not pure, and don't declare private copy/assign members (see 5.3.1) for Visitor, so that derived Visitors can rely on compiler defaults if they wish.
  4. For the same reason, it seems better to define no-op defaults for the visit*() functions instead of making them pure virtual.
  5. As written, Operation::accept() does not traverse the Expressions contained in the Operation. That means that concrete Visitors have to do so themselves if they want to. Really such traversal should be Operation's responsibility; we'll add that capability in our next pass.
So how do you use this thing? A simple example is class Print : public Visitor, which will print one of our Expressions. The interface is self-explanatory -- a constructor and inherited virtual visit*() functions -- so I'll just give the implementation:

Print::Print(std::ostream &str)
: Visitor()
, _str(str)
{
}

void Print::visitVariable(const Variable &v)
{
_str << v.name();
}

void Print::visitNumber(const Number &n)
{
_str << n.value();
}

void Print::visitOperation(const Operation &o)
{
_str << "(";
std::vector<Expression *>::const_iterator it =
o.operands().begin();

while (it != o.operands().end()) {
const Expression *e = *it;
++it;
if (it == o.operands().end()) {
// Print unary or binary operator.
_str << o.symbol();
}
e->accept(*this);
}
_str << ")";
}

The Print class is suitable for framing in an ostream shift operator:

std::ostream &operator<<(
std::ostream &str, const Expression &expr)
{
Print printer(str);
expr.accept(printer);
return str;
}

There you go. The basics of C++ Visitors. The toy example doesn't really do justice to how useful they are: Visitors are a big help when you have a big hierarchy to deal with. But the simple design presented here runs into some difficulties when you have a big hierarchy. Coming up, I'll show you some tricks to get the power of Visitors while softening some of the difficulties with using them.

Here is the complete list of posts in this series:

Sunday, March 8, 2009

C++ Visitor Pattern: Motivation

The Visitor design pattern is perhaps not the best-loved of the GoF patterns. One indication of its unpopularity is its position as the last pattern in the GoF Design Patterns book. But I have found it useful in several projects, and would like to share some Visitor tricks I've learned along the way.

In C++, Visitor is an abstract base class, with a set of one-argument virtual functions -- one for each class in a forest of hierarchies -- thereby providing "multiple-dispatch" of some piece of functionality. Virtual functions are an example of single-dispatch: based on the type of an object, the correct virtual function is called. Multiple-dispatch means that a function is selected based on more than one type: in this case, the type of the visitor, and the type of the class it visits.

Have you ever been writing some code, and thought: "I wish I could add a virtual function to these classes, without changing the class interfaces"? If so, that is exactly what Visitor allows. Of course, you have to have designed the class to accept a Visitor first, but after that, you have this "external virtual function" capability.

Another way in which Visitor is helpful in big C++ projects, is that it allows you to have good source file hygiene -- as described in our coding standard -- but still keep related functionality in one place. An example is pretty-printing of an Expression hierarchy: to make consistency easier to maintain, you would like all the printing functions to be in a single file, which would violate our source file standards if implemented by virtual functions in the classes.

Sounds good, why is Visitor unpopular? There are a couple of valid complaints about the pattern. First, it can make for difficulties when adding a new class to a visited hierarchy. This is somewhat mitigated by the Default Visitor variant, and we can also use some preprocessor tricks to create a compile- or link-time error for Visitors which must be extended for some or all new classes. It's also worth mentioning that if the Visitor functionality were instead implemented as a pure virtual function, the same amount of work is required (except for tracking down all the Visitors).

Secondly, effective use of Visitors may imply a loss of encapsulation. If a Visitor is to do meaningful work on an object, it may need access to implementation details that -- all other things being equal -- we would prefer to hide inside the class.

Nevertheless, Visitor can be a very useful design pattern. In coming posts I will present some practical C++ examples for getting the most out of it.

Here is the complete list of posts in this series: