Wednesday, April 15, 2009

Gdb set input-radix

Today I messed up a debugging session I had going, when I forgot the gdb syntax for setting the value of a variable in the code. It wouldn't have been a problem for most variables, but I wanted to set the variable i. I typed:
  (gdb) set i 100
the surprising response to that -- from gdb version 6.3.0.0-1.132.EL3rh -- was:
  Input radix now set to decimal 100, hex 64, octal 144.
Say what? Of course the correct syntax for setting a variable in the debug program is:
  (gdb) set var i = 100
You see, gdb was trying to be nice by interpreting "i" as shorthand for "input-radix", a sub-command of "set". If my variable had some other name that didn't trigger the shorthand, I would have gotten a friendly error message and no ill effects:
  (gdb) set flags 100
A syntax error in expression, near `100'.
Having numbers you enter interpreted as base-100 is not very useful, and I ended up killing my session and restarting because I didn't figure out the trick for setting the input-radix back to 10 until I had gotten gdb into what seemed to be a useless state. The actual sequence of events was like this:
  (gdb) set i 100
Input radix now set to decimal 100, hex 64, octal 144.
(gdb) set i 10
Input radix now set to decimal 100, hex 64, octal 144.
(gdb) # Huh? Oh yeah, base100("10") == base10("100").
(gdb) set i 1
Nonsense input radix ``decimal 1''; input radix unchanged.
(gdb) # D'oh! base100("1") isn't 10, it's 1.
(gdb) set i 10
Invalid number "10".
(gdb) # Oops!
Notice that even though it said the radix was unchanged, gdb really did change the radix to base-1! At that point, I thought I was toast, since it couldn't recognize any numbers except 0. I killed my session, losing my breakpoints and history. Later I realized that there is a way out: use a hex number:
  (gdb) set i 0xa
Input radix now set to decimal 10, hex a, octal 12.
Now, set input-radix could be a handy little thing, if it were restricted to values you might actually want to use as the base for numbers you type in to the debugger. Say, 2, 8, 10, and 16. But 100? You can't even enter numbers between 16 and 100: it recognizes "F" as 15 -- as long as there's no variable called "F" in the debug context -- but it doesn't recognize "G" as 16. You can even set the input radix to 0! Watch this:
  (gdb) set i 0x0
Input radix now set to decimal 4294967295, hex ffffffff, octal 37777777777.
The corresponding set output-radix command is restricted to 8, 10, and 16. Base 2 would have been nice, but whatever. It's ridiculous that set input-radix allows anything including 0 and 1.

Wednesday, April 8, 2009

Extra Characters After Close-Brace

As noted previously, Tcl forces you to put open-braces "{" on the same line as the if or proc that they belong to, and forces you to use the atrocious "} else {" style.

That's appalling, but get this: it also requires a space to the right of the close-brace "}". That's just plain stupid.

Compounding the problem, the error message isn't something like "a space is required after a close-brace". Oh no. It says "extra characters after close-brace". And if the body of your if is very long, like this:
if {$tcl_is_stupid} {
  # ... many lines of code ...
  # Next line is the error
}else{
  i_eat_my_hat
}

then the error message is truncated so that you only see the part of the command that isn't broken, along with the line number of the "if", not the offending "else". As icing on the cake, in the case I was trying to help someone figure out, the error was at the top of one of Tcl's Icelandic Saga stacktraces, and the line number reported to be in error was with respect to the proc it was in, not the file.

Thankfully, Google led me to the answer, because Tcl sure didn't. I love the subtitle of that Tcl wiki page: Purpose: to discuss one of the few 'gotchas' in Tcl. Ha! What a crock!

Sunday, April 5, 2009

Easier-to-Use C++ Visitors

A recent post described the basic structure of the Visitor design pattern for C++. We used the visitFoo()/accept() idiom presented in the GoF book.

Now let's tweak the design a bit, to make things a little prettier, and to make it easier to derive classes from Visitor. After that, I'll need to do another post on preprocessor tricks that generate boilerplate code and set up compile-time errors for common Visitor coding mistakes.

Cosmetic Changes

It's a little awkward to invoke the Visitor code by calling a member function of the visited base class -- obj.accept(v) -- so let's add inline functions to Visitor to provide the more natural idiom of v.visit(obj):

void visit(const Expression &e) {
e.accept(*this);
}
void visit(const Expression *e) {
if (e) {
e->accept(*this);
}
}
Note: this means that Visitor.hpp includes Expression.hpp. That is the correct dependency -- Visitors know more about Expressions than vice versa. Thus, Expression.hpp will declare class Visitor instead of including Visitor.hpp.

Don't be tempted to use operator()(const Expression &e) instead of visit(). At first it sounds clever that you will be able to write Print print(stream); print(obj);, but it's bewildering to new readers of your code, and it will even make you scratch your own head when you revisit that code a year into the future. It's much easier to read Print printer(stream); printer.visit(obj);. Trust me, I've been there, trying to figure out what my own program was doing.

Next, I disagree with encoding the type names into the Visitor function names -- it sounds silly to read the declarations aloud: visitOperation(Operation &) -- is there an echo in here? C++ has function overloading -- let's use it, and give all the Visitor members the same short name. We just used up visit(), so we'll have to think of something else. For reasons that will make more sense in a moment, let's use the name enter():

virtual void enter(const Variable &);
virtual void enter(const Number &);
virtual void enter(const Operation &);
A pleasant side effect of this is that there is less room for error when you change the name of a class. If the type name is encoded in the function name, and you forget to change the function name in a concrete visitor, you have just disabled the visitor for that type.

Ease-of-use

Let's fix some usability defects in the basic Visitor implementation. The worst one is that a recursive operation has to know the structure of the visited objects -- see Print::visitOperation() in the earlier post. That is a breakdown of encapsulation, or responsibility, or both.

On the other hand, some concrete Visitors do not want to recurse into subobjects -- they just want to operate on the one visited object. To accomodate both kinds of Visitor, we move the recursiveness into accept() -- makes sense, it's the object's responsibility -- but change the signature of enter() to allow the derived Visitor to choose whether to recurse or not, by returning either true (recurse) or false (don't).

Now the enter() prototypes look like this:

virtual bool enter(const Variable &);
virtual bool enter(const Number &);
virtual bool enter(const Operation &);
The no-op defaults in the base class return false. I went around and around with myself on the question of whether to make recursion the default, but I think the answer is, If your Visitor works harder by recursing, it should require more code to get it done -- overriding the defaults to recurse where needed.

Now that responsibility for recursion has returned to the visited hierarchy, Operation::accept() looks like this:

void Operation::accept(Visitor &v) const
{
if (v.enter(*this)) {
std::vector<Expression *>::const_iterator it;
for (it = operands().begin();
it != operands().end();
++it) {
const Expression *e = *it;
v.visit(e);
}
v.exit(*this);
}
}
Handling recursion in the object does the trick most of the time, but the sad thing is, it still doesn't fix the encapsulation issue with our pretty-printer example. To print an infix expression, the Print Visitor has to know the structure of Operator. That's just a fact of life, but if we change the Visitor interface just a tiny bit, we can at least print expressions in prefix and postfix notations without loss of encapsulation. All that's needed is an exit() function to correspond to the enter() function, which will close parentheses for prefix expressions, or print the operator for postfix expressions.

As you can guess, the exit() function returns void, and the default implementation in abstract base Visitor does nothing. If you have sharp eyes, you'll notice that I already added a call to it in Operation::accept() above. It only gets called if the Visitor chose to recurse. Now a postfix pretty-printer is much simpler than the infix version presented in the earlier post:

bool Postfixer::enter(const Variable &v)
{
_str << v.name() << " ";
return false;
}

bool Postfixer::enter(const Number &n)
{
_str << n.value() << " ";
return false;
}

bool Postfixer::enter(const Operation &o)
{
return true;
}

void Postfixer::exit(const Operation &o)
{
_str << o.symbol() << " ";
}
Most importantly, it knows nothing about the structure of Operation.

This post has dragged on long enough, but there is one last ease-of-use improvement I want to make to our Visitor. It must implement the Default Visitor idiom, which makes the double-dispatch polymorphic in both parameters. In simpler terms, if the concrete visitor does not implement, say, enter(Foo &), the default action is not "do nothing", it is enter(ParentOfFoo &). It's hard to visualize the benefit of this in my tiny toy example, but in a realistic hierarchy -- for example, a language hierarchy with abstract Declarations, Statements, and Expressions -- it often saves a lot of coding. Note that in our example, it means we have to add a function to Visitor that hasn't been there before: enter(Expression &).

Next up, some preprocessor magic which generates a lot of the code for these Visitors -- including the Default Visitor fallbacks -- and which also gives you distant early warning of certain common typos.

Friday, April 3, 2009

Last Element in a Tcl List

Oh, thank you Tcl, for providing the special string end. That way I can get the last element in a list with:

[lindex $mylist end]

What a relief! I was about to type:

[lindex $mylist [expr {[llength $mylist] - 1}]]

And if I had had to type that, my head would have exploded. Fortunately, I got away with just a pulsing vein on my temple at the thought of a pseudo-keyword like end.