Wednesday, July 15, 2009

Perl Cwd Performance

The Camel book says -- in the Efficiency section of Chapter 24 -- to use the Cwd module instead of calling pwd repeatedly.

Boy, it's not kidding. I wrote a test script that does almost nothing but change directories and print the current directory, repeating that about 7500 times. Using `pwd` to get the directory took an average wall-clock time of 11.7 seconds. If instead you import Cwd's version of chdir(), the script can use $ENV{PWD}, which took an average wall-clock time of 0.3 seconds.

Oddly, using Cwd::cwd() took an average of 15.4 seconds. This wasn't a very scientific test, but those numbers bore up under repeated tries (doing several runs in a row, throwing out the time for the first run). I used wall-clock time, because the sys/user time reported by Unix time was quite a bit lower, even though it was an unloaded system. Maybe because the directories were NFS mounts.

So here's a good strategy for using Cwd:

use Cwd qw(chdir);
sub cwd {
return $ENV{PWD};
}


Note: the Cwd documentation points out that $ENV{PWD} is only kept up to date if every module used in the script uses Cwd::chdir to change directories.

Monday, July 6, 2009

gcc Warning Options

A few years ago I was in a job interview, and the development manager was gloating because he had squashed all compiler warnings in the code, and then added the gcc -Werror switch to the build, so that warnings now caused the build to fail. At the time, I thought "That's annoying overkill". My feeling was that compiler warnings didn't live very long -- after someone saw one a few times, they would fix it.

But I've changed my mind. Where I work, the team had even been doing without -Wall for many years. They had been using -pedantic. In other words, the warnings were all noise, and no signal, since the pedantic warnings don't tell you about potential bugs in your code, just about adherence to language standards. Moreover, they were completely happy to ignore all compiler warnings -- even those for things so dangerous that gcc warns about them without any warning switches, like 64-bit size issues. I now understand why someone would insist on -Werror.

Really, the best set of warnings for g++ is -Wall -Wextra -Werror. [If you have an older gcc, like the 3.2.3 that's stock on RedHat 3, use -W instead of -Wextra.] When we switched these on recently, there were thousands of unique warnings. Many of them were harmless in the end of the day, but there were dozens that indicated serious code bugs and/or crashes waiting to happen.

Of course, you can disable some of the -Wextra warnings. On our RedHat 3 build, we find it convenient to add these disabling flags:
  • -Wno-parentheses: too picky
  • -Wno-unused-parameter: provoked by <fstream>
  • -Wno-deprecated: provoked by <strstream>

In the best possible world, we would use <sstream> and wouldn't need -Wno-deprecated. But that's a battle for another day. For today, I'm just very happy that we have real warnings turned on, and that they get attention due to -Werror.