Sunday, October 5, 2008

Obfuscated Directory Names

I have to rant about some confusing directory/file/object names in a project I inherited at work. The guy who did it is a competent, experienced engineer. This isn't an attack on him or his work habits. If there's a moral to this rant, it's that good developers can make this kind of mistake, so the rest of us have to be extra careful not to make it.

Now, on to my rant. This C++ project -- for purposes of anonymity let's call it the "zoo" project -- interfaces with a Tcl interpreter. One confusing thing about the project organization is that it doesn't follow Rules 1.1 and 1.2 of my C++ coding standard: each source file might define many different classes. That compounds some other class naming issues, like the presence of classes with ridiculously similar names, like "ZooObj" and "ZooObject", or naming files after an abstract base class, but changing the capitalization: zooObject.hpp.

But the real problem has to do with directory names that don't describe their contents well, and that all sound the same. Here's the directory structure:

...
zoo/
tclApi/
zooDb/
zooDb/
zooDbTcl/

The morpheme "db" here just means "I am defining a set of classes that model the zoo data". Well, no kidding, it's the zoo project. Things that go without saying should just not be said. In object-oriented design, responsibility for every function belongs to some class or another. You should never need special directories called "db" or "model" or "classes" or anything like that -- they don't add any information.

The next indication that something has gone wrong is that we have a directory called "zooDb/zooDb". That's one duplication that we could surely live without; you also have to wonder why there are two directories with "tcl" in the name. It turns out that one rationale for the directory structure is that there are a set of classes which will be compiled into a library with no dependencies on the larger project, so that it can provide a Tcl package to any interpreter. So maybe there's a method to this madness, we need the zoo/zooDb/zooDbTcl directory for that separate Tcl package code. Oops, turns out that's not the case, it's zoo/zooDb/zooDb that holds that. Arrrggggh!

What can we learn from this bad example? First, don't add directory structure where it's not needed. You probably need one directory for each object library you're building; any more than that just makes it confusing when someone is trying to walk through your code. Second, directory names and class names should not contain superfluous strings like "db", "model", or "class". Finally, don't add misleading strings -- like "tcl" in the example above -- that lead away from the directories or files that should be described with them.

If we apply all that, we end up with a much cleaner directory structure:

...
zoo/
zooTclPackage/

With that structure, when a new developer is looking for the classes that are in the dependency-free package, he knows just where to look. When he's looking for other zoo-project files, he knows where to look. It doesn't matter that some code interfacing with Tcl is mingled in a directory with stuff that knows nothing about Tcl -- like the food on your plate, it all gets mixed together when you eat it anyway. As an added benefit, the use of that flat hierarchy and putting every class in a separate header file will hopefully cut down on the temptation to create evil twins like zooObj and zooObject.

No comments: