Software EngiSneering

Saturday, June 14, 2014

Prevent Emacs from Changing Case During Completions

People where I work have the annoying habit of naming files the same as the class they define, except that the filename is capitalized differently than the class. So the class ZooAnimal is in files zooAnimal.h and zooAnimal.cpp. Why? Why???

It's annoying enough when I'm trying to locate the file for a given class, but a more frequent annoyance is that when I hit M-/ to run dabbrev-expand, the out-of-the-box behavior is to change the case of what I typed to match the expansion that dabbrev found. Using the example above, if I type "ZooAn M-/", dabbrev notices #include "zooAnimal.h", and gives me the completion "zooAnimal". Grrr...

The dabbrev package is so fundamental that you don't need to specially enable it in your .emacs file. In fact, I didn't even realize that's what I was using, so in my first efforts to change the completion case-folding, I was searching for variables named "*complet*", and hit a dead end. Only on the next day did I think to C-h k M-/ to find out what function was case-folding the expansions, and that led me immediately to the variable dabbrev-case-fold-search. The default setting for that is the symbol 'case-fold-search', which means to use the same value as case-fold-search. Now, I do find case-folding useful on searches, either for laziness' sake, or because you don't always remember the exact capitalization of what you're looking for. So I have case-fold-search set to 't', meaning the default behavior of dabbrev-expand was very annoying.

Solution: either enter C-h v dabbrev-case-fold-search and click the customize link (and choose "off" for "case is significant" -- both of which are confusing descriptions in my opinion), or in your .emacs put "(setq dabbrev-case-fold-search nil)".

I purposely left dabbrev out of the post title, hoping people who are confused like me will be more likely to Google it.

Wednesday, April 10, 2013

Emacs tip: Use 'view-lossage' to answer the question "How did that happen?"

Does it ever happen that you are thumping along in emacs, and suddenly you are in a mode or buffer that you never knew about before? You must have unintentionally hit some keystroke that caused the change, but what was it?

You can find the answer by typing C-h l (that's a little "L"). This runs view-lossage, which shows the last 100 characters typed.

For instance, this morning I was suddenly staring at a blank buffer called "ChangeLog". What is that thing? To find out, I typed C-h l. A *Help* buffer opened with a few lines of symbols. Here are the last couple lines:

C-s F I X M E C-a C-n C-n C-n C-n C-n C-n C-n
C-n C-x 2 C-x o C-x V a r C-x u C-h l

You start reading the lossage at the end. So, at the end I can see the C-h l from invoking view-lossage. Right before that, an undo (C-x u) to get rid of the typing I had just done into the unknown ChangeLog buffer. Right before that, C-x V a r. Ah, I meant to search for "Var", but I must have hit C-x instead of C-s. This is the culprit.

So what do those keys do? It opened a new buffer, but did it have some other harmful side-effect I want to know about? Hit C-h k to start the describe-key function, and start typing C-x, v, a. Before I could type the 'r', the *Help* buffer showed up and explained that I had entered the keystroke for vc-update-change-log, a function that seems like it was more useful back in the days of RCS than today, but which doesn't harm anything else. Anyway, mystery solved!

There tend to be two situations in which I call view-lossage. Number one is where I think, "Hmm, I don't know what I did, but it might be useful." For that case, view-lossage lets you explore that functionality. That was the case this morning with ChangeLog; I was curious about it but it turned out not to be useful to me. Number two is where I think "I don't ever want that to happen again". In that case, figure out the offending keystroke, and then disable it in your .emacs file, like this:

(global-unset-key "\C-[\C-[") ;; prefix-command
(global-unset-key "\C-x\C-p") ;; mark-page
(global-unset-key "\C-x\C-n") ;; set-goal-column

Those are typos I've made before that have annoying effects, so I just completely disable those keystrokes (mark-page is especially nasty: it sets the mark at the end of the file, and moves the point to the beginning of the file, so there's not always an easy way to jump back to where you were in the file).

Friday, January 25, 2013

C++ Private Inheritance

For a long time I believed private inheritance was such an arcane feature that it should be avoided. In my C++ Coding Standard, 8.2 stated "Use only public inheritance." Here was my rationale:

You might find private or protected inheritance useful to save a few lines of code somewhere. But I claim you should just duplicate the lines of code, rather than couple together classes which are so different conceptually that public inheritance can't be used. They might be structurally the same today, but if they are conceptually different, that structural similarity may change over time and give you a huge headache.

I think that rationale is pretty good, but now I've decided it might be worthwhile to use private inheritance in some cases. For instance, it might make sense to inherit from some STL container class, and promote some of its lookup functions or subtypes to "public" with "using" declarations. Like this:

  class Foo : private std::map<int, Bar *> {

  public:
    typedef std::map<int, Bar *> Base;
    using Base::key_type;
    using Base::data_type;
    using Base::value_type;
    using Base::const_iterator;
    using Base::begin;
    using Base::end;
    using Base::empty;
    using Base::size;
    using Base::find;

The likelihood of the structural change mentioned above is very low in this case, and the "using" declarations -- which I'll admit I didn't know could be used that way when I first wrote the standard -- allow you to reuse worthwhile parts of the base class in a very clean and foolproof manner. Later if you realize there's another bit of functionality you want to expose publicly, you only have to write a one-liner, not a wrapper function. Meanwhile, things you might not want exposed -- like operator[]() -- stay out of the interface.

But the clincher for me is that you had better not inherit publicly from a class like std::map. Why? Because it doesn't have a virtual destructor. What if someone -- for whatever reason -- ends up with a pointer to a std::map which is actually one of your Foo objects, and wants to delete it? Only the map destructor will be called. If ~Foo() was supposed to delete its values, that won't happen. Similarly, public inheritance means that someone could take a pointer to a Foo, and then use map's copy constructor or assignment operator to make a simple pointer copy of the Foo, perhaps unintentionally. Then if the original was deleted, and deleted its values, the copy would hold stale pointers.

Interestingly, I didn't seem very worried about that situation when I annotated the standard. I wrote that you might sometimes break 8.4 -- "Destructors of base classes must be pure virtual (but implemented)." -- giving this bad advice:

If you have a memory-sensitive class where subclasses will not be used polymorphically (or do not require polymorphic destruction), you can disregard the rule and get rid of the virtual table pointer.

That seems ridiculous now. That is actually exactly the use case for private inheritance: you want to reuse some code, but not polymorphically.

This all came to the fore today when I considered making a class 'public std::pair', so it could model an Edge in a Boost Graph Library edge_list<>, which requires 'first' and 'second' members (at least in the iterator class you instantiate edge_list<> with). But recently a friend of mine had twitted me on another class I wrote that inherits publicly from an STL container, so I started to think about these issues.

I'm still not sure how I'll implement my Edge class -- except that it won't be public pair -- but I see that I need to change the coding standard someday soon, to reflect the reasonable usage of private inheritance.

Wednesday, September 19, 2012

Don't "Namespace" Filenames

When I first started this blog, I wrote up a rant about directory names that mislead about what their contents are. At the same time, I complained about adding unnecessary morphemes to directory names, things like "db" or "model" or such-like.

I've recently become annoyed with the habit of -- for want of a better term -- "namespacing" directory names and file names. I'm not saying not to use good namespace discipline in source code. What I mean is having a directory structure like this:

  zoo/
    zoo_iface/
      zoo_cmds.hpp
      zoo_cmds.cpp
      ...
    zoo_model/
      zoo_cage.hpp
      zoo_animal.hpp
      ...

Why does everything have to start with "zoo"? Look at it this way: if that was how we named people, then everyone's last name would be part of their first name and also part of their middle name. Instead of calling someone Mary Jane Smith, we would know her as MarySmith JaneSmith Smith. Some people make it even worse by prepending the full directory name onto the file name -- like "zoo_model_cage.hpp". Poor Mary Jane would be MaryJaneSmith JaneSmith Smith.

Of course, a source file name may be dictated by its contents -- you're following our C++ standard, right? -- so my ire is directed at redundant directory names more than filenames.

This principle doesn't have to stop at source files. It boggles my mind that raw bug testcases where I work start their life with names like this:

  /home/bugs/prod1_bugs/comp2_bugs/bug6003.tgz

For crying out loud, I know this is a directory full of bugs, can't we just call each bug something like:

  /home/bugs/prod1/comp2/6003.tgz

Tuesday, September 11, 2012

Disabling Auto-Indent in Emacs

I have to edit a C++ file at work whose indentation is so screwed up that any punctuation key I type ends up changing the indentation of the line I'm working on to something that is way out of line with the text right around it. For sanity's sake I have to M-x set-variable c-syntactic-indentation nil when I'm working on that file. The c-syntactic-indentation variable does not start out buffer-local, meaning that for the sake of the one stupid file, I have turned off the very useful feature of syntactic indentation for every file I have open in that session. So, if you are ever going to set it temporarily, make sure you have this line in your .emacs file:

  (make-variable-buffer-local 'c-syntactic-indentation)

While you're at it, don't you want a shortcut for the set-variable? I amused myself greatly this morning as I tried to think what shortcut I would assign this action to. The obvious mnemonics I polled with C-h k were taken: I have C-c TAB set to indent-region, C-c ; to comment-region, and C-c { set to a function to insert a skeleton class definition. What to do? I finally hit upon C-c (, and lo-and-behold I had already set that keystroke to toggle c-syntactic-indentation, sometime in the distant past! Here's the code:

  (global-set-key "\C-c("
                  '(lambda ()
                     (interactive)
                     (setq c-syntactic-indentation (not c-syntactic-indentation))
                     (message "c-syntactic-indentation set to %s"
                              (if c-syntactic-indentation "t" "nil"))))

Friday, June 1, 2012

Emacs: Scroll Other Window Up

It often happens that I have two related files open in one emacs frame, and I want to scroll around comparing them. Of course to scroll back and forth in a single buffer, C-v scrolls you down, and M-v scrolls you up. (In a perverse bit of emacs jargon, C-v's function is called "scroll-up", since the text moves "up" relative to the fixed window -- even though every human wanting to look lower in the file thinks and says "scroll down, scroll down" -- and M-v's function is called "scroll-down".)

To keep two files synchronized as you peruse them, the power tool to use is M-x ediff-buffers, which sets up an interactive diff session to step through the diffs. Within ediff, a simple lowercase "v" scrolls lower in both files at once, and uppercase "V" scrolls higher. Don't forget M-x ediff-revision as a handy tool for comparing a version-controlled file to its latest revision, or for comparing two different revisions interactively.

Ediff is great, but sometimes it is too heavy-handed for the eyeballing I'm trying to do. For one thing, if one of the files contains a large section that is missing in the other, scrolling both files at once means striding through the section in one window while inching down a line at a time in the file that lacks it. Also, navigating ediff can be tedious if there are a great number of changes or reordered chunks, and once your diffs no longer line up sensibly, ediff's highlighting is simply a nuisance.

So there are the old standbys C-v and M-v, plus there is the handy M-C-v, which scrolls lower in the other window. That is, if I have a.txt and b.txt open in a single frame, and the cursor is in a.txt, then M-C-v will show a lower chunk of b.txt, to keep up with C-v in a.txt. Until today, whenever I wanted to move higher in b.txt, I didn't know of a single keystroke to do that, so I always just switched windows and used M-v. I could do C-u - M-C-v, but I never liked that.

Turns out there is a keystroke to scroll higher in b.txt without switching windows: M-C-S-v -- same keys as scrolling lower, plus the shift key. Now, M-C-v is a pretty ergonomic companion to C-v. I lean the base of my left hand on the Ctrl key, hit V with my left index finger, and keep my thumb poised over the Alt key to choose which window to scroll. The shift key just does not fit in with this scheme. With some effort, I can hold it with my left pinkie, but it's not comfortable. So my new solution is to tie that command to C-c M-C-v. True, it's not a single keystroke, but unlike the C-u solution above, it's all in the left hand, and close together without being cramped. The .emacs line is:

  (global-set-key "\C-c\M-\C-v" `scroll-other-window-down)

Yes, the command to go higher in the file -- to scroll up as we humans say -- is "scroll-other-window-down".

Thursday, January 12, 2012

gcc error: invalid use of member (did you forget the `&' ?)

Every time I have gotten the above compilation error, it has nothing to do with a missing ampersand. Instead, it is missing function-call parentheses:

  if (you_go && use->yourMember) { // WRONG!
    this->isInvalid();
  }

  if (i_go && use->myMember()) { // Oh yeah, the parens.
    continue;
  }

So don't use your member invalidly.

Friday, March 11, 2011

Don't Avoid No-ops

A form of spurious case analysis that seems to come up a lot in the code I have to work on right now falls under the category of "avoiding a no-op". Well, don't add cruft to your code just to avoid a no-op.

Here's the one that I see hundreds of:

  if (ptr != NULL) {
    delete ptr;
  }

Instead of:

  delete ptr;

But the line that caused me to post this today was this:

  if (!isalnum(str[i]) && (str[i] != '_')) str[i] = '_';

Why complicate the if condition just to make sure and not overwrite an underscore with an underscore? In reading through this code I paused to consider when an underscore could occur, and whether something special had to happen, only to read a little further and see that someone was guarding against a no-op. Good lord, just perform the no-op so I can read this more easily:

  if (!isalnum(str[i])) str[i] = '_';

Thursday, January 27, 2011

Filename Expansion in Tcl

Stupid, stupid, Tcl. If you want to run a shell command with a glob filename, like *.h, you can't just do:

exec ls -l *.h

You have to explicitly tell Tcl to expand the glob, because it will pass *.h to the shell literally. But you also can't do:

exec ls -l [glob *.h]

because it will pass the entire list to the shell literally (thanks, Tcl, that's useful). You get lucky and it works if there is exactly one file matched by the glob. Otherwise ls will complain that there's no file called "a.h b.h c.h".

So what do you do? Unfortunately, if you pull up the man page for exec on the internet, it tells you a way to do this that is only syntactically valid for Tcl 8.5 or greater. For those of us living in the past, the demonic incantation you must utter is:

eval [list exec ls -l] [glob *.h]

Don't use Tcl unless you have to.

Wednesday, January 26, 2011

C++ Static No Longer Deprecated

Wow, a long-held piece of geek trivia is no longer true.

This Stack Overflow article points out that in a recent draft of the upcoming C++0x revision to the C++ standard, file-scope static declarations are no longer deprecated. For, oh, 20 years, it's been one of those finer language points that you can whip out to show your sophistication and look down on the ignorant. But between the August 2010 draft (pdf) and the November 2010 draft (pdf), the section deprecating static has been struck!

If you're unfamiliar with the issue, it's that anonymous namespaces provide a more general solution to the multiply-defined symbol problem. Instead of:


  static int num = 0;
  static int read_num() { return num; }

the preferred C++ way to limit visibility of global objects to file scope was:


  namespace {

  int num = 0;
  int read_num() { return num; }

  } // anonymous namespace

The rationale is that you can put other symbols you wish to hide -- like class definitions -- in such a namespace, but static doesn't help you with those. As the most general solution to the problem, namespace wins and static loses. But I suppose that 20 years of failing to break people of the static habit led to a recent change of heart.

Suits me. I had been consistently using anonymous namespaces for quite a while, both to be modern and especially since it does frequently come up that I want a type that is only used in one file. But static declarations are better for self-documenting the code, so I am happy to welcome them back.

Only problem is, I need a new piece of C++ trivia to show off with.

Wednesday, November 17, 2010

Emacs: Copy Environment Variable from Shell

Sometimes I get annoyed when my emacs session has different environment variable settings than some shell buffer I have running in the session. The most painful is when a Makefile depends on environment settings that I don't have in my .profile. Command-name completion in a shell buffer can also be painful if you have changed your path. And of course I'd like gdb to start up with the correct environment every time -- you can set the variables inside gdb, but that gets old on those days when gdb itself crashes again and again.

I kept pasting export FOO=bar into the *scratch* buffer and editing it into (setenv "FOO" "bar") and eval'ing that. After the 1000th time of doing that, I decided to automate it. Turns out emacs already has an interactive function for copying an environment variable from the shell, but you have to type in the variable name. I decided to write a little function to look for the last export, or the last echo $FOO, and copy that variable:

(defun engisneering-shell-copy-env-var ()
  (interactive)
  (let* ((expat "\\(export +\\([^=\n]+\\)=\\(.+\\)\\)")
         (echpat "\\(echo +\\$\\(.+\\)\n\\(.+\\)\\)")
         (cshpat "\\(setenv +\\([^ \n]+\\) +\\(.+\\)\\)")
         (patt (concat shell-prompt-pattern
                       "\\(" expat "\\|" echpat "\\|" cshpat "\\)")))
    (save-excursion
      (if (re-search-backward patt)
          (let* ((m (or (and (match-beginning 2) 3)
                        (and (match-beginning 5) 6) 9))
                 (var (buffer-substring (match-beginning m) (match-end m)))
                 (oldval (or (getenv var) " ")))
            (shell-copy-environment-variable var)
            (setq val (getenv var))
            (message "Old %s=%s; New %s=%s" var oldval var val))))))

Run this function inside your shell buffer, and it will search backwards for the last environment variable action and bring that variable setting into the emacs session. It's also nice because it gives you a message in the minibuffer showing the change. Add a local-set-key -- I like C-c C-v -- in your shell-mode-hook, and you're good to go.

Wednesday, September 22, 2010

Let's Allocate Stuff!

Nobody's perfect, of course, but when there are so many things wrong with a small piece of C++ code, it's hard to be placid. How about:

  std::vector v = new std::vector;
  // ...
    v->push_back(new std::string("..."));
  // ...
  std::ostream *str = get_me_a_stream(...);
  for (int i = 0; i < v->size(); i++) {
    *str << (*v)[i]->c_str();
  }
  delete str;
  // loop to delete v and its elements.

Ow, my head is spinning from all of these needless allocations. Let's not wear out new and delete.

Well, if you first learn C and then learn C++, your head might still be in Pointer Land. But the unnecessary call to c_str() makes it hard for me to get any work done.

Saturday, August 14, 2010

gcc: error: expected `)' before '&' token

Duplicate post here just to get two related error messages into page titles. That's as much SEO as I know how to do.

If you get this error from g++, and everything looks correct to you, maybe you are defining a constructor without qualifying it with the class name. See the example on this post.

gcc: error: expected unqualified-id before ')' token

I was scratching my head for a few minutes over the g++ error message in the title. Google turned up a couple of red herrings, like this one at Stack Overflow where a clueless noob #defined the name of his class.

My mistake had been to define a constructor without putting the Class:: in front of it:

    Class() // WRONG!!!
    : Super(), _duper()
    {
    }

instead of

    Class::Class()
    : Super(), _duper()
    {
    }

It wasn't as stupid as it sounds, I was cutting and pasting inlined functions into the .cpp file.

Wednesday, July 14, 2010

Can't Locate Object Method via Package

I recently got a confusing Perl error message for a line of code that I thought was a simple assignment to a hash member:

Can't locate object method "patterns" via package "objdir(/[^ ]*)?" (perhaps you forgot to load "objdir(/[^ ]*)?"?) at ./filter.pl line 24.

A Google search turned up many forum entries asking about the message, but they all had to do with open-source projects where there was indeed a package method call that had gone wrong.

After I rubbed my eyes long enough, it was obvious what I had done: left the '$' off the front of the hash reference. I had typed:

    patterns{"objdir(/[^ ]*)?"} = "dirs/objs"; # WRONG!

instead of:

    $patterns{"objdir(/[^ ]*)?"} = "dirs/objs";

I even had use strict; use warnings; on, but only got the error message. Anyway, since Google didn't help me out, let's see if this page with the error text in the title floats to the top and helps someone else out someday.

Friday, June 18, 2010

Blessed Mother of Commented-out Code

It's the little things in life that give pleasure, isn't it? Like the lovely comment below, which has adorned a source file at my company for nearly sixteen (16) years now (meaning the file is at least 5 years older than the company).

/*****************************#if (0) //[
// Marked code has been deleted   -- XXXX (18 August, 1994)
// No need to check whether the value is X/X/X
// Only the size is to be checked which has been done
[[[ DELETED

      char  val;

      val=tolower(*((char*) var)->getValue());
      if ((val != 'X') && (val != 'X') && (val != 'X'))
          return 0;
]]]
   #endif //]
*************************************/

There are many facets to this beautiful gem. First you'll notice that there is a nice C++-style // comment -- complete with the date and the signature of Mr. XXXX -- itself commented out by a C-style /* comment. Too bad the dead code isn't commented with //, so that my grep for tolower would have stood out as unnecessary.

But look, really the /* is there to comment out an #if 0. The guy must have wanted emacs to colorize the dead code as a comment -- I don't think emacs had font-lock back then, but there was the hilit19 package.

Still, you can't be too careful, so let's also wrap that code up in a [[[DELETED ... ]]], in case there is some human that can't read the C comments and the if-0, but who will understand what "triple-bracket DELETED" means.

Finally, it appears the conditional was originally written as #if (0) [ ... ] -- was that ever legal C? -- but when that wouldn't compile, the brackets had to be commented out, because every character in this file is too precious to ever delete. This must be the bad code afterlife. You can check out anytime you like, but you can never leave.

How did all this happen? Is it an accumulation of different commenting-out strategies that occurred over time? Or is it just an unfortunate snapshot of one man's frenzied efforts to remove 3 lines of code without actually deleting anything? Sadly, we can only speculate on what happened, because these lines entered our repository in exactly this condition over 10 years ago.

I love deleting dead code, but I can't touch this one. It's an antique. And so ugly that it's beautiful.

Thursday, June 10, 2010

Csh Skips Last Line in Script

Oh, joy. I've found another thing to hate about csh. If the last line of your script doesn't have a newline after it, that line won't be executed. It will be silently ignored. That isn't very useful.

If emacs knows you are editing a shell script, it will automatically put a newline at the end if you didn't do it yourself. But if you have an ordinary text file that you will run with source, it better have a newline at the end. You can get emacs to either ensure that or warn you about it by adding one of the following lines to your .emacs file:

;; Quietly add a newline if missing.
  (setq require-final-newline t)

  ;; Ask if you want a newline at end of file.
  (setq require-final-newline 'ask)

Googling for this problem, I found another nice csh rant.

Wednesday, April 28, 2010

Tcl Case Analysis Lunacy

This goes way back to the roots of this blog. Complaining about Tcl, and complaining about code that breaks down into unnecessary and confusing case analysis -- those topics were the first three posts here.

Unsurprisingly, the toy language that is Tcl is serving me up some ridiculous case analysis. It has to do with the C++-side of a Tcl integration. When you look at the internals of the language objects created by the Tcl interpreter, some things that are conceptually the same have different internal representations. It's bad enough that I have to succumb to case analysis to figure out what's what: but on the other side of that wall, Tcl had to do some case analysis to put them all in different representations! Bad, naughty Tcl.

Here's what's bugging me. When you look at a Tcl_Obj over in the C world, it has a typePtr member to distinguish different types of things:

A simple name like A has a NULL typePtr.
So does a list of things in braces, like { A B }.
A name with some special characters like A[0] has type "string".
So does a list of things in braces which extends over a few lines, like:
```
{ A \
      B }
```
An explicit list like [list A] has type "list".
A quoted string like "A B" has NULL. At least I think so. I lost track.

So obviously there are a bunch of if statements in the back-end tangling stuff up like this. But it's nearly impossible to untangle it on the C side.

Stupid Tcl. Stupid case analysis.

Wednesday, March 10, 2010

Don't Give Variables Negative Names

Sometimes programmers get caught up in the absence of something. Don't let that leak over into the name of a variable or data member:

bool Job::acceptable(const JobCandidate &applicant) const
{
  bool noFelonyConvictions = findFelonyConvictions(applicant);
  return noFelonyConvictions && qualified(applicant);
}

The noFelonyConvictions variable documents the code in a readable way, but it gets silly if the code ever changes so that you have to initialize it, or if someone ever wants to know if there are felony convictions. Things become less readable:

  bool noFelonyConvictions = true;
  if (applicant.findRecords(FELONY)) {
    noFelonyConvictions = false;
  }
  return applicant.qualified(!noFelonyConvictions);

It all makes more sense if you name the variable felonyConvictions, which initializes naturally enough to false, can be set to true if one is found, and can be negated to show absence. Also, if you later become interested in the number of convictions, there will be fewer code changes if the variable changes from a bool to an int.

Whenever you notice yourself putting a "no" into a variable name, get rid of it.

Thursday, February 4, 2010

How to Undeclare a bash Function

It took me a few minutes to figure this out, so I thought I might as well pass it along.

If you've declared a bash function, and then decided you don't really want it, either because you went ahead and implemented it as a script somewhere, or you gave it a bad name or something, you undeclare it like this:

  $ unset -f myfunc