inamidst.com · topic

Bug Free Programming!

Prof. Daniel Bernstein is known for his interesting programming style and insistence that bug free coding, even in C, is a possibility. Though most people are familiar with his code—qmail, djbdns, cdb, and so on—and some others familiar with the $500 guarantees of buglessness, less appear familiar with his actual style of coding.

The Method

I asked Aaron Swartz whether Prof. Bernstein had written anything about his methods of coding, wondering whether they are transferrable to other people. I'd been working on a curses editor, and I wanted the foundations to be very solid but I kept running into problems.

Aaron, a self-confessed Bernsteinaholic, pointed me to a USENET message from Prof. Bernstein back in 2001, where he advises someone to "Double-check your typing. Triple-check your typing. Measure your error rate so that you can calculate the proper number of times to check.", and "Structure your code so that mistakes are detected by automated tests."

And Aaron himself phrased the general method as follows:

  1. Architect it beautifully
  2. Look at every single line of code and ask yourself "how could this go wrong?"
  3. Keep careful track of your screwups in 2 and build systems to avoid them

I'd already come to some similar conclusions, but not voiced them as clearly; nonetheless, the very system that I was building at the time to prompt this enquiry had already been a good demonstration of Aaron's first point.

Beautiful Code

One of the simplest things that an editor must do is to move a cursor about, but when you're dealing with a window and a cursor and then a text datastructure behind it, it's very easy to get the rendering wrong and introduce fencepost errors and so on.

My code was purely structural. It was moving the cursor around, it was checking the size of the window, checking the size of the lines. None of this made very much sense when you took a step back. Instead, I decided to cut out two or three line chunks of the code, put them into a new method, and give the new method a very long name. I also thought and rethought the whole procedure involved in moving the cursor around.

Here's an example of the transformation, which was extremely beneficial: not only is the code now more readable, but I fixed numerous bugs in the process. One of the original methods:

   def moveLeft(self): 
      y, x, q, p = self.position()
      schar = self.editor.doc.getStartchar(q)
      if x: self.editor.moveCursor(y, x - 1)
      elif schar: 
         start = max(0, schar - self.editor.maxx - 5)
         self.editor.doc.setStartchar(q, start)
         self.drawline(y)
         self.editor.moveCursor(y, min(p - 1, self.editor.maxx))
      elif (y or self.editor.doc.startline): 
         self.moveUp()
         self.endOfLine()

And its replacement:

   def move_left(self): 
      screen, doc = self.context()

      if not screen.start_of_line(): 
         self.move_cursor_left()

      elif screen.start_of_line() and \
           not doc.first_line() and \
           not screen.hidden_left(): 
         self.move_up()
         self.end_of_line()

      elif screen.start_of_line() and \
           screen.hidden_left(): 
         self.show_left_section()
         self.end_of_screen_line()

Note that the old code had actually abstracted a little bit already, but didn't go far enough. Note also how the order and so forth has changed, reflecting a much better design for it.

The basic principle is to make it self-documenting by writing working pseudocode. This works especially well in Python, and points to why we're all using high level languages now, moving further and further away from assembly code.

Some Sort of Conclusion

This piece is unfinished, but if it were finished (you can just make the rationale up in your head) the conclusions would be:

People designing programming systems, be it languages or frameworks or IDEs, should aim to make sure that high-level doesn't come at a cost of elegance or, especially, performance. In my code example above, I took a small performance hit which I was able to do because it's a program requiring human interaction, and microsecond lengthed lags are worth the trade off for less bugs in the code. On the other hand, for other applications this clearly wouldn't be acceptable.

Sean B. Palmer, 2006-11-21