Today will be the last Tech Tuesday for 2012 as I will be taking another one of my information slim down breaks starting next week. In the review of programming topics covered so far, I pointed out that we have not yet looked at any of the questions that arise once we have given instructions to the computer. The biggest one here is does it all work? Or are there errors? How would we know and what do we do about it?
Program errors are known as bugs and finding and getting rid of them as debugging. It is sometimes claimed that this usage goes back to a story of Grace Hopper finding a moth in the relays of Harvard’s Mark II computer. That story may or may not be apocryphal and the use of the word bug to describe a fault or error is definitely older than that. In reading up a bit for this post I discovered that Lady Ada is credited, including by Babbage himself, with having found the first error in a program for his analytical engine (Google had a Lady Ada Doodle yesterday in honor of her 197th birthday).
I first wrote about errors in the context of syntax. Syntax errors tend to be the easiest to find as generally the program will either not compile or it will not execute. The compiler or interpreter will produce some error message that identify the offending lines of code and even what the specific syntax error is. Even then sometimes some hunting around is required, for instance in the case of unbalanced blocks (meaning a missing closing element for a block such as a missing “}”).
Most of the time though the bugs we are dealing with are of a semantic nature. We think the program means one thing but when the computer executes it it really does something else. A lot of work has gone into trying to avoid this by formally verifying the correctness of software. In areas such as say the software for space rockets it is easy to see why we would want to be able to do this before launching the rocket. There are, however, some obvious and some less obvious problems with being able to do this. For starters, how do you specify what the program is supposed to be doing other than by — drum roll — writing another program?
Because formal methods are laborious and limited a lot of code does contain bugs. Some are even sufficiently famous to have their own name, such as the off-by-one error or OBOE (if you make seven cuts, how many slices of sausage do you have?). The question then is how and when do you best find bugs and get rid of them? The answer to this has changed over the years in important ways as we have moved from a waterfall model of development to the agile model.
As I pointed out yesterday, agile development is really the application of ideas from continuous improvement to software. One of the key ideas is to have no or minimal interim inventory as it will hide production flaws. In agile we should have a little untested code — which could hide bugs — as possible. This is accomplished through techniques such as frequent code reviews and lots of unit testing and automated black box testing (a subject of some future posts). Having fewer bugs and finding the ones that do exist more quickly helps reduce compounding effects and makes debugging a lot easier.
Debugging is the process of tracking down and eliminating bugs, which are often tracked in a system such as Bugzilla or Jira. Some engineers absolutely hate it and others love it since it can be a bit like playing detective (especially when debugging other people’s code). Even just reproducing a bug can sometimes be quite challenging. For instance, the bug may occur only in a very specific configuration of the program and in result of interaction with either underlying hardware or other software. In fact, looking for the bug may change the conditions enough for it not to occur leading to a so-called Heisenbug.
Personally I am in the camp of folks who find tracking down and fixing bugs quite satisfying. My first major debugging experience was finding bugs in an accounting system that was written in an early version of Basic that allowed only two character variable names and the system consisted of 80+ separate program files. What was your most epic bug / debugging experience?