My feathers are ruffled each time I hear people in the software industry claim that producing bug-free or even bug-rare is unrealistic or impossible. They’ll come up with technical arguments about the difficulty of writing quality software and how expensive it is to fix bugs. The usual conclusion is that customers should stop whining and get used using to buggy software, or go back to pencil and paper. Sure, very few pieces of software are bug-free when they’re first written, and debugging is a difficult job. But that’s irrelevant.
Bugs are a business decision. If a company ships software with bugs in it, that’s because somebody decided to stop testing and stop fixing bugs, and ship the software in whatever state it’s in. The time to make that decision is when more testing and bug fixing would cost more than the loss in sales and support costs caused by the known and unknown bugs remaining in the software. Both these figures will be estimates, since the cost of fixing or not fixing unknown bugs can’t be calculated.
This business decision still does not mean all software has to contain bugs. The higher the cost of unfixed bugs, the more money a company will spend on fixing them.
Last July I flew to Denver to attend the Shareware Industry Conference. I flew the leg from Taipei to Los Angeles on a Boeing 747 operated by China Airlines. This aircraft has two major software systems on board: the avionics software (flight computer), and the in-flight entertainment system. These two systems are completely independent of each other, developed by different companies, to different standards.
The avionics software is the software that flies the plane. No, the pilots don’t fly the plane, the flight computer does. How many bugs would you tolerate in the avionics software? How many do you think Boeing left unfixed? How many people have ever been killed by software bugs in modern airliners? Zero. A flawed flight computer would immediately ground all 747s worldwide. Boeing would not recover.
The in-flight entertainment system is a completely different story. It’s not essential to the plane. It only serves to make the passengers forget how uncomfortable those economy seats really are. If the entertainment system barfs all over itself, the cost is minimal. Passengers are already out of their money, and most will choose their next flight based on price and schedule rather than which movies are on those tiny screens, if any. I was actually quite pleased with Chine Airlines’ system, which offered economy passengers individual screens and a choice of a dozen or so on-demand movies (i.e. each passenger can start viewing any movie at any time, and even pause and rewind). That is, until the system started acting up. It locked up a few times causing everybody’s movie to pause for several minutes. Once, the crew had to reboot the whole thing. That silly Linux penguin mocked me for several minutes while the boot messages crept by. X11 showed off its X-shaped cursor right in the middle of the screen even longer. Judging from the crew’s attitude about it, the reboot seemed like something that’s part of their training.
The 747 story clearly shows that bug-free software is perfectly possible, and that bugs are merely a matter of priority. Software developers usually have a reasonably good idea of what would cost to get the bug count down to a certain level. But they often have only a vague idea of the costs of leaving bugs unfixed. These can be quite a bit higher than having your in-flight entertainment system ridiculed in an obscure blog.
An obvious cost, though often difficult to quantify, are lost sales. People who stumble upon annoying bugs when using the free trial version are far less likely to buy. People who hear from others how buggy the product is are far less likely to even download the trial. As the software market becomes more and more commoditized, this cost will only increase. The more alternatives customers have, and the easier it is to switch, the less they will put up with bugs in your product.
Another direct cost is tech support. If your software eats up a paying customer’s data, that customer is going to eat up quite a bit of your tech support department’s resources. Larger companies often cheat in this area, staffing their tech support department with low-wage workers that can’t do much more than send out canned scripts. But in a small company, let alone a one-person shop, every minute spent on tech support is often a minute taken away from development and marketing, not to mention fixing bugs. Fixing a bug fixes it for all customers. An email apologizing for it reaches only one person, while a forum or blog post still only reaches a small percentage of customers. It comes as no surprise that many smaller companies are renowned for their attention to the quality of their products.
A real killer is code reuse. Code reuse is often considered the holy grail of software development, because it allows you to add large chunks of functionality to an application almost for free. But any bugs in the reused code come along for the ride. Fixing bugs in reused code becomes even harder because each time code is reused, more code will become dependent on its behavior. And obviously, more customers will be exposed more often to those bugs.
Software quality matters and can be achieved with desktop software just like it can with mission critical software. Recently, Microsoft has been criticized for delaying and cutting features from Windows Vista. But Microsoft knows exactly what they’re doing. Windows ME was a commercial failure because it was perceived to be less stable than Windows 98 SE. As a result, many system builders preloaded Windows 98 SE until Windows XP Home became available. Microsoft can ill afford to release Vista unless it’s at least as stable as XP. It would not only hurt upgrade sales, but could have Linux, Apple and/or Google jumping for joy.