Monday, 4 June 2007

If You Can't, It Doesn't

If you can't measure something, can you prove that it exists? Or at least that you understand it? If we're having a theological discussion, you might have one answer, but in the material world, and more specifically in technology development, the answer to both questions is "probably not". To quote what several nurses and doctors tell me was pounded into them during their training, "if you don't write it down, it didn't happen" — because there's no record that it happened the way you think you remember it. Memory's a funny thing — even about events that actually happened.

This less-than-breathtakingly original, but relevant, train of thought occurred to me after I spent an hour kicking the tires of the PHPUnit code coverage analysis features. For you PHP coders out there who aspire to professionalism, you will almost certainly find that rigorous, formal testing procedures will save you time and grief in the not-very-long run. If you've been "test infected" or you're working in an XP (the development method, not the massively defective software), you already know this; otherwise, you're likely to be shocked by how good your code is likely to become.

Automated unit tests (using tools such as JUnit (for Java), NUnit (for Microsoft .NET) and PHPUnit are good for proving a) that your code works the way you expect and b) that it keeps working as you make changes to it. Automating the process (so that it runs all your tests whenever you make changes) ensures that you get continuous feedback that all the tests still work (reducing the likelihood of hidden dependencies breaking). Code coverage testing, as the phrase implies, tracks which lines of your code are exercised. Code that has been tested extensively and found to work as expected can be differentiated from untested code and "dead" code. Dead code is code that can't be executed because no logical path through the program exists that will execute that code. A minimal number of lines of code will be marked as dead code in many languages and test systems due to the textual structure of the language itself (e.g., PHPUnit marks a line of code consisting solely of the brace ending an if-block as dead code). Code identified as dead should be eliminated except for this structural detritus; why maintain code that will never be executed?

Many beginning developers (and not so beginning) tend to assume that their code works if it doesn't contain any obvious syntax errors flagged by the interpreter or compiler. These language systems are (generally) reasonably competent at interpreting what you wrote; they have significant constraints in discerning what you intended beyond what you wrote. Hence, just because the code "compiles cleanly" doesn't mean it is free from defects. That's the job of testing — starting with you, the developer, running unit tests.

If you're doing unit tests without any specialised tools ("rolling your own" tests), or your testing tool does not provide code coverage analysis, it may often be difficult to determine whether specific blocks of your code have or have not been tested successfully. Thus, the assumptions that you made while writing the code are unlikely to be challenged during testing — the same assumptions will guide an ad hoc testing regime as the original coding.

One of the benefits of test-driven development is the ability to cut out unnecessary code and other development artifacts; the development team is very confident that exactly and only what is part of the desired system is actually in that system and can expand/refactor the code to adapt to changing requirements; "see a need, fill a need". You can get a lot more done, more quickly, when you have total confidence that your code will continue to work as features are added or changed, and that anything that breaks will be immediately and obviously detected. You can't do that — really — without code coverage analysis.

For you plinkers out there who aren't doing sufficient (or sufficiently organised) testing yet — your competition is, and if you spend any time at all in this craft, you will bang up against "difficult" bugs that wouldn't have been so difficult with pervasive testing. Those of you who have been writing unit-test cases shouldn't automatically get too comfortable, however... how do you know how much of your code is being tested? If you're testing the same block of code eight different ways and other sections of your code don't get tested at all, can you ship a quality product? Coverage analysis will save you time (by not writing redundant test cases), grief (by prodding you to test areas of code you thought you tested but hadn't), and money (from #1 and #2). If you're trying to run a completely instrumented shop — where everything that can be measured for a lower cost than the failure of that thing, is being measured — then I (should be) preaching to the choir.

To sum up:

  • If you don't write it down (in a way that "it" can be found again), it never happened. If you've never tested your code, it's broken until proven otherwise.
  • If you can't repeat the test at will, it hasn't been tested.
  • If you don't or can't know how much of your code has been tested, your users will.

We can't all be Microsoft and expect our paying customers to find our problems!