Tuesday 11 August 2009

Tutorials, best practices and staying current

A gent by the name of Brian Carey has written a very nice little tutorial on "Creating an Atom feed in PHP", and gotten it published on the IBM DeveloperWorks site. In the space of about ten pages, Brian gives a stratospheric overview of what Atom is and why PHP is a good language for developing Atom-aware apps, and then gets into the tutorial - defining a MySQL database table to hold the data used to 'feed' the Atom feed, and writing code to get the data out and put it into the form that a reader such as NetNewsWire expects an Atom feed to be in.

Now, to be fair, Brian describes himself as "an information systems consultant who specializes in the architecture, design, and implementation of Java enterprise applications", and the paper is clearly meant as a whirlwind tutorial, not to be taken by the careful/experienced reader as necessarily production-quality code. And if this had been published in, say, 2002 or so, I'd have thought it a great how-to for banging out some PHP 4 code. But this is 2009 PHP 4 is a historical artifact, and blogs and industry journals of seemingly every stripe are decrying the poor quality (security, maintainability, etc.) of PHP code... much of which is still written as if it were the turn of the century, ignoring PHP 5's numerous new features and the best practices that both spawned them and grew from them.

So what's really wrong with doing things like they did in Ye Olden Tymes™?

  • Procedural code makes it harder to make changes (fix bugs, add features, change to reflect new business rules) and be certain that those changes don't introduce new defects. This is largely because...
  • While it is possible to test procedural code using automatable tools like PHPUnit, it's a lot harder and more complex than testing a clean, object-oriented design.
  • Hard-coding everything, interspersing 'magic values' throughout code, is a major hindrance to future reuse - or present debugging;
  • Using quick-and-dirty, old-style database APIs exposes you to the risk of input data that is more 'dirty' than it should be - opening the door to SQL injection and other nastiness;
  • Not staying current exposes your code to the risk that it's either using features that have since been deprecated or removed entirely, or (arguably worse), the risk that new features (such as standard library additions) make some of your existing code redundant at best.
Each of these, to varying degrees, is true of much of the PHP code I've read in the last couple of years, including that in the DeveloperWorks paper. For example, the DW paper's code makes use of the old-style PHP4 mysql_* API rather than the more abstract/portable MDB2 database abstraction layer. And there's also the rather terse implementation of a date3339() function for converting a timestamp to RFC 3339 output format, that's now nicely handled through the standard PHP DateTime class.

How would I have done things instead? Read the next few posts in this blog to find out. And, of course, comments are always welcome.

No comments: