Wednesday 28 July 2010

Automating So You Don't Forget

This is a bit of an introductory-level post/rant/tutorial, but I've been peppered by enough "why on earth would you do this?" questions by various (seemingly experienced) project team members and on various mailing lists that I thought I'd just write my own take on this and point people to it when useful.


I'm pulling my (semihemidemiexistent) hair out on four different PHP projects at the moment. Not because they take all my time (they don't, unfortunately), but because three of them are in "maintenance" mode and express that in different ways. Take version control: one uses Mercurial (my favourite DVCS package); Subversion (once "subversive;" now the "safe" non-DVCS choice); and (tragically) git. Each project has different coding standards. One of those is actually widely-enough used that PHP CodeSniffer comes with support for it right in the tin. The others are relatively easy to code "sniffs" for. (Do remember that none of the three "maintenance" projects were using CodeSniffer (or equivalent), and all three have very sporadic use of their main VCS repositories.)

Wait a minute...now I've got to remember which standards go with which projects? And oh, yeah, it would be Really Nice™ to have any changes automatically saved in version control... if they're worthy.

What do I mean by "worthy?" Well, before I worry overmuch about how code is formatted, I should be able to prove that it works properly. After all, the most beautifully-formatted code that doesn't work is still (essentially) useless. This, of course, is where a tool like PHPUnit comes in; once you have sufficient coverage of your code with automatable tests, especially if you write the tests before you write (new) code, you can make changes confidently and quickly, because a) your tests prove that the code works as expected, and b) you're making sensible use of a (D)VCS, so that when your wonderful new code goes south and doesn't come back, you can follow your virtual-breadcrumb trail back up the face of the cliff. Only after PHPUnit blesses the code should CodeSniffer get a crack at it.

The new folks are scribbling away: "first test everything, then comply with standards, and then update version control." The rest of you are saying "hang on a minute; that problem's been sorted any of several different ways."

Precisely. If you're developing in the Java world, you're spoilt for choice: you can do perfectly reasonable build/test/deploy automation using Ant, or if you want to keep a large number of people (allegedly) gainfully employed managing a J2EE-on-steroids project, you can go for Maven.

In the PHP world, we've got a nice "little" analogue to Ant called Phing. It will quickly become "dead-finger" technology; you'll wonder how (or why) you ever did a reasonably "serious" project without it. And yet, most of the open-source PHP projects I've seen (on Sourceforge and elsewhere) don't use such a tool; they rely on error-prone, manual steps. This manual process, with steps easily forgotten or mangled, is the source of many bugs in released software — in any language.

Enter Phing (or equivalent). You set up the moral equivalent of a makefile with the steps you want to have performed the same way in the same order, every time. Phing supports properties, which can be stored separately from the "master" build file that references them. This allows you to set up consistent process and policy (defined in the build file) and plug in the values for a specific project using the separate properties file.

So how much difference does all this make? Let's take an example set of steps, some variation of which I follow in my build files:

  1. First, clean out all the files created by steps that come later (like test reports);
  2. Then, run unit tests, displaying the output as they run. If tests fail, stop;
  3. Verify compliance with your chosen coding standards; if a problem is found, stop. Either fix the problem if it's in a file you've touched or add the file to the ignore list if it's a legacy file;
  4. I like to run PHPDocumentor to automatically generate developer documentation, from comments left in the code. CodeSniffer will check these, too, so by the time phpdoc gets its grubby virtual paws on your code, it shouldn't find any problems;
  5. If all is well, then it's on to version control. I have Phing show a "diff" report of what's changed since the last checkin, and then prompt me for a checkin comment. If I want to run the whole process but not check in to VCS (maybe I'm coming back to a project after a while away and just want to see the earlier steps run), I can hit the Return key, and my build file will skip the VCS checkin because I've supplied an empty comment (which it checks for).

Great, so (since I've followed a few conventions), all I need to do is type ''phing'' at the command line and it's off to the races. Trivially easy to use and, much more importantly, proof against a very high level of idiocy.

What's that? You in the back... I'm putting the cart before the horse, you say? I shouldn't do a process that drives VCS checkin, but a VCS checkin "hook" that does the validation and so on instead?

To some degree, that's a matter of taste. From a very practical perspective, though, having your build-and-test automation drive VCS instead of the other way 'round means that you can use any VCS operable from a command line, with minimal pain moving between projects. Not every VCS implements a pre-commit hook in the same way; some apparently don't implement them at all. (Yes, we know they're toys, but they're "enterprisey" big-ticket toys. Some managers will buy anything.) So, by having a single-command process execution/enforcement tool, you'll generally find that the internal and external quality of your project improves considerably and quickly; you'll also find that the risk involved with sweeping changes or audacious new features drops to a more comfortably survivable level.

And that's why I always answer the question "What tools should I be using for my PHP development?" to include at least:

  • Your project's version control tool of choice (again, I recommend Mercurial);
  • Phing;
  • PHPUnit;
  • PHP CodeSniffer; and
  • PHPDocumentor.

Once we get people used to a core set of tools and practices, we can then go on to the thorny religious issues like, "which PHP framework should I use?"

Next question?

Wednesday 7 July 2010

Phing! It's a Dessert Topping! It's a Floor Wax! No, it's efPhing something!

I want my hour back. No, seriously; that's what happens when you don't touch a tool for a while, but you spend a lot of time with its "kissin' cousin."

Phing, if you're new-ish to serious PHP development, is your ultimate Swiss Army Ginsu Chainsaw™. It's a "build tool" that lets you automate pretty near anything, especially having to do with PHP. There are several dozen "tasks", like the PhpCodeSnifferTask, the IfTask, and of course the PhingTask. There is also good documentation on extending Phing; adding all sorts of new tasks and other bits, and people have run with the ball.

Put another way, if you're coming from the Java omniverse, Phing is analogous to Apache Ant, with at least as vibrant a community effort throwing stuff over the wall.

Sometimes the things that get thrown over the wall blow up, though. Sometimes that isn't Phing's phault, though that's the fish-eye lens you're looking through as you grapple with the problem. FOr instance, the PEAR coding standards as interpreted by PHP CodeSniffer require tags in the internal documentation you must produce to comply with the standard that are not supported by the documentation generator they're ostensibly intended for. You'll see your Phing script break on that rock — unless you "change the conditions of the test." Figuring out the (multiple) flags, option settings and occult incantations necessary to make things run smoothly will take you some time to figure out, unless you just did it last week and wrote copious notes in your wiki.

But the absolute break-down-and-laugh-until-you-cry moment came from the way Phing handles "custom properties," the symbols you can define to make your Phing life easier in various ways, like having common, shared policies and processes across projects, with specific values set on a per-project basis. You whip up a new "property file" as part of a new project, and reuse your existing XML "build file." Ah, but there is one sizable bump to trip the unwary or rushed: Properties are specified in a sensible XML format when included in the build file, but separate, included "property files" look like old-style Windows .ini files. And $DEITY help you if you forget, because Phing sure won't.

Let's say you have a "drop-dead simple" build.xml file, like this:

<?xml version="1.0" encoding="UTF-8"?>

<project name="demo1" default="demo">

<if>
    <equals arg1="${usepropfile}" arg2="false" />
    <then>
        <property name="foo" value="baz" />
    </then>
    <else>
        <property file="build.properties" />
    </else>
</if>

    <target name="demo">
        <echo message="Demo target; foo= ${foo}." />
    </target>
</project>

If you screw up the build.properties file — say, by specifying your values in the same sort of XML format:

    <property name="foo" value="quuz" />

you'll see a most sublimely confarkled message:

    ....
     [echo] Demo target; foo= ${foo}.
    ....

At this point, you'll either have a D'oh! moment, or you'll start chasing your tail. Choose Door #2, and you could be at it a while.

The answer, as often, is RTFM. Appendix F (!) of the Phing manual defines the "Property File Format," which is our venerable foil the .ini file with a few twists (like being able to define properties that incorporate the values of other properties – just like in build.xml).

Would it really have been too complicated to use the same format (i.e., the XML tags) in the property file as in the build file? You could even deal with making it valid XML with minimal effort. But no...

I actually did learn this properly a couple of years ago, when Phing was new and kind of shiny. It's only become more powerful since then. Just don't hold three lighted M-80s in your hand.

Why rant about this? I've long firmly believed that one of the main duties of a master craftsman, in any craft – including software development – is to occasionally fall into the weeds, pick himself up, and remind the young journeymen and apprentices "don't do what I just did; you'll make yourself look silly, or worse." Better for one person to do it who can be reasonably expected to pick himself up, than for a dozen others to fall in with no idea how deep they're getting.