Tuesday, June 30, 2009

Another item from the "That's obvious - in hindsight" dept.


Since upgrading to Safari 4 (why haven't you yet?), I ran into a problem with the single add-in that I've bothered keeping in Safari - Pith Helmet. If you're familiar with AdBlock Plus on Firefix, you've got the basic idea; an add-in to your browser(s) of choice that lets you block advertising, annoying Flash, or pretty much anything else you can identify by file name (e.g., "*.swf*) or by URL (e.g., "http://www.doubleclick.net"), and "magically" removes it from the final content displayed by your browser. This feature has gotten so popular that several browsers are building in more-or-less-competent versions of it by default.

Getting back to the problem... PithHelmet, the Safari ad blocker, was incompatible with Safari 4 because the framework it depended on, called SIMBL, has not had an update since October 2006 - which, as far as Safari or WebKit, equates to "forever". So I do a Google search for "greasekit safari 4", and then start whittling down the results (English language only, please, and only within the last year). Eventually, I ran across a forum post (which I have since lost) saying basically "Yes, PH crashes Safari 4 but has anybody else tried out Fanboy's AdBlock CSS sheet?"

Which, if you know anything about Web development, should have sent the palm of your hand rocketing toward your forehead in a major "D'oh! moment. Of course! Why didn't I (or we all) think of that about 6 or 8 years back?

For those of you who aren't as knowledgeable about the detailed workings of your browser, let me explain. Every Web browser, probably since at least Netscape 1.0, has included support for "user style sheets"; where individual users (or organisations of such users) can choose to instruct their browsers to display certain specific content differently than it was originally specified. For this to work, the user in question (or someone s/he depends on) have to be very literate in HTML and particularly CSS, the languages of Web pages. To do simple blocking, like 'block all SWF files", isn't hard with modern browsers, but the Web developers themselves can make life significantly easier by following modern "best practices". The "practices" particularly relevant here are "add 'id' attributes to all structural and semantic elements." (hmm; what's the difference between that and "all elements"? Another blog post...) If the developer does that, including for the body tag, then it's very easy for even a neophyte user to start filtering just what's wanted.... and learn something about how Web pages work into the bargain.

Thanks for reading - and commenting.

Saturday, June 27, 2009

Remember to test your testing tools!


I've been doing some PHP development lately that involves a lot of SPL, or Standard PHP Library exceptions. I do test-driven development for all the usual reasons, and so make heavy use of the PHPUnit framework. One great idea that the developer of PHPUnit had was to add a test-case method called setExpectedException(), which should eliminate the need for you (the person writing the test code) to do an explicit try/catch block yourself. Tell PHPUnit what you expect to see thrown in the very near future, and it will handle the details.

But, as the saying says, every blessing comes with a curse (and vice versa). The architecture of PHPUnit pretty well seems to dictate that there can only be one such caught exception in a test method. In other words, you can't set up a loop that will repeatedly call a method and pass it parameters that you expect it to throw on; the first time PHPUnit's behind-the-scenes exception-catcher catches the exception you told it was coming, it terminates the test case.

Oops. But if you think about it, pretty expectable (pardon the pun). For PHPUnit to catch the exception, the exception has to get thrown and unwind the call stack past your test-case method. That makes it very difficult (read: probably impossible to do reliably inside PHPUnit's current architecture) to resume your test-case code after the call that caused the exception to be thrown - which is what you'd want if you were looping through these things.

This leaves you, of course, with the option of writing try/catch blocks yourself - which you were hoping to avoid but which still works precisely as expected.

Moral of the story: Beware magic bullets. They tend to blow up in your face when you least expect it.

Wednesday, May 27, 2009

News Flash: Microsoft Reinvents Eiffel, 18 Years On


One of the major influences on the middle third of my career thus far was Bertrand Meyer's Eiffel programming language and its concept of design by contract. With such tools, for the first time (at least as far as I was aware), entire classes of software defects could be reliably detected at run time (dynamic checking) and/or at compile time (static checking). I worked on a couple of significant project teams in the mid- to late '90s that used Eiffel quite successfully. Further, it impacted my working style in other languages; for several years, I had a reputation on C and C++ projects for putting far more assert statements than was considered usual by my colleagues. More importantly, it made me start thinking in a different way about how to create working code. Later, as I became aware of automated testing, continuous integration and what is now called agile development, they were all logical extensions of the principles I had already adopted.

This all happened over a period of 15 or so years. In a field where anyone with more than 2 or 3 years' experience is considered "senior". But with each of these changes I, and most other serious practitioners who I knew and worked with, two to three years was really just about as long as it took to answer more questions than we raised. That, in most crafts, is considered one of the signs of becoming a journeyman rather than a wet-behind-the-ears apprentice.

Then, a few hours ago, I was reading a blog entry by one David R. Heffelfinger which mentioned a project at Microsoft DevLabs called "SmallBasic. Another project that the same organization developed is called "Code Contracts"; there's a nice little set of tools (which will be built into the upcoming Visual Studio 2010 product), and a nice introductory video. Watch the video (you'll need Silverlight to view it), and then do some research on Eiffel and design-by-contract and so on, and it's very difficult not to see the similarities.

So, on the one hand, I'm glad that .NET developers will finally be getting support for 20-year-old (by the time significant numbers of developers use VS 2010 and .NET 4.0). Anything that helps improve the developer and user experiences on Windows is by definition a Good Thing™.

On the other hand, I see more evidence of Microsoft's historical Not Invented Here mentality; beating the drum for "new and wonderful ideas for Windows development" that developers on other platforms have been using effectively for some time. While the Code Contracts project indirectly credits Eiffel - the FAQ page links to Spec# at Microsoft Research, which lists Eiffel as one of its influences - it would have been nice to see acknowledgement and explanation of precursor techniques be made more explicitly. Failure to do so merely reinforces the wisdom of Santayana as applied to software: "Those who cannot remember the past are condemned to repeat it", as well as "Fanaticism consists in redoubling your efforts when you have forgotten your aim." This last is something that we who wish to improve our craft would do well to remember.

Comments, please.

Thursday, May 14, 2009

Jaw-Droppers - Blast from the Past


Just when you thought it was safe to forget that the 1970s ever existed... this gem shows up on the XML Daily Newslink, a mailing list I follow intermittently. (Actually, this was included in the XMLDN from Wed 11 Feb - an indication of how "closely" I've been following lately.)

Developing a CICS-Based Web Service
G. Subrahmanyam, G. Mokhasi, S. Kusumanchi; SOA World Magazine

Web services have opened opportunities to integrate the applications
at an enterprise level irrespective of the technology they have been
implemented in. IBM's CICS transaction server for z/OS v3.1 can support
web services. It can help expose existing applications as web services
or develop new functionality to invoke web services. One of the commonly
used protocols for CICS web services is SOAP for CICS. It enables the
communication of applications through XML. It supports as a service
provider and service consumer independent of platform and language.
SOAP for CICS enables CICS applications to be integrated with the
enterprise via web services as part of lowering the cost of integration
and retaining the value of the legacy application. SOAP for CICS also
comes along with the implementation encoder and decoder.


"SOAP for CICS"? Give it a REST, guys. On the other hand.... preserving this by-now more-solid-than-most-rocks 1969-era technology IS a signal achievement; an example of engineering stability on par with Soyuz.

On the other hand... support for legacy technologies like this looks set to become increasingly expensive and risky over time, since apparently:
  1. Numerous surveys published in the last few years indicate that the sizable majority of "old mainframe" tech will have retired by 2010 (do the math on the years), and

  2. partly as a result of (1), users risk become increasingly dependent on outsourced Indian providers - hardly conducive to effective project control.

In my own professional view, technological preservation activities like this are mainly useful for one thing: they can serve as the pattern against which a re-implementation of the business process using more currently-supportable tech can be verified. And once an organization has done this for its most valuable legacy systems (which wouldn't have been preserved this long if they weren't so valuable), the actual and perceived risk of migrating to new technologies as conditions warrant drops dramatically. After all, if you (or your internal colleagues) have successfully brought your line-of-business systems from CICS via REST to, say, PHP or Java, you're a lot more comfortable with the idea of migrating to whatever the mid-to-trailing technology is in ten years' time - while the support costs of the (then) existing system are still manageable.

Are any of you actually involved in any technological archaeology like this? When I was younger, I used to brag that half the systems I'd worked with were older than I was, but that stopped being true about 1995. (AFAIK, I was one of the last guys to professionally touch a working IBM 360 mainframe, in about 1989.)

(Original article at this entry's title link, which is also here.)

Wednesday, May 06, 2009

Professionalism, Web development, and giving oxy to morons

Whereas a poor craftsman will blame his tools, poor tools will handicap even the most skilled craftsman.


As I insinuated in my previous post, I'm getting up to speed on the Zend Framework, the "900-kg elephant" of PHP application frameworks.

One major bone I have to pick with the ZF team is with regard to documentation: each time I've checked the site in the last couple of months, there's been an apparently current HTML version (now clocking in at some 300 HTML pages). There is also a PDF version, the promise of which is used as an enticement to register for their content distribution network (and, presumably, marketing info). As of this moment, however, the framework is at version 1.8.0, but the PDF version of the programmer's reference manual only covers version 1.6.0 (from September, 2008); some 12 releases earlier. It no longer fully matches the actual code, to the point where it is not difficult for a new developer to get deeply confused.

After spending a half-hour browsing the HTML version of the document, I am unable to find any declaration as to which version of the Framework is documented. However, the README.TXT file included with the source distribution states that it covers the 1.8 release, revision 15226, released on April 30, 2009. Classes which are listed in the README as being new, such as Zend_Filter_Encrypt, are documented in the HTML programmer's guide. Establishing a match between the (HTML) doc and the current code is non-trivial, however. While it may be argued that people unfamiliar with browsing a Subversion repository are not likely to be common within Zend's target audience, I would indirectly refute that: a product release, particularly one with a strong industry following, should be
  • properly documented;
  • easy for a (prospective) user to verify that he has the complete package; and
  • with a definite, intuitive learning curve.
In my view, the Zend Framework fails on at least two of these points. The assertion within large segments of the PHP community that it is the "gold standard" of PHP application frameworks should be a disturbing, cautionary omen: if Web development, particularly PHP development, wishes to be taken seriously by the software industry at large, then some major improvements and attitude shifts need to occur quickly, publicly and effectively. It is still far too easy for potential developers outside the "early-adopter" leading edge to scoff that PHP development (and, by extension, Web development as a whole) is still far too immature and amateurish to be taken seriously. As someone who has developed professionally in PHP for some ten years now, that is a disturbing state of affairs; one that I would love to see (and participate in) a free-ranging discussion of.

Comments, anyone?

Thursday, April 16, 2009

OMFG, or Holy Deforestation, Batman!


As some of you know, I'm working on a book on Web development, using off-the-shelf tools (frameworks, template engines, JavaScript libraries, etc.) to leverage semantic, standards-compliant, accessible, search-friendly Websites. (That's more a matter of adjusting your development philosophy and workflow than anything else, but I digress). As part of that, I'e been doing a (reasonably) comprehensive review of PHP 5 application frameworks. You might have heard of ezComponents or CakePHP, but the 900-kg elephant in the room is definitely the Zend Framework. It ships with the Dojo JavaScript toolkit, but doesn't make it excessively difficult to mix and match others (Scriptaculous, Prototype, jQuery, etc.) if desired.

And here's another reason to call ZF the '900-kg elephant' -- the programmer's reference guide (for version 1.7) weighs in at a <sarcasm>svelte</sarcasm> 1170 pages. Don't print this at home, folks. Better yet, just don't print it... either browse it online or download the PDF. Save a forest or six. For you old-timers, this will remind you quite a bit of US DOD Standard 2167A, "fondly" remembered as "documentation by the boxcar load".

Tuesday, February 24, 2009

Mac OS X is BSD Unix. Except when it's Different.


One of the things that a BSD Unix admin learns to rely on is the "ports" collection, a cornucopia of packages that can be installed and managed by the particular BSD system's built-in package manager: pkgsrc for NetBSD, pkg_add for FreeBSD, and so on. In nearly all BSD systems, the port/package manager is part of the basic system (akin to APT under Debian Linux).

Mac OS X is BSD Unix "under the hood," specifically Darwin and, indirectly, FreeBSD.

This provides the Mac user who has significant BSD experience with a nice, comfy security blanket. This blanket has a few stray threads, however. One of these is the package-management system and ports.

Software is customarily installed on Mac OS X systems from disk images, or .dmg files. When opened, these files are mounted into the OS X filesystem and appear as volumes, equivalent to "real" disks. The window that the Finder opens for that volume customarily contains an icon for the application to be installed and a shortcut to the Applications folder. Installation usually consists of merely dragging the application icon onto the shortcut (or into any other desired folder). Under the hood, things are slightly more complex, but two points should be borne in mind.

First, there is no true Grand Unified Software Manager in Mac OS that is comparable to APT under Debian Linux, or even the "Add or Remove Programs" item in Microsoft Windows' Control Panel. Uninstalling a program ordinarily consists of dragging the program icon to the Trash or running a program-specific uninstaller (usually found on the installation disk image).

Second, while there is a "ports" implementation for Mac OS X (MacPorts), it isn't a truly native part of the operating system. More seriously, the versions of software ports which are maintained in its ports list are not always the most recent version available from their respective maintainer. Installing an application via MacPorts, installing a newer version through the customary method, and attempting to use MacPorts to maintain the conflated software can quite easily introduce confusing disparities into the system, with potentially destabilizing effect.

Go back and read that last paragraph again, especially the final sentence. Mac OS X, meet BSD Unix. Touch gloves, return to your corners, and wait for the bell.

Most users will never run into any problems, simply because most Mac users make little or no use of MacPorts (or any other command-line-oriented system management tool). MacPorts users are (almost by definition) likely to be experienced Unix admins who pine for the centralized simplicity of their workhorse software-management system. Informal research suggests that many, if not most, MacPorts users are active developers of Unix and/or Mac software. In other words, we're all expected to be grown-ups capable of managing our own systems, trading away the soft, easy-to-use graphical installation for a wider variety of nuts-and-bolts-level packages.

So why is any of this a problem for me? Why am I consuming your precious time (and mine) blathering on about details which most interested people already know, and most who don't, probably aren't? As a bit of a public mea culpa and a warning to others to pay attention when mixing installation models.

I had previously installed version 8.2 of the PostgreSQL database server on my Mac via MacPorts. Returning to it later, I realized that the installation had not completely succeeded: the server was not automatically starting when I booted the system, and the Postres tools were not in my PATH. After a bit of Googling, I came across a few links recommending the PostgreSQL 8.3.6 disk images on the KyngChaos Wiki. The server failed to start as expected.

Remembering that I had previously installed 8.2 the "ports" way, I first uninstalled the newly-installed and -broken 8.3.6. I then ran port uninstall postgresql82 && port clean postgresql82 in an apparently successful attempt to clean up the preexisting mess, after which the KyngChaos disk images installed correctly and (thus far) work properly.

This once again points out the usefulness of keeping a personal system-management Wiki as a catchall for information like this - both to help diagnose future problems, and (especially) to avoid them altogether. These can be dirt simple; I use Dokuwiki, and have for over a year now. Forewarned (especially by yourself) is forearmed, after all.

Just thought I'd get this off my chest.

Tuesday, December 23, 2008

Maybe not eating 'crow', specifically, but..... DUCK!!!

As in, "bend over, here it comes again..."

One of the things I have greatly appreciated about the Mac, especially with OS X, is how simple and straightforward software management is, compared to Linux and especially Windows (where every system change is a death-defying adventure against great odds). Operating system or Apple-supplied apps need an update? Software Update is as painless as it gets: the defaults Just Work in proper Mac fashion, but you can set your own schedule, along with a few other options. There is a well-established convention for third-party apps to check for updates via a Web service "phoning home" at app startup; this has been very easy to deal with. Application and file layout is regular and sensible; libraries and resources are generally grouped in bundles at the system or user level. After a few years of DLL hell in Windows and library mix-and-match in Linux, this was shaping up to be a real pleasure.

Then, as some of you know, I updated Mac OS X on my iMac from 10.5.5 to 10.5.6. As expected, that apparently went as smooth as glass. I even blogged about it. XCode worked; MS Office 2008 for the Mac worked; Komodo Edit worked; all my IM clients worked; all seemed customarily wonderful in the omniverse. I even started up Mail; it opened normally and happily downloaded my regular mail and Google mail, just as it had done every day for months. (I didn't actually open any messages then; that will turn out to be important.) Satisfied that everything Just Worked as always, I went back to working on a project for a few hours before turning in for the night.

Next morning, I went through the usual routine. Awake the Mac from hibernation; log in; start Yahoo, MSN and Skype; start Mail; open Komodo; open Web browsers (Safari, Opera and Camino) and I'm ready to get started. First thing...here's an interesting-sounding email message; let's open that up and... *POOF* - Mail crashes.

WTF? It started up just fine; I even got the "Message for you, Sir" Monty Python WAV I'd set Mail to use as my new-mail-received notification. I start Mail again. Picking a different message, I double-click it in the inbox. A window frame opens with the message title, sits empty for a few hundred milliseconds, then Mail goes away again. Absolutely, totally repeatable. Reboot changes nothing. Safe Boot (the Mac equivalent of Windows' "safe mode") changes nothing. The cold fingers of panic stroke my ribs like Glenn Gould at the piano. On a bad-karma scale of 0 to 10, initial reaction is an "O my God"; we're not dead, but we're hurt bad; the karma has definitely run over the dogma. 

The next couple of days are spent using my ISP's Webmail service, and a set of Python scripts I'd previously written to search mailbox contents - Apple Mail, like any sensible email program, adheres to established standard formats. If I'd been using Microsoft Lookout! in a similar situation, I'd have been up the creek.

Finally, I come across some Web-forum items that indicate that GPGMail needs to be updated; if it's not, Mail will crash under OS X 10.5.6 - which is exactly what was happening. (If you're not using GPGMail, GNU Privacy Guard, or any of the various GPG interfaces for Windows such as Enigmail for Mozilla Thunderbird, you don't know how many people are recording and/or reading your email - but if it transits a server in the US or UK, it's guaranteed that it will be.

Installing the upgraded GPGMail bundle was the work of less than two minutes (hint: remove or rename the old bundle before copying the new one over. You probably don't need the insurance, but consider how we got here...). Then start up Mail as usual. It should, once again, Just Work - complete with being able to read and reply to messages, with or without GPG signatures.

OK, so what lessons can we take away from this experience, both as users and developers?

Time Machine may well be the single most rave-worthy piece of software I've touched in 30 years, but it can't (obviously, easily) do everything, and in a crisis, even experienced users may well not want to risk bringing too much (or too little) "back from history". There's definitely a market for addons to TM to do things like "look in my user and system library directories, the Application directory structure, MacPorts, etc., and bring application Foo back to the state it was in last Tuesday morning, but leave my data files as they are." I almost certainly could do that with the bare interface -- but, especially since it was "broken" as part of an OS upgrade, and (with the Windows/Linux experience fresh in mind) not comfortable exploring hidden dependencies... I was without my main email system for three days. Sure, I had workarounds -- that I wouldn't have had if I'd been in a stock Windows situation -- but that's not really the point, is it?

Also, app developers (Mac or other), add this to your "best practices" list: If your software uses any sort of plug-in/add-on architecture, where modules are developed independently of the main app, then you can have dependency issues. The API you make available to your plugin developers will change over time (or your application will stagnate); if you make it easy for them (and your users) to deal with your latest update, you'll be more successful. There's (at least) two ways to go about doing this:

The traditional "brute force" approach. Have a call you can use to tell plugins what version of the app is running, and allow them to declare whether or not they're compatible with that version. Notify the user about any that don't like your new version. For examples of this, see the way Firefox and friends deal with their plugins. Yes, it works, but it's not very flexible; a new version may come out that doesn't in fact modify any of the APIs you care about - which means that the plugin should work even though it was developed against version 2.4 of your app and you're now on 4.2.

Alternatively, a more fine-grained approach. Group your API into smaller, functional service areas (such as, say, address-book interface or encryption services for an email program). Have your plug-in API support a conversational approach.

  1. The app calls into the plugin, asking it which services it needs and what versions of each it supports.

  2. The app parses the list it gets back from the plugin. If the app version is later than the supported range for a specific feature identified by the plugin, add that to a "possibly unsupported" list. (If the app version is earlier than the range supported by the plugin, assume that it's not supported and go on to check the next one.)

  3. If the "possibly unsupported" plugin list is empty, go ahead and continue bringing up the app, loading the plugins normally; you're done with this checklist.

  4. For each item in the "possibly unsupported" list, determine whether the API for each feature required for the plugin has changed since the plugin was explicitly supported. (This is how a plugin for an earlier release, say 2.4, could work just fine with a later version, like 4.2.) If there's no change in the APIs of each feature required by the plugin, remove that plugin from the "possibly unsupported" list.

  5. If any plugins remain in the list, check if there's an updated version of that plugin on the Net. This might be done using a simple web-service-to-database-query on your Web server. If your Web server knows of an update, ask the user for permission to install it. If the user declines, or no upgrade is available, unload the plugin. (You'll check again next time the app is started; maybe there's an update by then.)

  6. Once the status of each plugin has been established, and compatible plugins loaded, finish starting up your app.

Of course, there are various obvious optimisations and convenience features that can be built into this. Any presentation to the user can and likely should be aggregated; "here's a list of the plugins that I wasn't able to load and couldn't find updates for." Firefox and friends are a good open-source example of this. The checks for plugin updates can also be scheduled, so as not to slow down every app startup. This might be daily, weekly, twice a month, whatever; the important thing is to let the user configure that schedule and view a list of plugins that are installed but not active.

As I started this post by saying, I've been very favorably impressed by Mac apps' ease of use (including installation and maintenance). Mail fell down and couldn't get up again without outside assistance; this is unusual. The fact that this was caused by a plugin and that Mail could not detect and work around the conflict just amazes me; I expect more from Apple. I'm not ready to decrease my use of the Mac because this happened - but I am going to pay more attention to how things work under the hood. The fact that I have to even be aware of this -- which is one of the features that hitherto distinguished the Mac from the grubbier Windows and Linux alternatives -- is worrisome.

Again, your comments are welcome.

Wednesday, December 17, 2008

Things that make you go 'Hmmmm', continued

Very much picking up from the mindset expressed in my earlier post... with the knowledge that this could (and probably should) be broken up into at least three different rants...

I've been working heavily in Python for the past couple of months, regrettably letting some other projects slide a bit. Now done with that, I spent yesterday picking up where I'd left off in a moderately-sized, reasonably well-designed PHP 5.2 project. (Bear in mind that my PHP experience is easily 5 or 6 times as much as my Python.)

And... while I'm not ready to jump on the "PHP sucks" bandwagon, it does feel clunky. Occasionally obscure (though never up to the standards of obscurity a good Perl hacker deals with every day).

Why is this? Three years ago, Joel Spolsky wrote an excellent rant on The Perils of JavaSchools. His point essentially boils down to that how you're trained (or "educated") as a developer shapes the way you look at problems; if all you know is a "Hammer", you try to visualize every problem as a "Nail" (even when it's a "Glass Figurine"). You may well have less-than-satisfying success with that view.

More importantly, the tools and techniques you know shape whether you can properly understand a problem at all. Not having certain features in a language (or not being knowledgeable in their use) means that you'll write clunky, hard-to-understand (and therefore -maintain) code to achieve the desired result...and wind up with (an attempt at) calculus using Roman numerals. Just as having a positional numeric system (e.g. Arabic or "modern" numerals) makes whole classes of problems possible that weren't otherwise, languages and their features make programming problems practical or more efficient.

What we don't want is one language that tries to do everything, in every way possible. We already have that; it's called Perl, and one ramification of its overriding philosophy, "There's more than one way to do it", is that there's always a better way to do it; the search for same can and often does suck in resources faster than a Sol-sized black hole. This also illustrates the failing of most of the more recent practitioners of software development that I've worked with. While those of us who've been working since before oh, about 1988 or 1990 generally make a point of reading at least one new technical book a month, I recently led a group of about 20 young (less than 5 years experience as of 2006) developers where not a single one admitted to reading more than one technical book a year since graduation. These people didn't know how to solve the problem we were working on because the two or three tools which they were familiar with encourage their users to think in ways that do not lead to effective solutions for this problem.

A language, any language, can really only do a limited number of things well. If you are fluent in more than one human language, say, English and Mandarin Chinese, think for a moment about concepts and sayings that are natural in one language but just don't work well in the other. Computer languages are like that, too, which is one reason why your computer's operating system is much less likely to be written in COBOL than your company's accounting program is (with benefits for all concerned).

Getting back to what started this rant....coming back to PHP after a sojourn in Python....

PHP gets the job done. Recent versions, particularly the current 5.2, are much more pleasant to work in than previous versions were for those of us who "think in objects". But it has taken years to get here.

As I look at one class in particular in this PHP project's code base, part of my mind is working on "if this were Python code, I'd write it like..." - and a two-hundred-line unit of code would be about 60 or 80, and much cleaner and easier to understand to boot. Why is that? Think "original intent".

PHP was originally developed as an adjunct to HTML for Web pages, to provide some simple dynamic content. It then "just growed", adding new features and capabilities (database access, object orientation) as it became used in a wider variety of problems. It is, quite simply, a tool that grew into a reasonably useful - if not quite general-purpose - language. There are a number of things it does quite well, especially since it can be used to do useful work without requiring a steep learning curve beforehand.

Python is different. A general-purpose language, with functional-programming features, it is useful for Web application development (e.g., with mod_python for the Apache Web server). Whereas Perl has the idea that "There's more than one way to do it" - and therefore no best way - Python argues that "there should be one—and preferably only one—obvious way to do it", whatever "it" is.

So am I suggesting that PHP developers ditch everything and go with Python? Of course not. What I am arguing is the seemingly quaint notion that developers, especially those who aspire to work in the craft for a living, should continually strive to learn new tools, techniques, languages and processes. Your abilities are like every other living thing; they're either growing, or they're dying.

Tuesday, December 16, 2008

Happy Updating....


If you're a Windows usee with a few years' experience, you've encountered the rare, monumental and monolithic Service Packs that Micorosoft release on an intermittent basis (as one writer put it, "once every blue moon that falls on a Patch Tuesday"). They're almost always rollups of a large number of security patches, with more added besides. Rarely, with the notable (and very welcome at the time) exception of Windows XP Service Pack 2, is significant user-visible functionality added. Now that SP3 has been out for seven months or so, it's interesting to see how many individuals and businesses (especially SMEs) haven't updated to it yet. While I understand, from direct personal experience, the uncertainty of "do I trust this not to break anything major?" (that is, "anything I use and care about?"), I have always advised installing major updates (and all security updates) as quickly as practical. Given the fact that there will always be more gaping insecurities in Windows, closing all the barn doors that you can just seems the most prudent course of action.

I got to thinking about this a few minutes ago, while working merrily away on my iMac. Software Update, the Mac equivalent of Windows' Microsoft Update, popped up, notifying me that it had downloaded the update for Mac OS X 10.5.6, and did I want to install it now? I agreed, typed my password when requested (to accept that a potentially system-altering event was about to take place, and approve the action), and three minutes later, I was logged in and working again.

Why is this blogworthy? Let's go back and look at the comparison again. In effect, this was Service Pack 6 for Mac OS X 10.5. Bear in mind that 10.5.5 was released precisely three months before the latest update, and 10.5.0 was released on 26 October 2007, just under 14 months ago. "Switchers" from Windows to Mac quickly become accustomed to a more pro-active yet gentle and predictable update schedule than their Windows counterparts. The vast majority of Mac users whom I've spoken with share my experience of never having had an update visibly break a previously working system. This cannot be said for Redmond's consumers; witness the flurry of application and driver updates that directly follow Windows service packs. XP SP2, as necessary and useful as it was, broke more systems than I or several colleagues can remember any single service pack doing previously...by changing behavior that those programs had taken advantage of or worked around. Again, the typical Mac customer doesn't have that kind of experience. Things that work, just tend to stay working.

Contrast this with Linux systems, where almost every day seems to bring updates to one group of packages or another, and distributions vary wildly in the amount of attention paid to integrating the disparate packages, or at least ensuring that they don't step on each other. Some recent releases have greatly improved things, but that's another blog entry. Linux has historically assumed that there is reasonably competent management of an installed system, and offers resources sufficient for almost anyone to become so. Again, recent releases make this much easier.

Windows, on the other hand, essentially requires a knowledgeable, properly-equipped and -staffed support team to keep the system working with a minimum of trouble; the great marketing triumph of Microsoft has been to both convince consumers that "arcane" knowledge is unnecessary while simultaneously encouraging the "I'm too dumb to know anything about computers" mentality - from people who still pony up for the next hit on the crack pipe. Show me another consumer product that disrespects its paying customers to that degree without going belly-up faster than you can say "customer service". It's a regular software Stockholm syndrome.

The truth will set you free, an old saying tells us. Free Software proponents (contrast with open source software) like to talk about "free as in speech" and "free as in beer". Personally, after over ten years of Linux and twenty of Windows, I'm much more attracted by a different freedom: the freedom to use the computer as a tool to do interesting things and/or have interesting experiences, without having to worry overmuch about any runes and incantations needed to keep it that way.

Comments very welcome, as always.

Wednesday, December 03, 2008

Modern Tools and Archaic Practices Shouldn't Mix

Sun have released NetBeans 6.5, which, among many other (potentially) useful and interesting features, claims to officially support Web development using PHP. This is, on the face of things, a major improvement from the situation under NB 6.1 and prior, which treated PHP essentially as other unsupported languages were treated: you could do raw text editing, but the features that are the entire point of using an IDE - auto-completion, search/cross-reference, and so on - were completely absent. Not so in 6.5; at least minimal support for features like code completion, auto-display of PHPDoc during code entry, and so on can be found here. After a few minutes of poking around, I was starting to get optimistic; here was a decent, if somewhat more heavyweight, alternative to the Komodo Edit which I had been using for some months. Why look for alternatives when I was extremely happy with Komodo Edit for the Mac? Because, almost every day, I sat down in front of Komodo Edit for Linux, and became frustrated with the inconsistencies, limitations and general less-polished feel (Why can't ActiveState include KE for Mac key emulation along with vi and emacs?)

So, back to NetBeans and PHP. I spent a few minutes putting together toy code just to see how the editor felt. Then I created the really one-and-only sample PHP project that came with NB 6.5, a site for a fictional India-based budget airline. Go through the 'New Project' wizard, select the project type, the directory to be used to contain the entire thing (for development, at least), and hit The Magic "Finish" Button.

And, voilà, a new project is born:

At first blush, nothing too obviously catastrophic. Rather non-semantic names for the image files, and the files under 'include' generally presume that you'll only ever need one nav bar, for example, but hey, it's a sample project, I tell myself. It's not necessarily meant to be production-quality; it's supposed to give you a starting point to either see how to use NetBeans to work in the PHP you already know, or how to use this PHP that's all over the Web in the NetBeans you've been using earlier versions of.

And then I double-click on the index.php file in the Projects pane. And my jaw hits the floor as I see... 1996-ASP-style intermingling of PHP code and raw HTML. OK, the DTD is from 1999 and the PHP code uses superglobals, which date from 2001, but you get the idea.

We've spent the better part of a decade, as a craft, running screaming away from this style of work. No sane, experienced PHP developer would write code like this today; we may not have (quite) advanced to the point where "everybody" uses the same tools for similar projects, but separation of presentation (HTML) and logic (PHP) is pretty universally seen as not just a Good Thing® but a Necessary Thing® if the site is ever going to be debugged/maintained. There are just so many problems that conmingled code and markup create, unnecessarily, in living code. I'm well aware that a 'toy' example for a general-purpose editor-with-benefits can not (and arguably should not) try to teach tyros the basics of the language in question.

But is it really too much to ask that such an example be written in a reasonably modern and correct style, or at least put big red (say, 144-point Comic Sans) warnings to the effect, "DANGER: If you don't know why this is  horrible practice, please go buy a book! The job you save may well be your own."

Still using the free Komodo Edit on the Mac, trying to justify shelling out for the "real" Komodo IDE... but that's a deliberation for another post.



(updated Wed 17 December) Somehow, I'd managed to post this without specifying a title. Embarrassing, but fixed.

Sunday, October 12, 2008

Things that make you go 'Hmmmm'

....or 'Blechhhh', as the case may be....

I've been using PHP since the relative Pleistocene (I recently found a PHP3 script I wrote in '99). I've been using and evangelising test-driven development (TDD) for about the last five years, usually with most such work being done in C++, Java, Python or other traditionally non-Web languages (with PHP really only being amenable to that since PHP 5 in 2004).

So here I am, puttering away on a smallish PHP project that I've decided to TDD from the very beginning. For one of the classes, I throw together a couple of simple constructor tests in PHPUnit, to start, such as:
require_once( 'PHPUnit/Framework.php' );

require_once( '../scripts/foo.php' );

class FooTest extends PHPUnit_Framework_TestCase
{
    public function testCanConstructBasic()
    {
        $Foo = new Foo( 'index.php' );
    }
    public function testCanConstructBasicWildcard()
    {
        $Foo = new Foo( '*.php' );
    }
}
And, as is right and proper, I code the minimal class necessary to make that pass:
class Foo {
}
That's it. That's really it. No declaration whatever for the constructor or any other methods in the class. Since it doesn't subclass something else, we can't just say "oh, there might be a constructor up the tree that matches the call semantics."  PHPUnit will take these two files and happily pass the tests.

WTF?!?

I understand what's really going on here - since the class is empty, you've just defined a name without defining any usage semantics (including construction). I would say fine; not a problem. But I would think that PHPUnit should, if not give an error, then at least have some sort of diagnostic saying "Hey, you're constructing this object, but there are no ctor semantics defined for the class." I can see people new to PHP and/or TDD, who are maybe just working through and mentally adapting an xUnit tutorial from somewhere, getting really confused by this. I know I did a double-take when I opened the source file to add a new method (to pass a test not shown above) and saw nothing between the curly braces. On one level, very cool stuff. On another, equally but not always obviously important level, more than enough rope for you to shoot yourself in the foot.

Or, to put it another way, even though I've been writing in dynamic languages off and on for ages, I still tend to think in incompletely dynamic ways. Sometimes this comes back and bites me.  Beware: here be (reasonably friendly, under the circumstances) dragons.

Friday, August 15, 2008

( C/C++ != C) && (C/C++ != C++)

A thought which ran through my mind as I was browsing some job requirements recently...
Why are recruiters still hung up on "C/C++", years after even Microsoft got around to shipping a reasonably compliant compiler (depending on your prejudices and code needs, anywhere from Visual Studio 6 in 1998 to VS.NET 2003)?

"C/C++" started life (or zombiehood) as a Microsoft marketing term back in the late 1980s with the release of Version 7 of their C compiler, which included "some C++ features". MSC 7 wasn't a "real" C++ compiler, but companies such as Borland (now CodeGear), Watcom (now part of Sybase), IBM and others, were shipping compilers that implemented the bulk of the (then-) Draft Standard in a (largely) portable, consistent fashion, so Microsoft was able to muddy the waters by calling their product "C/C++", secure in the knowledge that many of their customers had too little C++ experience to see through the marketing.

Incidentally, this (non-Microsoft) competitive innovation spurred numerous advances, such as Alexander Stepanov's (of AT&T, later at HP) Standard Template Library (STL). Microsoft, in response, introduced a "Container Class Library" which was in practice quite inferior (since it required contained objects to be derived from the Microsoft Foundation Class library's CObject class and (if memory serves) did not support either multiple inheritance or thread safety. Since Microsoft's compilers at the time did not properly support important Standard C++ features such as templates and runtime type information (RTTI) that were needed for the STL, the compiler defects created market opportunities for companies like Rogue Wave and Dinkumware to create products with similar but not identical function.

Timewise, this was really when Microsoft was starting to really push developer lock-in - the practice of introducting non-standard and/or proprietary "features" which were made central to the development process. Despite the existence of numerous superior (in design, function and in productivity) class libraries such as Borland's ObjectWindows Library, Inmark's zApp library, the previously-mentioned Rogue Wave toolkits, and others, Microsoft's MFC carved out huge market share and mindshare, largely because:
  • it was bundled with the Microsoft C ("C/C++") compiler;
  • its limitations and defects mapped most closely to those of the underlying compiler;
  • it came with a primitive but usable GUI builder, for "click-and-drool" development; and
  • it was relentlessly praised by the Microsoft-beholden tech press of the day.
That last point should never be underestimated; publishers of less-than-laudatory articles, such as C Users Journal and Will Zachmann (when he was writing for PC Magazine) would find themselves cut off from Microsoft's press briefings, rumor mil and other means of "keeping up with the competition". This was meant as punitive, to "hurt" the "offenders"...who promptly wrote up the entire sordid affair, built a certain amount of loyal sympathy from the industry grass-roots, and survived quite well, thank you very much.

Getting back to "C/C++"... the term was a marketing fix to a technical problem which rapidly gained "mindshare" with its intended audience: marginally to non-technical people (senior managers, HR people, etc.) who wanted or needed to sound technically knowledgeable. Microsoft was able to play on their lack of real language knowledge coupled with follow-the-herd instincts to help force adoption in enterprises, from the top down. While this helped to increase sales, and helped preserve Windows' market share and lock-in in the enterprise for nearly two decades, it seriously retarded the take-up of standard, portable C++ in the industry (as intended). It also gave companies like ParcPlace (with Smalltalk) and NeXT, later Apple (with Objective-C) incentives to use "alternative" languages, either to gain some "control over their own destiny" independent of a competitor, or simply because C++ at the time was not up to the tasks which they wanted to accomplish.

In any event, by around 2000 (plus or minus a half-decade), Microsoft had caught up with where the rest of the industry had been for a decade or so (bringing serious, proprietary backward-compatibility baggage along with them). The marketing need for the 'C/C++' Newspeak was gone - but the corporate world that had learned the newfangled technical language back in the day was still in place, bound only by the Peter Principle (whose bar, thanks to the new technology throughout the enterprise, had been set depressingly high). Consequently, you still run across job ads with text like this (from the Singapore Straits Times of 13 August 2008):

C/C++ EMBEDDED SOFTWARE Engr. Contract. Call 6xxx7085

Truly informative about the needs; at first blush, seemingly written by a completely non-technical HR person. (I didn't follow up the advertisement to actually verify this, however).

What's the point of this whole rambling rant? To try to impress upon you, my half-dozen Loyal Readers, a technical truism that has been around as long as there have been technical gadgets: "80% of what you know will be obsolete in n months; the other 20% will never be obsolete. Using that 80% beyond its shelf life just makes you look silly." Or, if not 'silly', then at least 'locked in to an out-of-date technology or idea.' And that, with very high likelihood, does not deliver a competitive advantage to your organization.

Tuesday, August 12, 2008

Test Infection Lab Notes

In a continuing series...

As current and former colleagues and clients are well aware, I have been using and evangelizing test-driven development in one flavor or another since at least 2001 (the earliest notes I can find where I write about "100% test coverage" of code). To use the current Agile terminology, I've been "test-infected".

Since my main Web development language is PHP 5.2 (and anxiously awaiting the goodness to come in 5.3), using Sebastian Bergmann's excellent PHPUnit testing framework. PHPUnit uses a well-documented convention for naming test classes and methods. One mistake often made by people in a hurry (novices or otherwise) is to neglect those conventions and then wonder why "perfectly innocuous" tests break. I fell victim to this for about ten minutes tonight, flipping back and forth between test and subject classes to understand why PHPUnit was giving this complaint:
There was 1 failure:

1) Warning(PHPUnit_Framework_Warning)
No tests found in class "SSPFPageConfigurationTest".

FAILURES!
Tests: 1, Failures: 1.
about this code:
class SSPFPageConfigurationTest extends PHPUnit_Framework_TestCase
{
    public function canConstruct()
    {
        $Config = new SSPFPageConfiguration();
        $this->assertTrue( $Config instanceof SSPFPageConfiguration );
    }
};
which was "obviously" too simple to fail.

The wise programmer is not afraid to admit his errors, particularly those arising from haste. The novice developer proceeds farther on the path to enlightenment; the sage chuckles in sympathy, thinking "been there, done that; nice to be reminded that other people have, too".

Wednesday, July 23, 2008

Differences that Make Differences Are Differences

(as opposed to the Scottish proverb, "a difference that makes no difference, is no difference")

This is a very long post. I'll likely come back and revisit it later, breaking it up into two or three smaller ones. But for now, please dip your oar in my stream of consciousness.

I was hanging around on the Freenode IRC network earlier this evening, in some of my usual channels, and witnessed a Windows zealot and an ABMer going at it. Now, ordinarily, this is as interesting as watching paint dry and as full of useful, current information as a 1954 edition of Правда. But there was one bit that caught my eye (nicknames modified for obfuscation):
FriendOfBill: Admit it; Microsoft can outmarket anybody.
MrABM: Sure. But marketing is not great software.
FriendOfBill: So?
MrABM: So... on Windows you pay for a system and apps that aren't worth the price, on Linux you have free apps that are either priceless or worth almost what you pay (but you can fix them if you want to), and on the Mac, you have a lot of inexpensive shareware that's generally at least pretty good, and commercial apps that are much better. THAT's why Microsoft is junk... they ship crap that can't be fixed by anyone else.
FriendOfBill: So you're saying that the Linux crap is good because it can be fixed, and the Mac being locked in is OK because it's great, but Windows is junk because it's neither great nor fixable?
MrABM: Exactly. Couldn't have said it better myself.
Now...that got me to thinking. Both of these guys were absolutely right, in my opinion. Microsoft is, without question, one of the greatest marketing phenomena in the history of software, if not of the world. But it is unoriginal crap. (Quick: Name one successful Microsoft product that wasn't bought or otherwise acquired from outside. Internet Explorer? Nope. PowerPoint? Try again.) Any software system that convinces otherwise ordinary people that they are "stupid" and "unable to get this 'computer' thing figured out" is not a net improvement in the world, in my view. I've been using and developing for Windows as long as there's been a 'Windows'; I think I've earned the opinion.

Linux? Sure, which one? As Grace Hopper famously might have said, "The wonderful thing about standards is that there are so many of them to choose from." (Relevant to The Other Side: "The most dangerous phrase in the language is, 'We've always done it this way.'") As can be easily demonstrated at the DistroWatch.com search page, there are literally hundreds of active "major" distributions; the nature of Free Software is such that nobody can ever know with certainty how many "minor" variants there are (the rabbits in Australia apparently served as inspiration here). Since every distribution has, by definition, some difference with others, it is sometimes difficult to guarantee that programs built on one Linux system will work properly on another. The traditional solution is to compile from source locally; that, with the help of ingenious tools like autoconf. Though this (usually) can be made to work, it disproportionately rewards deep system knowledge to solve problems. The "real" fix has been the coalescence of large ecosystems around a limited number of "base" systems (Debian/Ubuntu, Red Hat, Slackware) with businesses offering testing and certification services. Sure, it passes the "grandma test"....once it's set up and working.

The Macintosh is, and has been for some time, the easiest system for novice users to learn to use quickly. Part of that is due to Apple's legendary Human Interface Guidelines; paired with the tools and frameworks freely available, it is far easier for developers to comply with the Guidelines than to invent their own interface. The current generation of systems, Mac OS X, is based on industry-standard, highly-reliable core components (BSD Unix, the Mach microkernel, etc.) which underpin an extremely consistent yet powerful interface. A vast improvement over famously troubled earlier versions of the system, this has been proven in the field to be proof against most "grandmas".

A slight fugue here; I am active in the Singapore Linux Meetup Group. At our July meeting, there was an animated discussion concerning the upcoming annual Software Freedom Day events. The question before the group was how to organize a local event that would advance the event's purpose: promoting the use of free and open source software for both applications and systems. What I understood the consensus to be basically worked out as "let's show people all the cool stuff they can do, and especially let's show them how they can use free software, especially applications, to do all the stuff they do right now with Windows." The standard example is someone browsing the Web with Firefox instead of Internet Explorer; once he's happy with replacement apps running under Windows, it's easier to move to a non-Windows system (e.g., Linux) with the same apps and interface. That strategy has worked well, particularly in the last couple of years (look at Firefox itself and especially Ubuntu Linux as examples). The one fly in the ointment is that other parts of the system don't always feel the same. (Try watching a novice user set up a Winprinter or wireless networking on a laptop.) The system is free ("as in speech" and "as in beer") but it is most definitely not free in terms of the time needed to get things working sometimes... and that cannot always be predicted reliably.

The Mac, by comparison, is free in neither sense, even though the system software is based on open-source software, and many open-source applications (Firefox, the Apache Web server) run just fine. Apache, for instance, is already installed on every current Mac when you first start it up. But many of the truly "Mac-like" apps - games, the IRC program I use, a nifty note organizer, and so on) are either shareware or full commercial applications (like Adobe Photoshop CS3 or Microsoft Word:mac). You pay money for them, and you (usually) don't get the source code or the same rights that you do under licenses like the GNU GPL.

But you get something else, by and large: a piece of software that is far more likely to "just work" in an expectable, explorable fashion. Useful, interesting features, not always just more bloat to put a few more bullet items on the marketing slides. And that gives you a different kind of freedom, one summed up by an IT-support joke at a company I used to work for, more than ten years ago.
Q: What's the difference between a Windows usee and a Mac user?
A: The Windows usee talks about everything he had to do to get his work done. The Mac user...shows you all the great work she got done.
That freedom may be neither economic or ideological. But, especially for those who feel that the "Open Source v. Free Software" dispute sounds like a less entertaining Miller Lite "Tastes Great/Less Filling" schtick, for those who realize that the hour they spend fixing a problem will never be lived again, this offers a different kind of freedom: the freedom to use the computer as an appliance for interesting, intellectually stimulating activity.

And having the freedom to choose between the other, seemingly competing freedoms... is the greatest of these.

Tuesday, July 22, 2008

Best Practices Alleged; Your Mileage May Vary

Yahoo! quite often releases interesting/useful/thought-provoking tools for people doing "serious" Web development. I add the modifier to specify that we're usually not talking about the Joe Leet three-page magnum oopus; a lot of what they do and talk about really only pays huge returns when you work with a site as large and complex as, well, Yahoo!.

Recently, they brought out a couple of nifty tools that integrate into the Firefox browser's Firebug Web-developer-Swiss-Army-knife extension. One of these, YSlow ("why [my site] slow?") does some interesting evaluations and calculations against whatever page (with secondary requests) you throw it at. Its "Performance" tab shows how your page matches up against Yahoo!'s new "Best Practices for Speeding Up Your Web Site." At first blush, a lot of these make perfect sense; "Avoid Redirects", "No 404s", and so on. YSlow, on the other hand, evaluates against a slightly different set of guidelines to those on the Best Practices Page:

1. Make fewer HTTP requests
2. Use a CDN
3. Add an Expires header
4. Gzip components
5. Put CSS at the top
6. Put JS at the bottom
7. Avoid CSS expressions
8. Make JS and CSS external
9. Reduce DNS lookups
10. Minify JS
11. Avoid redirects
12. Remove duplicate scripts
13. Configure ETags

"Huh?", our hypothetical Web pseudogod Mr Leet might well ask. "What the heck is an 'ETag'? Or a 'CDN'? Does any of this even apply to me?" Well, Joe, yes and no. For instance, content-delivery networks like Akamai or ATDN, as you might well know by hearing the names, scatter servers at strategic places around the planet with the aim of reducing the time it takes to get data from huge, media-content-heavy sites like CNN.com or the like, down to your browser at the end of a surprisingly long chain. Does everybody who puts a site up need something like this? Does the average small-to-midsize business? Usually not, unless you really are a Web Hype-Dot-Oh site that shoves exabytes out every day to wow the yokels or the investors. For the local pizza joint with a site containing maybe forty files, tops, with a couple of megabytes of images, a CDN is thermonuclear overkill. As many Web-development sites have pointed out for the last decade, there's quite a bit you can do to speed things up and lower bandwidth usage without spending the big bucks on this.

Why do I blather on about this when I started talking about best practices and YSlow? Because for practices to be "best", they first and foremost have to be appropriate for the use at hand. Buying a Lamborghini Countach to go down to the corner store for some sodas will quite likely get you yelled at by the Significant Other (followed by your bank). But if Lewis Hamilton showed up at pole position in a '72 Ford Pinto... you'd hear the laughter from St Paul to São Paolo.

Use the tools and techniques appropriate to the task at hand. There's a lot that small Website developers can learn from Google and the tools they publish. Getting an "A" score has a certain karmic appeal, and most of the optimizations required are straightforward anyway (tweaking how your Web server serves your data, for the most part). But is this worth all the geek love it's been getting?

Until someone with the developer credibility and experience of a Yahoo! stands up and explains a better set of practices for the SMB developer, the answer seems to be "yeah, probably". We who make our living (or our diversion) from the creation, care and feeding of Web sites are, for the most part, artisans posing as engineers, with inconsistent knowledge or practice of our craft; we dream of building the online equivalent of the Empire State Building but wind up with the Cologne Cathedral; a wonder, yes, but surely 600 years was well beyond the original estimated schedule! Agreed-upon standards (so that, say, a page appears identical in different browsers),; a shared, common body of knowledge; even (gasp!) widespread, vendor-neutral certifications of professional competence will eventually become common in software (including Web) development for the same reasons as, say, in architecture. The artifacts involved (skyscrapers, Web sites) have important social and policy implications, and inconsistent competence in practice poses a real and serious danger to the public at large. Sooner or later, it's going to be uneconomic for the present ad hoc system to advance the state of the art, or to meet the needs placed upon its products.

Best practices are good; best practices that actually work for the stated purposes in a broad variety of praxis are much better. But to get there, we're going to need to collaborate and communicate effectively, and to do that, we're going to have to make sure everybody involved is speaking the same language to describe the same things. If we don't, we'll continue to be stuck in pretty much the same place we are now - with a bunch of shade-tree mechanics running around in the pits at the Monaco Grand Prix...only doing a lot more damage.

Comments are welcomed, as always.

Friday, July 11, 2008

Does anybody else have a problem with this?

If you've got an ssh connection to a Debian or Ubuntu Linux box handy, and you have sudo privileges on that box, try this little experiment:
  1. ssh to your box as an ordinary user;
  2. sudo su to get a root prompt (you should be asked for your password - this is important);
  3. as soon as you get the root prompt, exit back to normal user, then exit your ssh session entirely.
Now, here's the scary part:
  1. ssh to that same box again right away, as the same user;
  2. sudo su to get a root prompt again.
Why is this scary? Because the second time you ask for a root prompt, you're not prompted for a password. This means that, not only does the actual Linux box require access and user security appropriate to its function, but so does every device that can ssh into it with a rootable user!

I'm sure this isn't in any way new, but in 10+ years of using Linux, I just now encountered that scenario for the very first time. As Linux is becoming more popular, and more users are marching up the 'power user' scale, this is something that should be paid attention to - especially in a business environment. Yowza!

Thursday, July 10, 2008

Standard Standards Rant, Redux: Why the World-Wide Web Isn't "World-Wide" Any More

The "World Wide Web", to the degree that it was ever truly universal, has broken down dramatically over the last couple of years, and it's our mission as Web development professionals to stand up to the idiots that think that's a Good Thing. If they're inside our organization, either as managers or as non-(Web-)technical people, we should patiently explain why semantic markup, clean design, accessibility and (supporting all of the above) standards compliance are Good for Business. (As the mantra says, "Google is your most important blind customer," because your prospective customers who know what they're looking for but don't yet know who they're buying it from find you that way.) Modern design patterns also encourage more efficient use of bandwidth (that you're probably paying for), since there's less non-visible, non-semantic data in a properly designed nest of divs than in an equivalent TABLE structure. Modern design also encourages consistent design among related pages (one set of stylesheets for your entire site, one for your online product-brochure pages, and so on). Pages that look like they're related and are actually related reassure the user that he hasn't gotten lost in the bowels of your site (or strayed off into your competitor's). It's easier to make and test changes that affect a specified area within your site (and don't affect others). It's easier to add usability improvements, such as letting users control text size), when you've separated content (XHTML) from presentation (CSS and, in a pinch, JavaScript). Easier-to-use Web sites make happier users, who visit your site more often and for longer periods, and buy more of your stuff.

Experienced Web developers know all this, especially if they've been keeping up with the better design sites and blogs such as A List Apart. But marketing folks, (real) engineers and sales people don't, usually, and can't really be expected to -- any more than a typical Web guy knows about internal rate of return or plastic injection molding in manufacturing. But you should be able to have intelligent conversations with them, and show them why 1997 Web design isn't usually such a good idea any more. (For a quick Google-eye demo, try lynx).  Management, on the other hand, in the absence of PHBs and management by magazine, should at least be open to an elevator pitch. Make it a good one; use business value (that you can defend as needed after the pitch).

That's all fine, for dealing with entrenched obsolescence within your own organization. What about chauvinism outside - from sites you depend on professionally, socially or in some combination? For years, marginalized customers have quietly gone elsewhere, with at most a plaintive appeal to the offenders, pointing out that a good chunk of Windows usees don't browse with Internet Explorer anymore (check out the linked article; a major business-tech Website from 2004(!!); the arguments are much stronger now). But some companies, particularly Microsoft-sensitive media sites like CNet and its subsidiary ZDNet, still don't work right when viewed with major non-Windows browsers (even when the same browser, such as Opera or Safari, works just fine with that site from Windows). And then there are the sites for whom their Web presence is the entire company, but they haven't yet invested the resources into competent design required to take their site construction from a point-and-drool interface virtually incapable of producing standards-compliant work, and instead present a site that a) actively checks for IE and snarls at you if you're using anything else, and b) has their design so badly broken and inaccessible that people stay away in droves. (Yes, I'm looking at you - every click opens a new window).

When we encounter Web poison like this, we should take the following actions:
  • Notify the site owner that we will use a better (compatible, accessible, etc.) site, with sufficient details that your problem can be reproduced (flamemail that just says "Teh site sux0rs, d00d!" is virtually guaranteed to be counterproductive);
  • When you find an acceptable substitute, let that site's owners know how they earned your patronage. Send a brief thank-you note to one or two of their large advertisers (if any), as well as to the advertisers on the site you've left (if you know any). Politely thank them for supporting good Web sites, or remind them why their advertising won't be reaching you anymore (as appropriate);
  • Finally, there really ought to be a site (if there isn't already) where people can leave categorized works/doesn't-work-for-me notes about sites they've visited. This sounds an awful lot like the original argument for Yahoo!; I can see where such a review site would either die of starvation or grow to consume massive resources. But praise and shame are powerful inducements in the offline world; it's long past time to wield them effectively online.
I'm sure that there are literally millions of sites with Web poison out there, and likely several "beware" sites as well. For the record, the two that wasted enough of my week this week to deserve special dishonor are ZDNet and JobStreet. Guys, even Microsoft doesn't lock people out and lock browsers up the way you do; I can browse MSDN and Hotmail just fine on my Mac, on an old PC with Linux, or on an Asus Eee. And if you need help, I and several thousand others like me are just an email away. :-)

Wednesday, July 02, 2008

It's easy to think there's a war going on...

(playing softly, in the background of my mind, The Beatles' Revolution)

....between the Web developers promoting nice, clean development with RESTful, semantic (X)HTML judiciously enhanced with CSS and JavaScript (henceforth often referred to as the "Army of Light") and those using "popular", "mainstream" frameworks such as CakePHP and the Zend Framework, who route everything through a Front Controller of some sort, and often seem to be in the dubious company of WS-Whatever Web services (which, I am reliably told, provide ample amounts of The Wrong Kind of job security - they know more about your app than you do, and aren't telling what they know). The sides would seem to be pretty cut-and-dried, judging from a lot of the blog activity (Google REST XML-RPC PHP to get a million and a half or so hits of light reading material). Except...

Briefly skimming through the Zend Framework documentation, for instance, and looking at the QuickStart and tutorials reinforces the idea that URL handling is routed through a front controller to an application-specific action controller, which is the C in the notorious (and some say overused) MVC (model-view-controller) framework. Originally developed to help improve desktop-application development, particularly in languages like Java and Smalltalk, it became popular for Web development because.... it seemed like a good idea at the time. Actually, for Web development in the Pleistocene (say, late-1990s), it was a good idea. Anything that cut through the estimated 27.612 interconnected details that needed to be simultaneously mastered to get a "Hello, World" EJB up and happy was, by its very existence, a Very Good Thing. And so, when shops moved to more productive, less pathologically irrational development systems than J2EE, the models and design patterns that had saved their bacon were brought over into the New World, to maintain conceptual touchstones that helped Useful Work Get Done. Happiness abounded throughout the realm, until apps started outgrowing the meager bounds of static HTML and became "Rich Internet Applications". (To the tune of "Lions and Tigers and Bears, Oh My!", you hear faint murmurs of "AJAX and WS-* and REST, Oh My!") And, to pile on the snowclones, there really be dragons there.

'Dragons' in the form of falling into a GET-centric, action-oriented, everything-just-a-click-away world of convoluted Web apps with limited (re)usability and even less understandability to those who haven't swum there in some time. The entire promise of REST is simple: by centering applications around resources, rather than actions (through the use of URIs; Universal Resource Identifiers) and following the eminently sensible notion of not putting kilobytes of state information into those URIs (necessary information is POSTed along with the URI request), many problems that become painfully visible in large systems, simply go away. (Try sending a link to a cool book you found on Amazon over an instant messenger chat.)

But a typical, outsourced-development, haven't-really-used-this-tool-and-you-want-it-WHEN?!? developer isn't going to think of those things. He's going to grab a tool that has promising-sounding Google hits, run through a tutorial or two, and then plunge into the Son of the Enhancement of the Rewrite of yehey.com, with the customer sending him an "is it done yet?" email every six minutes. Clean design? What's that? Well-guarded state transitions? Who's got time to even understand that, let alone implement it? If we don't get it done, the customer's going to pull the project and send it to Vietnam or somewhere...

Just to make one point absolutely clear: I don't mean to be picking on Zend and CakePHP as being more than simply representative of widely-used, well-reputed tools that can be used to get the unwary, rushed developer (are there any other kind earning a paycheck?). While it is entirely practical to write semantic, RESTful Web applications in both frameworks (and both document how to do so), it's like, say, RPG; a fantastic tool for solving problems in a well-defined domain, usable with significant effort outside that domain, and Zeus help you if you use it to write an MMORPG.

The real point of this rant, if it hasn't hit you like a Muhammad Ali speed-anchor punch, is another pout over the state into which we've allowed the once-honorable craft of software development (of which Web development is but a specific case) into absolute bollocks. We've allowed the pay-any-price-to-cut-costs, pinch-a-penny-until-you-can-hear-it-scream-from-Boise-to-Bangalore idiots pervert us from Muhammad Ali (or at least Sonny Liston) into Herschel Shmoikel Pinkus Yerucham Krustofski. A plurality, if not yet an overwhelming majority of those who call themselves "software development 'engineers'" have been given neither sufficient formal training in their craft nor the resources (time, money, support, etc.) to continue learning as they go. "If you can spell EJB and ERP, you're the guy for us - as long as you're young and dirt cheap. And when you're done with that, we've got some BASIC code we want in Java instead."

So at the unique moment in history when ephemeral intellectual artifacts have assumed primacy in a wide range of human affairs, the humans whose intellect is responsible for their creation and correct functioning have progressively less ability to do the job properly. The way that, if they sit back and think for a moment, they know should be possible, has to be possible in any sort of rational omniverse whatsoever. But few, if any, ever get that chance for reflection. Fewer still, having reflected, researched and enlightened themselves, are welcomed back into the paying ranks who toil away at this once-noble craft.

And my Zend Framework code still feels slimy. It's not Zend's fault, at least, not entirely. Front controllers are good; front controllers are your friends; front controllers are.... *crunch!*

Saturday, June 28, 2008

It's Time to Grow Up

Adrian Kingsley-Hughes over at ZDNet has a very interesting post up, titled "Sticking with XP / Upgrading to Vista / Waiting for Windows 7 / Switching to Mac or Linux - There’s no single right answer". He puts forward the blitheringly-obvious elephant-in-the-room answer to the perennial food-fight question, "Which system is best?" Namely, "use what works for you." As I read through the post and the comments that had been added to it, I thought about all the man-centuries (if not -millennia) that had been "invested" in the topic. Naturally, I had my own two rupiah worth to say on the topic. Following is the text of the comment I left at approximately 1645 GMT on Friday 27 June. Let me know what you think. (There aren't any links in the post reprinted below; to the best of my knowledge, ZDNet's commenting software hates links, and it definitely hates Macs; every single comment I've posted has given me the error "You must enter the text to post" - on a clean, empty comment form - AFTER I've hit the "Add your opinion" button. Fix it, guys!


"Use what you like, and like what you use."

That's excellent advice for those of us who've been bouncing around in the funhouse for a while, who know which mirrors make us look weird (or, worse, are broken and likely to cut us if we're not careful)...and, granted, there are blessed few truly "new" users now; statistically, nearly everybody's used a PC with one form or another of Windows, and increasing numbers of us have used Mac and/or Linux, but...

We still do have the FNG syndrome with folks who haven't upgraded for a while, and finally they get tired of their molasses-slow Win98 box when they see this zippy new PC or Mac they've been handed at work. "Gee, we use XP at work, but I heard Microsoft isn't going to sell it anymore... what should I use?" Many of us, professionally or otherwise, are tasked with advising those people. Too often, the advice becomes "this is what I use; try it" without really understanding the (often vast) difference between the adviser and the user in question. And when "advisers" try and hash things out among themselves, it almost universally degenerates into an Animal House food fight scene - which doesn't bring any value to the discussion and actually makes us LESS able to give good advice.

Mental hands up: How many of you reading this have ever spent a month using Leopard? Vista? At least two of Ubuntu, Fedora or SuSE Linux? How many raised your hand all three times? Yeah, I see you, way back in the back.... but blessed few others.

In any other endeavour that dared call itself a craft, let alone aspire to an engineering discipline, this would be malfeasance if not negligence; you just DO NOT give advice on matters in which you are not qualified - and if you don't have experience and/or training with Technology X, you're NOT qualified to present advice as being any more valuable than used toilet paper.

In that light, Adrian has performed a major public service here. Given the reality that most people whose job relies on using and/or developing for one of the major platforms are quite unlikely to be as current and proficient on any of the others, this is the best advice that will just let us get along with our jobs without pissing in each other's lemonade each and every single day (Mike Cox and No_Axe, you know who you are).

But in the increasingly unlikely event that we're ever to make something professional out of this hobby that we are lucky enough to get paid for, the fact that this "solution" is seen as viable to any degree, let alone the MOST viable solution, is absolutely, reprehensibly unacceptable. And since computers and software have become absolutely central to nearly everything in modern life, including not least public policy, if we don't get our house in order under our own power, sooner or later some governmental organization or group thereof is going to step in and exert adult supervision. Is that what we want?

The days when Windows geeks and Mac users and Linux hackers could happily putter away, each in their own walled garden with tactical nuclear landmines guarding against any encroachment by reality, are gone as surely as the clipper ship. In the world of the Internet, where information is what's important and how it's processed/generated/visualized/stored is at best secondary, we're faced with the same choice as every biological or cultural organism at an evolutionary shift: adapt or die. Keep the lemonade clean, or drink the purple Kool-Aid. Our choice. Each and every one of us.

Tuesday, June 24, 2008

Browser Support: Why "Internet Explorer 6" Really Is A Typo

(Experienced Web developers know that the correct name for the program is Microsoft Internet Exploder - especially for version 6.)

Case in point: I was browsing the daringfireball.net RSS feed and came across an article on the 37signals blog talking about Apple's new MobileMe service dropping support for IE6. The blog is mostly geared towards 37signals' current and potential clients who, if not Web developers themselves are at least familiar with the major technical issues involved. Not surprisingly, virtually every one of the 65 comments left between 9 and 13 June was enthusiastic in support for the move; not because the commenters necessarily favor Apple (though, clearly, many do), but because anybody who's ever cared about Web standards knows that IE6 is an antediluvian, defiantly defective middle finger thrust violently up the nostril of the Web development community; the technological equivalent of the Chevrolet Corvair: unsafe at any speed.

The degree to which this is true, and to which this truth continues to plague the Web developer and user communities, were brought into sharp focus by three of the comments on the post. The first, from 37signals' Jason Fried, estimates that 38% of their total traffic is IE, of which 31% is IE 6.0 (giving a grand total of 11.8% of total traffic - not huge, but significant).  The second is from Josh Nichols, who points out that Microsoft published a patch to solve the problem with IE6 in January, 2007; he notes, however, that an unknowable number of users may not have applied that patch. Finally, Michael Geary points out that later versions of Internet Explorer (version 7 and possibly the now-in-beta Version 8) also have the problem of not being able to "set cookies for a 2×2 domain, i.e. a two-letter second level domain in a two-letter top level domain with no further subdomain below that," (including his own mg.to blog domain). The fact that relatively few domains fall into that category can be argued to be part of the problem; users running IE, particularly an out-of-date version of IE, are likely to be less experienced, less able to recognize and solve the problem correctly, than to blame it on "something wrong with the Internet". For those people and companies who've paid for those perfectly legitimate domains, the negligence and/or incompetence of the browser supplier and/or user mean that they're not getting their money's worth.  And ICANN, the bureaucracy "managing" the domain-name system, is now "fast-tracking" a proposal to increase the number of top-level domain names (TLDs) used. (In time-honored ICANN custom, the press release is dated 22 June 2008 and "welcome[s]" "Public Comments" "by 23 June 2008." Nothing like transparency and responsiveness in governance, eh?

Thursday, June 19, 2008

Good Things are good things....aren't they?

Anybody who's worked with me over the last 20 years or so knows that I generally evangelize conforming to standards when they exist, are relevant and widely agreed on. As the famous quote from Andrew Tanenbaum (in Computer Networks, 2/e, p. 254) reminds us, "The nice thing about standards is that you have so many to choose from." When "standards" are used to promote vendor agendas (e.g., Microsoft force-feeding OOXML to a hapless ISO) or when they go against the common sense built up through hard-won experience by practitioners. And when multiple standards for a product or activity exist, and those standards are each widely used by various users (who could have chosen other alternatives), and when those standards conflict with each other in important ways that can't be amicably resolved, then those "standards" cause reasonable people to not merely question their validity, but, too often, the entire concept of "standards".

As any developer who's worked in more than one shop, or sometimes even on more than one project in a shop, knows, coding standards are sometimes arbitrary, often the prizes and products of epic bureaucratic struggle, and (in the absence of automated enforcement such as PHP CodeSniffer) often honored more in the breach than the compliance. What makes things even more "fun" is conflicting standards. It's not all that unusual for a company to contract out for development work, specifying that their coding standards be complied with (since they're the customer and they're going to maintain, or control maintenance of, the code). If the contractor has their own set of standards that conflict with the customer's, then problems arise with internal process compliance, customer involvement and final delivery. It can be - and too often is - a sorry mess. Simple code reformatting problems can be taken care of with a pretty-printer program; oftentimes, though, one sees entire programs (which have to be debugged, documented and maintained) developed just to "translate" one format to another. Many shops just give up, declare the project to be an exception or exemption from their own internal standards and processes, and try to conform to the customer's demands. "Try to", since their developers, both writing and reviewing the code, are going to be fighting against it tooth and nail because it just "feels wrong".

This whole rant was inspired by reading through yet another coding-standard document; this one the Zend Framework PHP Coding Standard. One item in particular struck me as counter-intuitive. In Item B.2.1.1, PHP File Formatting - General, it says:

For files that contain only PHP code, the closing tag ("?>") is never permitted. It is not required by PHP. Not including it prevents trailing whitespace from being accidentally injected into the output.

Experienced PHP developers are quite likely to have problems with this, not least because it conflicts with earlier behavior of the PHP interpreter and with tools that expect well-formed code. This is one of the oddities which tools like the aforementioned PHP CodeSniffer need to take into account. (There are other, more blatant "yellow flags"

If you're in a shop which takes standards seriously, uses PEAR code and uses the Zend Framework, your code review meetings are likely quite interesting.
  • "OK, we're going to look at foodb.class.php first, and then the others I mentioned in the email yesterday."
  • "Which standard does it use?"
  • "Well, it ties in with PDO, so it ought to follow the PEAR standard, right?"
  • "OK, that sounds reasonable."
As the meeting continues...
  • "Hey, what's this at the end of omnibar.class.php? There's no 'close-PHP' tag! If we start using the code-search Wiki plugin that the Bronx Project folks keep raving about, it's not going to like that...."
  • "Oh, yeah, but that's because  it uses all this Zend Framework stuff, so we use Zend's coding conventions... see that comment at the top about how to run CodeSniffer?"
  • "Riiiiight...."
and so on. Weren't process and standards supposed to make development easier and more reliable?

I agree with the sentiment, apocryphally attributed to one or another of numerous software gurus, that, in the presence of otherwise adequate and sufficient standards, we shouldn't be so "egotistical" as to think developing a "better" standard than others already have is worth our time; take what's already out there, adapt as necessary, and move forward. The trick, of course, is in evaluating that condition, "otherwise adequate and sufficient." Also, since our craft is (hopefully) continuing to advance and adopt standard patterns for things done before, striking out on your own (after careful consideration) demands that the question be revisited from time to time. What are other development groups using (broadly) similar techniques to solve (broadly) similar problems using? Is a consensus forming, and do we have anything useful to say about it? Or has a single standard already taken hold, and we can take advantage of it (at least for new or reworked code)?

Code analyzers like lint and PHP CodeSniffer can be amazingly useful. But for them to function as standard/policy enforcement tools, there must be a standard, or a small group of similar standards for them to enforce. When development teams have to juggle between incompatible standards, it discourages them from following any standards. And in that direction lie... the 1970s.

Monday, June 16, 2008

g++ != gcc (arrrrrrgh!)

Coming back up to speed on Mac programming, now that I've finally got a shiny new iMac. Their XCode IDE looks like a great tool (Objective C, C++, C, etc., etc.), but I was hacking around building some simple test code. Being a fully-certified Unix operating system, of course that's an easy way to get something done while minimizing the number of known and unknown unknowns that need to be dealt with.

This mostly transpired between 2300 Sunday and 0130 Monday (last night/this morning). I installed CppUnit (Boost was already on the system), and kept running into the same problem.
jeffs-imac:Foo jeff$ gcc -I /opt/local/include -I /opt/local/include/cppunit foo.cpp -L /usr/local/lib -lcppunit -o foo
ld: in /usr/local/lib, can't map file, errno=22
collect2: ld returned 1 exit status

Hmmm. Maybe the library that's on there is screwed up somehow? Go to Sourceforge, pull down the library source, build it, install it, and try again.

Same thing. Fiddle with the code, fiddle with the command line, nothing fixes it. Go out and Google for help. The very first hit, from MacOSXHints, had a silly-sounding but tantalizing "clue":

You're right. I figured out the problem is that I was missing the -c switch when building the .o file with gcc. For some reason the linker doesn't complain about it, but when I try to link the shared lib with my main program I get the obscure can't map file error. Now it is working. Thanks.

Hmmm again. Go fiddle some more, this time compiling and linking my trivial proof-of-concept in separate gcc command lines. Still no joy.

I go back and play with building libcppunit again, wondering if I've missed some funky option to configure. Nope. It's pushing 0130, I need to be up at 6-something, and my brain is fried, so I shut down for the night.

(Later) in the morning, something's niggling at the back of my mind, saying I missed something while watching libcppunit compile, so I do it again. Yep, cue the "you dumb palooka!" moment: it's not using gcc to compile; it's using g++. For those who've never had the (dubious) pleasure - gcc is the "general-purpose" front-end to the GNU Compiler Collection, a set of (numerous) language systems, of which g++ is the C++-specific toolchain (and standalone front end). All compilers in the Collection produce object code in compatible format (the same back-end is used for almost everything), so usually all you have to do is invoke the One Command to automagically compile your Ada, FORTRAN, Java, PL/I, whatever. And, to be honest, it had been a few months since I'd dealt with C++ on gcc/g++ from the command line. (Thank you, Eclipse!) But I remembered a bit of wisdom lost in the Mists of Time...

They can both compile your C++ (in a single step). But: They're. Not. The. Same.

So, I re-do my (now-) two-step process, substituting g++ for gcc:
jeffs-imac:Foo jeff$ g++ -c -I /opt/local/include/ -I /usr/local/include/cppunit/ foo.cpp
jeffs-imac:Foo jeff$ g++ foo.o -L/opt/local/lib -lcppunit -o foo
jeffs-imac:Foo jeff$
Ta-daaaaa!  OK, can we go back to a single step? That is, compiling and linking (using g++) in one go:
jeffs-imac:Foo jeff$ g++ -I /opt/local/include -I /opt/local/include/cppunit foo.cpp -L /usr/local/lib -lcppunit -o foo
ld: in /usr/local/lib, can't map file, errno=22
collect2: ld returned 1 exit status
jeffs-imac:Foo jeff$
Nope. This is where the MacOSXHints commenter was right on the money. But why? I used to be knee-deep in the (FORTRAN-specific) code, a feeling akin to being knee-deep in the dead on occasion, and the answer doesn't come immediately to mind. Any ideas?

Saturday, May 10, 2008

ANFSD: starting a series to scratch an itch

(And Now For Something Different, for the 5LA-challenged amangst you...)

I've made my living, for about half my career, on the proposition that if I stayed (at least) three to six months ahead of (what would become) the popular mean in software technology, I'd be well-positioned to help out when Joe Businessman or Acme Corporation came along and hit the same technology - with the effect of "Refrigerator" Perry hitting a reinforced-concrete wall. This went reasonably well as "the market" started using PCs, then GUIs, then object-oriented programming, and then "that Internet thingy" (Shameless plug: résumé here or in PDF format).

In other ways, I've been a staunch traditionalist. I've used IDEs from time to time, because I was working as part of a team that had a standard tool set, or because I was programming for Microsoft Windows and the Collective essentially requires that that be done in their (seventh-rate) IDE unless you want to decrease productivity by several dozen orders of magnitude.

Otherwise, just give me KATE or BBEdit and a command-line compiler and I'm happy. This continued for a significant chunk of the history of PCs, until I decided that, for the Java work I was doing, I really needed some of the whiz-bang refactoring and other tie-ins supported by Eclipse and NetBeans. Then I started hacking around on a couple of open-source C++ packages and thought I'd give the Eclipse C/C++ Development Tooling a try. Now I'm coming up to speed on wxWidgets development in C++.

During this learning-curve week, I spent a lot of time browsing the Web for samples, tutorials and so on. To call most of them execrable is to give them unwarranted praise. Having recently resumed work on a Web development book dealing with useful standards and helpful process, and since I've been doing C++ off and on since the mid-80s, I thought I'd start a series of blog entries that would:
  • Document some of the traps and tricks I hit to get a simple wxWidgets program into Eclipse;
  • Illustrate some early, very simple refactoring of the simple program to get a bit more sanity;
  • Get Subversion and Eclipse playing well together;
  • Explain why I think parts of teh Agile method are simulataneously nothing new and the best new idea to hit development in a very long time.
  • Start using an automated-testing tool to build confidence during debugging and refactoring; and
  • Using a code-documentation tool in the spirit of JavaDoc to produce nice technical/API docs.
At the end of the series, you'll have a pretty good idea of how I feel most projects (regardless of underlying technology and specific tools) "should" be done.
You'll have seen a very simple walk-through of the process, demonstrated using Linux, Eclipse, C++ and wxWidgets, but actually quite broadly applicable well beyond those bounds.

Please send comments, reactions, job offers, etc., to my email. Death threats, religious pamphlets, and other ignorance can, as always, go to /dev/null. Thanks!

Tuesday, May 06, 2008

Oh. Mah. Gawwwwwwwwd.

You no longer need to reboot a running Linux system to apply security patches to it.

Check it out.

Excuse me whilst I pick my jaw up from the sub-sub-sub-basement floor. If this checks out in the field, on multiple distros, then a lot of sysadmins are going to be able to get a lot more sleep at night. And the comment by one David Pottage:
This should be good for distro kernels.

Just think if you can prepare a special kernel module that will apply security patches to a running kernel, then so can your favorite distro. In future when there is a security update, instead of downloading a ~20Mb kernel package from security.debian.org or the like, and then waiting until a suitable time to install it and reboot the system, you can download a small package containing patching modules for the standard kernels from that distro, and install it immediately...
The mind reels a bit. Security patches are the most gotta-do-it-right-NOW things that come down the pipe for any system. Open-source systems that are widely audited, like Linux, tend to get patches a lot quicker than Windows (which had attacks in the wild with no fixes available for some 271 days in 2007), or even Mac OS X. Any closed system that depends on a single organization to secure it will always have slower reaction time than an open system with enough (mutually independent, distributed) resources to throw at it. As Eric S. Raymond wrote in The Cathedral and The Bazaar, "given enough eyeballs, all bugs are shallow". As long as there is some meritocratic control over the "official" patch-submission process - and there is - it's now easier than ever to keep critical systems up and secure, in ways and at speeds that simply can't be matched in the Microsoft world. Remember, even if your uptime is 99.999%, (the so-called "five nines gold standard"), you're still down five minutes and fifteen seconds every year. Murphy's Law says at least five minutes of that time will be when it was really, truly important that the system not be down.

You can't repeal Murphy's Law, but I think it does give us a big step towards an insurance policy.

Saturday, May 03, 2008

Rant - can people at least engage brains before asking stupid questions? (Or: Paging Andy Rooney)


I read several different development blogs and message boards, such as those associated with phpclasses.org, codeproject.com, IBM DeveloperWorks, and so on. Usually pretty useful both for learning new techniques and keeping an eye on what other people are doing. Several of the boards, particularly on CodeProject, have been getting hot and bothered lately about the declining quality of questions being asked on the public lists; one large category of these is "affectionately" known as "homework questions'. These usually aren't for actual classwork. A more typical scenario seems to go like this: Sanjay is brought into an outsourced project because his agency assures the client that he's a hotshot - fully qualified in J2EE, PHP, XML, RPC and LSMFT (obviously a key qualification, but since the client HR person is neither technical nor over 40, it just appears to be "tech jargon.")

The client manager thinks, "I'll have to let a couple of my other guys go to stay in budget, but if this guy can save our bacon, it's worth it." Sanjay starts one bright Monday morning at 9.00,, gets the usual here's-where-things-are spiel, sits around (billing time) waiting for his computer to be set up and connected to the network (these things almost never happen before the warm body shows up), and by 1 PM is browsing the codebase for the project. By 1.30, he's on the Web, posting questions on sites that make it absolutely, crystal clear that he wouldn't know his ear from a hole in the ground if you gave him a flashlight, a map and six hours' head start.

The most extreme example of this I've personally witnessed was when I was part of a team consulting to a major Southeast Asian telecoms firm, working on integrating the homegrown billing system that one division used into the (telecom-)industry-standard one used by most of the rest of the company. This benefited greatly from knowledge of Java, of both Linux and Windows, and especially of the commercial system which was the target for the migration. The firm which produces this system, in true Java-ecosystem fashion, offers their own training which leads to certification in various aspects of the system. The team had a good mix of knowledge and experience, with the single yellow flag being that the major new-system expert was a member of the client's staff. After being encouraged to bring our own domain expert in (apparently so the client could reassign the existing one if desired), our headquarters (in India) found the "perfect guy". He had Java and J2EE certificates. He had a BSCS from "one of the top universities" in India. He had all the certifications the vendor offered. Oops, they were all for the previous version of the system....but this guy could, on paper, walk on water as far as our project's needs were concerned. So we flew him out from Bangalore. We had, as I recall, six weeks before major schedule slips would hit the fan.

A month later, my team manager and I sat watching this guy type in sample programs from the manual, try to build them, watch them fail, erase everything and start over. The manager said to me, "well, why don't you and (the junior guy on the team) start trying to pick up the pieces? Maybe you can pull something out." Then we noticed something else that was interesting. The client manager kept saying nice things about the "expert", how diligent and hardworking he was. He started taking the expert out for lunch and so on. My manager and I were in shock: this guy had already cut into the project for thousands of dollars and had yet to produce a single project artifact. Worse, in our view, his "experience" and "qualifications" were obviously complete fabrications. Headquarters (back in India), however, wouldn't send out a replacement because they had been told that the client was happy with the guy. We started looking around for brown-projectile-proof mackintoshes, anticipating the storm sure to come. And it did, eventually - shortly after the "expert" failed to follow explicit, idiot-proof instructions on how to extend his visa so he wouldn't have to go back home. Instead of buying a round-trip bus ticket to Singapore, he bought a one-way ticket, later claiming he wasn't sure which bus he'd want to take back up. The Singapore immigration officials blocked him from entering Singapore as they had no reasonable assurance of when he'd leave. The Malaysians wouldn't let him back in because his (tourist) visa had expired that day and he'd need to enter another country before reentering Malaysia. After several frantic midnight telephone calls and (I was later told) negotiation and pleas from our firm and the client, he was unceremoniously dumped on a plane back home, and our firm was billed for a full-fare coach ticket. The client manager became upset because his buddy, the "expert" was nowhere to be found and on no notice, at that.

Why am I blathering about all this now, several years after the event? My ire was raised by one of these "homework questions" I'd mentioned earlier, this time posted on the WeberDev general PHP forum. Entitled "Which is compatible PHP or Java with SQL SERVER 2005", the writer, using the nick "kumarsudu", starts out with...
Hi All,

I would like to know which one is have compatible

1. PHP with SQL SERVER 2005
2. Java with SQL SERVER 2005

as i have to develop a project, ...
As another popular forum's members often ask, "how many WTFs are there in this message?" The mind reels. Asking totally nonsensical, absolutely open-ended questions on technical sites and lists is now at a flood stage not seen since the AOL infection of the Internet back in the early 1990s. The writer, besides the sterling specificity of the question as mentioned earlier, shows little knowledge of or interest in proper use of the English language (despite ample materials and information available online as well as off).

It pains me that:
  • a person of such deliberate ignorance and lack of brilliance would not only choose to waste my and other readers' time with the request to do his work for him;
  • any contracting agency would have such miserably low non-standards that this guy would even get the time of day, let alone a job for which there are an ample supply of other (by definition more qualified) candidates for (granted, they may have to raise their pay to a level higher than a fast-food burger-flipper);
  • any client company would not only pay money for such an individual, but continue to do business with the "recruiter" that brought him in;
  • there are no project-local, competent individuals whom this writer could ask for help, who would apply appropriate informational and organizational responses to the question; and finally
  • that the prevailing business contempt of software development has sunk so low that this type of thing is more unusual for the fact that anyone bothered to notice it than that it happens at all. Would you want to fly with an airline whose pilots were the cheapest available, using forged certificates and qualifications to highlight (non-existent) experience? If your answer to that question is anything less emphatic than "hell no!," please inform me of your flying plans so that I may ensure that I am neither flying nor anywhere on the ground along your route during such a flight.
The "project" mentioned by the initial writer, if performed by individuals of this apparent calibre, is highly likely to fail - wasting the client's time and money, leaving the client wih a problem that still needs to be solved, and continuing to erode the opinions of the client and the client's associates of the business value of software, since, obviously "IT projects always fail."

Saturday, December 01, 2007

Buzz About the Code Buzzards

Scott Hackett over at SlickEdit has a blog entry where he talks about code scavenging as "a new software development methodology." The point being, of course, that code scavenging isn't new at all; it's at least as old as the UNIVAC I. Every new, and not so new, software developer has used it at one time or another, usually for quick bootstraps to get a specific part of the software under development working in spite of limited knowledge of the language and/or domain in question by the programmer, or limited resources (time/budget). Most times, it's at least felt to be some combination of the two.

What's changed recently, and what makes five-finger coding a screaming, begging candidate for formalization, as Mr. Hackett notes, are two distinct phenomena: the rise of new repositories on Web sites like Koders, Krugle, Google Code Search, among others, that at least hold the promise of finding "what you need right now", right now. Also, development professionals face ever-increasing resource constraints ("can you get that for us yesterday?") in a craft that is continually expanding in scope and detail.

What I found interesting, having visited sites like Krugle many times in the past, was the effect that open-source licensing like the GNU General Public License and the BSD License are having on one of the main limiting factors that Mr. Hackett identifies: trust. Many developers, for various reasons, don't trust the work of someone they don't know when their own jobs or reputations are on the line. What open source and the new search engines bring to the table is the idea that now, you don't always have to just accept a code fragment "on faith". Many of the more experienced developers have a track record of other code that they've written, often accessible via the same search engines that helped you find the one you're looking at. Open source and free licensing mean that you can look at other stuff that the guy in question has written, get a feel for how his style and competence match your own, and use that information as additional basis for evaluating the code fragment you're scavenging. The trick, obviously, is to keep the time and effort required for all this significantly below what it would take for you to write your own implementation from a clean sheet of paper. That's why search engines and good indexing are important.

Of course, none of this will help you make the changes that always have to be made to put a piece of "foreign" code into your handiwork. That's still going to take some work; hopefully not on the order of putting a Porsche engine into your Ford pick-up. We'll have to wait for - and work for - serious improvements in the state of the software development craft to have any effect on that. Development methods, languages, and so on are still in the cathedral at Rheims. I've been waiting 30 years for the binary Renaissance to hit my professional life.... anybody need a virtual stonemason?

Tuesday, September 18, 2007

Improvement as opposed to Change


A link on the Agile CMMi blog had some very interesting things to say about a gentleman named Brian Lyons, the CEO (as well as CTO and founder) of Number Six, a Vienna, Virginia (Washington, DC area) software technology services/business consulting firm. Number Six quite obviously 'get' the concepts behind Agile CMMi, Hillel Glazer of the blog wrote. Interested, I flipped over to their site to hopefully learn a bit more, maybe drop off a CV.

Mr. Glazer wrote the blog entry in question on 25 July. The first thing you notice when viewing the Number Six site is the home-page obituary of and tribute to Brian Lyons, who died on (Monday) 3 September 2007 in a motorcycle accident. There are various links to family and tribute sites, a press release, and a request that "in lieu of flowers, the family has requested that donations be made to" a scholarship fund at the University of Maryland (excellent, particularly for those too far or too late to attend the funeral). I wish the family and company all the best in their times of grief and trouble.

The point I originally started writing this entry to make, however, was inspired by one of the bullets on Number Six' careers page, under the heading "Why Six?": "We are committed to consistent improvement, not continual drastic change."

Think about that distinction for a moment. Many of us, particularly those in the software-geek persuasion, start our careers trying to remake everything; not necessarily because it's broken, but so that we can do it (whatever it is), make it ours, introduce new techniques or technologies, and hopefully (but improbably) provide a better solution to the problem at hand than what was there before.

The forces arrayed against that impulse are formidable. Playing the role of (and often in actual fact) those who are by nature suspicious of any radical approach to a problem, too often dismissing out of hand any alleged improvement or innovation without "giving it a fair chance", in the eyes of the wet-behind-the-ears Young Turk. Occasionally, of course, improvements and innovations too beneficial to ignore do come out of this process.

As the developer matures (which may take days, decades or eternities), the Young Turk recognizes that not all of the hindrances were traditionalist per se; rather, they were operating from a different (usually business-oriented) set of priorities. Learning to understand and appreciate those priorities is one of the fundamental aspects of any successful transformation from ivory-tower Geek to journeyman software craftsman (or -woman), able to make a contribution to customers and to the craft of software development.

What becomes obvious, to all thoughtful participants and stakeholders in the process of developing and maintaining any software system is that a natural tension exists between three apparently conflicting ideas:
  • Don't change what works well enough when there are more pressing needs to attend to;
  • New capabilities, chosen and implemented properly, can have strongly beneficial effects (efficiency, productivity, wider customer scope, 'better' according to various aspects of quality);
  • Any change to an existing system involves direct and indirect costs which must be carefully evaluated and compared against the expected improvements.
Balancing and managing the conflict between those concepts throughout the lifetime of a piece of software is an art that has yet to be thoroughly and reliably mastered with a level of reliability and cost widely acceptable to business customers. Several philosophies and techniques (often marketed as technologies) have been developed and marketed as (attempted) solutions to that problem. Three that I have found extremely useful, in complementary ways, over the last few decades are agile development, refactoring and the CMMI. (Those already familiar with the concepts may skip down to the paragraph which begins "As argued by practitioners such as Hillel...".)

The CMMI (previously more widely and less precisely known as the CMM, for Capability Maturity Model) has been around for quite some time as a formal means of evaluating the maturity and coherence of an organization's development efforts (among other activities). It requires its users (organizations) to acquire and document ever-increasing detail of precisely how they go about the process of development and how they are required to be able to prove (in an internal or external audit) that those processes are being followed and any discrepancies noted and accounted for. This suffers as a standalone policy framework for software development on two grounds:
  • the "what" and "why" of development are completely ignored. CMMI fundamentally doesn't care if you're building a pile of junk that has serious practical problems, as long as you do it using the process you've defined for yourself.
  • The CMMI, along with US DOD Standard 2167A, have been blamed for the logging of more trees (to make paper) than virtually any other industry business practice. This is largely because both are seen (in CMMI's case, somewhat unfairly) as pushing "hidden mandates" for the rigid , never-far-from-obsolete "waterfall" model of development.
Agile development, on the other hand, is a survival response to both the (largely management-driven) mantra that "Real Artists Ship", and on the other hand, to seemingly interminable periods between "milestones" (particularly in the waterfall model) where nothing of intrinsic value is visible to an outside observer (e.g., management or customers). Its main criticism from opponents is that it produces too little documentation (the product is the documenatation, to a large degree).

An aspect of, or adjunct to, agile development is refactoring, pioneered and popularized in the Martin Fowler-written book. Refactoring is all about how you can make (sometimes radical) changes to the internals of a software system, provided you keep the external (customer-facing, revenue-generating) interfaces constant. Wrapping one's mind around this as a formalized process, as opposed to the "don't-break-what-works" mentality common to experienced developers (and managers), is a threshold experience in a software craftsman's progress.

As argued by practitioners such as Hillel, the combination of the process-centric CMMI and the solution-centric, visible-progress Agile set of processes (including, particularly, refactoring) gives the "best of both worlds" for development - a focus on artifacts and products created through a survivable process, matched with a demonstrated and documented knowledge of exactly what is meant by "that process" under current circumstances. This also implies demonstrating understanding of the specifics of "current circumstances" that drive the decision-making and artifact-development processes - which brings us back full circle to understanding why Agile is a good fit to begin with.

Unlike the Rational Unified Process (RUP), which grew out of a waterfall-driven management-oriented mentality (and the opportunity to sell software tools and consulting services to any shop using the heavily-marketed process), Agile and CMMI development can be successfully started after passing around a few books among the development and management staff. Tools and training are, chosen thoughtfully, extremely helpful, and the appraisal process within CMMI (formally SCAMPI, the well-known "Level 1" to "Level 5" assessment of process maturity) does eventually require an outside assessor, or registrar, to formally certify the organization. But to take a typical small development group from the customary initial chaos to a finely-tuned, market-leading, customer-satisfying machine, it has been my experience and observation that it is far easier (and more cost-effective) to implement CMMI in an Agile fashion than to go down the (vendor-specific) RUP route. Imposing either on a large organization would require an unlikely combination of managerial brilliance, sadism and masochism; revolutions boiling up from below and winning eventual management sanction have proven much more likely to be successful in that type of environment.

So what's a Young Turk (or a more seasoned craftsman) to make of all this blather? The simple point: Consistent improvement is practical, and without life- or career-threatening implications, achievable on a repeatable basis. Continual drastic change, in contrast and almost by definition, will lead the development group to many sleepless nights trying to get that last show-stopper bug fixed, management to wonder why those bozos in development can't be trusted to ship anything on time, and customers to wonder what they're really getting for their money beyond the hype and bling of the marketing materials. Which team would you rather be on?

Saturday, September 15, 2007

Crap Is Not a Professional Goal


I recently, very briefly, worked with a Web startup based in Beijing (the "Startup"). The CEO of this company, an apparently very intelligent, focussed individual with great talent motivating sales and marketing people, takes as his guiding principle a quote from Guy Kawasaki, "Don't worry, be crappy".

The problem with that approach for the Startup, as I see it, is two-fold. To start, Kawasaki makes clear in his commentary that he's referring strictly to products that are truly innovative breakthroughs, exemplars of a whole new way of looking at some part of the world. Very few products or companies meet that standard (and even if yours does, Kawasaki declares, you should eliminate the crappiness with all possible speed). No matter how simultaneously useful and geeky the service offered by the Startup's site is, it is, at best, a novel and useful twist resting on several existing, innovative-in-their-day technologies. (A friend who I explained the company to commented that "it sounds as innovative as if Amazon.com only sold romance novels within the city limits of Boston" - hardly a breakthrough concept). Indeed, Kawasaki makes clear that he's talking about order-of-magnitude innovation; the examples he cites are the jump from daisy-wheel to laser printing and the Apple Macintosh.

The second, more insidious, problem with the approach, and the trap that many early-1990s Silicon Valley startups fell into, is that you take crappiness as a given, without even trying to deliver the one-two punch of true innovation and a sublimely well-engineered product that immediately raises the bar for would-be "me-too" copycats. (Sony, for instance, has traditionally excelled at this, as has the Apple iPod (learning from the mistakes Kawasaki cites for the earliest Macintosh). Deliver crap, and anybody can compete with you once they understand the basics of your product. The wireless mouse is a good example of this.

If you tell yourself from the get-go that you'll be satisfied if you ship a 70%-quality product, what will happen is that, as time goes by, that magical 70% becomes 50%, then 30%, then whatever it takes to meet the date you told the investors. And if management doesn't trust engineering to give honest, realistic estimates (as is typical in software and pandemic in startups), you have a recipe for disaster: engineering takes a month to come back with an estimate that development will take 12-18 months; management hears '12' and automatically cuts that to 6 and pushes to have a "beta" out in 4. The problem is that, if you're dealing with an even marginally innovative product, things are not cut-and-dried; the engineers will have misunderstood some aspects of the situation, underestimated certain risks, and been completely blind to others. This was pithily summed up, in another field entirely, by Donald Rumsfeld:
There are known "knowns." There are things we know that we know. There are known unknowns. That is to say there are things that we now know we don't know. But there are also unknown unknowns. There are things we don't know we don't know. So when we do the best we can and we pull all this information together, and we then say well that's basically what we see as the situation, that is really only the known knowns and the known unknowns. And each year, we discover a few more of those unknown unknowns.
Companies that simultaneously forget the ramifications of this while taking too puffed-up a view of themselves are leaving themselves vulnerable to delivering nothing more useful or profitable than the old pets.com (not the current PetSmart) sock puppet - and probably nothing that memorable, either.

And that further assumes that they don't fall prey to preventable disasters like losing the only hard drive running a production server built with non-production, undocumented software. If Web presence defines a "Web 2.0" company in the eyes of its customers and investors, going dark could be very costly indeed.

Friday, August 24, 2007

Back in the Saddle, Again

....with abject apologies to Gene Autry...

I haven't posted here for a while (about three months - gak!). For the six or eight of you still hanging on, humble apologies and my deepest appreciation (sympathies?). Some have publicly wondered (offline) whether I am merely offline or have flatlined. Actually, I've been in hospital twice during that time, and my personal and professional lives have undergone more than the usual random fluctuations. Be that as it may...

As with roughly half of the Linux-aware folks out there, I've been playing with Ubuntu Linux for a while now. The job I just started - as Principal Technologist and alleged future CTO for FoneVillage.com in Beijing - is with an Ubuntu shop, so that's one motivation. I've been a Debian evangelist for a few years now - formerly a Kanotix (now Sidux refugee, now with Ubuntu and Mepis installed and happy on laptops and a desktop (and lusting after a Mac Pro (but that's another blog entry)...

Half of me LOVES Ubuntu. Point-and-click everything; all the applications (except mainstream violent Windows games) that a user could want immediately available, name-brand Big Applications for the enterprise; more-solid-than-most-rocks Debian under the hood; regularly updated Live CDs (but get the better Live DVD what's not to like?

The other half of me, the guy who's been intimate with the care and feeding of Unix systems for almost 30 years, has an easy answer for that; sudo (as superuser/SystemGod, do) everything, but in particular, sudo bash (as superuser, open up a shell [terminal] and let me run arbitrary commands with no restrictions).If you read Ubuntu guides and Web pages, almost everything a user does from a command shell that affects the system is done as sudo command, while logged in as an ordinary user. A bit of poking around with Google led me to a page on About.com's Ubuntu Desktop Guide that put things into better perspective:

The first user account you created on your system during installation will, by default, have access to sudo. You can restrict and enable sudo access to users with the Users and Groups application.

My knee-jerk reaction having subsided, I'm back to generally liking what I see in Ubuntu. It's intended to achieve - and generally succeeds at - being "easy enough for anybody to use", not just "geeks", as Linux has heretofore been viewed by Windows usees. It's another answer to the classic "what's the difference between a Windows usee and a Mac user?" question: The Windows usee talks about everything he had to do to get his work done; the Mac user (or, generally, the Ubuntu user) talks about all the great work she got done.

For the technophobes out there who still want to join the modern world, definitely worth a spin.

Monday, June 04, 2007

If You Can't, It Doesn't


If you can't measure something, can you prove that it exists? Or at least that you understand it? If we're having a theological discussion, you might have one answer, but in the material world, and more specifically in technology development, the answer to both questions is "probably not". To quote what several nurses and doctors tell me was pounded into them during their training, "if you don't write it down, it didn't happen" - because there's no record that it happened the way you think you remember it. Memory's a funny thing - even about events that actually happened.

This less-than-breathtakingly original, but relevant, train of thought occurred to me after I spent an hour kicking the tires of the PHPUnit code coverage analysis features. For you PHP coders out there who aspire to professionalism, you will almost certainly find that rigorous, formal testing procedures will save you time and grief in the not-very-long run. If you've been "test infected" or you're working in an XP (the development method, not the massively defective software), you already know this; otherwise, you're likely to be shocked by how good your code is likely to become.

Automated unit tests (using tools such as JUnit (for Java), NUnit (for Microsoft .NET) and PHPUnit are good for proving a) that your code works the way you expect and b) that it keeps working as you make changes to it. Automating the process (so that it runs all your tests whenever you make changes) ensures that you get continuous feedback that all the tests still work (reducing the likelihood of hidden dependencies breaking). Code coverage testing, as the phrase implies, tracks which lines of your code are exercised. Code that has been tested extensively and found to work as expected can be differentiated from untested code and "dead" code. Dead code is code that can't be executed because no logical path through the program exists that will execute that code. A minimal number of lines of code will be marked as dead code in many languages and test systems due to the textual structure of the language itself (e.g., PHPUnit marks a line of code consisting solely of the brace ending an if-block as dead code). Code identified as dead should be eliminated except for this structural detritus; why maintain code that will never be executed?

Many beginning developers (and not so beginning) tend to assume that their code works if it doesn't contain any obvious syntax errors flagged by the interpreter or compiler. These language systems are (generally) reasonably competent at interpreting what you wrote; they have significant constraints in discerning what you intended beyond what you wrote. Hence, just because the code "compiles cleanly" doesn't mean it is free from defects. That's the job of testing - starting with you, the developer, running unit tests.

If you're doing unit tests without any specialised tools ("rolling your own" tests), or your testing tool does not provide code coverage analysis, it may often be difficult to determine whether specific blocks of your code have or have not been tested successfully. Thus, the assumptions that you made while writing the code are unlikely to be challenged during testing - the same assumptions will guide an ad hoc testing regime as the original coding.

One of the benefits of test-driven development is the ability to cut out unnecessary code and other development artifacts; the development team is very confident that exactly and only what is part of the desired system is actually in that system and can expand/refactor the code to adapt to changing requirements; "see a need, fill a need". You can get a lot more done, more quickly, when you have total confidence that your code will continue to work as features are added or changed, and that anything that breaks will be immediately and obviously detected. You can't do that - really - without code coverage analysis.

For you plinkers out there who aren't doing sufficient (or sufficiently organised) testing yet - your competition is, and if you spend any time at all in this craft, you will bang up against "difficult" bugs that wouldn't have been so difficult with pervasive testing. Those of you who have been writing unit-test cases shouldn't automatically get too comfortable, however... how do you know how much of your code is being tested? If you're testing the same block of code eight different ways and other sections of your code don't get tested at all, can you ship a quality product? Coverage analysis will save you time (by not writing redundant test cases), grief (by prodding you to test areas of code you thought you tested but hadn't), and money (from #1 and #2). If you're trying to run a completely instrumented shop - where everything that can be measured for a lower cost than the failure of that thing, is being measured - then I (should be) preaching to the choir.

To sum up:
  • If you don't write it down (in a way that "it" can be found again), it never happened. If you've never tested your code, it's broken until proven otherwise.
  • If you can't repeat the test at will, it hasn't been tested.
  • If you don't or can't know how much of your code has been tested, your users will.
We can't all be Microsoft and expect our paying customers to find our problems!

Tuesday, May 08, 2007

XHTML Is (Nearly) Useless

If you've written any Web pages in the last five years (at least), you've at some point bumped into the difference (schism?) between "original" HTML and "new, improved - now based on XML!" XHTML. If you don't write Web "content" (thanks for reading my blog, but why are you here?), or deal professionally with those who do, you may not know the difference, or care that there is a difference. There is, and people should care about it if they care about the Web.

(Briefly, for those who care but don't know; the rest of you can skip this and the next paragraph.) HTML is often known to developers as "tag soup", because very, very many sites don't follow the strict interpretation of the standard, and are "broken" in all sorts of ways. This was initially justified as working around the myriad bugs in grossly defective browsers such as Microsoft Internet Explorer. XHTML was different and better because it was HTML reformulated as XML, which could then be "validated" (checked) by any validating XML parser. HTML-as-XML also (should have) driven the development and use of all sorts of nifty techniques and tools that are only practical when assumptions can safely be made about the structure and format of the document - which would be true in XML/XHTML but not necessarily in "classic" HTML.

The problem, of course, is Microsoft's Internet Explorer browser, affectionately known to Web professionals as "Internet Exploder". Among the many "quirks" (defects) that has unknowingly afflicted usees of that browser, all versions up to and including the current Version 7 fail to understand XHTML as XHTML. The "conversation" that takes place between a browser and a server when the browser requests a Web page is defined by the open standard known as HyperText Transfer Protocol, or HTTP. Part of that conversation involves the server informing the browser what type of data it will be sending. This is done using what is called a "content type" "header".

All together now? Good. When a server wants to send a browser a page of "tag soup" HTML, the correct content type is "text/html". A properly-formatted and -served XHTML page will instead use "application/xhtml+xml". This will inform the browser that, in fact, the page being transferred is a proper XHTML page (per the open standard defining it), so the browser will kick in the assumptions and processing that works for XHTML but not for "tag soup".

Of course, Internet Explorer is now the only major browser that gets this wrong (as indicated by this vintage-2005 blog entry). As far as I am aware, every other major graphical browser in the world - Firefox, Opera, Konqueror, Galeon and the rest - all support The Right Thing. Unfortunately, IE is still the 300-pound gorilla in the china shop; the majority of Windows usees still browse the Web using IE, and though the trend is improving steadily, that will likely continue to be true for the next couple of years (say, 2009-2010 barring unforeseen circumstances).

What kinds of things would a properly XHTML 1.0-compliant browser let us do with our site? One trivial example: let's say you're writing a political-commentary site that is geared towards an upcoming election, and you want to consistently name your candidate as "The Honorable Senator Francis X. Snort (email senator@senatorsnort.org)". When your guy goes down to defeat (one too many campaign-finance scandals, mayhap) you want to change the blurb to "The Honorable former Senator F. X. Snort (email snort@somefreemail.com)". Trivial to do with whatever CMS or scripting system you're using, right? But by using an XML entity, you can simply say "&snort;" in your document, and an entity declaration in your document's header will tell the parser what you really mean. Change the declaration, and every instance of that entity expands to your new meaning. People who use other XML-based markup systems, such as DocBook, have been using this technique for years. Using XML entities in pages shown in correct (non-IE) browsers will do exactly what you tell it to. In IE, or, to be fair, several text-based browsers, the entity name will be displayed exactly as it is in the document - in our case, as &snort;. This is unlikely to have the desired effects on the folks "back home" for the Senator.

Web developers have, as I mentioned, several well-known workarounds for this type of thing, using their authoring tools rather than the document itself. It is, however, a reasonably easy example for people to understand. Given the increasing popularity of systems such as PHP Smarty that let you use large chunks of "raw" (X)HTML along with the scripting goodies, it would come in handy too.

So how does all of this make XHTML "nearly useless?" Because most developers developing pages for the general public (as opposed to corporate intranets), knowing that Microsoft IE doesn't support the correct content type, will either "not bother" developing "correct" XHTML or at best will serve it to all comers as "tag soup" HTML.

This also has the "benefit" of completely stifling further innovation (as far as the end user is concerned) based on XHTML. All of the comments I've made so far are only germane to the initial version of XHTML, designated 1.0. The newer versions, XHTML 1.1 and XHTML 2.0, provide new features and support new technologies that greatly expand the usefulness of the Web - or would, if Microsoft weren't, as usual, dragging the Web down for competitive lock-in purposes. By doing everything in their considerable power to ensure that IE browsers and sites aren't fully, completely interoperable with other browsers, they discourage Windows usees from using "rival" browsers to browse sites labeled "Best viewed with Microsoft Internet Explorer". There's nothing preventing Web designers from writing standards-compliant sites that also work well with IE; in a well-designed site, it's not particularly onerous to support both standards and Microsoft. If you're using Microsoft tools, of course, it will take quite a bit more work and knowledge to create valid sites. It can be done - several sites and mailing lists describe the techniques and mind-set required - but Microsoft do not go out of their way to make it easy to do so.

Of course, this also applies only to the public Internet. If you're fortunate enough to be developing "real Web apps" for your company's intranet, and your company understands the value of open standards, then you're not going to be subjugating yourself to IE and none of this really applies to you. Go enjoy all the things that new tech lets you implement that can really stomp on your non-standards-using competition!

For the rest of us, until the Web gets out of this proprietary funk it's in now, and IE either falls into a long-deserved oblivion (improving Windows security dramatically, but that's another post) or actually complying with the same standards every other serious browser in the world does, then we're going to have problems. One of the more annoying and frustrating ones, as we've discussed, is that XHTML is (nearly) useless." So much for innovation.

Monday, May 07, 2007

It's the End of the Net as we know it, and we feel fine....


John C. Dvorak has an interesting post on his pcmag.com column blog, entitled "Will the Internet Collapse?" He doesn't think it will, obviously, and he's got some pretty impressive trends to back up his contention. Example: 140,000 terabytes of backbone traffic in 2002 - at a "conservative" 60% annual growth through 2007, that's roughly 25 KB for each of the six billion or so people on the planet. Most of whom (still) wouldn't know what a byte was if it bit them; they've got more pressing concerns, like safe food, clean water, housing... But I digress.

I don't think the Net per se will "collapse", either. What's going to happen -- what's already happening -- is both more subtle and dangerous. The "Internet craze" that gave rise to Bubbles 1.0 (1990s) and 2.0 (now) and has driven the Net from a quirky research project into a cultural touchstone, has done two things that, by comparison, would make an every-Friday-from-4-to-10-PM crash seem benign in comparison (and "4-t0-10-PM" where? On the Internet, it's always "now".)

The first problem is the Baby's Spoon in the Waterfall. There's so much information (wrapped up in even more "content", which isn't the same thing) that no person, government, entity or corporation can ever comprehend. People who spend large amounts of time surfing the Web and using various tools to pull information off the Net in other ways, soon exhibit a behavior akin to being "punch drunk". Late in te 12th round, The Champ has connected so many times with Joe Palooka's jaw, and we in the crowd can see Joe staggering around, unsure of even from which direction the merciless pummeling is coming, let alone able to control the situation. The Champ, in this analogy, is the onslaught of data/information/"content" from the Net, primarily email and the Web; Joe is standing in for the typical, non-technical ("you mean Yahoo and the Web aren't synonyms?") user. As the user's eyes glaze over and the cognitive mind enters vapor-lock, he is essentially unable (and psychologically unwilling) to refine his usage patterns or seek out new experiences that he wouldn't find in "offline" life (what in an earlier age was called the "You Are There" effect). So, for instance, the stereotypical North American user goes back to the "safe, familiar" online equivalents of his offline television shows - the "news" sites owned by the same multinational corporations that own American media, and YouTube, which can be viewed as a worldwide online version of "America's 'Funniest' Home Videos": another vehicle for peddling the same tired corporate products in the commercials.

The other problem, of course, is that organizing all this "stuff" has become more difficult, and the rate that it becomes more difficult is at least as rapid as the rate of growth itself. While the Net, and the Web in particular, have enabled new ways to express individual personalities (e.g., MySpace) and alowed ordinary citizens of many countries to amass much ore detailed information about what their government is doing, for them or to them (e.g., YouGov and
Thomas), if you don't know about YouGov or Thomas (or any similar site set up by your own country's government), then the old Bruce Springsteen song, "57 Channels and Nothing's On" seems quaint and manageable in comparison. People know there's all sorts of stuff out there - they can Google for it, "it must be real" - but, unable to come to grips with how things are organized (they aren't, on purpose) or how to use the available information to achieve a personally important goal, they fall back on the sites that organize and package and sanitize the content, accepting loss of control as the price of freedom from thinking too much. (E.g., AOL - a subsidiary of Time-Warner, and Fox "News".com, a wholly-owned subsidiary of AIPAC.) As people sink safely back into their easy chairs, content to absorb the anti-intellectual pablum that bombards them, they lose touch with the idea, let alone the possible reality, of an energized populace using the new, revolutionary technology at its disposal to improve their own lot in life and that of the world at large. Instead of a medium which challenges the status quo, the Net has devolved into a tool which reinforces it.

A collapse of the Internet? You're right, John; it will never happen. But a collapse of the promise and meaning of the Internet? It's already here, folks; we're just standing around watching streaming video of the rubble bouncing.

Monday, August 21, 2006

Projects and Data Formats, or Scratching a Standard Itch



Fair warning: This post was written in bits and pieces over a week that I spent mostly on my back in bed; it hits two or three hot-button issues that I've been running up against. In the fullness of time, I may come back and break it up, or write follow-on entries pontificating on one point or another, but for the nonce, your patience - and comments! are appreciated.

Real standards happen in one of two ways. One way involves an organisation like the World Wide Web Consortium (or W3C as it is commonly known) puts together different committees and working groups, and over the course of various meetings, seminars, forums, and other corporate expense-account sinkholes, massive sets of documents are ratified; if we're lucky, somewhere within that will be nuggets of information and wisdom around which useful things can be accomplished. Successful xamples of this include standards such as HTML, XHTML and CSS 2. Less sucessful examples include efforts such as WCAG 2. While it may safely be assumed that nothing in the new standard will disrupt the existing order of the Internet, the flip side of this is that there may be no actual working implementation of the new standard (to prove that such is practical), and it may well be that the new standard is not the most efficient or elegant solution to the problem. This may be described as the "top-down" approach.

The other way that standards happen in the real world is for a developer, or typically a small group of developers, to come up with something that works for them, open up community/public comment and collaboration, and eventually submit the standard definition (whihc by then has several working implementations) to standards bodies like the W3C or the Internet Engineering Task Force. This may be seen as the "bottom-up" approach. Its success is largely tied to how effectively it solves what it sets out to, and equally critically, whether it does so in a manner that doesn't convey inherent advantage to a subset of its audience (such as the company employing the creators of the standard). Successful examples of this include vCard and its successor hCard.

Stumbling across the description of hCard (from An Angry Fix by Jeffrey Zeldman, a well-known figure in the online Web-design industry and community) after I had been giving some thought to a problem I had been having with contact information in various formats. Namely, that the information was in various formats, for my (ancient) PalmPilot, each of two different Nokia phones, my email package (Mozilla Thunderbird), and so on, and so on.... Keeping everything synchronised - the mundane necessity of ensuring that any given contact was in each of the needed places with the most recently updated information - is a burden sufficient to preclude any further effort, such as actually communicating anything useful or interesting to those contacts. (Maybe they read this blog...)

What's needed is a free, open source bit of software to take these various directories in varyingly historical formats, apply updates and changes to a single, current-technology directory around something like vCard (or, better, hCard), and then to spit out various dumps of this data to suit the different devices and their differing format requirements. If you think about this for a while, you can think of all sorts of ways that synchronisation could be a real pain...which update gets applied if you enter the same information two different ways on two devices? Suppose that I get energetic and add data to the "Custom Fields" in my Palm to represent data that has specific fields for the phone or Thunderbird - but since I add different data at different times, it's not always consistent? And on and on...

I'm going to keep one eye open over the next few weeks or months for something that does this relatively painlessly (and, of course, if anybody knows of any, please let me know). Otherwise, it's likely to become Item 374 in my medium-priority queue for Tools I Intend To Write (Someday).

Implicit in the first paragraph, and alluded to more directly in the third (see Zeldman et al) is the fact that the W3C has spent the last 2-3 years making abundantly clear who its customers and stakeholders are, and telling those of us who are professionally tied to standard technologies but who are not ourselves multinational corporations flush with cash for endless junkets (and patent payoffs), to take a long walk off the shortest pier available. While this may be seen by some as an efficient use of resources, addressing the corporate sponsors who are the titans of the marketplace anyway, somebody made a good point along the way: the Microsofts and H-Ps and IBMs and so on of the world started out as small shops that nobody had ever heard of. Had the standards of the day been defined less for what made sense from an engineering perspective than a lock-out-the-small-guys marketing directive, the world would be a very different - and likely less advanced - place today. What goes around, comes around - and the W3C in particular is building up a lot of bad blood with the vitally "interested parties" who don't happen to (presently) be among the 200 or so largest corporations on the planet.

What will happen? On the one hand, we'll likely wind up with lots of easily available but proprietary "standards" like Adobe PDF; the word processors I've used for the last four years have supported publishing in PDF without Adobe asking for a dime. On the other hand, we'll have highly marketed, widely Diggable, proprietary-means-you-only-get-it-from-us packages. These may have lively add-on Astroturfed communities, but they won't deliver the business benefits of truly open software; you can't fork the product, you can't completely support yourself, every use you make of the product or technology, in perpetuity, will be subject to the dictates and whims of the company that owns the product. Well and good, you say; they do, in fact, own their product, and have a right to do whatever they like with it. True, but where does that leave customers who incorporate that product into critical business processes? A year or so ago, an American friend of mine told me of one of his clients, who had a hard drive in their accounting server fail. They swapped out the drive, restored from backups, and found they needed to reinstall the order-management package they used - to generate and track every single order from the day the company was founded right up to the guy who just got off the phone - needed a license key. Fine; they call the vendor's toll-free phone number, expecting to be back in business (literally) in a few minutes. Oops. The vendor was bought out by a much larger firm; their version is now three versions out of date and the (new) vendor requires htem to buy an upgrade - at retail - to access the data they've just restored from tape.

When people ask me what the business benefits of open systems are, they don't want to hear a Stallmanesque sermon on the virtues of individual liberties, real though they may be, or the geek chic of cool code, or the cheapskate appeal to "it doesn't cost a thing". It does - in time and effort to convert and adopt within the enterprise. But what you get from it at the end of the day is control over your own business processes; you can keep running a ten-year-old word processor if you choose to, or have your accounting package customised just so, or whatever you can create a business justification for - and it's going to be much easier to cost-justify relatively audacious projects because there are no hidden surprises. Transparency, auditability, control, economy -- those may not be terribly high on the Digg word list; they may not have dozens of ...For Dummies-style books in your local chain bookshop, but people who make their living, and their employees' living, by making the numbers come out right every quarter should understand what I'm talking about. It's about time.

A side note: Anyone who is considering setting up business in Malaysia rather than other nearby countries (Thailand, Vietnam) may well want to consider the level of technical efficiency, customer support, and attitude towards serice of the local telecom quasi-monopoly. For most people and businesses, Telekom Malaysia is the only game in town. As one of the subscribers/victims of their Streamyx ADSL "service" for the last three years, I have watched connection speed and reliabillity plummet as thousands of new subscribers are pushed onto steadily lower and lower tiers of service. I believe, for example, that they now offer a "broadband" connection at 128 Kbps; twice as fast as a standard dialup modem. I am paying for a 2 Mbps - 2,000 Kbps - connection; in the last two months, I have never witnessed transfer rates higher than 400 Kbps, and for the last week never higher than 80 Kbps. If I were living in a capitalist system with competitive markets, I would have choices. In a functionally Stalinist economy where competition against government-linked is tightly controlled, I have no usable choices. It has taken me well over two weeks of trying to post this blog entry. Selemat datang ke Malaysia! (Welcome to Malaysia!)

Wednesday, June 28, 2006

Promises Kept, Credibility Gaps, and Microsoft: Are we Customers or Consumers?


As reported on Slashdot, quoting Quentin Clark's WinFS team blog (which spun the item mercilessly), and commented on widely, particularly by rjdohnert and Kamal:

WinFS is dead. What has been understood for a decade or so to refer to a "Windows File System", recently rechristened in Microsoftspeak as "Windows Future Storage" (to imply a lack of commitment to a product or in fact anything specific at all); in any form recognisable as the product/technology that has been hyped unrelentingly by Microsoft when they needed something to keep users (and developers) committed to the Next Windows Version, the plug has been pulled for what promises to be the very last time. This could be viewed in a number of ways; the least uncharitable explanation that concievably touches upon our shared reality is the subject of the remainder of this item.

Yet another case of Microsoft overpromising and underdelivering? Since they really don't care about providing great software to consumers - either end users or developers, there is no real penalty for failing to keep promises (though they do, in true Rove/O'Reilly fashion, try to spin the sucker positive as hard as they can, just to keep the yokels giving the slack-jawed "wow....they say it's cool" and, as Michalski originally wrote, crapping cash).

There is absolutely no reason to keep waiting for a relational file store in Windows or any product except SQL Server (and possibly some future version of OFfice that requires SQL Server). There is no reason whatever to believe Microsoft will keep ANY promise made to developers or end users, nor or in future. There is absolutely no reason to believe that any gee-whiz "technology preview" given by Microsoft will ever turn into a real, stable, usable product unless that product is announced (with a ship date) at the show or conference where the demo is made. Stability and usability of said product will, as with all previous Microsoft releases, have to wait for the second service pack.

What this boils down to, in other words, is a matter of trust, and commitment, and honesty, and all the values that a company which values its customers (and workers) is expected to incorporate into its ethos. That Microsoft deliberately chooses not to do this, as it has proven on numerous occasions, shows its complete and consistent contempt for those poor schmucks it sees as consumers, not customers.

We, as developers and users, have two choices. We can either continue to prove Microsoft right, gulping whatever product they deign to deliver, crapping out whatever cash they choose to take, abjectly powerless to exert any change over their behaviour. Or, we can refuse to play their game any more. There are other tools to develop products for Windows. Most of these have the additional benefit of being cross-platform.

"Cross-platform". There's a quaintly radical word in these times. The idea that people could use a variety of systems, tools, applications, to get their work done. Companies don't have to pay US$600 to buy an office "suite" with a heavy-duty word processor, spreadsheet, and yadda yadda for a manager whose work is primarily limited to short memos? Revolutionary. Selecting tools based on the needs of the user rather than the "default" "choice" for the entire organisation? If one choice of office layout doesn't fit everybody from the managing director to the secretarial pool, then by what logic should they use the same software tools to do their work? How ma many users of, say, Microsoft Word use more than a tiny percentage (say, 5%) of the "features" in the product? (According to surveys dating back to 2000, roughly 5%). By looking at the situation as a need to give each user tools appropriate for the task at hand, rather than imposing a uniform "solution" and adapting the task to the "solution"?

This whole WinFS affair is yet another bit of weight pushing the Good Ship Microsoft towards (or past, in some opinions) the tipping point. Those already on board might do well to examine their options; those considering extending their 'booking' may wish to reconsider. The main forces arguing that no 'realistic' options exist have been marketing-driven, rather than technically- or business-driven. Consumers blindly take whatever they're given; customers demand products that meet their needs. It is high time that those who purchase and use business computer software systems, and the tools to work with them, availa themselves of their options.

Monday, May 22, 2006

Who Needs Privacy without Liberty?


I was originally going to call this post "Pretty Good Astroturf - What Happened to PGP at the Grass Roots?"

The Register has a good piece on Whatever Happened to PGP? As a PGP (now GnuPG user for at least ten years, I was immediately interested.

PGP, for those of you who might not remember, stands for "Pretty Good Privacy". It was arguably the first widely-deployed, open, cross-platform public key cryptography (encryption and electronic signatures) software systems. At one time, the growth in usage looked like China's economic output - respectful transitioning to breathtaking, with people confidently forecasting 'incredible' within the near future. Then a funny thing happened.

People - ordinary individuals, what politicos call the "grass roots" - stopped being so interested in PGP, and PKI in general. It turned out that people were willing to be sold on the idea that the only thing they needed encryption for was to work with a "secure Web page" in their browser, so they could order stuff using a credit card. The idea that people might want to keep their personal communication private, or be able to make messages and files that they create tamper-proof, just went completely below the radar. This "just happened" to "coincide" with the increasingly shrill jingoistic/"security" propaganda being drummed into the skill of ordinary Americans; security and identity management were no longer something that many ordinary people could use and control without feeling it all either a bit ridiculouse or seditious, depending on one's politics. Still, public discussion and enthusiasm - at least among "mainstream" Americans - seemed to diminish from about 2001 onwards. The travails that PGP went through didn't help grassroots individual use - first with the US government trying to crush Phil Zimmerman, the original developer, and then the soap-operatic sagas by which Network Associates, Inc. acquired and then almost literally threw away the original PGP code base.

But, as the Register article points out, there was one very significant group of users who jumped on PGP. Since PGP depends on a "web of trust" - A trusts C because A knows and trusts B and B asserts his trust for C - the use of PGP within widespread organisations, where some central IT or other department can certify (and possibly issue) PGP keys, is seen as a natural solution to business problems of identity management. Where in the early days, a PGP user might send and encrypted message from his office email account, comfortable in the belief that his corporate masters would be none the wiser, now the corporation is including PGP in its infrastructure.

Grass roots, meet AstroTurf.

Some might see the tone of the Register article as "how can we solve this problem?" But which problem?

Popular use of PGP, or other public-key crypto, would be desirable in a libertarian culture where people valued and guarded their privacy and identity, particularly against encroachment and/or usurpation by a less-than-trusted corporation or the overweening State. While the justification for this exists in the current American social and political system, more than ever before in living memory.... the social impetus doesn't really exist anymore. An educated, informed, watchful and skeptical American population has largely forgotten how to think for itself, delegating that once-vibrant activity to the likes of Faux "News" and the Lobby.

Corporate use, on the other hand, is proceeding apace; and those users would argue that there is no real problem: a business need has been identified, a tool selected that addresses the problem, yielding a solved problem. What's not to like? Errr....yes, well, it does depend on your viewpoint. Was that the original intention that Zimmerman had in writing PGP? Almost certainly not. Does that make the use of PGP in a business environment any less "right" or "proper"? Not if it is to remain "free" as in speech; anybody can usu PGP, as any free software, for any purpose permitted by the license.

What's "wrong" isn't the way that the use of PGP is growing, even though that isn't in a way that necessarily enhances human freedom or liberty, or enhances the security and privacy of individual citizens, as originally intended. Rather, it is that the political and social culture has changed, to where the values of freedom and liberty are no longer widely seen as individually attainable or discernable; rather, people believe themselves to be as free as they are told that they are - and see no need for independent evaluation or confirmation. Technology can be used to aid the solution of social and political problems; it cannot, however, be a "solution" in itself. Just as the old saying goes, "you can lead a horse to water, but you can't make him drink", you can provide the people of the world, whatever their present situation, with tools to enhance that freedom and liberty - but people will only use the tool if they care about such things. If Huxley's observation is accurate, that "The victim of mind-manipulation does not know that he is a victim. To him the walls of his prison are invisible, and he believes himself to be free. That he is not free is apparent only to other people. His servitude is strictly objective" -- then the tools available don't matter. A key is useless to one who does not see she shackles on his own wrists. That, I fear, is the level that far too many Americans - and others - have fallen to.

What happened to PGP? It got better, and became as obsolete as freedom.

Sunday, March 26, 2006

On the importance of keeping current

Now that PHP 6 is in the works, there is even less excuse than existed previously for Web sites (hosting providers in particular) not migrating to PHP 5 from PHP 4. We are faced with the unpleasant possibility for tool and library developers of having to support three major, necessarily incompatible, versions of PHP.

I am not yet up to speed on what PHP 6 is going to bring to the table, but PHP 5 (which will be two years old on 13 July 2006) makes PHP a much more pleasant, usable language for projects large and small. With a true object model, access control, exception handling, improved database support, improved XML support, proper security design concepts, and so on, it's a far cry from the revised-nearly-to-the-point-of-absurdity PHP 4.

Another great thing about PHP 5, if not strictly part of it, is the PHPUnit unit testing framework (see also the distribution blog). This is a wonderful tool for unit testing, refactoring, and continuous automated verification of your codebase. It will strongly encourage you to make your development process more agile, using a test first/test everything/test always mindset that, once you have crossed the chasm, will benefit a small one- or two-man shop at least as much as the large, battalion-strength corporate development teams that have to date been its most enthusiastic audience.

I have so far used this tool and technique for three customer projects: the first was delivered (admittedly barely) on time, the second was actually deliverable less than 2/3 of the scheduled calendar time into the project (allowing for further refactoring to improve performance) and delivered on time, and the third was delivered 10% ahead of time, with no heroic kill-the-last-bug all-night sessions required.

Discussing the technique with other developers regarding its use in PHP and other languages (such as Python, Ruby, C++ and of course Java; the seminal "JUnit" testing framework was written for Java), gives the impression that this experience is by no means unique or extreme (nor did I expect it to be). Given that two of my three major career interests for the last couple of decades have been rapid development of high-quality code and the advancement of practices and techniques to help our software-development craft evolve towards a true engineering discipline, this would seem a natural thing for me to get excited and evangelical about. (The third, in case you're wondering, is the pervasive use of open standards and non-proprietary technologies to help focus efforts on true innovation).

All of this may seem a truly geeky thing to rave about, and to a certain degree, I plead guilty of that. But it should also be important, or at least noteworthy, to anybody whose business or casual interests involve the use of software or software-controlled artifacts like elevators and TiVo. By understanding a little bit about how process and quality interact, clients, customers and the general-user public can help prod the industry towards continuous improvement.

Because, after all, "blue screens" don't "just happen".

Once more into the breach, dear friends; once more....

To the half-dozen or so of you reading this blog, thank you; and for those of you who wondered what's happened to me and this blog over the last several months, the answer is both "a great deal" and "not much at all".

I had been ill for a couple of months, with what the doctors insisted was just an ordinary flu, and then a cold, and then an ordinary (NOT H5N1, thank you very much) flu that has kept my close friends busy trying to spy the license tag number of the lorry that keeps running me down. I am better now, thank you.

I have also changed hosting providers for my professional Web site and email hosting; the new crew look to be a good outfit so far:

  • they understand the value of responding quickly to customer enquiries, no matter how harebrained;

  • they understand Linux and Apache and (at least do a convincing appearance of) not just a "me-too" offering;
  • their people know their way around their system (see the first comment);

  • they have sensibly large limits on disk space and bandwidth; which means that

  • they allow you to host a lot of tools and libraries and addons that you can manage yourself (think PEAR for you PHP types) without having to rely on the (necessarily limited) knowledge of a central administrator who may not be quite as up to scratch on version X of the FooBar publishing framework as you are.


In short, as I said, off to a good start. After getting some minor details worked out, and being on my feet again, the all-new seven-sigma.com Web site should be up within the next couple of days.

Sunday, February 19, 2006

If you can't extend it, is it really an eXtensible HTML?

Arrrrrrrrrrrrrggggggghhhhhhhhhhhhhhh!!!!! So much for consistency....

As anybody who has worked with me in the last 3-4 years well knows, I have been an enthusiastic advocate of DocBook as a documentation markup vocabulary for various purposes, and by extension, XML-based tools for all manner of things (Apache Ant and so on).

One feature I use regularly in Docbook XML source documents is the internal subset, which lets you define entities and include files defining entities not part of the original DTD. So, for instance, my standard software.ent file has an entry (obviously without the junk spaces, and all on one line) of
< ENTITY mswindows '< ulink url="http://www.apple.com/switch">Microsoft Windows< /ulink>'>;
This way, anywhere in a DocBook file that includes that entity definition, I can type &mswindows; and, when the document is transformed (into XHTML, PDF, RTF or whatever), the desired link and text will appear in place of the entity. This is an obvious lifesaver when you want to include, for instance, links to glossary definitions for unfamiliar terms scattered through a document.

Fine. But the current state of XHTML (the XML-based successor to HTML) simply doesn't support it. It does not appear to be possible to have an XHTML document with an internal subset parsed correctly by any current major browser on Windows, Linux or the Macintosh. Various Google searches such as this one produce links to pages that say, with varying levels of emphasis and literal wording, "you can't use internal subsets for XHTML that is to be rendered by a browser". It seems that in the force-fit of XML to HTML that produced XHTML, the concept of different "streams", or purposes, for documents was introduced. XHTML which is to be rendered in a Web browser has one set of limitations (including the internal subset); whereas XHTML conformant to the same definition documents which is to be processed as "pure XML" has another.

To say that this sucks is to use that colloquialism as an extreme understatement, akin to saying that tsunamis are wet. This limitation closes off an entire range of applications that would use dynamically-generated XHTML as browser-viewable data in the same spirit as XML generally (without writing an otherwise redundant app to parse and reformat the data). The benefits — and they are significant — of an XML-based browser markup language are (in my view) seriously degraded by foolishness like this.

Of course, several of you are already thinking, I could just use XSLT instead. I had previously wondered why PHP and other Web scripting languages included support for XSLT processing. Now I know, I guess.

If anybody has any corrections or other good ideas, please let me know. You can find a dirt-simple example of what I'm trying to do here.

Friday, December 02, 2005

What You've Known Since You Were Six: Sharing Helps Everyone


By the mid-1980s, it became virtually impossible to write a technically and economically interesting, large-scale application in a market-friendly period of time by a single developer. Likewise, by the mid-1990s, it became unusual to find application-level software which did not make use of some sort of database, both as part of the application itself and (for relatively mature development shops) as an integral part of development tools (configuration management, defect tracking and so on).

I would argue that, by 2004, it became infeasible to do development work -- commercial or otherwise -- in a craftsmanlike manner without the pervasive use of collaboration software. These systems -- wikis, blogs, message boards/discussion software, and so on -- dramatically improve the capability and effectiveness of a development team. (All of these applications, incidentally, make heavy use of databases.)

Any craft, arguably especially the craft of software development, lives by the traditional medical manifesto, "if you don't write it down, it never happened". Too often, however, information is written down (captured), but there is no means for organising and retrieving that information effectively. These collaboration tools each address that need for capture, (flexible) organisation and presentation of information in subtly, but importantly, different ways.

wikis are great for organising documents in a way that they can be shared, collaborated on, retrieved, and massively hyperlinked, with attachments, comments and so on. blogs (also look at blogger.com) are where individuals can write about anything and everything, with links to outside pages and other resources of interest. Most blogs, including this one, support readers posting comments to blog entries, or to other comments. (So please let me know what you think!) This differs from a wiki in a manner similar to the way that newspaper columns differ from journal articles or books; mainly in the semantics and scope and style of information being presented. Also, while blogs can be edited after the fact, they rarely are, whereas a wiki with a good community around it has its content change regularly.

Finally, we come to discussion software, sometimes referred to as message boards or bulletin boards. These are the latest form of a general system as old as networked computing itself. Systems like phorum, phpBB and vBulletin are all Web-based systems which allow users to post messages in forums, which contain discussions of a particular subject. These can be searched, attachments can be made, and so on. If a group needs to have a discussion, come to a consensus, and see how it got there later, this is a better tool for that sort of thing than a wiki or a blog would be. (Documents which are created in response to whatever decision was taken can be collaborated on in a wiki; individuals can expound on related opinions or useful information in their blogs.)

Another bonus to all of this is that all of these applications involve databases, with most of the information being text or (hopefully) relatively small binary files. The means to do backups and restores, as well as formal version control, for each of these is well known. Often such facilities are supported directly by the administrative interface for the software. Thus, development organisations can record, preserve and recall vital information without having it locked up in people's heads, or (possibly worse) written down haphazardly on random bits of paper and requiring significant decoding effort by people other than the author who wish to read them.

These tools also, combined with email and instant messaging, almost completely remove the need (or even benefit) for teams to be located in the same physicall location, working at the same time. Development teams now have the tools to be highly effective regardless of the physical location or time-zone differences of the team members. Indeed, this writer has participated in such distributed teams, developing both commercial and non-commercial software systems, and can enthusiastically vouch for its effectiveness.

In short, within a very short period of time, we can exxpect the adoption of collaborative tools by software development teams of all environments and domains to increase dramatically. Use of these tools is, or shortly will be, an effective discriminator for success, especially in explicitly competitive environments. If you are participating in a team developing software, Web sites, or similar systems and you're not using these tools, why not? Your competition probably is.

Thursday, December 01, 2005

Religious Icons and Text....Editors; Some People Get Really Attached to Their Tools

For several years now, when I'm editing text files (program source code, documentation, blog entries, whatever) under Microsoft Windows, I've used the free Crimson Editor. At Version 3.70 (which can be downloaded here) since 22 September 2004, it is a perfectly reasonable general-purpose text/source code editor. It does the basic things most people expect: syntax highlighting for different languages, macro recording and editing, support for calling external tools and integrating their output into the file being edited, and so on. But I have gradually become less than thrilled with it for a few (possibly quirky) reasons:
  • While available without charge (economically free), the source code is not available (freedom of action is restricted). Not being an open source product, it is not open to real customisation beyond simple keystroke macros and language syntax definitions.
  • Most of my Web development for the last few years has revolved around the PHP scripting language. More recently, this has been supplemented by Python. Crimson lacks any of the features that other, more specific editors like NuSphere PhpED or ActiveState Komodo offer (although, granted, at a price).
  • What I have been doing even more of lately, however, has been document creation and editing using Docbook XML, from which I can create HTML, PDF, text and RTF word-processing files from a single set of sources. Very cool stuff. Until I get into Crimson and start pounding the keyboard - in frustration; the editor recognises that it's dealing with an XML file and can colour-code tags, but there isn't a whole lot else it knows about. I eventually got tired of doing basic scut-work and memory exercises over and over again, and went looking for something else. (At least a lighter hammer to hit myself in the head with).
As always, my first two stops were SourceForge and Google. You know Google. You may not know SourceForge, but you should. If you're looking for software -- to do anything, on anything, in anything -- this is the place to look first. A clearinghouse of open source projects, as I write these words their website reports 106,861 registered projects and 1,186,755 registered users. Those range from small scratch-an-itch projects on up to very complex, very complete line-of-business systems like ERP and CRM. They currently have 2,274 projects listed in the Text Editors category. Obviously, I'm not going to look at all of these, but there are nice ways to whittle down the list.

I had defined for myself a half-dozen usage modes that I was going to use to evaluate each editor and compare it against Crimson Editor. Now, to a programmer, or a writer, editors are like the old advertising line for a brand of potato crisps, "bet you can't have just one". I now have, temporarily, mind you, eight different editors on my Windows Start menu, with another 3 or 4 whose installation procedure was so basic that it didn't even install any menu items.

These basically fell into three groups. First are the demo or 'lite' versions of commercial products, which invariably annoyed me with the artificial limitations intended to induce you to ante up for the 'real' paid-for version. These were quickly discarded. Second were the well-meaning but immature/incomplete 'freeware' packages (such as, to some degree, Crimson). These invited a certain amount of experimentation with their source code to tweak things up a bit before being abandoned. While some of these (such as Programmers Notepad) are clearly on the right track, they just didn't "feel" right -- and we are talking about the most subjective of tools. Just as a carpenter may have a favourite hammer, or an electrician a meter that works just the way he likes it, a writer has to feel "comfortable" with the editor and other tools he uses. (Also seen were several packages so lacking in completeness and/or competence that it is fervently hoped that they were pure larks; exploratory projects to teach their writers something. Humility should come along for the ride; several comments on support forums were, quite deservedly, scathing.)

Now I am in the midst of conversion from Crimson Editor to JEdit. While it has a few quirks and oddities, likely due both to the fact that I am using a beta (prerelease) version and to the peculiarities of Java on Microsoft Windows, it does most of the mundane things quite well and some greatly appreciated extra features uniquely well.
  • As a Java program, it runs not only on Windows but on Linux, the Apple Macintosh, BSD Unix, your pocket supercomputer, whatever.
  • Being an open source project, you have total control. Think some features are just bloated junk and want to get rid of them? Go right ahead. Thought of a cool new feature that would fit right in with what's already there? Fire up your inner coder and go write it. Think something's pretty cool, but think you can make it faster/better? No problem -- and when you're done, you can contribute the changes back to the original project, start distributing your modified version on your own, or just hold onto the goodies for yourself. That's the freedom (of action) that open source gives you.
  • Since it's hosted on SourceForge, you don't need to worry about the original team getting bored and walking away, leaving the code on a site that eventually goes dark. Even if JEdit does go dormant for a time (a highly unlikely scenario, apparently), SourceForge makes it easy for some fresh talent to pick things up and carry forward at a later time.
  • If you're working in Docbook XML, it has some really nice features: not only does it support autocomplete for tags like
    , it also autocompletes
    entities like &eacute; for é. Better still, for any entity that you define -- any entity which would be known to a parser reading your document -- autocompletion is supported. To show a specific example, my standard "data dictionary" entities file (what I use to reuse standard phrases and URLs in all my work), I can type &ap and the list of candidates has apache-httpd as the third item in a dropdown list. Two taps on the down-arrow key and I've saved eight keystrokes. If you're authoring in Docbook XML, you should be making pervasive use of entities to aid in consistency and reuse. This makes it lots easier.
I'll likely come back and update this blog entry as I gain more experience with JEdit, and as it matures into an "official" release of Version 4.3. If anybody with writing or coding experience really is reading this, I'd appreciate it if you'd leave comments describing your experiences. Thanks.

Thursday, October 27, 2005

Craft, culture and communication

This is a very hard post for me to write. I've been wrestling with it for the last two days, and yes, the timestamp is accurate. If I offend you with what I say here, please understand that it is not meant to be personal. Rather, it probably means you may want to pay close attention.

When I was in university, back in the Pleistocene, I had a linguistics professor who went around saying that
A language is the definition of a specific culture, at a specific place, at a specific time. Change the culture, the place or the time, and the language changes - and if the language changes, it means that something else has, too.
Why is this relevant to the craft of software development?

Last weekend, I picked up a great book, Agile JavaTM: Crafting Code with Test-Driven Development, over the weekend at Kinokuniya bookstore at KLCC. There are maybe half a dozen books that any serious developer recognises as landmark events in the advancement of her or his craft. This, ladies and gentlemen, is one of them. If you are at all interested in Java, in high-quality software development, or in managing a group of software developers under seemingly impossible schedules, and if you are fully literate in the English language as a means of technical communication, then bookmark this page, go grab yourself a copy, read it, come back, and reread it tomorrow. It's not perfect - I would have liked to see the author use TestNG as the test framework rather than its predecessor, JUnit) but those are more stylistic quibbles than substance; if you go through the lessons in this book, you will have some necessary tools to improve your mastery of the craft of software development, specifically using the Java language and platform.

I immediately started talking up the book to some of my projectmates at Cilix, saying "You gotta learn this".

And then I stoppped and thought about it some more. And pretty much gave up the idea of evangelising the book - even though I do intend to lead the group into the use of test-driven development. It is the logical extension of the way I have been taught (by individuals and experience) to do software development for nearly three decades now. It completely blew several of the premises I was building a couple of white papers on completely away - and replaced them with better ones (yes, Linda, it's coming Real Soon Now). TDD may not solve all your project problems, cure world poverty or grow hair on a billiard ball, but it will significantly change the way you think about - and practise - the crat of software development.

If you understand the material, that is.

There are really only three (human) languages that matter for engineering and for software: English, Russian and (Mandarin) Chinese, pretty much in that order. Solid literacy and fluency in Business Standard English and Technical English will enable you to read, comprehend and learn from the majority of technical communication outside Eastern Europe and China (and the former Soviet-bloc engineers who don't already know English are learning it as fast as they can). China was largely self-reliant in terms of technology for some time, for ideological and economic reasons; there's an amazing (to a Westerner) amount of technical information available in Chinese - but English is gaining ground there too, if initially often imperfect in its usage.

Coming back to why my initial enthusiasm about the book has cooled, for those of you who aren't actually from my company, I work at an engineering firm in Kuala Lumpur, Malaysia called Cilix. We do a lot of (Malaysian) government contract work in various technical areas, but we are also trying to grow a commercial-software (including Web applications) development group. Until recently, I managed that group; after top management came to its senses, I am now in an internal-consulting role. As Principal Technologist, I see my charter as consulting to the various groups within the Company on (primarily) software-related and development-related technologies, techniques, tools and processes, with a view to make our small group more effective at competition with organisations hundreds of times our size.

Up to now, we've been in what a Western software veteran would recognise as "classic startup mode": minimal process, chaotic attempts at organisation, with project successes attained through the heroic efforts of specific, talented individuals. My job is, in part, to help change that: to help us work smarter, not harder. Enter test-driven development, configuration management, quality engineering, and documentation.

Documentation. Hmmm. Oops.

One senior manager in the company recently remarked that there are perhaps five or six individuals in the entire company with the technical abilities, experience and communication abilities to help pull off the type of endeavour - both in terms of the project and how we go about it. Two or at most three of those individuals, to my knowledge, are attached to the project, and one of these is less than sanguine about the currency of technical knowledge and experience being brought to bear.

Since arriving on the project, I have handed two books to specific individuals, with instructions to at lesat skim them heavily and be able to engage in a discussion of the concepts presented in a week to ten days' time. Despite repeated prodding, neither of those individuals appeared to make that level of effort. This is not to complain specifically about the individuals; informally asking developers within the group how many technical books they had read in the last 18 months averaged solidly in the single digits. A similar survey taken in comparable groups at Microsoft, Borland, Siemens Rolm or Weyerhaeuser - all companies where I have worked previously - would likely average in the mid-twenties at least. So too, I suspect, would surveys at Wipro, Infosys or PricewaterhouseCoopers, some of our current and potential competitors.

While American technical people are rightly famous for living inside their own technical world and not getting out often enough, that provides only limited coverage as an excuse. In a craft whose very raison d'ètre is information, an oft-repeated truism (first attributed, to my knowledge, to Grace Hopper, that "90% of what you know will be obsolete in six months; 10% of what you know will never be obsolete. Make sure you get the full ten percent." If you don't read -- both books and online materials -- how can a software (or Web) developer have any credible hope of remaining current or even competent at his or her craft?

That principle extends to organisations. If the individual developers do not exert continuous efforts to maintain their skills (technical and linguistic) at a sufficiently high level, and their employer similarly chooses not to do so, how can that organisation remain competitive over the long term, when competitiveness may be directly linked to the efficiency and effectiveness with which that organisation acquires, utilises and expands upon information - predominantly in English? How can smaller organisations compete against larger ones which are more likely to have the raw manpower to scrape together a team to accomplish a difficult, leading-edge project? "Learn continuously or you're gone" was an oft-repeated mantra from business and industry participants in a recent Software Development Conference and Expo, an important industry conference. What of the individuals or organisations who choose not to do so?

Those of us involved in the craft of software and Web development have an obvious economic and professional obligation to our own careers to keep our own skills current. We also have an ethical, moral (and in some jurisdictions, fiduciary, legal) obligation to encourage our employers or other professional organisations to do so. There is no way of knowing whether, or how successfully, any given technology, language or practise will be in ten years' time, or even five. How many times has the IT industry been rocked by sudden paradigm shifts -- the personal computer, the World Wide Web -- which not only created large new areas of opportunity, but severely constrained growth in previously lucrative areas? I came into this industry at a time when (seemingly) several million mainframe COBOL programmers were watching their jobs go away as business moved first to minicomputers, then to PCs. History repeated itself with the shift to graphical systems like the Apple Macintosh and Microsoft Windows, and again with the World Wide Web and the other Internet-related technologies, and yet again with the offshoring craze of the last five years. What developer, or manager, or director, has the hubris to in effect declare that it won't happen again, that there own't be a new, disruptive technology shift that obsoletes skills and capabilities?

But whatever shift there is, whatever new technology comes along that turns college dropouts into megabillionaires, that changes the professional lives of millions of craftspeople... it will almost certainly be documented in English.

Tuesday, October 18, 2005

About me and my work at Cilix

I'm working on a lot of things for my work at Cilix, an engineering firm in Kuala Lumpur, Malaysia. First off, let me be clear on one thing: this blog is not officially sanctioned in any way by Cilix; this is 'just me'.

We call ourselves "A Knowledge Company". What that means, at least in my understanding, is that we apply professional knowledge and experience, augmented heavily by technology, to solve customers' knowledge-management and IT challenges. As such, we do a lot of writing - documents, Web pages, software, ad (nearly) infinitum.

We're a small shop as these things go, and our competition comes from much larger organisations with instant multinational name-brand recognition. Like any small firm, we have to win our first projects with a given client by promising - and delivering - a better value proposition than our competition. Where we get repeat business - again, like any similar firm - is by being agile, efficient, and above all, competent to the point of being unquestionably the least risky vendor for a particular solution.

Those attributes, in turn, lead us to consider issues like process, quality, and superlative knowledge of everything we are about. These issues, and how we as an organisation work through them, were what originally attracted me to the Company when I was approached and offered a position here. These issues are also the foci of what I expect to accomplish with this blog and the related collaboration tools (such as the Wiki).



I am also trying to evangelise and lead the implementation of open documentation and data-format standards at Cilix. This involves, among other things, migrating away from proprietary, binary formats like Microsoft Office documents to open, preferably text-based formats. As it happens, many of these open, text-based formats are based on XML vocabularies, such as Docbook and SVG.

Wby are text-based formats preferable? Lots of reasons:
  • They are usually much more compact (and compressible) than comparable binary formats. Converting mostly-text Microsoft Word documents to Docbook equivalents often yields size reductions of 80% or more (think how much more convenient email attachments would be);
  • They are usable with a wider variety of tools. I can throw a text file on my PalmPilot and fiddle with it far easier than a Microsoft Word document, for instance;
  • The are more amenable to most version-control systems, particularly cvs and subversion. Instead of making copies of each version of a binary file, all that is required is to take the difference between two different versions of a text file - a much easier and more reliable operation. I have seen version control systems of all flavours - SourceSafe, cvs, Atria/Rational ClearCase - irretrievably corrupt binary files when insufficient care was taken by the configuration manager in dealing with binary files;
  • They are more amenable to being stored in databases. Many databases (such as MySQL can return result sets packaged as XML fragments; this, combined with an XSLT parser and stylesheet, opens the door to some truly compelling presentation capabilities.
By taking advantage of these capabilities, we should be able to create better products with more predictable (and shorter) schedules without either greatly expanding the development team or pushing the present staff to the point of burnout. There is a saying in Silicon Valley in California, only partly tongue-in-cheek:
It isn't a startup until somebody dies
Here's hoping that's one "tradition" that's not exported anywhere outside the Valley.

The Vision, Forward through the Rear-View Mirror

The vision I'm trying to promote here, which has been used successfully many times before, is that of a very flexible, highly iterative, highly automated development process, where a small team (like ours) can produce high-quality code rapidly and reliably, without burning anybody out in the process. (Think Agile, as a pervasive, commoditized process.) Having just returned (17 October) from being in hospital due to a series of small strokes, I'm rather highly motivated to do this personally. It's also the best way I can see for our small team to honour our commitments.

To do this, we need to be able to:
  • have development artifacts (code/Web pages) integrated into their own documentation, using something like Javadoc;
  • have automated build tools like Ant regularly pull all development artifacts from our software configuration management tool (which for now is subversion)
  • run an automated testing system (like TestNG) on the newly built artifacts;
  • add issue reports documenting any failed tests to our issue tracking system, which then crunches various reports and
  • automatically emails relevant reports to the various stakeholders.
The whole point of this is to have everybody be able to come into work in the morning and/or back from lunch in the afternoon and know exactly what the status of development is, and to be able to track that over time. This can dramatically reduce delays and bottlenecks from traditional flailing about in a more ad hoc development style.

Obviously, one interest is test-driven development, where, as in most so-called Extreme Programming methods, all development artifacts (such as code) are fully tested at least as often as the state of the system changes. What this means in practice is that a developer would include code for testing each artifact integrated with that artifact. Then, an automated test tool would run those tests and report to the Quality Engineering team any results. This would not eliminate the need for a QE team; it would make the team more effective by helping to separate the things which need further exploration from the things that are, provably, working properly.

Why does this matter? For example, there was an article on test-driven development in the September 2005 issue of IEEE Computer (reported on here) that showed one of three development groups reducing defect density by 50% after adopting TDD, and another similar group enjoying a 40% improvement.

All this becomes interesting to us at Cilix when we start looking at tools like:
  • Cobertura for evaluating test coverage (the percentage of code accessed by tests)
  • TestNG, one successor to the venerable JUnit automated Java testing framework. TestNG is an improvement for a whole variety of reasons, including being less intrusive in the code under test and having multiple ways to group tests in a way that makes it much harder for you to forget to test something;
  • Ant, the Apache-developed, Java-based build tool. This is, being Java-based and not interactive per se, easy to automate;
  • and so on, as mentioned earlier.

A Lever, or a toothpick?

Any modern development effort which is complex enough to be commercially and/or technically interesting requires active, continuous collaboration between professionals and craftsfolk of various disciplines and specialisations. For instance, most organisations developing computer software have, in addition to the designers and coders of the software itself, several other interested stakeholders: quality engineers, documentation authors and editors, sales and marketing specialists, and various flavours of managers. Each of these individuals and groups have different capabilities and roles with regard to the project being developed, each of the groups have different perspectives, different needs - but one need that all share, knowingly or not, is the ability to communicate effectively and efficiently with each other. This involves the creation or acquisition of information, its refinement, analysis, discussion and use within the context of the project. The end goal, of course, is the completion and delivery of some sort of artifact that meets the needs of the organisation and delights that product's customers, without sending the development organisation on a death march in the process.

"But wait", you might reasonably say, "we already do this. We have meetings, minutes are taken, transcribed and emailed about, lots of other emails get sent back and forth, we have documents like functional specifications and design documents and whatnot to keep ourselves organised - what do we need all this gimcrackery for?" All of which is perfectly true, jus as you can put harness and bit on your horse, hitch up a carriage, and travel from Kuala Lumpur to Singapore - or you could catch an airline flight instead. There are countless organisations, and a depressingly high proportion of smaller ones, who continue to solve earl-21st-century problems with early-20th-century tools and practices. We know better. One of the points that many leading authorities, such as Steve Maguire in his excellent book Debugging the Development Process, make is that for each hour of meetings which a knowledge worker attends, it takes at least another hour for him or her to regain the level of productivity in work product creation which would have been effected had the meeting not taken place. So, for a typical large-corporate developer who spends an hour every day in meetings that could have their purpose accomplished through less intrusive means, the company is taking a 25% hit in productivity for that individual. Take 1/4 of the payroll of, say, Maxis, or even my own company Cilix, and that starts to add up.

The Lever That Rocks Our World

The ancient Greek philosopher Archimedes is quoted as having said:

Give me a place to stand and with a lever I will move the whole world.


We're not here to attempt anything on that scale, but being involved in IT often feels that way.

This blog is one of the vehicles which I intend to use to record, discuss and build upon various projects, ideas and memes which I believe important in the context of a modern software-development organisation. By the term "software-development organisation", I am also including groups which create Web applications and content, podcasts, or similar "media" artifacts which rely on some form of computer or other electronic technology for distribution.