Tuesday, October 20, 2009

Time v. Money v. Risk v. Frustration

I just spent three hours beating my head against The Joys of Wi-Fi Networking. No doubt because it saved $50 or so, the HDB public-housing flat I live in here in Singapore (which are "affectionately" known as 'chicken coops' for their quality and structural integrity) didn't put telephone jacks in every room. Nor did they run conduit between rooms so that suchlike could be added later. The upshot of this is that my DSL modem, in the living/dining room, is 15 meters and two concrete-slab walls away from my study/office, in a bedroom (with no phone jack and two power outlets).

Up to now, this hasn't been an insurmountable problem, because the DSL modem/802.11g router the government-linked telco sells you when you open a DSL subscription could punch a (barely) usable signal through those two concrete walls; desktop and laptop computers strewn around the study were on the LAN, with quite serviceable WPA2 encryption to keep private bits private. The router (in the living room) claimed to have a 14 Mbps connection to The Net; the aggregate total bandwidth available in the study ranged from 3 to 5 Mbps; not fantastic but usable. The world was in a survivable state of chaotic flux.

Then, yesterday morning, FedEx delivered a bright, shiny (actually used, in a Target bag rather than original box, no longer in the Polycom US catalog but still serviceable) Polycom SoundPoint 501 SIP phone, courtesy of my newest client (thanks, Nathan, Matt and Margaret - I'm not kvetching, honest!). An hour of work with jackhammer and flamethrower cleared sufficient space on the desk for the new trophy. Now to plug it in and fire it up.

Plug it in? To what? My ear?

Just to make sure I had my head straight, I looked at what it would take to run line from point P to R. Two solid concrete/rebar walls, routing around various doors/windows, across the span of the living room.... no, that wasn't going to happen, certainly not by Thursday. Put the phone in the living room? Not a chance. Wait a minute - they have these things called wireless Ethernet bridges; I should be able to get one of those, stick it in the study, connect the phone to it and it to the existing wireless LAN, and I'm good to go.

I went out to the local "IT super-mall", Funan. It's the best place in Singapore to buy electronic gear of whatever variety from reasonably reputable dealers. After visiting some 25 different shops, large and small, I was firmly reacquainted with one of the basic facts of Singapore life.

This is a firmly Stalinist economy in the areas that matter. All imports (which is to say, anything of value) are brought in through a small number of generally well-connected exclusive distributors. When one shop says "finished, already!" (the de facto motto of the city), everybody is going to say the same thing. That is, if you're fortunate enough to even run into some salesperson who actually understands what you're looking for. I could find things easier (and often cheaper!) in Vietnam.

Then I happened to pop into one hole-in-the-wall one-man shop, Crystal Systems. The "one man" said "say, I understand what you're trying to do; I did exactly that for a firm here recently; all you need are two routers." The archetypal light-bulb moment. (Remember, I'm a software craftsman, not a hardware/network engineer.) This is where the "time v. money v. ..." of the title comes in. What I should have done was to buy two identical brand-new 802.11n routers. That would have cost me about S$190 (~US$136 or so), and would have given me a perfect excuse to replace my aging 2Wire 2700HGV-2, But this has already been a bit of an expensive month - new hard drive, various software - so after some consideration I decided to try to salvage the existing router (apparent POS though it may be) and just spend the minimum necessary. So I went home with one S$69 TP-LINK TL-WR340G router.

It features the now-customary browser-based setup - just remember to use a hard-wired link rather than wireless (d'oh!), and it's almost as easy as falling off your seat. Bridging mode is pretty obvious; the TP-LINK wants to know the MAC address of the router you're connecting to, along with its encryption setup.

In bridge mode, the TL-WR340G only supports the obsolescent WEP encryption, not the current and generally superior WPA2. The 2Wire - along with every device currently connected to it, obviously - uses WPA2, and does not support a mixed WPA2/WEP environment.

Back to the store I go tomorrow; the only question now is: one new Netgear N router (which can fall back to 802.11g) or two?

Tuesday, August 18, 2009

Responding while Forbidden; Gender and Other Issues in OSS and PHP

This post is what would have been a response to a post on Elizabeth Naramore's blog, which she titled Gender in IT, OSS and PHP, and How it Affects Us *All*. Quite a good post, actually, with a long and often thoughtful (but as often thoughtless) comment thread following. I was hoping to respond to a comment on the post, but that apparently is no longer allowed, even though the "Leave a Reply" form at the bottom of the page is functional. There's also a mysterious "Login/Password" pair on the page, but no indication of which ID one uses, or how to go about getting one.

Following is the content of the reply-that-wasn't: I (perhaps unreasonably) think this has some points worth pondering. Please do read the original post first and then come back here - where the "comment" feature definitely does work.


@A Girl: Great that you're doing AP CompSci next year. As someone who's been in the Craft for 30 years, I have a great sentimental attachment to your idea that "teachers and professors have the chance to shape the mindsets of their students towards women in the industry". If we were a true profession, where essentially all practitioners have a certain common level and content of educational background combined with qualifying experience (e.g., apprenticeship, internship), I'd agree wholeheartedly.

The fact is that many if not most of the people in the industry - both the "coders in the trenches" and the ex-coders who got promoted into management ("because they were such great coders" - thereby removing two qualified people from an organization)... far too many of these people have noformal education in CS (or, often, anything else). And by the time they realize how important that might be, they're old enough that they're facing ageism in the workplace already - they're not confident enough to put the "big hole in the middle of [their] career" and "go back" to school. It doesn't help that schools the world over do such a lousy job of outreach and marketing to those potential students - they're focused on the Executive MBAs and other graduate-level returning students, who can have their pricey programs paid for by their employer. Joe or Jane Schmuck trying to keep head above water in the face of cut-throat competition from planeloads of new arrivals with mimeographed certifications, who've been taught their entire lives to never think out of the box to begin with... things start getting really tough out there. I'm not surprised that enrollment in CS programs is down. I'm amazed beyond words that it's still as high as it is; a less starry-eyed observer might expect the number of CS majors to closely track, say, majors in Phoenician economics.

A lot of the new entrants into CS and IT over the last ten years or so have< degrees - they're just not in the "obvious" field. Someone, and I wish I could find the original, wrote an article in one of the industry mags (like C/C++ Users Journal, not IEEE Computer) that, to be good at software in the modern era, one needed to have exposure to "behavioral science, psychology, linguistics, human factors, sociology, philosophy, rhetoric, ethnology, ethnography, information theory, economics, organizational politics, and a dozen other things - and please, please learn to write competently in English!" - I've had that taped above my display for years now. So it's not that we're not educated; the problem - and it is a problem - is that there is no universal common body of knowledge for software "engineering" - which is one of the necessary precursors of any true profession. We're not going to have a CBOK without either a broad consensus within the industry, or imposition of a system from outside (and very narrowly-focused) forces in government or the larger economy. Given the prevailing social and political attitudes of current practitioners ("herding libertarian-poseur cats" is a phrase not infrequently heard), that would seriously disrupt the Craft and, by extension, any industry or field dependent upon software (which by now is pretty much everything).

How to solve the problem - and, in so doing, help redress the pandemic sexism, racism and ageism (in huge parts of the world, recruiting with explicit age limits is perfectly legal, and here in South Asia, you're old for coding at 28)? I've got no idea. When I first started doing this, I thought that within the next thirty years or so (from 1979), we'd be able to turn this informal craftwork that had taken the industry away from the "educated CS types" and turn it into a real profession. Now? I'd say we're 30 to 50 years away from now - unless we have a software equivalent of the New London School explosion and a "solution" gets imposed from outside. We need to grow up, and quickly.

Monday, August 17, 2009

We Interrupt This Program...

We interrupt this tutorial to interject an observation about goals, methods and promises. Goals we have for ourselves as people and as professionals; the method we use to pursue those dreams; perhaps most importantly, the promises we make, both to ourselves and to our customers, about what we're doing and why.

I consider this after reading the Website for some consultants who've done some (relatively nice, relatively low-key) link-dropping on LinkedIn. I'm not naming them here, not because I don't want to draw attention to them (their own site is very clean and well-done), but because the point that I'm going to be making here isn't just limited to them - we, as a craft and an industry of Web development, have some Serious Problems™.

The "problem" is nicely summarized by this group's mission statement:

Our mission is to produce the perfect implementation of your dreams.

What could possibly be wrong with that?

As a goal, implied but left unspoken, absolutely nothing; both as practitioners and as clients, we tend to set ever-higher goals for ourselves. Indeed, that's the only way the "state of the art" - any "art" - can advance. But we who practice the art and craft of software (including Web) development (as opposed to the engineering discipline of hardware development) have a history of slashed-beyond-reality schedules and budgets coupled with a tendency for stakeholders not to hear "if all goes well" as a condition to our latest schedule estimate. We have a history, perceived and actual, of promising more than we can deliver. Far more attention is paid by non-technical people to the "failures" and "broken promises" of software than to things done right. For a craft whose work is accruing increasing public-policy and -safety implications, the effect of unrealistic expectations, brought about by poor communication and technical decisions being made by people who aren't just technically ignorant but proud of the fact, is disturbing. What started as a slow-motion train wreck has now achieved hypersonic speeds, and represents a clear and present danger to the organisational health and safety of all stakeholders.

I don't mean to say that projects always fail, but an alarming number of them do. If, say, dams or aircraft were built with the same overall lack of care and measurable engineering precision that is the norm in commercial software development, we'd have a lot more catastrophic floods, and a lot few survivors fleeing the deluge by air. When I entered this craft thirty years ago (last May), I was soon led to believe that we were thirty to fifty years away from seeing a true profession of "software engineering". As a time frame beginning now, in 2009, I now think that is almost laughably optimistic.

Why have things gotten worse when we as a tool-building and -using society need them to get better? Some people blame "The Microsoft Effect" - shipping software of usually-dubious quality to consumers (as opposed to 'customers') who have bought into the (false) idea that they have no realistic choice.

It's more pervasive than that; commercial software development merely reflects the fashion of the business "community" that supports it, which has bought into one of the mantras of Guy Kawasaki's "The Art of Innovation", namely "don't worry, be crappy." Not that Kawasaki is giving bad advice, but his precondition is being ignored just as those of other software people have been: the key sentence in his "don't worry, be crappy" paragraph is "An innovator doesn't worry about shipping an innovative product with elements of crappiness if it's truly innovative" (emphasis mine). In other words, if you really are going to change the world, nobody will notice if your Deus ex Machina 1.0 has clay feet as long as you follow up quickly with a 1.1 that doesn't...and follow that with a 2.0 that changes the game again. But that space between 1.0 and 1.1 has to be fast, Kawasaki argues (in the next paragraph, titled "Churn, Baby, Churn"), and the version after that has to come along before people (like possible competitors) start saying things like "well, he just brought out 1.1 to fix the clay feet in 1.0." If the customers see that you're bringing out new versions as fast as they can adapt to the previous ones, but that each new version is a vastly superior, revelatory experience compared to the earlier release that they were already delighted by, they'll keep giving you enough money for you to finish scaling the "revolutionary" cliff and take a (brief) rest with "evolutionary" versions. Business has not only forgotten how important that whole process is to their continued survival, but they've removed the capability for their bespoke software (and Web) infrastructure to use and reuse that model. All that remains is "it's ok if we ship crap; so does everybody else." That's the kind of thinking that made General Motors the world-bestriding Goliath it is today - as opposed to the wimpy also-ran it was (emphatically NOT) half a century ago. We really don't need any more businesses going over that sort of cliff.

What we do need, and urgently, are two complementary, mutually dependent things. We need a sea change in the attitude of (most) businesses, even technology businesses, towards software - to realise and acknowledge that the Pointy-Haired Boss is not merely a common occurrence in the way business manages software, but actively threatens the success of any project (and business) so infested. Just as businesses at some point realise that "paying any price to cut costs" is an active threat to their own survival, they need to apply that reality to their view of and dealings with the technical infrastructure that increasingly enables their business to function at all.

Both dependent on that and as an enabler of that change, the software and Web development industry really needs to get its house in order. We need to get away from the haphazard by-guess-and-by-golly estimation and monitoring procedures in use by the majority of projects (whose elaborate Microsoft Project plans and PowerPoint decks bear less and less resemblance to reality as the project progresses) and enforce use of the tools and techniques that have been proven to work, and have an organised, structured quest to research improvements and New Things.. Despite what millions of business cards and thousands of job advertisements the world over proclaim, there is no true discipline of "software engineering", any more than there was "oilfield engineering" in widespread use before the New London School explosion of 1937. Over 295 people died in that blast; we have software-controlled systems that, should they fail, could in fact hurt or kill many more - or cause significant, company- or industry-ruinous physical damages. We should not wait for such an event before "someone" (at that point, almost certainly an outside governmental or trans-governmental entity) says "These are the rules." While I understand and agree with the widespread assertion that certification tests in their present form merely demonstrate an individual's capability to do well on such tests, we do need a practical, experiential system - probably one modelled on the existing systems for engineering, law or medicine. Not that people should work 72-hour shifts; there's enough of that already. But rather that there should be a progression of steps from raw beginner to fully-trusted professional, with a mix of educational and experiential ingredients to ascend that progression, and continuing educational and certificating processes throughout one's entire career. The cost for this is going to have to be accepted as "part of the system" by business; if business wants properly competent engineers, and not just the latest boatload of unknowns with mimeographed vendor certs, then they're going to have to realize that that benefit does not come without cost to all sides. The free ride is over - for all the stakeholders at the table.

Tuesday, August 11, 2009

Flying to Pieces, a Tutorial View

(with apologies to Dean Ing)

PHP 5 (really, 5.2) is the first version of PHP with decent support for object-oriented development. By "decent support", I mean that it supports:and virtually all of the other feature goodness that programmers experienced in other object-oriented languages would expect.

One thing that comes from this, of course, is that applications can be developed incrementally, with well-defined interfaces that govern communication between various components, layers, and the classes that they contain. It also lets us make effective use of tools like PHPUnit, the PHP implementation of the well-known xUnit. Most people who start really using tools like this become very uncomfortable on projects that don't use them.

Getting back to the problem at hand...

We're developing a PHP application which will "syndicate the forum so that people can read synopses of the fishing reports with their feed readers," according to the original use case in Brian Carey's original DeveloperWorks paper. We can conceive of the application being made up of the following major "pieces":
  • a data source; an internal API the app uses to connect to and retrieve data from storage (e.g., a MySQL database);
  • an output generator, to generate the XML for the Atom feed using data from the data source; and
  • some high-level control logic to tie the first two pieces together.
If you've encountered the model-view-controller (MVC) architectural model, this will sound very familiar. We will not, however, be formally emphasizing MVC in this program. (There are numerous "application frameworks" for PHP which are built around MVC and related concepts; these are beyond the scope of this tutorial.)

Let's look at the output generator first, since it's already fairly well defined (it's an Atom feed). We can build test scaffolding using PHPUnit to prove that we're producing correct output by exercising the output generator interface - without touching a 'real' database. This will also give us confidence that we fully understand what the control logic needs to look like (suck data in, format it appropriately and output it). We can then fully implement that control logic, and wind up with implementing the "real" database access for the data-source component of the application. These three "pieces", or components, will be developed over the course of the next three posts in this series.

I'll close this entry by pointing you once again to PHPUnit. If you browse the PHPUnit site, or do a Google search for phpunit tutorial, you'll find plenty of help if you're not already up to speed on it. It will become, if it is not already, as indispensable a part of your development kit as your text editor of choice.

In the Beginning Was the End

Or, to put it another way: if you don't know what you're trying to accomplish, you're quite unlikely to get there very efficiently, even if you're in a very Agile environment that assumes (and plans for) significant change along the way.

This is the third in a series of blog entries where I take an existing, interesting PHP tutorial (Brian Carey's paper in IBM DeveloperWorks Creating an Atom feed in PHP). If you haven't read that material, or my previous entries in this series, I suggest that you go back and refresh yourself before continuing here.

Everybody all caught up? Right, then; onward!

What we're trying to do, of course, is to create an Atom feed, which presents information regarding the items available in the feed within a specialised XML format. Structurally, this consists of a "preamble", describing and identifying the feed, its content and its origination; a sequence of entry items, describing each entry in the feed, and terminated by a closing </feed> tag (matching the tag that opened the preamble).

Continuing with the scenario and data presented in Brian Carey's paper, our 'preamble' should appear similar to the following:
<?xml version='1.0' encoding='iso-8859-1' ?>
<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
<title>Fishing Reports</title>
<subtitle>The latest reports from fishinhole.com</subtitle>
<link href="http://www.fishinhole.com/reports" rel="self"/>
<updated>2009-05-03T16:19:54-05:00</updated>
<author>
<name>NameOfYourBoss</name>
<email>nameofyourboss@fishinhole.com</email>
</author>
<id>tag:fishinhole.com,2008:http://www.fishinhole.com/reports</id>
Next will be a series of <entry>...</entry> entries, such as
  <entry>
<title>Speckled Trout In Old River</title>
<link type='text/html' href='http://www.fishinhole.com/reports/report.php?id=4'/>
<id>tag:fishinhole.com,2008:http://www.fishinhole.com/reports/report.php?id=4</id>
<updated>2009-05-03T04:59:00-05:00</updated>
<author>
<name>ReelHooked</name>
</author>
<summary>Limited out by noon</summary>
</entry>
followed by a final </feed> tag closing the feed data stream (string or file).

Using that information, as well as the database layout and sample data from the original paper, we have a very clear definition of what our output should look like.

The next posts in this series will have us thinking about how we're going to organize solving this problem, and then get us started with PHPUnit, the standard unit-test and TDD tool for PHP. We can write tests to verify that our program is working properly - and we haven't even started to write the program yet!

You want to start a tutorial, well, you know....

Not as catchy as the Beatles' Revolution, even if the meter works.... oh well....

Continuing from my previous post. What do I think is important when starting to demonstrate some code? As with most writing, it depends on the audience. For the purpose of this series of posts, I'm assuming that you fit comfortably in or near the following:
  • You're comfortable with HTML and XML doesn't make you run screaming from the room;
  • You have a basic understanding of databases; you've run across SQL before and understand the basic concepts;
  • You understand PHP; you've written some code before;
  • You understand the concepts of "object-oriented development", "patterns", "best practices" and ideally "test-driven development" (usually abbreviated as "TDD"), even though you may not have loads of experience (yet) with them; and crucially
  • You want to improve your ability to write code that you can refine and possibly reuse over time.
The assumption that you know or at least are interested in PHP is a given, since that's the language we'll be using here.

What will you need to have installed and available to follow along?
  1. Access to a system with PHP 5.2 or higher, available both from the command line and the Web server (via a module or CGI);
  2. The PHPUnit and MDB2_Driver_mysql modules installed and available;
  3. A text editor of your choice;
  4. The ability to create PHP scripts and HTML files and have those accessible from the Web server as well as the command line.
These should all be pretty obvious to more experienced PHP developers, but making sure that we're both operating from the same set of assumptions - and no others - greatly reduces the likelihood of confusion and breakage along the way. Many of you haven't yet dealt much with unit tests using PHPUnit or similar systems; that's going to be a starting point for us.

Tutorials, best practices and staying current

A gent by the name of Brian Carey has written a very nice little tutorial on "Creating an Atom feed in PHP", and gotten it published on the IBM DeveloperWorks site. In the space of about ten pages, Brian gives a stratospheric overview of what Atom is and why PHP is a good language for developing Atom-aware apps, and then gets into the tutorial - defining a MySQL database table to hold the data used to 'feed' the Atom feed, and writing code to get the data out and put it into the form that a reader such as NetNewsWire expects an Atom feed to be in.

Now, to be fair, Brian describes himself as "an information systems consultant who specializes in the architecture, design, and implementation of Java enterprise applications", and the paper is clearly meant as a whirlwind tutorial, not to be taken by the careful/experienced reader as necessarily production-quality code. And if this had been published in, say, 2002 or so, I'd have thought it a great how-to for banging out some PHP 4 code. But this is 2009 PHP 4 is a historical artifact, and blogs and industry journals of seemingly every stripe are decrying the poor quality (security, maintainability, etc.) of PHP code... much of which is still written as if it were the turn of the century, ignoring PHP 5's numerous new features and the best practices that both spawned them and grew from them.

So what's really wrong with doing things like they did in Ye Olden Tymes™?
  • Procedural code makes it harder to make changes (fix bugs, add features, change to reflect new business rules) and be certain that those changes don't introduce new defects. This is largely because...
  • While it is possible to test procedural code using automatable tools like PHPUnit, it's a lot harder and more complex than testing a clean, object-oriented design.
  • Hard-coding everything, interspersing 'magic values' throughout code, is a major hindrance to future reuse - or present debugging;
  • Using quick-and-dirty, old-style database APIs exposes you to the risk of input data that is more 'dirty' than it should be - opening the door to SQL injection and other nastiness;
  • Not staying current exposes your code to the risk that it's either using features that have since been deprecated or removed entirely, or (arguably worse), the risk that new features (such as standard library additions) make some of your existing code redundant at best.
Each of these, to varying degrees, is true of much of the PHP code I've read in the last couple of years, including that in the DeveloperWorks paper. For example, the DW paper's code makes use of the old-style PHP4 mysql_* API rather than the more abstract/portable MDB2 database abstraction layer. And there's also the rather terse implementation of a date3339() function for converting a timestamp to RFC 3339 output format, that's now nicely handled through the standard PHP DateTime class.

How would I have done things instead? Read the next few posts in this blog to find out. And, of course, comments are always welcome.

Thursday, August 06, 2009

Smokin' Linux? Roll Your Own!

As people who've encountered the "business end" of Linux have known for some time, the system (in whichever distribution you prefer, greatly rewards (some would say 'requires') tinkering and customisation. This can be done with any Linux system, really; some distros, like LinuxFromScratch and, to a lesser degree, Gentoo and its derivatives, explicitly assume that you will be customizing their base system according to your specific needs.

Now the major distros are getting into it. There have been various Ubuntu and Fedora customisation kits on the Net, but none as far as I can tell that are as directly supported (or easy to use) as from OpenSUSE, the "community-supported" offering from Novell (who also offer SUSE Linux Enterprise Desktop and Server.

Visit the OpenSUSE site, and prominently visible is a link to the OpenSUSE Build Service, which "allows developers to package software for all major Linux distributions", or at least those that use rpm packaging, the packaging system used by Red Hat, Mandriva, CentOS, and other similar systems. But that's not all...

SUSE now have a new service, SUSE Studio, which allows users to create highly customized systems based on either the community (OpenSUSE) or enterprise versions of SUSE Linux. These "appliances" can be put together on the basis of "patterns", such as lamp_server (LAMP, or Linux/Apache/MySQL/PHP Web server) or technical_writing (which includes numerous tools like Docbook). You can even supply your own (either self-built or acquired elsewhere) RPM packages to include in the appliance you're building, and SUSE Studio will deal with the dependency matching (warning you if packages are required that aren't either among its standard set or uploaded by you).

Startup scripts, networking, basically anything that is usually handled through the basic installation or post-installation configuration - all can be configured within the SUSE Studio Web interface.

And then, when you've got your system just the way you want it, you can build it as either an ISO (CD/DVD) image to be downloaded and burned onto disc, or as a VM image for two of the most popular VM systems (VMWare and Xen).

But wait, there's more...

Using a Flash-enabled browser, you can even "test drive" your appliance, testing it while running (transparently) in an appropriate VM hosted within the SUSE Studio infrastructure. Especially if you have a relatively slow connection, this will let you do preliminary "smoke testing" without having to download the actual image to your local system. Once you're ready to do so, of course, downloading is very nearly a single-click affair. Oh, and you're given (presently) 15 GB of storage for your various builds - so you can easily do comparative testing.

What don't I like about it? In the couple of hours I've been messing around with it today, there's really only one nagging quibble: When you do the "test drive" of your new creation, the page you're running it in is a standard, non-secure http Web page. The page warns you that any data and keystrokes sent will not be encrypted, and recommends the use of ssh if that is a concern (by which most people will think https). But there's no obvious way to switch, and shutting down the running appliance (which starts by the time you read the warning) involves keystrokes and so on...

In fairness, this is still very clearly a by-invitation beta offering (but you can ask for an invite), and some rough edges are to be expected. I'm sure I'll run into another one or two as things go on. I'm equally certain that all the major problems will be smoothed out before SUSE Studio goes into general public availability.

So, besides the obvious compulsive hackers and the people building single-purpose appliance-type systems, who would really make use of this?

One obvious use case, which the SUSE Studio site describes, is as a canned demo of a software system. If you're an ISV, you can add your software or Web app to a SUSE Studio appliance, lock down the OS image to suit (encrypting file systems and so on), and hand out your discs at your next trade show (or have them downloadable from your Website). No worries about installing or uninstalling from prospective customers' systems; boot from the CD (or load it into a VM) and they're good to go.

Another thought that hit me this morning was for use as an interview filter. This can be in either of two distinct modes. First, you might be looking for people who are really familiar with how Linux works. Write up the specs of a SUSE Studio appliance (obviously more demanding than just the click-and-drool interface) and use an app of your own devising to validate the submitted entries. This validation could be automated in any of several ways.

The second possible interview filter would be as a programming/Web dev system. As a variation on the "ISV" example above, you load up an appliance with a set of tools and/or source files, ready to be completed and/or fixed by your candidates. They start up the appliance (either a live CD or VM), go through your instructions for the test, and then submit results (probably an encrypted [for authentication] archive of all the files they've touched, as determined by the base system tools) via email or FTP. On your end, you have a script that unpacks the submission into a VM and uses the appropriate automated testing tools to validate it. I can even see this as a business model for someone who offers this capability as a service to companies wishing to have a better filter for prospective candidates than resume-keyword matching - which as we all know is practically useless due to the high number of both false negatives and false positives.

What do you all think?

Tuesday, August 04, 2009

The Debate between Adequacy and Excellence

I was clicking through my various feeds hooked into NetNewsWire, in this case The Apple Core column on ZDNet, when I came across this item, where the writer nicely summed up the perfectly understandable strategy Microsoft have always chosen and compared that with Apple and the Mac. Go read the original article (on Better Living without MS Office and then read the comment.

As I've commented on numerous times in this blog and elsewhere (notably here), I'm both very partial to open standards (meaning open data formats, but usually expressed in open source implementations) and to the Apple Mac. As I've said before, and as the experience of many, many users I've supported on all three platforms bears out, the Mac lets you get more done, with less effort and irritation along the way, than either Windows or Linux as both are presently constructed.

But the first two paragraphs of this guy's comment (and I'm sorry that the antispam measures on ZDNet apparently don't permit me to credit the author properly) made me sit up and take notice, because they are a great summation of how I currently feel about the competing systems:

The Macs vs. PC debate has been going on for about 25 years or so, but the underlying debate is much older. What we are really discussing is the difference between adequacy and excellence. While I doubt I would want to be friends with Frank Lloyd Wright or Steve Jobs, both represent the exciting belief in what is possible. While Bill Gates and Steve Ballmer rake in billions, their relative impact on the world of ideas is miniscule.

Bill Gates understands that business managers are on the whole are a practical, albeit uninspired and short-sighted bunch. By positioning Microsoft early on to ride into the enterprise with the implicit endorsement of one of the biggest, longest-lived, and influential suppliers of business equipment, Gates was able to secure Microsoft's future. Microsoft's goal has never seemed to me to be to change the world, only to provide a service that adequately meets business needs. Microsoft has also shown from early on a keen awareness that once you get people to use your product, your primary goal is not to innovate to keep your customers, but, rather to make leaving seem painful and even scary. Many companies do this, but Microsoft has refined this practice into an art.

He then expands on this theme for four more paragraphs, closing with
Practically speaking Microsoft is here to stay. But I am glad that Apple is still around to keep the computer from becoming dreary, to inspire people to take creative risks, to express themselves, and to embrace the idea that every day objects, even appliances like the computer, can be more than just the sum of their functions.

Aux barricades! it may or may not be, depending on your existing preferences and prejudices. But it does nicely sum up, more effectively and efficiently than I have been able to of late, the reasons why Apple is important as a force in the technology business. Not that Microsoft is under imminent threat of losing their lifeblood to Apple; their different ways of looking at the world and at the marketplace work against that more effectively than any regulator could. But the idea that excellence is and should be a goal in and of itself, that humanity has a moral obligation to "continually [reach] well past our grasp", should stir passion in anyone with a functioning imagination. Sure, Microsoft have a commanding lead in businesses, especially larger ones - though Apple's value proposition has become much better there in the last ten years or so; it's hard to fight the installed base, especially with an entrenched herd mentality among managers. But, we would argue, that does not argue that Apple have failed, any more than the small number of buildings designed by Frank Lloyd Wright and his direct professional disciples argue for his irrelevance in architecture. If nobody pushes the envelope, if nobody makes a habit of reaching beyond his grasp, how will the human condition ever improve? For as Shaw wrote,
The reasonable man adapts himself to the world. The unreasonable man persists in trying to adapt the world to himself. All progress, therefore, depends upon the unreasonable man.

And that has been one of my favourite quotes for many years now.

Friday, July 31, 2009

Web Standards DO Save, Then and Now

It's been known in the Web-development community for several years now that well-designed, semantic, standards-compliant Websites use dramatically less resources (such as bandwidth) than 1997-era tangles of nested tables and invalid HTML. But it's still refreshing to read confirmation that that truth is pretty universally applicable - especially when that reading doesn't depend on what we now know as the "latest and greatest" Web browsers.

Take, for example, this five-year-old article by Jim Ramsey, then-Webmaster of the San Francisco Examiner newspaper's Website. In the section titled Simplify, Man!, he writes:

This is what a basic link in our navigation looked like late last year, before standards:

<tr>
<td class="navmenu" height="18"
onClick="javascript:rolloutNav(this);document.location='/home/index.cfm'"
onMouseOver="javascript:rolloverNav(this);"
onMouseOut="javascript:rolloutNav(this); " colspan="2">
<a href="/home/index.cfm" class="nav">HOME</a></td> </tr><tr>
<td bgcolor="#EEEEEE" class="navmenuspacer" colspan="2">
<img src="../site_images/spacer.gif" width="1" height="2"></td> </tr>
Now take a look at what an Examiner navigation link looks like now:
<li><a href="/home/">Home</a></li>
That’s a big difference. In fact, the first one is so bad, I’m almost embarrassed to include it here. And what did I get for all that extra stuff? Basically, nothing. The JavaScript triggers the rollover effect and the table cells control the spacing. All of that can be done using styles.

Let’s take another example. Here’s how a link to a story in the Arts section looked before standards:
<img src="../site_images/sfex/homekickerarrow.gif" width="6" height="8">
<span class="kicker">Movie Review: Dickie Roberts<br></span>
<a href="../templates/story.cfm?displaystory=1&storyname=090503a_dickie"
class="headlinesm">Problem 'Child'</a>
<hr noshade size="1" color="#EEEEEE">
Here’s the same thing following standards:
<h5>Movie Review: Hero</h5>
<h4><a href="/article/index.cfm/i/082704a_hero">Holding out for a 'Hero'</a></h4>
Again, once it is styled, the second version can be made to look identical to the first. When you can simplify markup in this way, it starts to make a big difference in bandwidth.

Comparing last year’s table-based site to our new standards-based one, the amount of information on our homepage is strikingly similar. Both contain basically the same elements and yet the HTML is 13K smaller on the CSS-version at 19.6K.

As a result, even though our traffic was about 40% higher in July 2004 than in September 2003, our bandwidth was almost exactly the same for those two months.

"[E]ven though our traffic was about 40% higher...our bandwidth was almost exactly the same..." What other development practice allows you to simultaneously:
  • boost your traffic by nearly half without similar increases in bandwidth costs;
  • improve your search-engine results without expensive, error-prone twiddling;
  • open your site to a potentially much wider audience, by not limiting what platform or browser your audience uses; and
  • significantly reduces the cost and complexity of maintaining your site?
As one colleague put it in an IM discussion, "people who develop like it's 1997 shouldn't be surprised if their revenues and page views don't exceed 1997 levels. Especially if they weren't around in 1997."

I'm not (quite) to the point of some of the more, um, evangelical developers out there in equating broken, invalid sites with actionable incompetence - but if our craft has serious hopes of making it out of "hobbyist" status in the eyes of non-technical business people, to where they treat practitioners as members of a profession, then we need to get some meaningful, practical, well-defined standards. This (standard-compliant, semantic development) is at or very near the top of my list of such standards.

Final note: I find it deeply ironic that Google have apparently redesigned the posting interface for Blogger. Whereas previously I have been able to post from any Linux, Mac or iPhone browser, switch between HTML editing and WYSIWYG view, and use all the other goodness...the new CSS and JavaScript seem to fail in any of the half-dozen Mac browsers I've tested today. One step forward, two steps back is especially painful when one started by walking directly away from the very edge of a cliff.

Monday, July 20, 2009

Tools, Continued

This blog, fairly obviously, is published on blogger.com, which is now owned by Google. Blogger.com is geared primarily towards people who want to write but don't want to have to worry about the nitty-gritty technical details involved (such as HTML, CSS and so on). Sign up, click the 'New Post' button, and off you go...almost easier than falling out of your chair.

For the Web wizards among you, you can get into the guts of how your blog and posts are laid out and formatted; there are several in-the-tin and numerous third-party templates that can be used to style your blog any way you like it, and those can then be hand-tweaked by you to get things just so.

As with any click-and-go interface, the Blogger new-post window (what I'm typing in right now) has a "Preview" button (or actually, link); click on it and you'll see a more-or-less-reasonable facsimile of what your breathless prose will look like to the next visitor who stumbles across your blog. I did say "more-or-less-reasonable"...

One thing the preview area doesn't do - sensibly in hindsight - is to apply your template's formatting to the post being previewed. So that, for instance, if your template specifies that you want 9-point Gentium Book Basic, Times New Roman or Times (in that order) for your body, you won't see it that way in the preview.

What you also won't see - and this is what tripped me up for the longest time - is other styling for the post body, particularly justification. You may notice that all my posts are displayed with justified left and right margins; the default, and most commonly used setting, is for a justified left and ragged right margin, which I find unattractive. I've tweaked the template I use several times over the years to try and get the effect I was looking for. Each time, the post-preview showed no changes to the text formatting, and so I undid the change without viewing the entire blog normally. I have been instead hand-wrapping the content of an entire post in a <div>...</div> element pair whose only reason for existence was to set text-align: justify;.

This isn't the first time I've blogged about my learning-experiences-that-shouldn't-have-been, and just to be perfectly clear about this, I'm not trying to dis Blogger.com about this. The preview-while-composing feature is merely to let you see the content of the post you're working on without all the editing framework around it. It won't, and arguably shouldn't try to render that content in its final form. You do, after all, have the ability to go back and edit posts you've already published.

This is, at its heart, a cautionary tale for those of us tasked with making the Web easier to use for a wide variety of users. Be careful especially with interface design. Recognise that, barring explicit cues to the contrary, people's assumptions about How Things Work on your site may have only a nodding acquaintance with your own - but if you have some appropriate description in the right place, users are often happy to adjust. But there has to be a balance, or your 'power users' will feel like they're being stifled. How you achieve that balance is, of course, up to you to find out. Good luck.

Sunday, July 19, 2009

Misadventures with Customer "Service": With Service like This, Why Bother?

Doing a job search using various Websites became really popular about ten years ago as an easy way for candidates and companies to find each other without any (visibly obvious) middlemen directly involved. I understand that in some geographic and employment areas, it's still a useful tool. I've had myself registered on fewer than half a dozen for several years, even though my last 6 jobs were all found through other means.

Job sites aren't typical consumer Websites in that they're really trying to target two distinct groups of customers: companies with jobs on offer (who generally pay the bills) and jobseekers (who generally don't). Even though jobseekers don't have any up-front money dangling in front of the site operator, wise (or experienced) operators of such systems know that they should "take care" of these site users well, since:
  • Companies will be less likely to pay to advertise on general-audience sites with few users, and
  • Users who have negative experiences (on any site) are often motivated to "spread the word".
Any business which is heavily dependent on its Website for revenue or for customer service knows (or quickly learns) that locking actual or potential customers out of their site through defective, easily-correctable site implementation has a direct and negative impact. The pathologically clueless operators don't seem to know or care why this happens; they just notice their monthly traffic figures dwindling over time, with revenues to match.

This even applies, although to a lesser extent, when the market being served by the site encourages a lack of competition. This is the case quite often here in South Asia, and notably in Singapore. Only two media companies serve the English-language market here, both with apparent Government ties. Broadcast television and radio are similarly throttled. This becomes apparent to anyone who spends more than a few days here. Individuals are engaged in frenetic, all-consuming competition with each other, while favoured companies and industries don't (visibly) engage in such grubbiness.

All of this went through my mind again over the last few days, after I'd had a series of consistently unpleasant experiences with one particular Website operator. Necessary background: I have been essentially Windows-free (and thus virus-free and botnet-free) for some five years now, initially using Linux systems like Ubuntu and then the Apple Mac. What has made this change practical, for me and for tens of millions of other people, has been the emergence of open standards in desktop computing, particularly with regard to open data formats. A classic example with which anyone reading this is familiar is HTTP. Any device, from large mainframes to appliances to mobile phones, can be used to produce or access information via HTTP, most commonly in the form of Web pages. Other "standards", like MP3 for audio or Microsoft Office document formats, are proprietary to a company or organisation but widely implemented by competing or collaborating systems. This reliance on actual or de facto standards, as much as any other single factor, is what has enabled much of modern technology, particularly the Internet (which many think of as synonymous with the World Wide Web).

Not everybody has always "played fair" with these standards, however. Microsoft, in particular, have been infamous for their commonly-used practice of "embrace, extend and extinguish" has led to several instances where they support an existing standard (such as HTML for Web pages or Kerberos for network authentication) and either introduce incompatible features or defective versions of standard features into their implementation. This has the effect of "locking in" users who support Microsoft's implementation rather than the actual standard. These usees, as they are often called, face significant actual or perceived costs and difficulties were they to switch away from their now-Microsoft-specific infrastructure. Microsoft have by no means been the only ones to do this sort of thing; they are, however, often quite brazen about it - and their usees don't always have an accurate understanding of the true costs of their "investment."

In particular, during the "dark ages" of Web development in the mid- to late 1990s, it was common for sites to have text on each page proclaiming "Best viewed with Microsoft Internet Explorer"; the tools which they were using to create the Web pages (often Microsoft FrontPage), did not always (or easily) generate valid HTML, CSS or standard JavaScript - all of which are needed to make interactive, "dynamic" Web sites work. As the relevant standards have become better-known to amateur and semiprofessional Web developers, the tools used by professional developers mature, and those using Web browsers become more aware of the importance of these standards, we have seen an increasing Web "Enlightenment", where once again, anybody (or anything, like a Google search crawler) can access all content for a site.

There are exceptions to this, however, just as there are trailing-edge sluggards in any social or technical change. These can give excuses about lack of resources, perceived lack of need ("We're doing fine the way things are") and so on - which quite often attempt to mask either ignorance of the issues involved or, at a fundamental level, disrespect for their current and actual customers. When a well-equipped, motivated (outside) organisation is in a position to gain from that ignorance or contempt - as Microsoft for so long was with regard to Web development and standards - those "sluggards" will continue to pat themselves on the back for their "prudence" and "conservatism" - right up to and often past the point at which the business dies.

One such organization appears to be the company which owns and operates the JobsDB.com group of employment Web sites. With "specialised" sites for Singapore, Thailand, the Philippines, Australia and other countries, and an apparently large advertising and promotion budget, it's safe to speculate that millions of people have at least seen their advertisements, if not actually made use of their site. Those who actually do try to browse for a job there, however, can face significant problems - if they're using anything other than Internet Explorer on Windows as their browser. (And the message seems to be finally getting through to many users - Microsoft Internet Explorer and Windows itself cause numerous security and stability problems that are comparatively unheard of on any other system.)

I browsed jobsdb.com.sg, the Singapore-themed version of their online job service. I got as far as entering my details in an application form for a position, when I ran into insurmountable difficulties. Numerous pages display poorly in any of the browsers I tried. "Obviously using archaic design," I thought, "but as long as the function isn't too badly broken, I can still deal with this." I eventually got to the a page with a form to be filled in by those wishing to apply for a specific job (at SingTel) and ran into fatal problems. I clicked on the link to report a problem with the site, and included the following description:


SingTel application page misbehaves horribly with Safari 4.0.2/Mac. An ASP alert box, (URL omitted for this blog post), is displayed in a new tab, taking the entire browser window; the message "Please correct the field(s) with red exclamation mark(s) (!). You can click on the exclamation mark for instruction and input relevant information." indicates that there was an error on the initial page (which is still open in its original tab), but no red exclamation marks appear.

Very 1998ish.



Sixteen hours after filing the initial report, and getting an auto-generated email indicating that it had been received and entered into their ticketing system, I get the following email from "sg-cshelpdesk@jobsdb.com.sg"):

Dear Sir / Madam,

Please clear the cache of your browser by deleting the cookies and temporary internet files, enable the java and javascript, disable the popup blocker, open a new browser and try from there.

Otherwise, please try other browsers like firefox or Internet Explorer 6.0 & above to access the Career Portal and verify that your Internet Settings are configured correctly against http://sg.career.jobsdb.com/faq/documents/Internet_Explorer_Settings.pdf. Then open a new browser and submit from there.

Best Regards,
Customer Service



Absolutely no problem-specific content whatsoever. This could easily have been (and in my opinion probably was) an automatically-generated message that was sent when no relevant response would be forthcoming within 24 hours after filing the problem (the '16:44' in the timestamp, or 4:45 PM local time, sounds suspiciously like "it's the end of the day, nobody got to this, blow him off and hope he goes away.") There continues to be an explicit assumption within the fluff mail that I'm using Internet Exploder; even though the second paragraph mentions "firefox" (sic), the PDF link supplied says that it relates to IE.

I'm getting used to companies, particularly the all-pervasive oligopolies, here in Singapore treating the customer with contempt. Knowing that no living human would probably ever read it, however, I did send one last complaint email.

Dear Customer “Service”,

Did you actually read the report I filed? Here are a few Statistically Improbable Phrases to clue you in:

  • “standards-compliant (non-IE) browsers”. I tested your site with ten different browsers on four different operating systems. In no case did it operate correctly. By the way, Microsoft does not make a version of Internet Exploder for non-Windows systems.
  • “Safari” - Safari is the standard, comes-with-the-system browser for Macs and iPhones, and is available for those wishing to upgrade their Internet experience on Windows as well.
  • “an ASP alert box...is displayed in a new tab” - obviously not what was intended by the coder; however, competent JS wouldn’t do this.
I am a Web developer with nearly 15 years experience. I have been teaching Web development for ten years. If a student of mine from my very first class, back in 1999, had submitted work with such grossly defective JavaScript, he would have been required to redo it. You see, even since the early days of the World Wide Web, there have been standards that even Microsoft can comply with – though they also “support” their own proprietary, non-standard version of HTML and JavaScript and have encouraged careless or inexperienced “developers” to lock themselves in, breaking sites for anyone using a non-IE browser. That includes half the smartphones on the planet, every single non-Windows desktop or laptop system, and search engines like Google.

You should take http://www.amazon.com/Novelty-Sign-Brain-before-engaging/dp/B000K62V92 as a gentle but firm recommendation.

Sincerely,


No, I don't expect a relevant reply. Yes, I do feel better. No, I won't be using JobsDB or any related site, and will happily explain to any who ask my opinion precisely why.

A final note: There is a nice little site called yougetsignal.com which offers a nice set of network (Internet) diagnostic and information tools. One page on the site, http://www.yougetsignal.com/tools/web-sites-on-web-server/, does what's called a "reverse IP address lookup". I used this page to find out what other sites are hosted by the same server as jobsdb.com.sg. Try it yourself; the results included apparently all "country-specific" jobsdb.com sites - as well as the site for Target, an American retailer. Interesting.

Not that I expect it, but if any of the parties mentioned here were to reply to what I've written here, I'd be happy to publish their response.

Tuesday, July 14, 2009

Expanding the Omniverse

Anybody who's worked with me in about the last 25 years knows that I've been preaching the idea that software craftfolk should never stop learning. Further, I've always believed that learning new languages or tools is one of the easiest ways to accomplish this, keeping the mind supple and open to new ways of doing things. And by and large, I've kept this up, learning enough of a new or long-neglected language to at least be able to read and patch code every two or three months. (You do the math.)

One of the languages that I learned rather early on in its lifecycle is Ruby. (Wikipedia has a rather good article summarizing the history, if you're unfamiliar with it.) Early 1.0-1.4 or thereabouts releases were interesting - they showed the power and promise of the concepts that Ruby is built on, without the bloated inscrutability of, say, perl. As so often has happened, I'd learned just enough to be dangerous, and then got sucked back into the charnel house that is Microsoft Windows development. By the time I had the time and inclination to start messing with Ruby again, an unfortunate thing happened.

Ruby on Rails.

Not that Rails isn't a great tool for building 37signals-type Websites; it clearly is. But it became a victim of its own hype and started being used for everything imaginable - famously including Twitter (a hype explosion in its own right.) Rails was enough to push me - and apparently a good number of other folks - away from Ruby and onto other languages, notably Python. And so I spent the bulk of the next couple of years in PHP, Python, C++, Objective-C, D (another bit of unsung genius), and managed to keep busy.

We developers have a well-known cliché for what drives us to do new things or participate in development projects; "scratching an itch". To scratch an itch in this context is to solve a problem that we ourselves are facing, or to do something that otherwise interests us. What got me motivated to get back up to speed in Ruby wasn't the Rails hype, or even seeing all the nice APIs that Twitter and Repertoire had made available. My problem was a bit simpler and more immediate.

Apple's Mail app started crashing under the load I was giving it. For about a year, I'd had a mail store that averaged about 2 GB and I was getting on the order of 600 to 800 emails a day. I don't mean to be critical of the app; it just wasn't designed to do what I was asking it to, certainly not when sharing 2 GB of RAM with the rest of the system. By the time I migrated away from Mail, I had over 400 filtering rules defined, to slice and dice incoming emails into appropriate folders where I could deal with them as I chose.

In late May, 2009, I up and migrated my email from Apple Mail to Microsoft Entourage. For those of you whose only exposure to Microsoft email software has been Exchange or Lookout! ("Outlook") Express, you're in for a pleasant shock. Entourage runs quite happily in the system as it is (even if I can't run certain other apps at the same time without upgrading RAM), and doesn't give me the maddening ten-minute freezes that were common with Apple Mail as it tried to figure out what to do next. Importing my existing mail store was a breeze. The filtering rules even came along for the ride, and Microsoft's rules editor is a real treat. There was, however, one small problem.

Somehow, the ordering of rules had become scrambled during the import, and after a few weeks of hand-editing to fix the biggest problems, I started looking for a program that would let me import, export, reorder and bulk-edit Entourage's rules. So far, I haven't been able to find one. (If anybody knows of anything useful, please add a comment or email me.) OK, I thought, no problem; nearly everything on the Mac is scriptable. I should just need to learn how to access Entourage's rule set from AppleScript or something similar, then I can write the tool I want. Not trivial, but certainly something that seemed conceptually quite practical.

So I started learning AppleScript, and casting about for tools and sample code that talked to Entourage from AppleScript. While searching, I ran across Matt Neuberg's site. Dr. Neuberg wrote the definitive guide on AppleScript - and then found something that worked better for him: rb-appscript. He's written an online book about it (eventually to be published on dead trees). At this point, I said "ok, let's get started with Ruby again and see what we can do."

If anybody has any pointers or suggestions, please comment. Thanks for reading.

Friday, July 10, 2009

"You can have my...when you pry it out of my cold, dead hands"

One of the, shall we say, unusual things about being in this line of work is that you develop stronger-than-is-healthy bonds to particular bits and pieces of technology, both hardware and software. For example, I think that anybody who's worked on a Mac as their main system for a year or so would take a catastrophic productivity hit if they were required to work in a Windows-only environment. Further, I assert, based on experience and observation, that this hit would actually increase in severity commensurate with previous Windows experience, as the nature of the problems and hassles encountered on a continuous basis would be that much more familiar.

But, believe it or not, this isn't a hit piece on Windows. If anything, Mac OS X takes the brunt here, in more or less a continuation of a previous post.

Don't get me wrong. The Mac itself is still one of the very few things I'd plug into that quote in the title. But the differences between it and the BSD Unix heritage it's based on, at an architecture-implementation level, sometimes drive me up the wall; it's as though I'm caught in the old Saturday Night Live "It's a floor wax AND a dessert topping!" skit.

Case in point: the Web developer's Swiss Army Ginsu Knife, PHP. On Unix and Linux systems, it's very straightforward to build a scripting engine that contains just the features and extensions needed, with security and debugging features added in to taste. The last few releases have even made that process humanly feasible for Windows usees. On Mac OS X, however, it's a different story entirely. (Item: As of Friday 11 July, the search "mac php configuration" returned some 5.22 million hits. Applying Sturgeon's Second Law to this leaves us with at least half a million pleas for and/or offers of help.)

The practical result is that, on Mac OS X, one uses one of the various prebuilt binaries for PHP (from Apple, MacPorts, MAMP or similar - or goes without. If you're interested enough in PHP to want it to work on a Mac, you probably have some urgent deadline breathing down your neck; you don't have time to figure out how/why there are unique runes and incantations involved in making it work. This is the dark flip side of "almost everything Just Works"; the things that don't Just Work tend to average the entire experience out.

So, what's the most efficient solution for the reasonably serious Mac-loving Web developer? Simple: max out the memory in your Mac (and I do mean "as much as Steve&Co let you put in and not one byte less, blastit!"). Then, add one of the software tools that's going to be on that Mac somebody pries out of my "cold, dead hands"; VMWare Fusion. Once you have Fusion installed, grab an ISO image of your BSD version or Linux distro of choice, install it, and use that for your PHP and other web-dev activities.

Better still, you can still use your fave Mac editor, browser and so on during development in the VM; you'll want to set up ssh on your VM, and then use sshfs to mount your Linux/BSD filesystem as a disk volume in your Mac. From that point - a file is a file is a file; you're just taking advantage of the *ix VM's tools. This is how I do my PHP 5.3 and PHP 6 testing now.

And, after going through all this extra effort, you may well pray that the new version of Mac OS X enables a better native solution. I know I do.

Nothing is perfect in this world, and all technologies have speed bumps in some fashion. The good bad thing about OS X is that there aren't nearly as many as in some other systems I could name - but when you hit the ones that are there, you hit them hard.

Wednesday, July 08, 2009

The Best Tool for the Job

One of the nice things about growing up around (almost exclusively) men who were master mechanics, carpenters or other such highly skilled tradesmen was that I developed an appreciation both for "the best tool for the job at hand" and "making do with what's available" - and whichever of these applied, accomplishing the task at hand to the best of anyone's ability.

As I've progressed through my software and Web career, I've become highly opinionated about the tools I use, just like any other experienced software craftsperson I've ever known. You and I might use different tools to accomplish what functionally is the same task, but so long as we each have practical, experiential bases for those preferences, we should just go ahead and get what needs doing done. (There's an argument in there for open standards as a requisite for that to happen, but that's another post.)

Too many people who should know better have religious-level devotion to or hostility towards certain companies and/or products. Yes, that includes me; I know I've said some pretty inflammatory things, usually when I felt someone was expressing a religious belief masked as a technical opinion. No doubt they've felt the same about me and any others who were incautious enough to oppose their evangelism (or reactionism, depending on the circumstances). In general, it should be pretty evident to everyone with a personal or professional involvement in IT or personal electronics that trends are driven as much by "what I say three times is true!" as what actually can be shown to be true. That's how mediocre-at-best products become "industry Leaders"; inertia and close-mindedness set in, reinforced by a well-funded, continuous and strident marketing/branding campaign.

I was having a discussion about this online recently, with a former associate who's long had me pegged as an ABMer ("Anything but Microsoft"). I can understand how he formed that opinion; I've long complained about the (innumerable) defects in the "market-leading" operating system, and about how slowly progress has been made in cleaning up the most egregious faults (such as security). But I've also worked at Microsoft in Redmond - three different times - and I've always been impressed by the number of truly gifted people working there. They've had their triumphs and tragedies (anyone used Microsoft Bob lately?). They've had to deal with widely differing process and management effectiveness as they transfer between or liaise with different groups. They've ignored a lot of what has been done outside the company, but they've also created some amazing things inside; too many of which unfortunately never make it into public products.

And the quality of their work product varies as much as any of the factors that go into it. Cases in point: compare, say, Windows Vista with Windows Mobile or the XBox; compare Microsoft Outlook (forever known as "Lookout!" to security/admin people) with Entourage; compare Word for Windows to Word for the Mac - what I understand is a completely different code base (and visibly so) that "just happens" to be able to flawlessly read and write documents shared with Word for Windows.

I also reread a blog post I wrote last December where I detailed the issues I was starting to have with Apple's own Mail app for the Mac. I have a mail store that's hovered somewhere above 2 GB for the last year. I receive 100-200 legitimate emails per day (and up to 700 spams). I presently have over 230 filtering rules defined for how to handle all that mail. Those rules have been built up over the last five years or so - first using Mozilla Thunderbird, then Apple's Mail.app, and now a new system; a progression that also speaks eloquently about the value of open standards. I have never, to my knowledge, lost a saved message whilst transferring from one package to its successor. The few hiccups each transition has had with filtering rules have all been relatively easy to find and fix, with the newest app making that process breathtakingly simple.

The new mail app? As you've no doubt guessed, Microsoft Entourage. It, like every other Mac app I've ever used, Just Works as expected (at least until you get out to the far, bleeding edges). If Microsoft made Windows and Office for Windows as well as they make Entourage (and the rest of their Office:Mac products), they really wouldn't have to worry about competition - and they'd richly deserve that. The market-friendly price for their Mac product (where their major, worthy competitor sells for US$79) is just icing on the cake.

I don't hate Microsoft. I just wish they would stick to what they do as well or better than anyone else, and leave the crappy products that can never be anything but hypersonic train wrecks - like Windows and Internet Exploder. I wish that ever more fervently every time I'm asked to help some hapless Windows usee fix "why my computer doesn't work". That would also make Microsoft's long-suffering stockholders - including current employees, former employees and myself, among others - feel a lot better.

Tuesday, June 30, 2009

Another item from the "That's obvious - in hindsight" dept.

Since upgrading to Safari 4 (why haven't you yet?), I ran into a problem with the single add-in that I've bothered keeping in Safari - Pith Helmet. If you're familiar with AdBlock Plus on Firefix, you've got the basic idea; an add-in to your browser(s) of choice that lets you block advertising, annoying Flash, or pretty much anything else you can identify by file name (e.g., "*.swf*) or by URL (e.g., "http://www.doubleclick.net"), and "magically" removes it from the final content displayed by your browser. This feature has gotten so popular that several browsers are building in more-or-less-competent versions of it by default.

Getting back to the problem... PithHelmet, the Safari ad blocker, was incompatible with Safari 4 because the framework it depended on, called SIMBL, has not had an update since October 2006 - which, as far as Safari or WebKit, equates to "forever". So I do a Google search for "greasekit safari 4", and then start whittling down the results (English language only, please, and only within the last year). Eventually, I ran across a forum post (which I have since lost) saying basically "Yes, PH crashes Safari 4 but has anybody else tried out Fanboy's AdBlock CSS sheet?"

Which, if you know anything about Web development, should have sent the palm of your hand rocketing toward your forehead in a major "D'oh! moment. Of course! Why didn't I (or we all) think of that about 6 or 8 years back?

For those of you who aren't as knowledgeable about the detailed workings of your browser, let me explain. Every Web browser, probably since at least Netscape 1.0, has included support for "user style sheets"; where individual users (or organisations of such users) can choose to instruct their browsers to display certain specific content differently than it was originally specified. For this to work, the user in question (or someone s/he depends on) have to be very literate in HTML and particularly CSS, the languages of Web pages. To do simple blocking, like 'block all SWF files", isn't hard with modern browsers, but the Web developers themselves can make life significantly easier by following modern "best practices". The "practices" particularly relevant here are "add 'id' attributes to all structural and semantic elements." (hmm; what's the difference between that and "all elements"? Another blog post...) If the developer does that, including for the body tag, then it's very easy for even a neophyte user to start filtering just what's wanted.... and learn something about how Web pages work into the bargain.

Thanks for reading - and commenting.

Saturday, June 27, 2009

Remember to test your testing tools!

I've been doing some PHP development lately that involves a lot of SPL, or Standard PHP Library exceptions. I do test-driven development for all the usual reasons, and so make heavy use of the PHPUnit framework. One great idea that the developer of PHPUnit had was to add a test-case method called setExpectedException(), which should eliminate the need for you (the person writing the test code) to do an explicit try/catch block yourself. Tell PHPUnit what you expect to see thrown in the very near future, and it will handle the details.

But, as the saying says, every blessing comes with a curse (and vice versa). The architecture of PHPUnit pretty well seems to dictate that there can only be one such caught exception in a test method. In other words, you can't set up a loop that will repeatedly call a method and pass it parameters that you expect it to throw on; the first time PHPUnit's behind-the-scenes exception-catcher catches the exception you told it was coming, it terminates the test case.

Oops. But if you think about it, pretty expectable (pardon the pun). For PHPUnit to catch the exception, the exception has to get thrown and unwind the call stack past your test-case method. That makes it very difficult (read: probably impossible to do reliably inside PHPUnit's current architecture) to resume your test-case code after the call that caused the exception to be thrown - which is what you'd want if you were looping through these things.

This leaves you, of course, with the option of writing try/catch blocks yourself - which you were hoping to avoid but which still works precisely as expected.

Moral of the story: Beware magic bullets. They tend to blow up in your face when you least expect it.

Wednesday, May 27, 2009

News Flash: Microsoft Reinvents Eiffel, 18 Years On

One of the major influences on the middle third of my career thus far was Bertrand Meyer's Eiffel programming language and its concept of design by contract. With such tools, for the first time (at least as far as I was aware), entire classes of software defects could be reliably detected at run time (dynamic checking) and/or at compile time (static checking). I worked on a couple of significant project teams in the mid- to late '90s that used Eiffel quite successfully. Further, it impacted my working style in other languages; for several years, I had a reputation on C and C++ projects for putting far more assert statements than was considered usual by my colleagues. More importantly, it made me start thinking in a different way about how to create working code. Later, as I became aware of automated testing, continuous integration and what is now called agile development, they were all logical extensions of the principles I had already adopted.

This all happened over a period of 15 or so years, in a field where anyone with more than 2 or 3 years' experience is considered "senior". But for me, and most other serious practitioners who I knew and worked with, two to three years was really just about as long as it took to answer more questions than we raised. That, in most crafts, is considered one of the signs of becoming a journeyman rather than a wet-behind-the-ears apprentice.

Then, a few hours ago, I was reading a blog entry by one David R. Heffelfinger which mentioned a project at Microsoft DevLabs called "SmallBasic". Another project that the same organization developed is called "Code Contracts"; there's a nice little set of tools (which will be built into the upcoming Visual Studio 2010 product), and a nice introductory video. Watch the video (you'll need Silverlight to view it), and then do some research on Eiffel and design-by-contract and so on, and it's very difficult not to see the similarities.

So, on the one hand, I'm glad that .NET developers will finally be getting support for 20-year-old concepts (by the time significant numbers of developers use VS 2010 and .NET 4.0). Anything that helps improve the developer and user experiences on Windows (or, in fact, any platform) is by definition a Good Thing™.

On the other hand, I see more evidence of Microsoft's historical Not Invented Here mentality; beating the drum for "new and wonderful ideas for Windows development" that developers on other platforms have been using effectively for some time. While the Code Contracts project indirectly credits Eiffel - the FAQ page links to Spec# at Microsoft Research, which lists Eiffel as one of its influences - it would have been nice to see acknowledgement and explanation of precursor techniques be made more explicitly. Failure to do so merely reinforces the wisdom of Santayana as applied to software: "Those who cannot remember the past are condemned to repeat it", as well as "Fanaticism consists in redoubling your efforts when you have forgotten your aim." This last is something that we who wish to improve our craft would do well to remember.

What do you all think?

Thursday, May 14, 2009

Jaw-Droppers - Blast from the Past

Just when you thought it was safe to forget that the 1970s ever existed... this gem shows up on the XML Daily Newslink, a mailing list I follow intermittently. (Actually, this was included in the XMLDN from Wed 11 Feb - an indication of how "closely" I've been following lately.)

Developing a CICS-Based Web Service
G. Subrahmanyam, G. Mokhasi, S. Kusumanchi; SOA World Magazine

Web services have opened opportunities to integrate the applications
at an enterprise level irrespective of the technology they have been
implemented in. IBM's CICS transaction server for z/OS v3.1 can support
web services. It can help expose existing applications as web services
or develop new functionality to invoke web services. One of the commonly
used protocols for CICS web services is SOAP for CICS. It enables the
communication of applications through XML. It supports as a service
provider and service consumer independent of platform and language.
SOAP for CICS enables CICS applications to be integrated with the
enterprise via web services as part of lowering the cost of integration
and retaining the value of the legacy application. SOAP for CICS also
comes along with the implementation encoder and decoder.


"SOAP for CICS"? Give it a REST, guys. On the other hand.... preserving this by-now more-solid-than-most-rocks 1969-era technology IS a signal achievement; an example of engineering stability on par with Soyuz.

On the other hand... support for legacy technologies like this looks set to become increasingly expensive and risky over time, since apparently:
  1. Numerous surveys published in the last few years indicate that the sizable majority of "old mainframe" tech will have retired by 2010 (do the math on the years), and

  2. partly as a result of (1), users risk become increasingly dependent on outsourced Indian providers - hardly conducive to effective project control.

In my own professional view, technological preservation activities like this are mainly useful for one thing: they can serve as the pattern against which a re-implementation of the business process using more currently-supportable tech can be verified. And once an organization has done this for its most valuable legacy systems (which wouldn't have been preserved this long if they weren't so valuable), the actual and perceived risk of migrating to new technologies as conditions warrant drops dramatically. After all, if you (or your internal colleagues) have successfully brought your line-of-business systems from CICS via REST to, say, PHP or Java, you're a lot more comfortable with the idea of migrating to whatever the mid-to-trailing technology is in ten years' time - while the support costs of the (then) existing system are still manageable.

Are any of you actually involved in any technological archaeology like this? When I was younger, I used to brag that half the systems I'd worked with were older than I was, but that stopped being true about 1995. (AFAIK, I was one of the last guys to professionally touch a working IBM 360 mainframe, in about 1989.)

(Original article at this entry's title link, which is also here.)

Wednesday, May 06, 2009

Professionalism, Web development, and giving oxy to morons

Whereas a poor craftsman will blame his tools, poor tools will handicap even the most skilled craftsman.


As I insinuated in my previous post, I'm getting up to speed on the Zend Framework, the "900-kg elephant" of PHP application frameworks.

One major bone I have to pick with the ZF team is with regard to documentation: each time I've checked the site in the last couple of months, there's been an apparently current HTML version (now clocking in at some 300 HTML pages). There is also a PDF version, the promise of which is used as an enticement to register for their content distribution network (and, presumably, marketing info). As of this moment, however, the framework is at version 1.8.0, but the PDF version of the programmer's reference manual only covers version 1.6.0 (from September, 2008); some 12 releases earlier. It no longer fully matches the actual code, to the point where it is not difficult for a new developer to get deeply confused.

After spending a half-hour browsing the HTML version of the document, I am unable to find any declaration as to which version of the Framework is documented. However, the README.TXT file included with the source distribution states that it covers the 1.8 release, revision 15226, released on April 30, 2009. Classes which are listed in the README as being new, such as Zend_Filter_Encrypt, are documented in the HTML programmer's guide. Establishing a match between the (HTML) doc and the current code is non-trivial, however. While it may be argued that people unfamiliar with browsing a Subversion repository are not likely to be common within Zend's target audience, I would indirectly refute that: a product release, particularly one with a strong industry following, should be
  • properly documented;
  • easy for a (prospective) user to verify that he has the complete package; and
  • with a definite, intuitive learning curve.
In my view, the Zend Framework fails on at least two of these points. The assertion within large segments of the PHP community that it is the "gold standard" of PHP application frameworks should be a disturbing, cautionary omen: if Web development, particularly PHP development, wishes to be taken seriously by the software industry at large, then some major improvements and attitude shifts need to occur quickly, publicly and effectively. It is still far too easy for potential developers outside the "early-adopter" leading edge to scoff that PHP development (and, by extension, Web development as a whole) is still far too immature and amateurish to be taken seriously. As someone who has developed professionally in PHP for some ten years now, that is a disturbing state of affairs; one that I would love to see (and participate in) a free-ranging discussion of.

Thursday, April 16, 2009

OMFG, or Holy Deforestation, Batman!

As some of you know, I'm working on a book on Web development, using off-the-shelf tools (frameworks, template engines, JavaScript libraries, etc.) to leverage semantic, standards-compliant, accessible, search-friendly Websites. (That's more a matter of adjusting your development philosophy and workflow than anything else, but I digress). As part of that, I'e been doing a (reasonably) comprehensive review of PHP 5 application frameworks. You might have heard of ezComponents or CakePHP, but the 900-kg elephant in the room is definitely the Zend Framework. It ships with the Dojo JavaScript toolkit, but doesn't make it excessively difficult to mix and match others (Scriptaculous, Prototype, jQuery, etc.) if desired.

And here's another reason to call ZF the '900-kg elephant' -- the programmer's reference guide (for version 1.7) weighs in at a <sarcasm>svelte</sarcasm> 1170 pages. Don't print this at home, folks. Better yet, just don't print it... either browse it online or download the PDF. Save a forest or six. For you old-timers, this will remind you quite a bit of US DOD Standard 2167A, "fondly" remembered as "documentation by the boxcar load".

Tuesday, February 24, 2009

Mac OS X is BSD Unix. Except when it's Different.

One of the things that a BSD Unix admin learns to rely on is the "ports" collection, a cornucopia of packages that can be installed and managed by the particular BSD system's built-in package manager: pkgsrc for NetBSD, pkg_add for FreeBSD, and so on. In nearly all BSD systems, the port/package manager is part of the basic system (akin to APT under Debian Linux).

Mac OS X is BSD Unix "under the hood," specifically Darwin and, indirectly, FreeBSD.

This provides the Mac user who has significant BSD experience with a nice, comfy security blanket. This blanket has a few stray threads, however. One of these is the package-management system and ports.

Software is customarily installed on Mac OS X systems from disk images, or .dmg files. When opened, these files are mounted into the OS X filesystem and appear as volumes, equivalent to "real" disks. The window that the Finder opens for that volume customarily contains an icon for the application to be installed and a shortcut to the Applications folder. Installation usually consists of merely dragging the application icon onto the shortcut (or into any other desired folder). Under the hood, things are slightly more complex, but two points should be borne in mind.

First, there is no true Grand Unified Software Manager in Mac OS that is comparable to APT under Debian Linux, or even the "Add or Remove Programs" item in Microsoft Windows' Control Panel. Uninstalling a program ordinarily consists of dragging the program icon to the Trash or running a program-specific uninstaller (usually found on the installation disk image).

Second, while there is a "ports" implementation for Mac OS X (MacPorts), it isn't a truly native part of the operating system. More seriously, the versions of software ports which are maintained in its ports list are not always the most recent version available from their respective maintainer. Installing an application via MacPorts, installing a newer version through the customary method, and attempting to use MacPorts to maintain the conflated software can quite easily introduce confusing disparities into the system, with potentially destabilizing effect.

Go back and read that last paragraph again, especially the final sentence. Mac OS X, meet BSD Unix. Touch gloves, return to your corners, and wait for the bell.

Most users will never run into any problems, simply because most Mac users make little or no use of MacPorts (or any other command-line-oriented system management tool). MacPorts users are (almost by definition) likely to be experienced Unix admins who pine for the centralized simplicity of their workhorse software-management system. Informal research suggests that many, if not most, MacPorts users are active developers of Unix and/or Mac software. In other words, we're all expected to be grown-ups capable of managing our own systems, trading away the soft, easy-to-use graphical installation for a wider variety of nuts-and-bolts-level packages.

So why is any of this a problem for me? Why am I consuming your precious time (and mine) blathering on about details which most interested people already know, and most who don't, probably aren't? As a bit of a public mea culpa and a warning to others to pay attention when mixing installation models.

I had previously installed version 8.2 of the PostgreSQL database server on my Mac via MacPorts. Returning to it later, I realized that the installation had not completely succeeded: the server was not automatically starting when I booted the system, and the Postres tools were not in my PATH. After a bit of Googling, I came across a few links recommending the PostgreSQL 8.3.6 disk images on the KyngChaos Wiki. The server failed to start as expected.

Remembering that I had previously installed 8.2 the "ports" way, I first uninstalled the newly-installed and -broken 8.3.6. I then ran port uninstall postgresql82 && port clean postgresql82 in an apparently successful attempt to clean up the preexisting mess, after which the KyngChaos disk images installed correctly and (thus far) work properly.

This once again points out the usefulness of keeping a personal system-management Wiki as a catchall for information like this - both to help diagnose future problems, and (especially) to avoid them altogether. These can be dirt simple; I use Dokuwiki, and have for over a year now. Forewarned (especially by yourself) is forearmed, after all.

Just thought I'd get this off my chest.

Tuesday, December 23, 2008

Maybe not eating 'crow', specifically, but..... DUCK!!!

As in, "bend over, here it comes again..."

One of the things I have greatly appreciated about the Mac, especially with OS X, is how simple and straightforward software management is, compared to Linux and especially Windows (where every system change is a death-defying adventure against great odds). Operating system or Apple-supplied apps need an update? Software Update is as painless as it gets: the defaults Just Work in proper Mac fashion, but you can set your own schedule, along with a few other options. There is a well-established convention for third-party apps to check for updates via a Web service "phoning home" at app startup; this has been very easy to deal with. Application and file layout is regular and sensible; libraries and resources are generally grouped in bundles at the system or user level. After a few years of DLL hell in Windows and library mix-and-match in Linux, this was shaping up to be a real pleasure.

Then, as some of you know, I updated Mac OS X on my iMac from 10.5.5 to 10.5.6. As expected, that apparently went as smooth as glass. I even blogged about it. XCode worked; MS Office 2008 for the Mac worked; Komodo Edit worked; all my IM clients worked; all seemed customarily wonderful in the omniverse. I even started up Mail; it opened normally and happily downloaded my regular mail and Google mail, just as it had done every day for months. (I didn't actually open any messages then; that will turn out to be important.) Satisfied that everything Just Worked as always, I went back to working on a project for a few hours before turning in for the night.

Next morning, I went through the usual routine. Awake the Mac from hibernation; log in; start Yahoo, MSN and Skype; start Mail; open Komodo; open Web browsers (Safari, Opera and Camino) and I'm ready to get started. First thing...here's an interesting-sounding email message; let's open that up and... *POOF* - Mail crashes.

WTF? It started up just fine; I even got the "Message for you, Sir" Monty Python WAV I'd set Mail to use as my new-mail-received notification. I start Mail again. Picking a different message, I double-click it in the inbox. A window frame opens with the message title, sits empty for a few hundred milliseconds, then Mail goes away again. Absolutely, totally repeatable. Reboot changes nothing. Safe Boot (the Mac equivalent of Windows' "safe mode") changes nothing. The cold fingers of panic stroke my ribs like Glenn Gould at the piano. On a bad-karma scale of 0 to 10, initial reaction is an "O my God"; we're not dead, but we're hurt bad; the karma has definitely run over the dogma. 

The next couple of days are spent using my ISP's Webmail service, and a set of Python scripts I'd previously written to search mailbox contents - Apple Mail, like any sensible email program, adheres to established standard formats. If I'd been using Microsoft Lookout! in a similar situation, I'd have been up the creek.

Finally, I come across some Web-forum items that indicate that GPGMail needs to be updated; if it's not, Mail will crash under OS X 10.5.6 - which is exactly what was happening. (If you're not using GPGMail, GNU Privacy Guard, or any of the various GPG interfaces for Windows such as Enigmail for Mozilla Thunderbird, you don't know how many people are recording and/or reading your email - but if it transits a server in the US or UK, it's guaranteed that it will be.

Installing the upgraded GPGMail bundle was the work of less than two minutes (hint: remove or rename the old bundle before copying the new one over. You probably don't need the insurance, but consider how we got here...). Then start up Mail as usual. It should, once again, Just Work - complete with being able to read and reply to messages, with or without GPG signatures.

OK, so what lessons can we take away from this experience, both as users and developers?

Time Machine may well be the single most rave-worthy piece of software I've touched in 30 years, but it can't (obviously, easily) do everything, and in a crisis, even experienced users may well not want to risk bringing too much (or too little) "back from history". There's definitely a market for addons to TM to do things like "look in my user and system library directories, the Application directory structure, MacPorts, etc., and bring application Foo back to the state it was in last Tuesday morning, but leave my data files as they are." I almost certainly could do that with the bare interface -- but, especially since it was "broken" as part of an OS upgrade, and (with the Windows/Linux experience fresh in mind) not comfortable exploring hidden dependencies... I was without my main email system for three days. Sure, I had workarounds -- that I wouldn't have had if I'd been in a stock Windows situation -- but that's not really the point, is it?

Also, app developers (Mac or other), add this to your "best practices" list: If your software uses any sort of plug-in/add-on architecture, where modules are developed independently of the main app, then you can have dependency issues. The API you make available to your plugin developers will change over time (or your application will stagnate); if you make it easy for them (and your users) to deal with your latest update, you'll be more successful. There's (at least) two ways to go about doing this:

The traditional "brute force" approach. Have a call you can use to tell plugins what version of the app is running, and allow them to declare whether or not they're compatible with that version. Notify the user about any that don't like your new version. For examples of this, see the way Firefox and friends deal with their plugins. Yes, it works, but it's not very flexible; a new version may come out that doesn't in fact modify any of the APIs you care about - which means that the plugin should work even though it was developed against version 2.4 of your app and you're now on 4.2.

Alternatively, a more fine-grained approach. Group your API into smaller, functional service areas (such as, say, address-book interface or encryption services for an email program). Have your plug-in API support a conversational approach.

  1. The app calls into the plugin, asking it which services it needs and what versions of each it supports.

  2. The app parses the list it gets back from the plugin. If the app version is later than the supported range for a specific feature identified by the plugin, add that to a "possibly unsupported" list. (If the app version is earlier than the range supported by the plugin, assume that it's not supported and go on to check the next one.)

  3. If the "possibly unsupported" plugin list is empty, go ahead and continue bringing up the app, loading the plugins normally; you're done with this checklist.

  4. For each item in the "possibly unsupported" list, determine whether the API for each feature required for the plugin has changed since the plugin was explicitly supported. (This is how a plugin for an earlier release, say 2.4, could work just fine with a later version, like 4.2.) If there's no change in the APIs of each feature required by the plugin, remove that plugin from the "possibly unsupported" list.

  5. If any plugins remain in the list, check if there's an updated version of that plugin on the Net. This might be done using a simple web-service-to-database-query on your Web server. If your Web server knows of an update, ask the user for permission to install it. If the user declines, or no upgrade is available, unload the plugin. (You'll check again next time the app is started; maybe there's an update by then.)

  6. Once the status of each plugin has been established, and compatible plugins loaded, finish starting up your app.

Of course, there are various obvious optimisations and convenience features that can be built into this. Any presentation to the user can and likely should be aggregated; "here's a list of the plugins that I wasn't able to load and couldn't find updates for." Firefox and friends are a good open-source example of this. The checks for plugin updates can also be scheduled, so as not to slow down every app startup. This might be daily, weekly, twice a month, whatever; the important thing is to let the user configure that schedule and view a list of plugins that are installed but not active.

As I started this post by saying, I've been very favorably impressed by Mac apps' ease of use (including installation and maintenance). Mail fell down and couldn't get up again without outside assistance; this is unusual. The fact that this was caused by a plugin and that Mail could not detect and work around the conflict just amazes me; I expect more from Apple. I'm not ready to decrease my use of the Mac because this happened - but I am going to pay more attention to how things work under the hood. The fact that I have to even be aware of this -- which is one of the features that hitherto distinguished the Mac from the grubbier Windows and Linux alternatives -- is worrisome.

Again, your comments are welcome.

Wednesday, December 17, 2008

Things that make you go 'Hmmmm', continued

Very much picking up from the mindset expressed in my earlier post... with the knowledge that this could (and probably should) be broken up into at least three different rants...

I've been working heavily in Python for the past couple of months, regrettably letting some other projects slide a bit. Now done with that, I spent yesterday picking up where I'd left off in a moderately-sized, reasonably well-designed PHP 5.2 project. (Bear in mind that my PHP experience is easily 5 or 6 times as much as my Python.)

And... while I'm not ready to jump on the "PHP sucks" bandwagon, it does feel clunky. Occasionally obscure (though never up to the standards of obscurity a good Perl hacker deals with every day).

Why is this? Three years ago, Joel Spolsky wrote an excellent rant on The Perils of JavaSchools. His point essentially boils down to that how you're trained (or "educated") as a developer shapes the way you look at problems; if all you know is a "Hammer", you try to visualize every problem as a "Nail" (even when it's a "Glass Figurine"). You may well have less-than-satisfying success with that view.

More importantly, the tools and techniques you know shape whether you can properly understand a problem at all. Not having certain features in a language (or not being knowledgeable in their use) means that you'll write clunky, hard-to-understand (and therefore -maintain) code to achieve the desired result...and wind up with (an attempt at) calculus using Roman numerals. Just as having a positional numeric system (e.g. Arabic or "modern" numerals) makes whole classes of problems possible that weren't otherwise, languages and their features make programming problems practical or more efficient.

What we don't want is one language that tries to do everything, in every way possible. We already have that; it's called Perl, and one ramification of its overriding philosophy, "There's more than one way to do it", is that there's always a better way to do it; the search for same can and often does suck in resources faster than a Sol-sized black hole. This also illustrates the failing of most of the more recent practitioners of software development that I've worked with. While those of us who've been working since before oh, about 1988 or 1990 generally make a point of reading at least one new technical book a month, I recently led a group of about 20 young (less than 5 years experience as of 2006) developers where not a single one admitted to reading more than one technical book a year since graduation. These people didn't know how to solve the problem we were working on because the two or three tools which they were familiar with encourage their users to think in ways that do not lead to effective solutions for this problem.

A language, any language, can really only do a limited number of things well. If you are fluent in more than one human language, say, English and Mandarin Chinese, think for a moment about concepts and sayings that are natural in one language but just don't work well in the other. Computer languages are like that, too, which is one reason why your computer's operating system is much less likely to be written in COBOL than your company's accounting program is (with benefits for all concerned).

Getting back to what started this rant....coming back to PHP after a sojourn in Python....

PHP gets the job done. Recent versions, particularly the current 5.2, are much more pleasant to work in than previous versions were for those of us who "think in objects". But it has taken years to get here.

As I look at one class in particular in this PHP project's code base, part of my mind is working on "if this were Python code, I'd write it like..." - and a two-hundred-line unit of code would be about 60 or 80, and much cleaner and easier to understand to boot. Why is that? Think "original intent".

PHP was originally developed as an adjunct to HTML for Web pages, to provide some simple dynamic content. It then "just growed", adding new features and capabilities (database access, object orientation) as it became used in a wider variety of problems. It is, quite simply, a tool that grew into a reasonably useful - if not quite general-purpose - language. There are a number of things it does quite well, especially since it can be used to do useful work without requiring a steep learning curve beforehand.

Python is different. A general-purpose language, with functional-programming features, it is useful for Web application development (e.g., with mod_python for the Apache Web server). Whereas Perl has the idea that "There's more than one way to do it" - and therefore no best way - Python argues that "there should be one—and preferably only one—obvious way to do it", whatever "it" is.

So am I suggesting that PHP developers ditch everything and go with Python? Of course not. What I am arguing is the seemingly quaint notion that developers, especially those who aspire to work in the craft for a living, should continually strive to learn new tools, techniques, languages and processes. Your abilities are like every other living thing; they're either growing, or they're dying.

Tuesday, December 16, 2008

Happy Updating....

If you're a Windows usee with a few years' experience, you've encountered the rare, monumental and monolithic Service Packs that Micorosoft release on an intermittent basis (as one writer put it, "once every blue moon that falls on a Patch Tuesday"). They're almost always rollups of a large number of security patches, with more added besides. Rarely, with the notable (and very welcome at the time) exception of Windows XP Service Pack 2, is significant user-visible functionality added. Now that SP3 has been out for seven months or so, it's interesting to see how many individuals and businesses (especially SMEs) haven't updated to it yet. While I understand, from direct personal experience, the uncertainty of "do I trust this not to break anything major?" (that is, "anything I use and care about?"), I have always advised installing major updates (and all security updates) as quickly as practical. Given the fact that there will always be more gaping insecurities in Windows, closing all the barn doors that you can just seems the most prudent course of action.

I got to thinking about this a few minutes ago, while working merrily away on my iMac. Software Update, the Mac equivalent of Windows' Microsoft Update, popped up, notifying me that it had downloaded the update for Mac OS X 10.5.6, and did I want to install it now? I agreed, typed my password when requested (to accept that a potentially system-altering event was about to take place, and approve the action), and three minutes later, I was logged in and working again.

Why is this blogworthy? Let's go back and look at the comparison again. In effect, this was Service Pack 6 for Mac OS X 10.5. Bear in mind that 10.5.5 was released precisely three months before the latest update, and 10.5.0 was released on 26 October 2007, just under 14 months ago. "Switchers" from Windows to Mac quickly become accustomed to a more pro-active yet gentle and predictable update schedule than their Windows counterparts. The vast majority of Mac users whom I've spoken with share my experience of never having had an update visibly break a previously working system. This cannot be said for Redmond's consumers; witness the flurry of application and driver updates that directly follow Windows service packs. XP SP2, as necessary and useful as it was, broke more systems than I or several colleagues can remember any single service pack doing previously...by changing behavior that those programs had taken advantage of or worked around. Again, the typical Mac customer doesn't have that kind of experience. Things that work, just tend to stay working.

Contrast this with Linux systems, where almost every day seems to bring updates to one group of packages or another, and distributions vary wildly in the amount of attention paid to integrating the disparate packages, or at least ensuring that they don't step on each other. Some recent releases have greatly improved things, but that's another blog entry. Linux has historically assumed that there is reasonably competent management of an installed system, and offers resources sufficient for almost anyone to become so. Again, recent releases make this much easier.

Windows, on the other hand, essentially requires a knowledgeable, properly-equipped and -staffed support team to keep the system working with a minimum of trouble; the great marketing triumph of Microsoft has been to both convince consumers that "arcane" knowledge is unnecessary while simultaneously encouraging the "I'm too dumb to know anything about computers" mentality - from people who still pony up for the next hit on the crack pipe. Show me another consumer product that disrespects its paying customers to that degree without going belly-up faster than you can say "customer service". It's a regular software Stockholm syndrome.

The truth will set you free, an old saying tells us. Free Software proponents (contrast with open source software) like to talk about "free as in speech" and "free as in beer". Personally, after over ten years of Linux and twenty of Windows, I'm much more attracted by a different freedom: the freedom to use the computer as a tool to do interesting things and/or have interesting experiences, without having to worry overmuch about any runes and incantations needed to keep it that way.

Wednesday, December 03, 2008

Modern Tools and Archaic Practices Shouldn't Mix

Sun have released NetBeans 6.5, which, among many other (potentially) useful and interesting features, claims to officially support Web development using PHP. This is, on the face of things, a major improvement from the situation under NB 6.1 and prior, which treated PHP essentially as other unsupported languages were treated: you could do raw text editing, but the features that are the entire point of using an IDE - auto-completion, search/cross-reference, and so on - were completely absent. Not so in 6.5; at least minimal support for features like code completion, auto-display of PHPDoc during code entry, and so on can be found here. After a few minutes of poking around, I was starting to get optimistic; here was a decent, if somewhat more heavyweight, alternative to the Komodo Edit which I had been using for some months. Why look for alternatives when I was extremely happy with Komodo Edit for the Mac? Because, almost every day, I sat down in front of Komodo Edit for Linux, and became frustrated with the inconsistencies, limitations and general less-polished feel (Why can't ActiveState include KE for Mac key emulation along with vi and emacs?)

So, back to NetBeans and PHP. I spent a few minutes putting together toy code just to see how the editor felt. Then I created the really one-and-only sample PHP project that came with NB 6.5, a site for a fictional India-based budget airline. Go through the 'New Project' wizard, select the project type, the directory to be used to contain the entire thing (for development, at least), and hit The Magic "Finish" Button.

And, voilà, a new project is born:

At first blush, nothing too obviously catastrophic. Rather non-semantic names for the image files, and the files under 'include' generally presume that you'll only ever need one nav bar, for example, but hey, it's a sample project, I tell myself. It's not necessarily meant to be production-quality; it's supposed to give you a starting point to either see how to use NetBeans to work in the PHP you already know, or how to use this PHP that's all over the Web in the NetBeans you've been using earlier versions of.

And then I double-click on the index.php file in the Projects pane. And my jaw hits the floor as I see... 1996-ASP-style intermingling of PHP code and raw HTML. OK, the DTD is from 1999 and the PHP code uses superglobals, which date from 2001, but you get the idea.

We've spent the better part of a decade, as a craft, running screaming away from this style of work. No sane, experienced PHP developer would write code like this today; we may not have (quite) advanced to the point where "everybody" uses the same tools for similar projects, but separation of presentation (HTML) and logic (PHP) is pretty universally seen as not just a Good Thing® but a Necessary Thing® if the site is ever going to be debugged/maintained. There are just so many problems that conmingled code and markup create, unnecessarily, in living code. I'm well aware that a 'toy' example for a general-purpose editor-with-benefits can not (and arguably should not) try to teach tyros the basics of the language in question.

But is it really too much to ask that such an example be written in a reasonably modern and correct style, or at least put big red (say, 144-point Comic Sans) warnings to the effect, "DANGER: If you don't know why this is  horrible practice, please go buy a book! The job you save may well be your own."

Still using the free Komodo Edit on the Mac, trying to justify shelling out for the "real" Komodo IDE... but that's a deliberation for another post.

Sunday, October 12, 2008

Things that make you go 'Hmmmm'

...or 'Blechhhh', as the case may be...

I've been using PHP since the relative Pleistocene (I recently found a PHP3 script I wrote in '99). I've been using and evangelising test-driven development (TDD) for about the last five years, usually with most such work being done in C++, Java, Python or other traditionally non-Web languages (with PHP really only being amenable to that since PHP 5 in 2004).

So here I am, puttering away on a smallish PHP project that I've decided to TDD from the very beginning. For one of the classes, I throw together a couple of simple constructor tests in PHPUnit, to start, such as:
require_once( 'PHPUnit/Framework.php' );

require_once( '../scripts/foo.php' );

class FooTest extends PHPUnit_Framework_TestCase
    public function testCanConstructBasic();
{
$Foo = new Foo( 'index.php' );
}

public function testCanConstructBasicWildcard()
{
$Foo = new Foo( '*.php' );
}
};
And, as is right and proper, I code the minimal class necessary to make that pass:
class Foo
{
};
That's it. That's really it. No declaration whatever for the constructor or any other methods in the class. Since it doesn't subclass something else, we can't just say "oh, there might be a constructor up the tree that matches the call semantics."  PHPUnit will take these two files and happily pass the tests.

I understand what's really going on here - since the class is empty, you've just defined a name without defining any usage semantics (including construction). I would say fine; not a problem. But I would think that PHPUnit should, if not give an error, then at least have some sort of diagnostic saying "Hey, you're constructing this object, but there are no ctor semantics defined for the class." I can see people new to PHP and/or TDD, who are maybe just working through and mentally adapting an xUnit tutorial from somewhere, getting really confused by this. I know I did a double-take when I opened the source file to add a new method (to pass a test not shown above) and saw nothing between the curly braces. On one level, very cool stuff. On another, equally but not always obviously important level, more than enough rope for you to shoot yourself in the foot.

Or, to put it another way, even though I've been writing in dynamic languages off and on for ages, I still tend to think in incompletely dynamic ways. Sometimes this comes back and bites me.  Beware: here be (reasonably friendly, under the circumstances) dragons.

Friday, August 15, 2008

( C/C++ != C) && (C/C++ != C++)

A thought which ran through my mind as I was browsing some job requirements recently...

Why are recruiters still hung up on "C/C++", years after even Microsoft got around to shipping a reasonably compliant compiler (depending on your prejudices and code needs, anywhere from Visual Studio 6 in 1998 to VS.NET 2003)?

"C/C++" started life (or zombiehood) as a Microsoft marketing term back in the late 1980s with the release of Version 7 of their C compiler, which included "some C++ features". MSC 7 wasn't a "real" C++ compiler, but companies such as Borland (now CodeGear), Watcom (now part of Sybase), IBM and others, were shipping compilers that implemented the bulk of the (then-) Draft Standard in a (largely) portable, consistent fashion, so Microsoft was able to muddy the waters by calling their product "C/C++", secure in the knowledge that many of their customers had too little C++ experience to see through the marketing.

Incidentally, this (non-Microsoft) competitive innovation spurred numerous advances, such as Alexander Stepanov's (of AT&T, later at HP) Standard Template Library (STL). Microsoft, in response, introduced a "Container Class Library" which was in practice quite inferior (since it required contained objects to be derived from the Microsoft Foundation Class library's CObject class and (if memory serves) did not support either multiple inheritance or thread safety. Since Microsoft's compilers at the time did not properly support important Standard C++ features such as templates and runtime type information (RTTI) that were needed for the STL, the compiler defects created market opportunities for companies like Rogue Wave and Dinkumware to create products with similar but not identical function.

Timewise, this was really when Microsoft was starting to really push developer lock-in - the practice of introducting non-standard and/or proprietary "features" which were made central to the development process. Despite the existence of numerous superior (in design, function and in productivity) class libraries such as Borland's ObjectWindows Library, Inmark's zApp library, the previously-mentioned Rogue Wave toolkits, and others, Microsoft's MFC carved out huge market share and mindshare, largely because:
  • it was bundled with the Microsoft C ("C/C++") compiler;
  • its limitations and defects mapped most closely to those of the underlying compiler;
  • it came with a primitive but usable GUI builder, for "click-and-drool" development; and
  • it was relentlessly praised by the Microsoft-beholden tech press of the day.
That last point should never be underestimated; publishers of less-than-laudatory articles, such as C Users Journal and Will Zachmann (when he was writing for PC Magazine) would find themselves cut off from Microsoft's press briefings, rumor mil and other means of "keeping up with the competition". This was meant as punitive, to "hurt" the "offenders"...who promptly wrote up the entire sordid affair, built a certain amount of loyal sympathy from the industry grass-roots, and survived quite well, thank you very much.

Getting back to "C/C++"... the term was a marketing fix to a technical problem which rapidly gained "mindshare" with its intended audience: marginally to non-technical people (senior managers, HR people, etc.) who wanted or needed to sound technically knowledgeable. Microsoft was able to play on their lack of real language knowledge coupled with follow-the-herd instincts to help force adoption in enterprises, from the top down. While this helped to increase sales, and helped preserve Windows' market share and lock-in in the enterprise for nearly two decades, it seriously retarded the take-up of standard, portable C++ in the industry (as intended). It also gave companies like ParcPlace (with Smalltalk) and NeXT, later Apple (with Objective-C) incentives to use "alternative" languages, either to gain some "control over their own destiny" independent of a competitor, or simply because C++ at the time was not up to the tasks which they wanted to accomplish.

In any event, by around 2000 (plus or minus a half-decade), Microsoft had caught up with where the rest of the industry had been for a decade or so (bringing serious, proprietary backward-compatibility baggage along with them). The marketing need for the 'C/C++' Newspeak was gone - but the corporate world that had learned the newfangled technical language back in the day was still in place, bound only by the Peter Principle (whose bar, thanks to the new technology throughout the enterprise, had been set depressingly high). Consequently, you still run across job ads with text like this (from the Singapore Straits Times of 13 August 2008):

C/C++ EMBEDDED SOFTWARE Engr. Contract. Call 6xxx7085

Truly informative about the needs; at first blush, seemingly written by a completely non-technical HR person. (I didn't follow up the advertisement to actually verify this, however).

What's the point of this whole rambling rant? To try to impress upon you, my half-dozen Loyal Readers, a technical truism that has been around as long as there have been technical gadgets: "80% of what you know will be obsolete in n months; the other 20% will never be obsolete. Using that 80% beyond its shelf life just makes you look silly." Or, if not 'silly', then at least 'locked in to an out-of-date technology or idea.' And that, with very high likelihood, does not deliver a competitive advantage to your organization.

Tuesday, August 12, 2008

Test Infection Lab Notes

In a continuing series...

As current and former colleagues and clients are well aware, I have been using and evangelizing test-driven development in one flavor or another since at least 2001 (the earliest notes I can find where I write about "100% test coverage" of code). To use the current Agile terminology, I've been "test-infected".

My main Web development language is PHP 5.2 (and anxiously awaiting the goodness to come in 5.3), using Sebastian Bergmann's excellent PHPUnit testing framework. PHPUnit uses a well-documented convention for naming test classes and methods. One mistake often made by people in a hurry (novices or otherwise) is to neglect those conventions and then wonder why "perfectly innocuous" tests break. I fell victim to this for about ten minutes tonight, flipping back and forth between test and subject classes to understand why PHPUnit was giving this complaint:
There was 1 failure:
1) Warning(PHPUnit_Framework_Warning)
No tests found in class "SSPFPageConfigurationTest".

FAILURES!
Tests: 1, Failures: 1.
about this code:
class SSPFPageConfigurationTest extends PHPUnit_Framework_TestCase
    public function canConstruct()
{
        $Config = new SSPFPageConfiguration();
        $this->assertTrue( $Config instanceof SSPFPageConfiguration );
    }
};
which was "obviously" too simple to fail.

The wise programmer is not afraid to admit his errors, particularly those arising from haste. The novice developer proceeds farther on the path to enlightenment; the sage chuckles in sympathy, thinking "been there, done that; nice to be reminded that other people have, too".

May you do a better job of keeping your koans in a nice, neat cone.

Wednesday, July 23, 2008

Differences that Make Differences Are Differences

(as opposed to the Scottish proverb, "a difference that makes no difference, is no difference")

This is a very long post. I'll likely come back and revisit it later, breaking it up into two or three smaller ones. But for now, please dip your oar in my stream of consciousness.

I was hanging around on the Freenode IRC network earlier this evening, in some of my usual channels, and witnessed a Windows zealot and an ABMer going at it. Now, ordinarily, this is as interesting as watching paint dry and as full of useful, current information as a 1954 edition of Правда. But there was one bit that caught my eye (nicknames modified for obfuscation):
FriendOfBill: Admit it; Microsoft can outmarket anybody.
MrABM: Sure. But marketing is not great software.
FriendOfBill: So?
MrABM: So... on Windows you pay for a system and apps that aren't worth the price, on Linux you have free apps that are either priceless or worth almost what you pay (but you can fix them if you want to), and on the Mac, you have a lot of inexpensive shareware that's generally at least pretty good, and commercial apps that are much better. THAT's why Microsoft is junk... they ship crap that can't be fixed by anyone else.
FriendOfBill: So you're saying that the Linux crap is good because it can be fixed, and the Mac being locked in is OK because it's great, but Windows is junk because it's neither great nor fixable?
MrABM: Exactly. Couldn't have said it better myself.


Now...that got me to thinking. Both of these guys were absolutely right, in my opinion. Microsoft is, without question, one of the greatest marketing phenomena in the history of software, if not of the world. But it is unoriginal crap. (Quick: Name one successful Microsoft product that wasn't bought or otherwise acquired from outside. Internet Explorer? Nope. PowerPoint? Try again.) Any software system that convinces otherwise ordinary people that they are "stupid" and "unable to get this 'computer' thing figured out" is not a net improvement in the world, in my view. I've been using and developing for Windows as long as there's been a 'Windows'; I think I've earned the opinion.

Linux? Sure, which one? As Grace Hopper famously might have said, "The wonderful thing about standards is that there are so many of them to choose from." (Relevant to The Other Side: "The most dangerous phrase in the language is, 'We've always done it this way.'") As can be easily demonstrated at the DistroWatch.com search page, there are literally hundreds of active "major" distributions; the nature of Free Software is such that nobody can ever know with certainty how many "minor" variants there are (the rabbits in Australia apparently served as inspiration here). Since every distribution has, by definition, some difference with others, it is sometimes difficult to guarantee that programs built on one Linux system will work properly on another. The traditional solution is to compile from source locally with the help of ingenious tools like autoconf. Though this (usually) can be made to work, it disproportionately rewards deep system knowledge to solve problems. The "real" fix has been the coalescence of large ecosystems around a limited number of "base" systems (Debian/Ubuntu, Red Hat, Slackware) with businesses offering testing and certification services. Sure, it passes the "grandma test"....once it's set up and working.

The Macintosh is, and has been for many years, the easiest system for novice users to learn to use quickly. Part of that is due to Apple's legendary Human Interface Guidelines; paired with the tools and frameworks freely available, it is far easier for developers to comply with the Guidelines than to invent their own interface. The current generation of systems, Mac OS X, is based on industry-standard, highly-reliable core components (BSD Unix, the Mach microkernel, etc.) which underpin an extremely consistent yet powerful interface. A vast improvement over famously troubled earlier versions of the system, this has been proven in the field to be proof against most "grandmas".

A slight fugue here; I am active in the Singapore Linux Meetup Group. At our July meeting, there was an animated discussion concerning the upcoming annual Software Freedom Day events. The question before the group was how to organize a local event that would advance the event's purpose: promoting the use of free and open source software for both applications and systems. What I understood the consensus to be basically worked out as "let's show people all the cool stuff they can do, and especially let's show them how they can use free software, especially applications, to do all the stuff they do right now with Windows." The standard example is someone browsing the Web with Firefox instead of Internet Explorer; once he's happy with replacement apps running under Windows, it's easier to move to a non-Windows system (e.g., Linux) with the same apps and interface. That strategy has worked well, particularly in the last couple of years (look at Firefox itself and especially Ubuntu Linux as examples). The one fly in the ointment is that other parts of the system don't always feel the same. (Try watching a novice user set up a Winprinter or wireless networking on a laptop.) The system is free ("as in speech" and "as in beer") but it is most definitely not free in terms of the time needed to get things working sometimes... and that cannot always be predicted reliably.

The Mac, by comparison, is free in neither sense, even though the system software is based on open-source software, and many open-source applications (Firefox, the Apache Web server) run just fine. Apache, for instance, is already installed on every current Mac when you first start it up. But many of the truly "Mac-like" apps - games, the IRC program I use, a nifty note organizer, and so on) are either shareware or full commercial applications (like Adobe Photoshop CS3 or Microsoft Word:mac). You pay money for them, and you (usually) don't get the source code or the same rights that you do under licenses like the GNU GPL.

But you get something else, by and large: a piece of software that is far more likely to "just work" in an expectable, explorable fashion. Useful, interesting features, not always just more bloat to put a few more bullet items on the marketing slides. And that gives you a different kind of freedom, one summed up by an IT-support joke at a company I used to work for, more than ten years ago.
Q: What's the difference between a Windows usee and a Mac user?
A: The Windows usee talks about everything he had to do to get his work done. The Mac user...shows you all the great work she got done.
That freedom may be neither economic or ideological. But, especially for those who feel that the "Open Source v. Free Software" dispute sounds like a less entertaining Miller Lite "Tastes Great/Less Filling" schtick, for those who realize that the hour they spend fixing a problem will never be lived again, this offers a different kind of freedom: the freedom to use the computer as an appliance for interesting, intellectually stimulating activity.

And having the freedom to choose between the other, seemingly competing freedoms... is the greatest of these.

Tuesday, July 22, 2008

Best Practices Alleged; Your Mileage May Vary

Yahoo! quite often releases interesting/useful/thought-provoking tools for people doing "serious" Web development. I add the modifier to specify that we're usually not talking about the Joe Leet three-page magnum oopus; a lot of what they do and talk about really only pays huge returns when you work with a site as large and complex as, well, Yahoo!.

Recently, they brought out a couple of nifty tools that integrate into the Firefox browser's Firebug Web-developer-Swiss-Army-knife extension. One of these, YSlow ("why [my site] slow?") does some interesting evaluations and calculations against whatever page (with secondary requests) you throw it at. Its "Performance" tab shows how your page matches up against Yahoo!'s new "Best Practices for Speeding Up Your Web Site." At first blush, a lot of these make perfect sense; "Avoid Redirects", "No 404s", and so on. YSlow, on the other hand, evaluates against a slightly different set of guidelines to those on the Best Practices Page:

1. Make fewer HTTP requests
2. Use a CDN
3. Add an Expires header
4. Gzip components
5. Put CSS at the top
6. Put JS at the bottom
7. Avoid CSS expressions
8. Make JS and CSS external
9. Reduce DNS lookups
10. Minify JS
11. Avoid redirects
12. Remove duplicate scripts
13. Configure ETags

"Huh?", our hypothetical Web pseudogod Mr Leet might well ask. "What the heck is an 'ETag'? Or a 'CDN'? Does any of this even apply to me?" Well, Joe, yes and no. For instance, content-delivery networks like Akamai or ATDN, as you might well know by hearing the names, scatter servers at strategic places around the planet with the aim of reducing the time it takes to get data from huge, media-content-heavy sites like CNN.com or the like, down to your browser at the end of a surprisingly long chain. Does everybody who puts a site up need something like this? Does the average small-to-midsize business? Usually not, unless you really are a Web Hype-Dot-Oh site that shoves exabytes out every day to wow the yokels or the investors. For the local pizza joint with a site containing maybe forty files, tops, with a couple of megabytes of images, a CDN is thermonuclear overkill. As many Web-development sites have pointed out for the last decade, there's quite a bit you can do to speed things up and lower bandwidth usage without spending the big bucks on this.

Why do I blather on about this when I started talking about best practices and YSlow? Because for practices to be "best", they first and foremost have to be appropriate for the use at hand. Buying a Lamborghini Countach to go down to the corner store for some sodas will quite likely get you yelled at by the Significant Other (followed by your bank). But if Lewis Hamilton showed up at pole position in a '72 Ford Pinto... you'd hear the laughter from St Paul to São Paolo.

Use the tools and techniques appropriate to the task at hand. There's a lot that small Website developers can learn from Google and the tools they publish. Getting an "A" score has a certain karmic appeal, and most of the optimizations required are straightforward anyway (tweaking how your Web server serves your data, for the most part). But is this worth all the geek love it's been getting?

Until someone with the developer credibility and experience of a Yahoo! stands up and explains a better set of practices for the SMB developer, the answer seems to be "yeah, probably". We who make our living (or our diversion) from the creation, care and feeding of Web sites are, for the most part, artisans posing as engineers, with inconsistent knowledge or practice of our craft; we dream of building the online equivalent of the Empire State Building but wind up with the Cologne Cathedral; a wonder, yes, but surely 600 years was well beyond the original estimated schedule! Agreed-upon standards (so that, say, a page appears identical in different browsers),; a shared, common body of knowledge; even (gasp!) widespread, vendor-neutral certifications of professional competence will eventually become common in software (including Web) development for the same reasons as, say, in architecture. The artifacts involved (skyscrapers, Web sites) have important social and policy implications, and inconsistent competence in practice poses a real and serious danger to the public at large. Sooner or later, it's going to be uneconomic for the present ad hoc system to advance the state of the art, or to meet the needs placed upon its products.

Best practices are good; best practices that actually work for the stated purposes in a broad variety of praxis are much better. But to get there, we're going to need to collaborate and communicate effectively, and to do that, we're going to have to make sure everybody involved is speaking the same language to describe the same things. If we don't, we'll continue to be stuck in pretty much the same place we are now - with a bunch of shade-tree mechanics running around in the pits at the Monaco Grand Prix...only doing a lot more damage.

Comments are welcomed, as always.

Friday, July 11, 2008

Does anybody else have a problem with this?

If you've got an ssh connection to a Debian or Ubuntu Linux box handy, and you have sudo privileges on that box, try this little experiment:
  1. ssh to your box as an ordinary user;
  2. sudo su to get a root prompt (you should be asked for your password - this is important);
  3. as soon as you get the root prompt, exit back to normal user, then exit your ssh session entirely.
Now, here's the scary part:
  1. ssh to that same box again right away, as the same user;
  2. sudo su to get a root prompt again.
Why is this scary? Because the second time you ask for a root prompt, you're not prompted for a password. This means that, not only does the actual Linux box require access and user security appropriate to its function, but so does every device that can ssh into it with a rootable user!

I'm sure this isn't in any way new, but in 10+ years of using Linux, I just now encountered that scenario for the very first time. As Linux is becoming more popular, and more users are marching up the 'power user' scale, this is something that should be paid attention to - especially in a business environment. Yowza!

Thursday, July 10, 2008

Standard Standards Rant, Redux: Why the World-Wide Web Isn't "World-Wide" Any More

The "World Wide Web", to the degree that it was ever truly universal, has broken down dramatically over the last couple of years, and it's our mission as Web development professionals to stand up to the idiots that think that's a Good Thing. If they're inside our organization, either as managers or as non-(Web-)technical people, we should patiently explain why semantic markup, clean design, accessibility and (supporting all of the above) standards compliance are Good for Business. (As the mantra says, "Google is your most important blind customer," because your prospective customers who know what they're looking for but don't yet know who they're buying it from find you that way.) Modern design patterns also encourage more efficient use of bandwidth (that you're probably paying for), since there's less non-visible, non-semantic data in a properly designed nest of divs than in an equivalent TABLE structure. Modern design also encourages consistent design among related pages (one set of stylesheets for your entire site, one for your online product-brochure pages, and so on). Pages that look like they're related and are actually related reassure the user that he hasn't gotten lost in the bowels of your site (or strayed off into your competitor's). It's easier to make and test changes that affect a specified area within your site (and don't affect others). It's easier to add usability improvements, such as letting users control text size), when you've separated content (XHTML) from presentation (CSS and, in a pinch, JavaScript). Easier-to-use Web sites make happier users, who visit your site more often and for longer periods, and buy more of your stuff.

Experienced Web developers know all this, especially if they've been keeping up with the better design sites and blogs such as A List Apart. But marketing folks, (real) engineers and sales people don't, usually, and can't really be expected to -- any more than a typical Web guy knows about internal rate of return or plastic injection molding in manufacturing. But you should be able to have intelligent conversations with them, and show them why 1997 Web design isn't usually such a good idea any more. (For a quick Google-eye demo, try lynx).  Management, on the other hand, in the absence of PHBs and management by magazine, should at least be open to an elevator pitch. Make it a good one; use business value (that you can defend as needed after the pitch).

That's all fine, for dealing with entrenched obsolescence within your own organization. What about chauvinism outside - from sites you depend on professionally, socially or in some combination? For years, marginalized customers have quietly gone elsewhere, with at most a plaintive appeal to the offenders, pointing out that a good chunk of Windows usees don't browse with Internet Explorer anymore (check out the linked article; a major business-tech Website from 2004(!!); the arguments are much stronger now). But some companies, particularly Microsoft-sensitive media sites like CNet and its subsidiary ZDNet, still don't work right when viewed with major non-Windows browsers (even when the same browser, such as Opera or Safari, works just fine with that site from Windows). And then there are the sites for whom their Web presence is the entire company, but they haven't yet invested the resources into competent design required to take their site construction from a point-and-drool interface virtually incapable of producing standards-compliant work, and instead present a site that a) actively checks for IE and snarls at you if you're using anything else, and b) has their design so badly broken and inaccessible that people stay away in droves. (Yes, I'm looking at you - every click opens a new window).

When we encounter Web poison like this, we should take the following actions:
  • Notify the site owner that we will use a better (compatible, accessible, etc.) site, with sufficient details that your problem can be reproduced (flamemail that just says "Teh site sux0rs, d00d!" is virtually guaranteed to be counterproductive);
  • When you find an acceptable substitute, let that site's owners know how they earned your patronage. Send a brief thank-you note to one or two of their large advertisers (if any), as well as to the advertisers on the site you've left (if you know any). Politely thank them for supporting good Web sites, or remind them why their advertising won't be reaching you anymore (as appropriate);
  • Finally, there really ought to be a site (if there isn't already) where people can leave categorized works/doesn't-work-for-me notes about sites they've visited. This sounds an awful lot like the original argument for Yahoo!; I can see where such a review site would either die of starvation or grow to consume massive resources. But praise and shame are powerful inducements in the offline world; it's long past time to wield them effectively online.
I'm sure that there are literally millions of sites with Web poison out there, and likely several "beware" sites as well. For the record, the two that wasted enough of my week this week to deserve special dishonor are ZDNet and JobStreet. Guys, even Microsoft doesn't lock people out and lock browsers up the way you do; I can browse MSDN and Hotmail just fine on my Mac, on an old PC with Linux, or on an Asus Eee. And if you need help, I and several thousand others like me are just an email away. :-)

Wednesday, July 02, 2008

It's easy to think there's a war going on...

(playing softly, in the background of my mind, The Beatles'Revolution)

....between the Web developers promoting nice, clean development with RESTful, semantic (X)HTML judiciously enhanced with CSS and JavaScript (henceforth often referred to as the "Army of Light") and those using "popular", "mainstream" frameworks such as CakePHP and the Zend Framework, who route everything through a Front Controller of some sort, and often seem to be in the dubious company of WS-Whatever Web services (which, I am reliably told, provide ample amounts of The Wrong Kind of job security - they know more about your app than you do, and aren't telling what they know). The sides would seem to be pretty cut-and-dried, judging from a lot of the blog activity (Google REST XML-RPC PHP to get a million and a half or so hits of light reading material). Except...

Briefly skimming through the Zend Framework documentation, for instance, and looking at the QuickStart and tutorials reinforces the idea that URL handling is routed through a front controller to an application-specific action controller, which is the C in the notorious (and some say overused) MVC (model-view-controller) framework. Originally developed to help improve desktop-application development, particularly in languages like Java and Smalltalk, it became popular for Web development because.... it seemed like a good idea at the time. Actually, for Web development in the Pleistocene (say, late-1990s), it was a good idea. Anything that cut through the estimated 27.612 interconnected details that needed to be simultaneously mastered to get a "Hello, World" EJB up and happy was, by its very existence, a Very Good Thing. And so, when shops moved to more productive, less pathologically irrational development systems than J2EE, the models and design patterns that had saved their bacon were brought over into the New World, to maintain conceptual touchstones that helped Useful Work Get Done. Happiness abounded throughout the realm, until apps started outgrowing the meager bounds of static HTML and became "Rich Internet Applications". (To the tune of "Lions and Tigers and Bears, Oh My!", you hear faint murmurs of "AJAX and WS-* and REST, Oh My!") And, to pile on the snowclones, there really be dragons there.

'Dragons' in the form of falling into a GET-centric, action-oriented, everything-just-a-click-away world of convoluted Web apps with limited (re)usability and even less understandability to those who haven't swum there in some time. The entire promise of REST is simple: by centering applications around resources, rather than actions (through the use of URIs; Universal Resource Identifiers) and following the eminently sensible notion of not putting kilobytes of state information into those URIs (necessary information is POSTed along with the URI request), many problems that become painfully visible in large systems, simply go away. (Try sending a link to a cool book you found on Amazon over an instant messenger chat.)

But a typical, outsourced-development, haven't-really-used-this-tool-and-you-want-it-WHEN?!? developer isn't going to think of those things. He's going to grab a tool that has promising-sounding Google hits, run through a tutorial or two, and then plunge into the Son of the Enhancement of the Rewrite of yehey.com, with the customer sending him an "is it done yet?" email every six minutes. Clean design? What's that? Well-guarded state transitions? Who's got time to even understand that, let alone implement it? If we don't get it done, the customer's going to pull the project and send it to Vietnam or somewhere...

Just to make one point absolutely clear: I don't mean to be picking on Zend and CakePHP as being more than simply representative of widely-used, well-reputed tools that can be used to get the unwary, rushed developer (are there any other kind earning a paycheck?). While it is entirely practical to write semantic, RESTful Web applications in both frameworks (and both document how to do so), it's like, say, RPG; a fantastic tool for solving problems in a well-defined domain, usable with significant effort outside that domain, and Zeus help you if you use it to write an MMORPG.

The real point of this rant, if it hasn't hit you like a Muhammad Ali speed-anchor punch, is another pout over the state into which we've allowed the once-honorable craft of software development (of which Web development is but a specific case) into absolute bollocks. We've allowed the pay-any-price-to-cut-costs, pinch-a-penny-until-you-can-hear-it-scream-from-Boise-to-Bangalore idiots pervert us from Muhammad Ali (or at least Sonny Liston) into Herschel Shmoikel Pinkus Yerucham Krustofski. A plurality, if not yet an overwhelming majority of those who call themselves "software development 'engineers'" have been given neither sufficient formal training in their craft nor the resources (time, money, support, etc.) to continue learning as they go. "If you can spell EJB and ERP, you're the guy for us - as long as you're young and dirt cheap. And when you're done with that, we've got some BASIC code we want in Java instead."

So at the unique moment in history when ephemeral intellectual artifacts have assumed primacy in a wide range of human affairs, the humans whose intellect is responsible for their creation and correct functioning have progressively less ability to do the job properly. The way that, if they sit back and think for a moment, they know should be possible, has to be possible in any sort of rational omniverse whatsoever. But few, if any, ever get that chance for reflection. Fewer still, having reflected, researched and enlightened themselves, are welcomed back into the paying ranks who toil away at this once-noble craft.

And my Zend Framework code still feels slimy. It's not Zend's fault, at least, not entirely. Front controllers are good; front controllers are your friends; front controllers are.... *crunch!*

Saturday, June 28, 2008

It's Time to Grow Up

Adrian Kingsley-Hughes over at ZDNet has a very interesting post up, titled "Sticking with XP / Upgrading to Vista / Waiting for Windows 7 / Switching to Mac or Linux - There’s no single right answer". He puts forward the blitheringly-obvious elephant-in-the-room answer to the perennial food-fight question, "Which system is best?" Namely, "use what works for you." As I read through the post and the comments that had been added to it, I thought about all the man-centuries (if not -millennia) that had been "invested" in the topic. Naturally, I had my own two rupiah worth to say on the topic. Following is the text of the comment I left at approximately 1645 GMT on Friday 27 June. Let me know what you think. (There aren't any links in the post reprinted below; to the best of my knowledge, ZDNet's commenting software hates links, and it definitely hates Macs; every single comment I've posted has given me the error "You must enter the text to post" - on a clean, empty comment form - AFTER I've hit the "Add your opinion" button. Fix it, guys!


"Use what you like, and like what you use."

That's excellent advice for those of us who've been bouncing around in the funhouse for a while, who know which mirrors make us look weird (or, worse, are broken and likely to cut us if we're not careful)...and, granted, there are blessed few truly "new" users now; statistically, nearly everybody's used a PC with one form or another of Windows, and increasing numbers of us have used Mac and/or Linux, but...

We still do have the FNG syndrome with folks who haven't upgraded for a while, and finally they get tired of their molasses-slow Win98 box when they see this zippy new PC or Mac they've been handed at work. "Gee, we use XP at work, but I heard Microsoft isn't going to sell it anymore... what should I use?" Many of us, professionally or otherwise, are tasked with advising those people. Too often, the advice becomes "this is what I use; try it" without really understanding the (often vast) difference between the adviser and the user in question. And when "advisers" try and hash things out among themselves, it almost universally degenerates into an Animal House food fight scene - which doesn't bring any value to the discussion and actually makes us LESS able to give good advice.

Mental hands up: How many of you reading this have ever spent a month using Leopard? Vista? At least two of Ubuntu, Fedora or SuSE Linux? How many raised your hand all three times? Yeah, I see you, way back in the back.... but blessed few others.

In any other endeavour that dared call itself a craft, let alone aspire to an engineering discipline, this would be malfeasance if not negligence; you just DO NOT give advice on matters in which you are not qualified - and if you don't have experience and/or training with Technology X, you're NOT qualified to present advice as being any more valuable than used toilet paper.

In that light, Adrian has performed a major public service here. Given the reality that most people whose job relies on using and/or developing for one of the major platforms are quite unlikely to be as current and proficient on any of the others, this is the best advice that will just let us get along with our jobs without pissing in each other's lemonade each and every single day (Mike Cox and No_Axe, you know who you are).

But in the increasingly unlikely event that we're ever to make something professional out of this hobby that we are lucky enough to get paid for, the fact that this "solution" is seen as viable to any degree, let alone the MOST viable solution, is absolutely, reprehensibly unacceptable. And since computers and software have become absolutely central to nearly everything in modern life, including not least public policy, if we don't get our house in order under our own power, sooner or later some governmental organization or group thereof is going to step in and exert adult supervision. Is that what we want?

The days when Windows geeks and Mac users and Linux hackers could happily putter away, each in their own walled garden with tactical nuclear landmines guarding against any encroachment by reality, are gone as surely as the clipper ship. In the world of the Internet, where information is what's important and how it's processed/generated/visualized/stored is at best secondary, we're faced with the same choice as every biological
or cultural organism at an evolutionary shift: adapt or die. Keep the lemonade clean, or drink the purple Kool-Aid. Our choice. Each and every one of us.

Tuesday, June 24, 2008

Browser Support: Why "Internet Explorer 6" Really Is A Typo

(Experienced Web developers know that the correct name for the program is Microsoft Internet Exploder - especially for version 6.)

Case in point: I was browsing the daringfireball.net RSS feed and came across an article on the 37signals blog talking about Apple's new MobileMe service dropping support for IE6. The blog is mostly geared towards 37signals' current and potential clients who, if not Web developers themselves are at least familiar with the major technical issues involved. Not surprisingly, virtually every one of the 65 comments left between 9 and 13 June was enthusiastic in support for the move; not because the commenters necessarily favor Apple (though, clearly, many do), but because anybody who's ever cared about Web standards knows that IE6 is an antediluvian, defiantly defective middle finger thrust violently up the nostril of the Web development community; the technological equivalent of the Chevrolet Corvair: unsafe at any speed.

The degree to which this is true, and to which this truth continues to plague the Web developer and user communities, were brought into sharp focus by three of the comments on the post. The first, from 37signals' Jason Fried, estimates that 38% of their total traffic is IE, of which 31% is IE 6.0 (giving a grand total of 11.8% of total traffic - not huge, but significant).  The second is from Josh Nichols, who points out that Microsoft published a patch to solve the problem with IE6 in January, 2007; he notes, however, that an unknowable number of users may not have applied that patch. Finally, Michael Geary points out that later versions of Internet Explorer (version 7 and possibly the now-in-beta Version 8) also have the problem of not being able to "set cookies for a 2×2 domain, i.e. a two-letter second level domain in a two-letter top level domain with no further subdomain below that," (including his own mg.to blog domain). The fact that relatively few domains fall into that category can be argued to be part of the problem; users running IE, particularly an out-of-date version of IE, are likely to be less experienced, less able to recognize and solve the problem correctly, than to blame it on "something wrong with the Internet". For those people and companies who've paid for those perfectly legitimate domains, the negligence and/or incompetence of the browser supplier and/or user mean that they're not getting their money's worth.  And ICANN, the bureaucracy "managing" the domain-name system, is now "fast-tracking" a proposal to increase the number of top-level domain names (TLDs) used. (In time-honored ICANN custom, the press release is dated 22 June 2008 and "welcome[s]" "Public Comments" "by 23 June 2008." Nothing like transparency and responsiveness in governance, eh?

Thursday, June 19, 2008

Good Things are good things....aren't they?

Anybody who's worked with me over the last 20 years or so knows that I generally evangelize conforming to standards when they exist, are relevant and widely agreed on. As the famous quote from Andrew Tanenbaum (in Computer Networks, 2/e, p. 254) reminds us, "The nice thing about standards is that you have so many to choose from." When "standards" are used to promote vendor agendas (e.g., Microsoft force-feeding OOXML to a hapless ISO) or when they go against the common sense built up through hard-won experience by practitioners. And when multiple standards for a product or activity exist, and those standards are each widely used by various users (who could have chosen other alternatives), and when those standards conflict with each other in important ways that can't be amicably resolved, then those "standards" cause reasonable people to not merely question their validity, but, too often, the entire concept of "standards".

As any developer who's worked in more than one shop, or sometimes even on more than one project in a shop, knows, coding standards are sometimes arbitrary, often the prizes and products of epic bureaucratic struggle, and (in the absence of automated enforcement such as PHP CodeSniffer) often honored more in the breach than the compliance. What makes things even more "fun" is conflicting standards. It's not all that unusual for a company to contract out for development work, specifying that their coding standards be complied with (since they're the customer and they're going to maintain, or control maintenance of, the code). If the contractor has their own set of standards that conflict with the customer's, then problems arise with internal process compliance, customer involvement and final delivery. It can be - and too often is - a sorry mess. Simple code reformatting problems can be taken care of with a pretty-printer program; oftentimes, though, one sees entire programs (which have to be debugged, documented and maintained) developed just to "translate" one format to another. Many shops just give up, declare the project to be an exception or exemption from their own internal standards and processes, and try to conform to the customer's demands. "Try to", since their developers, both writing and reviewing the code, are going to be fighting against it tooth and nail because it just "feels wrong".

This whole rant was inspired by reading through yet another coding-standard document; this one the Zend Framework PHP Coding Standard. One item in particular struck me as counter-intuitive. In Item B.2.1.1, PHP File Formatting - General, it says:

For files that contain only PHP code, the closing tag ("?>") is never permitted. It is not required by PHP. Not including it prevents trailing whitespace from being accidentally injected into the output.

Experienced PHP developers are quite likely to have problems with this, not least because it conflicts with earlier behavior of the PHP interpreter and with tools that expect well-formed code. This is one of the oddities which tools like the aforementioned PHP CodeSniffer need to take into account. (There are other, more blatant "yellow flags"


If you're in a shop which takes standards seriously, uses PEAR code and uses the Zend Framework, your code review meetings are likely quite interesting.

  • "OK, we're going to look at foodb.class.php first, and then the others I mentioned in the email yesterday."
  • "Which standard does it use?"
  • "Well, it ties in with PDO, so it ought to follow the PEAR standard, right?"
  • "OK, that sounds reasonable."
As the meeting continues...
  • "Hey, what's this at the end of omnibar.class.php? There's no 'close-PHP' tag! If we start using the code-search Wiki plugin that the Bronx Project folks keep raving about, it's not going to like that...."
  • "Oh, yeah, but that's because  it uses all this Zend Framework stuff, so we use Zend's coding conventions... see that comment at the top about how to run CodeSniffer?"
  • "Riiiiight...."
and so on. Weren't process and standards supposed to make development easier and more reliable?

I agree with the sentiment, apocryphally attributed to one or another of numerous software gurus, that, in the presence of otherwise adequate and sufficient standards, we shouldn't be so "egotistical" as to think developing a "better" standard than others already have is worth our time; take what's already out there, adapt as necessary, and move forward. The trick, of course, is in evaluating that condition, "otherwise adequate and sufficient." Also, since our craft is (hopefully) continuing to advance and adopt standard patterns for things done before, striking out on your own (after careful consideration) demands that the question be revisited from time to time. What are other development groups using (broadly) similar techniques to solve (broadly) similar problems using? Is a consensus forming, and do we have anything useful to say about it? Or has a single standard already taken hold, and we can take advantage of it (at least for new or reworked code)?

Code analyzers like lint and PHP CodeSniffer can be amazingly useful. But for them to function as standard/policy enforcement tools, there must be a standard, or a small group of similar standards for them to enforce. When development teams have to juggle between incompatible standards, it discourages them from following any standards. And in that direction lie... the 1970s.

Monday, June 16, 2008

g++ != gcc (arrrrrrgh!)

Coming back up to speed on Mac programming, now that I've finally got a shiny new iMac. Their XCode IDE looks like a great tool (Objective C, C++, C, etc., etc.), but I was hacking around building some simple test code. Being a fully-certified Unix operating system, of course that's an easy way to get something done while minimizing the number of known and unknown unknowns that need to be dealt with.

This mostly transpired between 2300 Sunday and 0130 Monday (last night/this morning). I installed CppUnit (Boost was already on the system), and kept running into the same problem.
jeffs-imac:Foo jeff$ gcc -I /opt/local/include -I /opt/local/include/cppunit foo.cpp -L /usr/local/lib  -lcppunit  -o foo
ld: in /usr/local/lib, can't map file, errno=22
collect2: ld returned 1 exit status
Hmmm. Maybe the library that's on there is screwed up somehow? Go to Sourceforge, pull down the library source, build it, install it, and try again.

Same thing. Fiddle with the code, fiddle with the command line, nothing fixes it. Go out and Google for help. The very first hit, from MacOSXHints, had a silly-sounding but tantalizing "clue":

You're right. I figured out the problem is that I was missing the -c switch when building the .o file with gcc. For some reason the linker doesn't complain about it, but when I try to link the shared lib with my main program I get the obscure can't map file error. Now it is working. Thanks.

Hmmm again. Go fiddle some more, this time compiling and linking my trivial proof-of-concept in separate gcc command lines. Still no joy.

I go back and play with building libcppunit again, wondering if I've missed some funky option to configure. Nope. It's pushing 0130, I need to be up at 6-something, and my brain is fried, so I shut down for the night.

(Later) in the morning, something's niggling at the back of my mind, saying I missed something while watching libcppunit compile, so I do it again. Yep, cue the "you dumb palooka!" moment: it's not using gcc to compile; it's using g++. For those who've never had the (dubious) pleasure - gcc is the "general-purpose" front-end to the GNU Compiler Collection, a set of (numerous) language systems, of which g++ is the C++-specific toolchain (and standalone front end). All compilers in the Collection produce object code in compatible format (the same back-end is used for almost everything), so usually all you have to do is invoke the One Command to automagically compile your Ada, FORTRAN, Java, PL/I, whatever. And, to be honest, it had been a few months since I'd dealt with C++ on gcc/g++ from the command line. (Thank you, Eclipse!) But I remembered a bit of wisdom lost in the Mists of Time...

They can both compile your C++ (in a single step). But: They're. Not. The. Same.

So, I re-do my (now-) two-step process, substituting g++ for gcc:
jeffs-imac:Foo jeff$ g++ -c -I /opt/local/include/ -I /usr/local/include/cppunit/ foo.cpp
jeffs-imac:Foo jeff$ g++ foo.o -L/opt/local/lib -lcppunit -o foo
jeffs-imac:Foo jeff$
Ta-daaaaa!  OK, can we go back to a single step? That is, compiling and linking (using g++) in one go:
jeffs-imac:Foo jeff$ g++ -I /opt/local/include -I /opt/local/include/cppunit foo.cpp -L /usr/local/lib -lcppunit  -o foo
ld: in /usr/local/lib, can't map file, errno=22
collect2: ld returned 1 exit status
jeffs-imac:Foo jeff$
Nope. This is where the MacOSXHints commenter was right on the money. But why? I used to be knee-deep in the (FORTRAN-specific) code, a feeling akin to being knee-deep in the dead on occasion, and the answer doesn't come immediately to mind. Any ideas?

Saturday, May 10, 2008

ANFSD: starting a series to scratch an itch

(And Now For Something Different, for the 5LA-challenged amangst you...)

I've made my living, for about half my career, on the proposition that if I stayed (at least) three to six months ahead of (what would become) the popular mean in software technology, I'd be well-positioned to help out when Joe Businessman or Acme Corporation came along and hit the same technology - with the effect of "Refrigerator" Perry hitting a reinforced-concrete wall. This went reasonably well as "the market" started using PCs, then GUIs, then object-oriented programming, and then "that Internet thingy" (Shameless plug: résumé here or in PDF format).

In other ways, I've been a staunch traditionalist. I've used IDEs from time to time, because I was working as part of a team that had a standard tool set, or because I was programming for Microsoft Windows and the Collective essentially requires that that be done in their (seventh-rate) IDE unless you want to decrease productivity by several dozen orders of magnitude.

Otherwise, just give me KATE or BBEdit and a command-line compiler and I'm happy. This continued for a significant chunk of the history of PCs, until I decided that, for the Java work I was doing, I really needed some of the whiz-bang refactoring and other tie-ins supported by Eclipse and NetBeans. Then I started hacking around on a couple of open-source C++ packages and thought I'd give the Eclipse C/C++ Development Tooling a try. Now I'm coming up to speed on wxWidgets development in C++.

During this learning-curve week, I spent a lot of time browsing the Web for samples, tutorials and so on. To call most of them execrable is to give them unwarranted praise. Having recently resumed work on a Web development book dealing with useful standards and helpful process, and since I've been doing C++ off and on since the mid-80s, I thought I'd start a series of blog entries that would:
  • Document some of the traps and tricks I hit to get a simple wxWidgets program into Eclipse;
  • Illustrate some early, very simple refactoring of the simple program to get a bit more sanity;
  • Get Subversion and Eclipse playing well together;
  • Explain why I think parts of teh Agile method are simulataneously nothing new and the best new idea to hit development in a very long time.
  • Start using an automated-testing tool to build confidence during debugging and refactoring; and
  • Using a code-documentation tool in the spirit of JavaDoc to produce nice technical/API docs.
At the end of the series, you'll have a pretty good idea of how I feel most projects (regardless of underlying technology and specific tools) "should" be done.
You'll have seen a very simple walk-through of the process, demonstrated using Linux, Eclipse, C++ and wxWidgets, but actually quite broadly applicable well beyond those bounds.

Please send comments, reactions, job offers, etc., to my email. Death threats, religious pamphlets, and other ignorance can, as always, go to /dev/null. Thanks!

Tuesday, May 06, 2008

Oh. Mah. Gawwwwwwwwd.

You no longer need to reboot a running Linux system to apply security patches to it.

Check it out.

Excuse me whilst I pick my jaw up from the sub-sub-sub-basement floor. If this checks out in the field, on multiple distros, then a lot of sysadmins are going to be able to get a lot more sleep at night. And the comment by one David Pottage:
This should be good for distro kernels.

Just think if you can prepare a special kernel module that will apply security patches to a running kernel, then so can your favorite distro. In future when there is a security update, instead of downloading a ~20Mb kernel package from security.debian.org or the like, and then waiting until a suitable time to install it and reboot the system, you can download a small package containing patching modules for the standard kernels from that distro, and install it immediately...
The mind reels a bit. Security patches are the most gotta-do-it-right-NOW things that come down the pipe for any system. Open-source systems that are widely audited, like Linux, tend to get patches a lot quicker than Windows (which had attacks in the wild with no fixes available for some 271 days in 2007), or even Mac OS X. Any closed system that depends on a single organization to secure it will always have slower reaction time than an open system with enough (mutually independent, distributed) resources to throw at it. As Eric S. Raymond wrote in The Cathedral and The Bazaar, "given enough eyeballs, all bugs are shallow". As long as there is some meritocratic control over the "official" patch-submission process - and there is - it's now easier than ever to keep critical systems up and secure, in ways and at speeds that simply can't be matched in the Microsoft world. Remember, even if your uptime is 99.999%, (the so-called "five nines gold standard"), you're still down five minutes and fifteen seconds every year. Murphy's Law says at least five minutes of that time will be when it was really, truly important that the system not be down.

You can't repeal Murphy's Law, but I think it does give us a big step towards an insurance policy.

Saturday, May 03, 2008

Rant - can people at least engage brains before asking stupid questions? (Or: Paging Andy Rooney)

I read several different development blogs and message boards, such as those associated with phpclasses.org, codeproject.com, IBM DeveloperWorks, and so on. Usually pretty useful both for learning new techniques and keeping an eye on what other people are doing. Several of the boards, particularly on CodeProject, have been getting hot and bothered lately about the declining quality of questions being asked on the public lists; one large category of these is "affectionately" known as "homework questions'. These usually aren't for actual classwork. A more typical scenario seems to go like this: Sanjay is brought into an outsourced project because his agency assures the client that he's a hotshot - fully qualified in J2EE, PHP, XML, RPC and LSMFT (obviously a key qualification, but since the client HR person is neither technical nor over 40, it just appears to be "tech jargon.")

The client manager thinks, "I'll have to let a couple of my other guys go to stay in budget, but if this guy can save our bacon, it's worth it." Sanjay starts one bright Monday morning at 9.00,, gets the usual here's-where-things-are spiel, sits around (billing time) waiting for his computer to be set up and connected to the network (these things almost never happen before the warm body shows up), and by 1 PM is browsing the codebase for the project. By 1.30, he's on the Web, posting questions on sites that make it absolutely, crystal clear that he wouldn't know his ear from a hole in the ground if you gave him a flashlight, a map and six hours' head start.

The most extreme example of this I've personally witnessed was when I was part of a team consulting to a major Southeast Asian telecoms firm, working on integrating the homegrown billing system that one division used into the (telecom-)industry-standard one used by most of the rest of the company. This benefited greatly from knowledge of Java, of both Linux and Windows, and especially of the commercial system which was the target for the migration. The firm which produces this system, in true Java-ecosystem fashion, offers their own training which leads to certification in various aspects of the system. The team had a good mix of knowledge and experience, with the single yellow flag being that the major new-system expert was a member of the client's staff. After being encouraged to bring our own domain expert in (apparently so the client could reassign the existing one if desired), our headquarters (in India) found the "perfect guy". He had Java and J2EE certificates. He had a BSCS from "one of the top universities" in India. He had all the certifications the vendor offered. Oops, they were all for the previous version of the system....but this guy could, on paper, walk on water as far as our project's needs were concerned. So we flew him out from Bangalore. We had, as I recall, six weeks before major schedule slips would hit the fan.

A month later, my team manager and I sat watching this guy type in sample programs from the manual, try to build them, watch them fail, erase everything and start over. The manager said to me, "well, why don't you and (the junior guy on the team) start trying to pick up the pieces? Maybe you can pull something out." Then we noticed something else that was interesting. The client manager kept saying nice things about the "expert", how diligent and hardworking he was. He started taking the expert out for lunch and so on. My manager and I were in shock: this guy had already cut into the project for thousands of dollars and had yet to produce a single project artifact. Worse, in our view, his "experience" and "qualifications" were obviously complete fabrications. Headquarters (back in India), however, wouldn't send out a replacement because they had been told that the client was happy with the guy. We started looking around for brown-projectile-proof mackintoshes, anticipating the storm sure to come. And it did, eventually - shortly after the "expert" failed to follow explicit, idiot-proof instructions on how to extend his visa so he wouldn't have to go back home. Instead of buying a round-trip bus ticket to Singapore, he bought a one-way ticket, later claiming he wasn't sure which bus he'd want to take back up. The Singapore immigration officials blocked him from entering Singapore as they had no reasonable assurance of when he'd leave. The Malaysians wouldn't let him back in because his (tourist) visa had expired that day and he'd need to enter another country before reentering Malaysia. After several frantic midnight telephone calls and (I was later told) negotiation and pleas from our firm and the client, he was unceremoniously dumped on a plane back home, and our firm was billed for a full-fare coach ticket. The client manager became upset because his buddy, the "expert" was nowhere to be found and on no notice, at that.

Why am I blathering about all this now, several years after the event? My ire was raised by one of these "homework questions" I'd mentioned earlier, this time posted on the WeberDev general PHP forum. Entitled "Which is compatible PHP or Java with SQL SERVER 2005", the writer, using the nick "kumarsudu", starts out with...
Hi All,

I would like to know which one is have compatible

1. PHP with SQL SERVER 2005
2. Java with SQL SERVER 2005

as i have to develop a project, ...
As another popular forum's members often ask, "how many WTFs are there in this message?" The mind reels. Asking totally nonsensical, absolutely open-ended questions on technical sites and lists is now at a flood stage not seen since the AOL infection of the Internet back in the early 1990s. The writer, besides the sterling specificity of the question as mentioned earlier, shows little knowledge of or interest in proper use of the English language (despite ample materials and information available online as well as off).

It pains me that:
  • a person of such deliberate ignorance and lack of brilliance would not only choose to waste my and other readers' time with the request to do his work for him;
  • any contracting agency would have such miserably low non-standards that this guy would even get the time of day, let alone a job for which there are an ample supply of other (by definition more qualified) candidates for (granted, they may have to raise their pay to a level higher than a fast-food burger-flipper);
  • any client company would not only pay money for such an individual, but continue to do business with the "recruiter" that brought him in;
  • there are no project-local, competent individuals whom this writer could ask for help, who would apply appropriate informational and organizational responses to the question; and finally
  • that the prevailing business contempt of software development has sunk so low that this type of thing is more unusual for the fact that anyone bothered to notice it than that it happens at all. Would you want to fly with an airline whose pilots were the cheapest available, using forged certificates and qualifications to highlight (non-existent) experience? If your answer to that question is anything less emphatic than "hell no!," please inform me of your flying plans so that I may ensure that I am neither flying nor anywhere on the ground along your route during such a flight.
The "project" mentioned by the initial writer, if performed by individuals of this apparent calibre, is highly likely to fail - wasting the client's time and money, leaving the client wih a problem that still needs to be solved, and continuing to erode the opinions of the client and the client's associates of the business value of software, since, obviously "IT projects always fail."

Saturday, December 01, 2007

Buzz About the Code Buzzards

Scott Hackett over at SlickEdit has a blog entry where he talks about code scavenging as "a new software development methodology." The point being, of course, that code scavenging isn't new at all; it's at least as old as the UNIVAC I. Every new, and not so new, software developer has used it at one time or another, usually for quick bootstraps to get a specific part of the software under development working in spite of limited knowledge of the language and/or domain in question by the programmer, or limited resources (time/budget). Most times, it's at least felt to be some combination of the two.

What's changed recently, and what makes five-finger coding a screaming, begging candidate for formalization, as Mr. Hackett notes, are two distinct phenomena: the rise of new repositories on Web sites like Koders, Krugle, Google Code Search, among others, that at least hold the promise of finding "what you need right now", right now. Also, development professionals face ever-increasing resource constraints ("can you get that for us yesterday?") in a craft that is continually expanding in scope and detail.

What I found interesting, having visited sites like Krugle many times in the past, was the effect that open-source licensing like the GNU General Public License and the BSD License are having on one of the main limiting factors that Mr. Hackett identifies: trust. Many developers, for various reasons, don't trust the work of someone they don't know when their own jobs or reputations are on the line. What open source and the new search engines bring to the table is the idea that now, you don't always have to just accept a code fragment "on faith". Many of the more experienced developers have a track record of other code that they've written, often accessible via the same search engines that helped you find the one you're looking at. Open source and free licensing mean that you can look at other stuff that the guy in question has written, get a feel for how his style and competence match your own, and use that information as additional basis for evaluating the code fragment you're scavenging. The trick, obviously, is to keep the time and effort required for all this significantly below what it would take for you to write your own implementation from a clean sheet of paper. That's why search engines and good indexing are important.

Of course, none of this will help you make the changes that always have to be made to put a piece of "foreign" code into your handiwork. That's still going to take some work; hopefully not on the order of putting a Porsche engine into your Ford pick-up. We'll have to wait for - and work for - serious improvements in the state of the software development craft to have any effect on that. Development methods, languages, and so on are still in the cathedral at Rheims. I've been waiting 30 years for the binary Renaissance to hit my professional life.... anybody need a virtual stonemason?

Tuesday, September 18, 2007

Improvement as opposed to Change

A link on the Agile CMMi blog had some very interesting things to say about a gentleman named Brian Lyons, the CEO (as well as CTO and founder) of Number Six, a Vienna, Virginia (Washington, DC area) software technology services/business consulting firm. Number Six quite obviously 'get' the concepts behind Agile CMMi, Hillel Glazer of the blog wrote. Interested, I flipped over to their site to hopefully learn a bit more, maybe drop off a CV.

Mr. Glazer wrote the blog entry in question on 25 July. The first thing you notice when viewing the Number Six site is the home-page obituary of and tribute to Brian Lyons, who died on (Monday) 3 September 2007 in a motorcycle accident. There are various links to family and tribute sites, a press release, and a request that "in lieu of flowers, the family has requested that donations be made to" a scholarship fund at the University of Maryland (excellent, particularly for those too far or too late to attend the funeral). I wish the family and company all the best in their times of grief and trouble.

The point I originally started writing this entry to make, however, was inspired by one of the bullets on Number Six' careers page, under the heading "Why Six?": "We are committed to consistent improvement, not continual drastic change."

Think about that distinction for a moment. Many of us, particularly those in the software-geek persuasion, start our careers trying to remake everything; not necessarily because it's broken, but so that we can do it (whatever it is), make it ours, introduce new techniques or technologies, and hopefully (but improbably) provide a better solution to the problem at hand than what was there before.

The forces arrayed against that impulse are formidable. Playing the role of (and often in actual fact) those who are by nature suspicious of any radical approach to a problem, too often dismissing out of hand any alleged improvement or innovation without "giving it a fair chance", in the eyes of the wet-behind-the-ears Young Turk. Occasionally, of course, improvements and innovations too beneficial to ignore do come out of this process.

As the developer matures (which may take days, decades or eternities), the Young Turk recognizes that not all of the hindrances were traditionalist per se; rather, they were operating from a different (usually business-oriented) set of priorities. Learning to understand and appreciate those priorities is one of the fundamental aspects of any successful transformation from ivory-tower Geek to journeyman software craftsman (or -woman), able to make a contribution to customers and to the craft of software development.

What becomes obvious, to all thoughtful participants and stakeholders in the process of developing and maintaining any software system is that a natural tension exists between three apparently conflicting ideas:
  • Don't change what works well enough when there are more pressing needs to attend to;
  • New capabilities, chosen and implemented properly, can have strongly beneficial effects (efficiency, productivity, wider customer scope, 'better' according to various aspects of quality);
  • Any change to an existing system involves direct and indirect costs which must be carefully evaluated and compared against the expected improvements.
Balancing and managing the conflict between those concepts throughout the lifetime of a piece of software is an art that has yet to be thoroughly and reliably mastered with a level of reliability and cost widely acceptable to business customers. Several philosophies and techniques (often marketed as technologies) have been developed and marketed as (attempted) solutions to that problem. Three that I have found extremely useful, in complementary ways, over the last few decades are agile development, refactoring and the CMMI. (Those already familiar with the concepts may skip down to the paragraph which begins "As argued by practitioners such as Hillel...".)

The CMMI (previously more widely and less precisely known as the CMM, for Capability Maturity Model) has been around for quite some time as a formal means of evaluating the maturity and coherence of an organization's development efforts (among other activities). It requires its users (organizations) to acquire and document ever-increasing detail of precisely how they go about the process of development and how they are required to be able to prove (in an internal or external audit) that those processes are being followed and any discrepancies noted and accounted for. This suffers as a standalone policy framework for software development on two grounds:
  • the "what" and "why" of development are completely ignored. CMMI fundamentally doesn't care if you're building a pile of junk that has serious practical problems, as long as you do it using the process you've defined for yourself.
  • The CMMI, along with US DOD Standard 2167A, have been blamed for the logging of more trees (to make paper) than virtually any other industry business practice. This is largely because both are seen (in CMMI's case, somewhat unfairly) as pushing "hidden mandates" for the rigid , never-far-from-obsolete "waterfall" model of development.
Agile development, on the other hand, is a survival response to both the (largely management-driven) mantra that "Real Artists Ship", and on the other hand, to seemingly interminable periods between "milestones" (particularly in the waterfall model) where nothing of intrinsic value is visible to an outside observer (e.g., management or customers). Its main criticism from opponents is that it produces too little documentation (the product is the documenatation, to a large degree).

An aspect of, or adjunct to, agile development is refactoring, pioneered and popularized in the Martin Fowler-written book. Refactoring is all about how you can make (sometimes radical) changes to the internals of a software system, provided you keep the external (customer-facing, revenue-generating) interfaces constant. Wrapping one's mind around this as a formalized process, as opposed to the "don't-break-what-works" mentality common to experienced developers (and managers), is a threshold experience in a software craftsman's progress.

As argued by practitioners such as Hillel, the combination of the process-centric CMMI and the solution-centric, visible-progress Agile set of processes (including, particularly, refactoring) gives the "best of both worlds" for development - a focus on artifacts and products created through a survivable process, matched with a demonstrated and documented knowledge of exactly what is meant by "that process" under current circumstances. This also implies demonstrating understanding of the specifics of "current circumstances" that drive the decision-making and artifact-development processes - which brings us back full circle to understanding why Agile is a good fit to begin with.

Unlike the Rational Unified Process (RUP), which grew out of a waterfall-driven management-oriented mentality (and the opportunity to sell software tools and consulting services to any shop using the heavily-marketed process), Agile and CMMI development can be successfully started after passing around a few books among the development and management staff. Tools and training are, chosen thoughtfully, extremely helpful, and the appraisal process within CMMI (formally SCAMPI, the well-known "Level 1" to "Level 5" assessment of process maturity) does eventually require an outside assessor, or registrar, to formally certify the organization. But to take a typical small development group from the customary initial chaos to a finely-tuned, market-leading, customer-satisfying machine, it has been my experience and observation that it is far easier (and more cost-effective) to implement CMMI in an Agile fashion than to go down the (vendor-specific) RUP route. Imposing either on a large organization would require an unlikely combination of managerial brilliance, sadism and masochism; revolutions boiling up from below and winning eventual management sanction have proven much more likely to be successful in that type of environment.

So what's a Young Turk (or a more seasoned craftsman) to make of all this blather? The simple point: Consistent improvement is practical, and without life- or career-threatening implications, achievable on a repeatable basis. Continual drastic change, in contrast and almost by definition, will lead the development group to many sleepless nights trying to get that last show-stopper bug fixed, management to wonder why those bozos in development can't be trusted to ship anything on time, and customers to wonder what they're really getting for their money beyond the hype and bling of the marketing materials. Which team would you rather be on?

Saturday, September 15, 2007

Crap Is Not a Professional Goal

I recently, very briefly, worked with a Web startup based in Beijing (the "Startup"). The CEO of this company, an apparently very intelligent, focussed individual with great talent motivating sales and marketing people, takes as his guiding principle a quote from Guy Kawasaki, "Don't worry, be crappy".

The problem with that approach for the Startup, as I see it, is two-fold. To start, Kawasaki makes clear in his commentary that he's referring strictly to products that are truly innovative breakthroughs, exemplars of a whole new way of looking at some part of the world. Very few products or companies meet that standard (and even if yours does, Kawasaki declares, you should eliminate the crappiness with all possible speed). No matter how simultaneously useful and geeky the service offered by the Startup's site is, it is, at best, a novel and useful twist resting on several existing, innovative-in-their-day technologies. (A friend who I explained the company to commented that "it sounds as innovative as if Amazon.com only sold romance novels within the city limits of Boston" - hardly a breakthrough concept). Indeed, Kawasaki makes clear that he's talking about order-of-magnitude innovation; the examples he cites are the jump from daisy-wheel to laser printing and the Apple Macintosh.

The second, more insidious, problem with the approach, and the trap that many early-1990s Silicon Valley startups fell into, is that you take crappiness as a given, without even trying to deliver the one-two punch of true innovation and a sublimely well-engineered product that immediately raises the bar for would-be "me-too" copycats. (Sony, for instance, has traditionally excelled at this, as has the Apple iPod (learning from the mistakes Kawasaki cites for the earliest Macintosh). Deliver crap, and anybody can compete with you once they understand the basics of your product. The wireless mouse is a good example of this.

If you tell yourself from the get-go that you'll be satisfied if you ship a 70%-quality product, what will happen is that, as time goes by, that magical 70% becomes 50%, then 30%, then whatever it takes to meet the date you told the investors. And if management doesn't trust engineering to give honest, realistic estimates (as is typical in software and pandemic in startups), you have a recipe for disaster: engineering takes a month to come back with an estimate that development will take 12-18 months; management hears '12' and automatically cuts that to 6 and pushes to have a "beta" out in 4. The problem is that, if you're dealing with an even marginally innovative product, things are not cut-and-dried; the engineers will have misunderstood some aspects of the situation, underestimated certain risks, and been completely blind to others. This was pithily summed up, in another field entirely, by Donald Rumsfeld:
There are known "knowns." There are things we know that we know. There are known unknowns. That is to say there are things that we now know we don't know. But there are also unknown unknowns. There are things we don't know we don't know. So when we do the best we can and we pull all this information together, and we then say well that's basically what we see as the situation, that is really only the known knowns and the known unknowns. And each year, we discover a few more of those unknown unknowns.
Companies that simultaneously forget the ramifications of this while taking too puffed-up a view of themselves are leaving themselves vulnerable to delivering nothing more useful or profitable than the old pets.com (not the current PetSmart) sock puppet - and probably nothing that memorable, either.

And that further assumes that they don't fall prey to preventable disasters like losing the only hard drive running a production server built with non-production, undocumented software. If Web presence defines a "Web 2.0" company in the eyes of its customers and investors, going dark could be very costly indeed.

Friday, August 24, 2007

Back in the Saddle, Again

....with abject apologies to Gene Autry...

I haven't posted here for a while (about three months - gak!). For the six or eight of you still hanging on, humble apologies and my deepest appreciation (sympathies?). Some have publicly wondered (offline) whether I am merely offline or have flatlined. Actually, I've been in hospital twice during that time, and my personal and professional lives have undergone more than the usual random fluctuations. Be that as it may...

As with roughly half of the Linux-aware folks out there, I've been playing with Ubuntu Linux for a while now. The job I just started - as Principal Technologist and alleged future CTO for FoneVillage.com in Beijing - is with an Ubuntu shop, so that's one motivation. I've been a Debian evangelist for a few years now - formerly a Kanotix (now Sidux refugee, now with Ubuntu and Mepis installed and happy on laptops and a desktop (and lusting after a Mac Pro (but that's another blog entry)...

Half of me LOVES Ubuntu. Point-and-click everything; all the applications (except mainstream violent Windows games) that a user could want immediately available, name-brand Big Applications for the enterprise; more-solid-than-most-rocks Debian under the hood; regularly updated Live CDs (but get the better Live DVD what's not to like?

The other half of me, the guy who's been intimate with the care and feeding of Unix systems for almost 30 years, has an easy answer for that; sudo (as superuser/SystemGod, do) everything, but in particular, sudo bash (as superuser, open up a shell [terminal] and let me run arbitrary commands with no restrictions).If you read Ubuntu guides and Web pages, almost everything a user does from a command shell that affects the system is done as sudo command, while logged in as an ordinary user. A bit of poking around with Google led me to a page on About.com's Ubuntu Desktop Guide that put things into better perspective:

The first user account you created on your system during installation will, by default, have access to sudo. You can restrict and enable sudo access to users with the Users and Groups application.

My knee-jerk reaction having subsided, I'm back to generally liking what I see in Ubuntu. It's intended to achieve - and generally succeeds at - being "easy enough for anybody to use", not just "geeks", as Linux has heretofore been viewed by Windows usees. It's another answer to the classic "what's the difference between a Windows usee and a Mac user?" question: The Windows usee talks about everything he had to do to get his work done; the Mac user (or, generally, the Ubuntu user) talks about all the great work she got done.

For the technophobes out there who still want to join the modern world, definitely worth a spin.

Monday, June 04, 2007

If You Can't, It Doesn't

If you can't measure something, can you prove that it exists? Or at least that you understand it? If we're having a theological discussion, you might have one answer, but in the material world, and more specifically in technology development, the answer to both questions is "probably not". To quote what several nurses and doctors tell me was pounded into them during their training, "if you don't write it down, it didn't happen" - because there's no record that it happened the way you think you remember it. Memory's a funny thing - even about events that actually happened.

This less-than-breathtakingly original, but relevant, train of thought occurred to me after I spent an hour kicking the tires of the PHPUnit code coverage analysis features. For you PHP coders out there who aspire to professionalism, you will almost certainly find that rigorous, formal testing procedures will save you time and grief in the not-very-long run. If you've been "test infected" or you're working in an XP (the development method, not the massively defective software), you already know this; otherwise, you're likely to be shocked by how good your code is likely to become.

Automated unit tests (using tools such as JUnit (for Java), NUnit (for Microsoft .NET) and PHPUnit are good for proving a) that your code works the way you expect and b) that it keeps working as you make changes to it. Automating the process (so that it runs all your tests whenever you make changes) ensures that you get continuous feedback that all the tests still work (reducing the likelihood of hidden dependencies breaking). Code coverage testing, as the phrase implies, tracks which lines of your code are exercised. Code that has been tested extensively and found to work as expected can be differentiated from untested code and "dead" code. Dead code is code that can't be executed because no logical path through the program exists that will execute that code. A minimal number of lines of code will be marked as dead code in many languages and test systems due to the textual structure of the language itself (e.g., PHPUnit marks a line of code consisting solely of the brace ending an if-block as dead code). Code identified as dead should be eliminated except for this structural detritus; why maintain code that will never be executed?

Many beginning developers (and not so beginning) tend to assume that their code works if it doesn't contain any obvious syntax errors flagged by the interpreter or compiler. These language systems are (generally) reasonably competent at interpreting what you wrote; they have significant constraints in discerning what you intended beyond what you wrote. Hence, just because the code "compiles cleanly" doesn't mean it is free from defects. That's the job of testing - starting with you, the developer, running unit tests.

If you're doing unit tests without any specialised tools ("rolling your own" tests), or your testing tool does not provide code coverage analysis, it may often be difficult to determine whether specific blocks of your code have or have not been tested successfully. Thus, the assumptions that you made while writing the code are unlikely to be challenged during testing - the same assumptions will guide an ad hoc testing regime as the original coding.

One of the benefits of test-driven development is the ability to cut out unnecessary code and other development artifacts; the development team is very confident that exactly and only what is part of the desired system is actually in that system and can expand/refactor the code to adapt to changing requirements; "see a need, fill a need". You can get a lot more done, more quickly, when you have total confidence that your code will continue to work as features are added or changed, and that anything that breaks will be immediately and obviously detected. You can't do that - really - without code coverage analysis.

For you plinkers out there who aren't doing sufficient (or sufficiently organised) testing yet - your competition is, and if you spend any time at all in this craft, you will bang up against "difficult" bugs that wouldn't have been so difficult with pervasive testing. Those of you who have been writing unit-test cases shouldn't automatically get too comfortable, however... how do you know how much of your code is being tested? If you're testing the same block of code eight different ways and other sections of your code don't get tested at all, can you ship a quality product? Coverage analysis will save you time (by not writing redundant test cases), grief (by prodding you to test areas of code you thought you tested but hadn't), and money (from #1 and #2). If you're trying to run a completely instrumented shop - where everything that can be measured for a lower cost than the failure of that thing, is being measured - then I (should be) preaching to the choir.

To sum up:
  • If you don't write it down (in a way that "it" can be found again), it never happened. If you've never tested your code, it's broken until proven otherwise.
  • If you can't repeat the test at will, it hasn't been tested.
  • If you don't or can't know how much of your code has been tested, your users will.
We can't all be Microsoft and expect our paying customers to find our problems!

Tuesday, May 08, 2007

XHTML Is (Nearly) Useless

If you've written any Web pages in the last five years (at least), you've at some point bumped into the difference (schism?) between "original" HTML and "new, improved - now based on XML!" XHTML. If you don't write Web "content" (thanks for reading my blog, but why are you here?), or deal professionally with those who do, you may not know the difference, or care that there is a difference. There is, and people should care about it if they care about the Web.

(Briefly, for those who care but don't know; the rest of you can skip this and the next paragraph.) HTML is often known to developers as "tag soup", because very, very many sites don't follow the strict interpretation of the standard, and are "broken" in all sorts of ways. This was initially justified as working around the myriad bugs in grossly defective browsers such as Microsoft Internet Explorer. XHTML was different and better because it was HTML reformulated as XML, which could then be "validated" (checked) by any validating XML parser. HTML-as-XML also (should have) driven the development and use of all sorts of nifty techniques and tools that are only practical when assumptions can safely be made about the structure and format of the document - which would be true in XML/XHTML but not necessarily in "classic" HTML.

The problem, of course, is Microsoft's Internet Explorer browser, affectionately known to Web professionals as "Internet Exploder". Among the many "quirks" (defects) that has unknowingly afflicted usees of that browser, all versions up to and including the current Version 7 fail to understand XHTML as XHTML. The "conversation" that takes place between a browser and a server when the browser requests a Web page is defined by the open standard known as HyperText Transfer Protocol, or HTTP. Part of that conversation involves the server informing the browser what type of data it will be sending. This is done using what is called a "content type" "header".

All together now? Good. When a server wants to send a browser a page of "tag soup" HTML, the correct content type is "text/html". A properly-formatted and -served XHTML page will instead use "application/xhtml+xml". This will inform the browser that, in fact, the page being transferred is a proper XHTML page (per the open standard defining it), so the browser will kick in the assumptions and processing that works for XHTML but not for "tag soup".

Of course, Internet Explorer is now the only major browser that gets this wrong (as indicated by this vintage-2005 blog entry). As far as I am aware, every other major graphical browser in the world - Firefox, Opera, Konqueror, Galeon and the rest - all support The Right Thing. Unfortunately, IE is still the 300-pound gorilla in the china shop; the majority of Windows usees still browse the Web using IE, and though the trend is improving steadily, that will likely continue to be true for the next couple of years (say, 2009-2010 barring unforeseen circumstances).

What kinds of things would a properly XHTML 1.0-compliant browser let us do with our site? One trivial example: let's say you're writing a political-commentary site that is geared towards an upcoming election, and you want to consistently name your candidate as "The Honorable Senator Francis X. Snort (email senator@senatorsnort.org)". When your guy goes down to defeat (one too many campaign-finance scandals, mayhap) you want to change the blurb to "The Honorable former Senator F. X. Snort (email snort@somefreemail.com)". Trivial to do with whatever CMS or scripting system you're using, right? But by using an XML entity, you can simply say "&snort;" in your document, and an entity declaration in your document's header will tell the parser what you really mean. Change the declaration, and every instance of that entity expands to your new meaning. People who use other XML-based markup systems, such as DocBook, have been using this technique for years. Using XML entities in pages shown in correct (non-IE) browsers will do exactly what you tell it to. In IE, or, to be fair, several text-based browsers, the entity name will be displayed exactly as it is in the document - in our case, as &snort;. This is unlikely to have the desired effects on the folks "back home" for the Senator.

Web developers have, as I mentioned, several well-known workarounds for this type of thing, using their authoring tools rather than the document itself. It is, however, a reasonably easy example for people to understand. Given the increasing popularity of systems such as PHP Smarty that let you use large chunks of "raw" (X)HTML along with the scripting goodies, it would come in handy too.

So how does all of this make XHTML "nearly useless?" Because most developers developing pages for the general public (as opposed to corporate intranets), knowing that Microsoft IE doesn't support the correct content type, will either "not bother" developing "correct" XHTML or at best will serve it to all comers as "tag soup" HTML.

This also has the "benefit" of completely stifling further innovation (as far as the end user is concerned) based on XHTML. All of the comments I've made so far are only germane to the initial version of XHTML, designated 1.0. The newer versions, XHTML 1.1 and XHTML 2.0, provide new features and support new technologies that greatly expand the usefulness of the Web - or would, if Microsoft weren't, as usual, dragging the Web down for competitive lock-in purposes. By doing everything in their considerable power to ensure that IE browsers and sites aren't fully, completely interoperable with other browsers, they discourage Windows usees from using "rival" browsers to browse sites labeled "Best viewed with Microsoft Internet Explorer". There's nothing preventing Web designers from writing standards-compliant sites that also work well with IE; in a well-designed site, it's not particularly onerous to support both standards and Microsoft. If you're using Microsoft tools, of course, it will take quite a bit more work and knowledge to create valid sites. It can be done - several sites and mailing lists describe the techniques and mind-set required - but Microsoft do not go out of their way to make it easy to do so.

Of course, this also applies only to the public Internet. If you're fortunate enough to be developing "real Web apps" for your company's intranet, and your company understands the value of open standards, then you're not going to be subjugating yourself to IE and none of this really applies to you. Go enjoy all the things that new tech lets you implement that can really stomp on your non-standards-using competition!

For the rest of us, until the Web gets out of this proprietary funk it's in now, and IE either falls into a long-deserved oblivion (improving Windows security dramatically, but that's another post) or actually complying with the same standards every other serious browser in the world does, then we're going to have problems. One of the more annoying and frustrating ones, as we've discussed, is that XHTML is (nearly) useless." So much for innovation.

Monday, May 07, 2007

It's the End of the Net as we know it, and we feel fine....

John C. Dvorak has an interesting post on his pcmag.com column blog, entitled "Will the Internet Collapse?" He doesn't think it will, obviously, and he's got some pretty impressive trends to back up his contention. Example: 140,000 terabytes of backbone traffic in 2002 - at a "conservative" 60% annual growth through 2007, that's roughly 25 KB for each of the six billion or so people on the planet. Most of whom (still) wouldn't know what a byte was if it bit them; they've got more pressing concerns, like safe food, clean water, housing... But I digress.

I don't think the Net per se will "collapse", either. What's going to happen -- what's already happening -- is both more subtle and dangerous. The "Internet craze" that gave rise to Bubbles 1.0 (1990s) and 2.0 (now) and has driven the Net from a quirky research project into a cultural touchstone, has done two things that, by comparison, would make an every-Friday-from-4-to-10-PM crash seem benign in comparison (and "4-t0-10-PM" where? On the Internet, it's always "now".)

The first problem is the Baby's Spoon in the Waterfall. There's so much information (wrapped up in even more "content", which isn't the same thing) that no person, government, entity or corporation can ever comprehend. People who spend large amounts of time surfing the Web and using various tools to pull information off the Net in other ways, soon exhibit a behavior akin to being "punch drunk". Late in te 12th round, The Champ has connected so many times with Joe Palooka's jaw, and we in the crowd can see Joe staggering around, unsure of even from which direction the merciless pummeling is coming, let alone able to control the situation. The Champ, in this analogy, is the onslaught of data/information/"content" from the Net, primarily email and the Web; Joe is standing in for the typical, non-technical ("you mean Yahoo and the Web aren't synonyms?") user. As the user's eyes glaze over and the cognitive mind enters vapor-lock, he is essentially unable (and psychologically unwilling) to refine his usage patterns or seek out new experiences that he wouldn't find in "offline" life (what in an earlier age was called the "You Are There" effect). So, for instance, the stereotypical North American user goes back to the "safe, familiar" online equivalents of his offline television shows - the "news" sites owned by the same multinational corporations that own American media, and YouTube, which can be viewed as a worldwide online version of "America's 'Funniest' Home Videos": another vehicle for peddling the same tired corporate products in the commercials.

The other problem, of course, is that organizing all this "stuff" has become more difficult, and the rate that it becomes more difficult is at least as rapid as the rate of growth itself. While the Net, and the Web in particular, have enabled new ways to express individual personalities (e.g., MySpace) and alowed ordinary citizens of many countries to amass much ore detailed information about what their government is doing, for them or to them (e.g., YouGov and
Thomas), if you don't know about YouGov or Thomas (or any similar site set up by your own country's government), then the old Bruce Springsteen song, "57 Channels and Nothing's On" seems quaint and manageable in comparison. People know there's all sorts of stuff out there - they can Google for it, "it must be real" - but, unable to come to grips with how things are organized (they aren't, on purpose) or how to use the available information to achieve a personally important goal, they fall back on the sites that organize and package and sanitize the content, accepting loss of control as the price of freedom from thinking too much. (E.g., AOL - a subsidiary of Time-Warner, and Fox "News".com, a wholly-owned subsidiary of AIPAC.) As people sink safely back into their easy chairs, content to absorb the anti-intellectual pablum that bombards them, they lose touch with the idea, let alone the possible reality, of an energized populace using the new, revolutionary technology at its disposal to improve their own lot in life and that of the world at large. Instead of a medium which challenges the status quo, the Net has devolved into a tool which reinforces it.

A collapse of the Internet? You're right, John; it will never happen. But a collapse of the promise and meaning of the Internet? It's already here, folks; we're just standing around watching streaming video of the rubble bouncing.

Monday, August 21, 2006

Projects and Data Formats, or Scratching a Standard Itch


Fair warning: This post was written in bits and pieces over a week that I spent mostly on my back in bed; it hits two or three hot-button issues that I've been running up against. In the fullness of time, I may come back and break it up, or write follow-on entries pontificating on one point or another, but for the nonce, your patience - and comments! are appreciated.

Real standards happen in one of two ways. One way involves an organisation like the World Wide Web Consortium (or W3C as it is commonly known) puts together different committees and working groups, and over the course of various meetings, seminars, forums, and other corporate expense-account sinkholes, massive sets of documents are ratified; if we're lucky, somewhere within that will be nuggets of information and wisdom around which useful things can be accomplished. Successful xamples of this include standards such as HTML, XHTML and CSS 2. Less sucessful examples include efforts such as WCAG 2. While it may safely be assumed that nothing in the new standard will disrupt the existing order of the Internet, the flip side of this is that there may be no actual working implementation of the new standard (to prove that such is practical), and it may well be that the new standard is not the most efficient or elegant solution to the problem. This may be described as the "top-down" approach.

The other way that standards happen in the real world is for a developer, or typically a small group of developers, to come up with something that works for them, open up community/public comment and collaboration, and eventually submit the standard definition (whihc by then has several working implementations) to standards bodies like the W3C or the Internet Engineering Task Force. This may be seen as the "bottom-up" approach. Its success is largely tied to how effectively it solves what it sets out to, and equally critically, whether it does so in a manner that doesn't convey inherent advantage to a subset of its audience (such as the company employing the creators of the standard). Successful examples of this include vCard and its successor hCard.

Stumbling across the description of hCard (from An Angry Fix by Jeffrey Zeldman, a well-known figure in the online Web-design industry and community) after I had been giving some thought to a problem I had been having with contact information in various formats. Namely, that the information was in various formats, for my (ancient) PalmPilot, each of two different Nokia phones, my email package (Mozilla Thunderbird), and so on, and so on.... Keeping everything synchronised - the mundane necessity of ensuring that any given contact was in each of the needed places with the most recently updated information - is a burden sufficient to preclude any further effort, such as actually communicating anything useful or interesting to those contacts. (Maybe they read this blog...)

What's needed is a free, open source bit of software to take these various directories in varyingly historical formats, apply updates and changes to a single, current-technology directory around something like vCard (or, better, hCard), and then to spit out various dumps of this data to suit the different devices and their differing format requirements. If you think about this for a while, you can think of all sorts of ways that synchronisation could be a real pain...which update gets applied if you enter the same information two different ways on two devices? Suppose that I get energetic and add data to the "Custom Fields" in my Palm to represent data that has specific fields for the phone or Thunderbird - but since I add different data at different times, it's not always consistent? And on and on...

I'm going to keep one eye open over the next few weeks or months for something that does this relatively painlessly (and, of course, if anybody knows of any, please let me know). Otherwise, it's likely to become Item 374 in my medium-priority queue for Tools I Intend To Write (Someday).

Implicit in the first paragraph, and alluded to more directly in the third (see Zeldman et al) is the fact that the W3C has spent the last 2-3 years making abundantly clear who its customers and stakeholders are, and telling those of us who are professionally tied to standard technologies but who are not ourselves multinational corporations flush with cash for endless junkets (and patent payoffs), to take a long walk off the shortest pier available. While this may be seen by some as an efficient use of resources, addressing the corporate sponsors who are the titans of the marketplace anyway, somebody made a good point along the way: the Microsofts and H-Ps and IBMs and so on of the world started out as small shops that nobody had ever heard of. Had the standards of the day been defined less for what made sense from an engineering perspective than a lock-out-the-small-guys marketing directive, the world would be a very different - and likely less advanced - place today. What goes around, comes around - and the W3C in particular is building up a lot of bad blood with the vitally "interested parties" who don't happen to (presently) be among the 200 or so largest corporations on the planet.

What will happen? On the one hand, we'll likely wind up with lots of easily available but proprietary "standards" like Adobe PDF; the word processors I've used for the last four years have supported publishing in PDF without Adobe asking for a dime. On the other hand, we'll have highly marketed, widely Diggable, proprietary-means-you-only-get-it-from-us packages. These may have lively add-on Astroturfed communities, but they won't deliver the business benefits of truly open software; you can't fork the product, you can't completely support yourself, every use you make of the product or technology, in perpetuity, will be subject to the dictates and whims of the company that owns the product. Well and good, you say; they do, in fact, own their product, and have a right to do whatever they like with it. True, but where does that leave customers who incorporate that product into critical business processes? A year or so ago, an American friend of mine told me of one of his clients, who had a hard drive in their accounting server fail. They swapped out the drive, restored from backups, and found they needed to reinstall the order-management package they used - to generate and track every single order from the day the company was founded right up to the guy who just got off the phone - needed a license key. Fine; they call the vendor's toll-free phone number, expecting to be back in business (literally) in a few minutes. Oops. The vendor was bought out by a much larger firm; their version is now three versions out of date and the (new) vendor requires htem to buy an upgrade - at retail - to access the data they've just restored from tape.

When people ask me what the business benefits of open systems are, they don't want to hear a Stallmanesque sermon on the virtues of individual liberties, real though they may be, or the geek chic of cool code, or the cheapskate appeal to "it doesn't cost a thing". It does - in time and effort to convert and adopt within the enterprise. But what you get from it at the end of the day is control over your own business processes; you can keep running a ten-year-old word processor if you choose to, or have your accounting package customised just so, or whatever you can create a business justification for - and it's going to be much easier to cost-justify relatively audacious projects because there are no hidden surprises. Transparency, auditability, control, economy -- those may not be terribly high on the Digg word list; they may not have dozens of ...For Dummies-style books in your local chain bookshop, but people who make their living, and their employees' living, by making the numbers come out right every quarter should understand what I'm talking about. It's about time.

A side note: Anyone who is considering setting up business in Malaysia rather than other nearby countries (Thailand, Vietnam) may well want to consider the level of technical efficiency, customer support, and attitude towards serice of the local telecom quasi-monopoly. For most people and businesses, Telekom Malaysia is the only game in town. As one of the subscribers/victims of their Streamyx ADSL "service" for the last three years, I have watched connection speed and reliabillity plummet as thousands of new subscribers are pushed onto steadily lower and lower tiers of service. I believe, for example, that they now offer a "broadband" connection at 128 Kbps; twice as fast as a standard dialup modem. I am paying for a 2 Mbps - 2,000 Kbps - connection; in the last two months, I have never witnessed transfer rates higher than 400 Kbps, and for the last week never higher than 80 Kbps. If I were living in a capitalist system with competitive markets, I would have choices. In a functionally Stalinist economy where competition against government-linked is tightly controlled, I have no usable choices. It has taken me well over two weeks of trying to post this blog entry. Selemat datang ke Malaysia! (Welcome to Malaysia!)

Wednesday, June 28, 2006

Promises Kept, Credibility Gaps, and Microsoft: Are we Customers or Consumers?

As reported on Slashdot, quoting Quentin Clark's WinFS team blog (which spun the item mercilessly), and commented on widely, particularly by rjdohnert and Kamal:

WinFS is dead. What has been understood for a decade or so to refer to a "Windows File System", recently rechristened in Microsoftspeak as "Windows Future Storage" (to imply a lack of commitment to a product or in fact anything specific at all); in any form recognisable as the product/technology that has been hyped unrelentingly by Microsoft when they needed something to keep users (and developers) committed to the Next Windows Version, the plug has been pulled for what promises to be the very last time. This could be viewed in a number of ways; the least uncharitable explanation that concievably touches upon our shared reality is the subject of the remainder of this item.

Yet another case of Microsoft overpromising and underdelivering? Since they really don't care about providing great software to consumers - either end users or developers, there is no real penalty for failing to keep promises (though they do, in true Rove/O'Reilly fashion, try to spin the sucker positive as hard as they can, just to keep the yokels giving the slack-jawed "wow....they say it's cool" and, as Michalski originally wrote, crapping cash).

There is absolutely no reason to keep waiting for a relational file store in Windows or any product except SQL Server (and possibly some future version of OFfice that requires SQL Server). There is no reason whatever to believe Microsoft will keep ANY promise made to developers or end users, nor or in future. There is absolutely no reason to believe that any gee-whiz "technology preview" given by Microsoft will ever turn into a real, stable, usable product unless that product is announced (with a ship date) at the show or conference where the demo is made. Stability and usability of said product will, as with all previous Microsoft releases, have to wait for the second service pack.

What this boils down to, in other words, is a matter of trust, and commitment, and honesty, and all the values that a company which values its customers (and workers) is expected to incorporate into its ethos. That Microsoft deliberately chooses not to do this, as it has proven on numerous occasions, shows its complete and consistent contempt for those poor schmucks it sees as consumers, not customers.

We, as developers and users, have two choices. We can either continue to prove Microsoft right, gulping whatever product they deign to deliver, crapping out whatever cash they choose to take, abjectly powerless to exert any change over their behaviour. Or, we can refuse to play their game any more. There are other tools to develop products for Windows. Most of these have the additional benefit of being cross-platform.

"Cross-platform". There's a quaintly radical word in these times. The idea that people could use a variety of systems, tools, applications, to get their work done. Companies don't have to pay US$600 to buy an office "suite" with a heavy-duty word processor, spreadsheet, and yadda yadda for a manager whose work is primarily limited to short memos? Revolutionary. Selecting tools based on the needs of the user rather than the "default" "choice" for the entire organisation? If one choice of office layout doesn't fit everybody from the managing director to the secretarial pool, then by what logic should they use the same software tools to do their work? How ma many users of, say, Microsoft Word use more than a tiny percentage (say, 5%) of the "features" in the product? (According to surveys dating back to 2000, roughly 5%). By looking at the situation as a need to give each user tools appropriate for the task at hand, rather than imposing a uniform "solution" and adapting the task to the "solution"?

This whole WinFS affair is yet another bit of weight pushing the Good Ship Microsoft towards (or past, in some opinions) the tipping point. Those already on board might do well to examine their options; those considering extending their 'booking' may wish to reconsider. The main forces arguing that no 'realistic' options exist have been marketing-driven, rather than technically- or business-driven. Consumers blindly take whatever they're given; customers demand products that meet their needs. It is high time that those who purchase and use business computer software systems, and the tools to work with them, availa themselves of their options.

Monday, May 22, 2006

Who Needs Privacy without Liberty?

I was originally going to call this post "Pretty Good Astroturf - What Happened to PGP at the Grass Roots?"

The Register has a good piece on Whatever Happened to PGP? As a PGP (now GnuPG user for at least ten years, I was immediately interested.

PGP, for those of you who might not remember, stands for "Pretty Good Privacy". It was arguably the first widely-deployed, open, cross-platform public key cryptography (encryption and electronic signatures) software systems. At one time, the growth in usage looked like China's economic output - respectful transitioning to breathtaking, with people confidently forecasting 'incredible' within the near future. Then a funny thing happened.

People - ordinary individuals, what politicos call the "grass roots" - stopped being so interested in PGP, and PKI in general. It turned out that people were willing to be sold on the idea that the only thing they needed encryption for was to work with a "secure Web page" in their browser, so they could order stuff using a credit card. The idea that people might want to keep their personal communication private, or be able to make messages and files that they create tamper-proof, just went completely below the radar. This "just happened" to "coincide" with the increasingly shrill jingoistic/"security" propaganda being drummed into the skill of ordinary Americans; security and identity management were no longer something that many ordinary people could use and control without feeling it all either a bit ridiculouse or seditious, depending on one's politics. Still, public discussion and enthusiasm - at least among "mainstream" Americans - seemed to diminish from about 2001 onwards. The travails that PGP went through didn't help grassroots individual use - first with the US government trying to crush Phil Zimmerman, the original developer, and then the soap-operatic sagas by which Network Associates, Inc. acquired and then almost literally threw away the original PGP code base.

But, as the Register article points out, there was one very significant group of users who jumped on PGP. Since PGP depends on a "web of trust" - A trusts C because A knows and trusts B and B asserts his trust for C - the use of PGP within widespread organisations, where some central IT or other department can certify (and possibly issue) PGP keys, is seen as a natural solution to business problems of identity management. Where in the early days, a PGP user might send and encrypted message from his office email account, comfortable in the belief that his corporate masters would be none the wiser, now the corporation is including PGP in its infrastructure.

Grass roots, meet AstroTurf.

Some might see the tone of the Register article as "how can we solve this problem?" But which problem?

Popular use of PGP, or other public-key crypto, would be desirable in a libertarian culture where people valued and guarded their privacy and identity, particularly against encroachment and/or usurpation by a less-than-trusted corporation or the overweening State. While the justification for this exists in the current American social and political system, more than ever before in living memory.... the social impetus doesn't really exist anymore. An educated, informed, watchful and skeptical American population has largely forgotten how to think for itself, delegating that once-vibrant activity to the likes of Faux "News" and the Lobby.

Corporate use, on the other hand, is proceeding apace; and those users would argue that there is no real problem: a business need has been identified, a tool selected that addresses the problem, yielding a solved problem. What's not to like? Errr....yes, well, it does depend on your viewpoint. Was that the original intention that Zimmerman had in writing PGP? Almost certainly not. Does that make the use of PGP in a business environment any less "right" or "proper"? Not if it is to remain "free" as in speech; anybody can usu PGP, as any free software, for any purpose permitted by the license.

What's "wrong" isn't the way that the use of PGP is growing, even though that isn't in a way that necessarily enhances human freedom or liberty, or enhances the security and privacy of individual citizens, as originally intended. Rather, it is that the political and social culture has changed, to where the values of freedom and liberty are no longer widely seen as individually attainable or discernable; rather, people believe themselves to be as free as they are told that they are - and see no need for independent evaluation or confirmation. Technology can be used to aid the solution of social and political problems; it cannot, however, be a "solution" in itself. Just as the old saying goes, "you can lead a horse to water, but you can't make him drink", you can provide the people of the world, whatever their present situation, with tools to enhance that freedom and liberty - but people will only use the tool if they care about such things. If Huxley's observation is accurate, that "The victim of mind-manipulation does not know that he is a victim. To him the walls of his prison are invisible, and he believes himself to be free. That he is not free is apparent only to other people. His servitude is strictly objective" -- then the tools available don't matter. A key is useless to one who does not see she shackles on his own wrists. That, I fear, is the level that far too many Americans - and others - have fallen to.

What happened to PGP? It got better, and became as obsolete as freedom.

Sunday, March 26, 2006

On the importance of keeping current

Now that PHP 6 is in the works, there is even less excuse than existed previously for Web sites (hosting providers in particular) not migrating to PHP 5 from PHP 4. We are faced with the unpleasant possibility for tool and library developers of having to support three major, necessarily incompatible, versions of PHP.

I am not yet up to speed on what PHP 6 is going to bring to the table, but PHP 5 (which will be two years old on 13 July 2006) makes PHP a much more pleasant, usable language for projects large and small. With a true object model, access control, exception handling, improved database support, improved XML support, proper security design concepts, and so on, it's a far cry from the revised-nearly-to-the-point-of-absurdity PHP 4.

Another great thing about PHP 5, if not strictly part of it, is the PHPUnit unit testing framework (see also the distribution blog). This is a wonderful tool for unit testing, refactoring, and continuous automated verification of your codebase. It will strongly encourage you to make your development process more agile, using a test first/test everything/test always mindset that, once you have crossed the chasm, will benefit a small one- or two-man shop at least as much as the large, battalion-strength corporate development teams that have to date been its most enthusiastic audience.

I have so far used this tool and technique for three customer projects: the first was delivered (admittedly barely) on time, the second was actually deliverable less than 2/3 of the scheduled calendar time into the project (allowing for further refactoring to improve performance) and delivered on time, and the third was delivered 10% ahead of time, with no heroic kill-the-last-bug all-night sessions required.

Discussing the technique with other developers regarding its use in PHP and other languages (such as Python, Ruby, C++ and of course Java; the seminal "JUnit" testing framework was written for Java), gives the impression that this experience is by no means unique or extreme (nor did I expect it to be). Given that two of my three major career interests for the last couple of decades have been rapid development of high-quality code and the advancement of practices and techniques to help our software-development craft evolve towards a true engineering discipline, this would seem a natural thing for me to get excited and evangelical about. (The third, in case you're wondering, is the pervasive use of open standards and non-proprietary technologies to help focus efforts on true innovation).

All of this may seem a truly geeky thing to rave about, and to a certain degree, I plead guilty of that. But it should also be important, or at least noteworthy, to anybody whose business or casual interests involve the use of software or software-controlled artifacts like elevators and TiVo. By understanding a little bit about how process and quality interact, clients, customers and the general-user public can help prod the industry towards continuous improvement.

Because, after all, "blue screens" don't "just happen".

Once more into the breach, dear friends; once more....

To the half-dozen or so of you reading this blog, thank you; and for those of you who wondered what's happened to me and this blog over the last several months, the answer is both "a great deal" and "not much at all".

I had been ill for a couple of months, with what the doctors insisted was just an ordinary flu, and then a cold, and then an ordinary (NOT H5N1, thank you very much) flu that has kept my close friends busy trying to spy the license tag number of the lorry that keeps running me down. I am better now, thank you.

I have also changed hosting providers for my professional Web site and email hosting; the new crew look to be a good outfit so far:

  • they understand the value of responding quickly to customer enquiries, no matter how harebrained;

  • they understand Linux and Apache and (at least do a convincing appearance of) not just a "me-too" offering;
  • their people know their way around their system (see the first comment);

  • they have sensibly large limits on disk space and bandwidth; which means that

  • they allow you to host a lot of tools and libraries and addons that you can manage yourself (think PEAR for you PHP types) without having to rely on the (necessarily limited) knowledge of a central administrator who may not be quite as up to scratch on version X of the FooBar publishing framework as you are.


In short, as I said, off to a good start. After getting some minor details worked out, and being on my feet again, the all-new seven-sigma.com Web site should be up within the next couple of days.

Sunday, February 19, 2006

If you can't extend it, is it really an eXtensible HTML?

Arrrrrrrrrrrrrggggggghhhhhhhhhhhhhhh!!!!! So much for consistency....

As anybody who has worked with me in the last 3-4 years well knows, I have been an enthusiastic advocate of DocBook as a documentation markup vocabulary for various purposes, and by extension, XML-based tools for all manner of things (Apache Ant and so on).

One feature I use regularly in Docbook XML source documents is the internal subset, which lets you define entities and include files defining entities not part of the original DTD. So, for instance, my standard software.ent file has an entry (obviously without the junk spaces, and all on one line) of
< ENTITY mswindows '< ulink url="http://www.apple.com/switch">Microsoft Windows< /ulink>'>;
This way, anywhere in a DocBook file that includes that entity definition, I can type &mswindows; and, when the document is transformed (into XHTML, PDF, RTF or whatever), the desired link and text will appear in place of the entity. This is an obvious lifesaver when you want to include, for instance, links to glossary definitions for unfamiliar terms scattered through a document.

Fine. But the current state of XHTML (the XML-based successor to HTML) simply doesn't support it. It does not appear to be possible to have an XHTML document with an internal subset parsed correctly by any current major browser on Windows, Linux or the Macintosh. Various Google searches such as this one produce links to pages that say, with varying levels of emphasis and literal wording, "you can't use internal subsets for XHTML that is to be rendered by a browser". It seems that in the force-fit of XML to HTML that produced XHTML, the concept of different "streams", or purposes, for documents was introduced. XHTML which is to be rendered in a Web browser has one set of limitations (including the internal subset); whereas XHTML conformant to the same definition documents which is to be processed as "pure XML" has another.

To say that this sucks is to use that colloquialism as an extreme understatement, akin to saying that tsunamis are wet. This limitation closes off an entire range of applications that would use dynamically-generated XHTML as browser-viewable data in the same spirit as XML generally (without writing an otherwise redundant app to parse and reformat the data). The benefits — and they are significant — of an XML-based browser markup language are (in my view) seriously degraded by foolishness like this.

Of course, several of you are already thinking, I could just use XSLT instead. I had previously wondered why PHP and other Web scripting languages included support for XSLT processing. Now I know, I guess.

If anybody has any corrections or other good ideas, please let me know.

Friday, December 02, 2005

What You've Known Since You Were Six: Sharing Helps Everyone

By the mid-1980s, it became virtually impossible to write a technically and economically interesting, large-scale application in a market-friendly period of time by a single developer. Likewise, by the mid-1990s, it became unusual to find application-level software which did not make use of some sort of database, both as part of the application itself and (for relatively mature development shops) as an integral part of development tools (configuration management, defect tracking and so on).

I would argue that, by 2004, it became infeasible to do development work -- commercial or otherwise -- in a craftsmanlike manner without the pervasive use of collaboration software. These systems -- wikis, blogs, message boards/discussion software, and so on -- dramatically improve the capability and effectiveness of a development team. (All of these applications, incidentally, make heavy use of databases.)

Any craft, arguably especially the craft of software development, lives by the traditional medical manifesto, "if you don't write it down, it never happened". Too often, however, information is written down (captured), but there is no means for organising and retrieving that information effectively. These collaboration tools each address that need for capture, (flexible) organisation and presentation of information in subtly, but importantly, different ways.

wikis are great for organising documents in a way that they can be shared, collaborated on, retrieved, and massively hyperlinked, with attachments, comments and so on. blogs (also look at blogger.com) are where individuals can write about anything and everything, with links to outside pages and other resources of interest. Most blogs, including this one, support readers posting comments to blog entries, or to other comments. (So please let me know what you think!) This differs from a wiki in a manner similar to the way that newspaper columns differ from journal articles or books; mainly in the semantics and scope and style of information being presented. Also, while blogs can be edited after the fact, they rarely are, whereas a wiki with a good community around it has its content change regularly.

Finally, we come to discussion software, sometimes referred to as message boards or bulletin boards. These are the latest form of a general system as old as networked computing itself. Systems like phorum, phpBB and vBulletin are all Web-based systems which allow users to post messages in forums, which contain discussions of a particular subject. These can be searched, attachments can be made, and so on. If a group needs to have a discussion, come to a consensus, and see how it got there later, this is a better tool for that sort of thing than a wiki or a blog would be. (Documents which are created in response to whatever decision was taken can be collaborated on in a wiki; individuals can expound on related opinions or useful information in their blogs.)

Another bonus to all of this is that all of these applications involve databases, with most of the information being text or (hopefully) relatively small binary files. The means to do backups and restores, as well as formal version control, for each of these is well known. Often such facilities are supported directly by the administrative interface for the software. Thus, development organisations can record, preserve and recall vital information without having it locked up in people's heads, or (possibly worse) written down haphazardly on random bits of paper and requiring significant decoding effort by people other than the author who wish to read them.

These tools also, combined with email and instant messaging, almost completely remove the need (or even benefit) for teams to be located in the same physicall location, working at the same time. Development teams now have the tools to be highly effective regardless of the physical location or time-zone differences of the team members. Indeed, this writer has participated in such distributed teams, developing both commercial and non-commercial software systems, and can enthusiastically vouch for its effectiveness.

In short, within a very short period of time, we can exxpect the adoption of collaborative tools by software development teams of all environments and domains to increase dramatically. Use of these tools is, or shortly will be, an effective discriminator for success, especially in explicitly competitive environments. If you are participating in a team developing software, Web sites, or similar systems and you're not using these tools, why not? Your competition probably is.

Thursday, December 01, 2005

Religious Icons and Text....Editors; Some People Get Really Attached to Their Tools

For several years now, when I'm editing text files (program source code, documentation, blog entries, whatever) under Microsoft Windows, I've used the free Crimson Editor. At Version 3.70 (which can be downloaded here) since 22 September 2004, it is a perfectly reasonable general-purpose text/source code editor. It does the basic things most people expect: syntax highlighting for different languages, macro recording and editing, support for calling external tools and integrating their output into the file being edited, and so on. But I have gradually become less than thrilled with it for a few (possibly quirky) reasons:
  • While available without charge (economically free), the source code is not available (freedom of action is restricted). Not being an open source product, it is not open to real customisation beyond simple keystroke macros and language syntax definitions.
  • Most of my Web development for the last few years has revolved around the PHP scripting language. More recently, this has been supplemented by Python. Crimson lacks any of the features that other, more specific editors like NuSphere PhpED or ActiveState Komodo offer (although, granted, at a price).
  • What I have been doing even more of lately, however, has been document creation and editing using Docbook XML, from which I can create HTML, PDF, text and RTF word-processing files from a single set of sources. Very cool stuff. Until I get into Crimson and start pounding the keyboard - in frustration; the editor recognises that it's dealing with an XML file and can colour-code tags, but there isn't a whole lot else it knows about. I eventually got tired of doing basic scut-work and memory exercises over and over again, and went looking for something else. (At least a lighter hammer to hit myself in the head with).
As always, my first two stops were SourceForge and Google. You know Google. You may not know SourceForge, but you should. If you're looking for software -- to do anything, on anything, in anything -- this is the place to look first. A clearinghouse of open source projects, as I write these words their website reports 106,861 registered projects and 1,186,755 registered users. Those range from small scratch-an-itch projects on up to very complex, very complete line-of-business systems like ERP and CRM. They currently have 2,274 projects listed in the Text Editors category. Obviously, I'm not going to look at all of these, but there are nice ways to whittle down the list.

I had defined for myself a half-dozen usage modes that I was going to use to evaluate each editor and compare it against Crimson Editor. Now, to a programmer, or a writer, editors are like the old advertising line for a brand of potato crisps, "bet you can't have just one". I now have, temporarily, mind you, eight different editors on my Windows Start menu, with another 3 or 4 whose installation procedure was so basic that it didn't even install any menu items.

These basically fell into three groups. First are the demo or 'lite' versions of commercial products, which invariably annoyed me with the artificial limitations intended to induce you to ante up for the 'real' paid-for version. These were quickly discarded. Second were the well-meaning but immature/incomplete 'freeware' packages (such as, to some degree, Crimson). These invited a certain amount of experimentation with their source code to tweak things up a bit before being abandoned. While some of these (such as Programmers Notepad) are clearly on the right track, they just didn't "feel" right -- and we are talking about the most subjective of tools. Just as a carpenter may have a favourite hammer, or an electrician a meter that works just the way he likes it, a writer has to feel "comfortable" with the editor and other tools he uses. (Also seen were several packages so lacking in completeness and/or competence that it is fervently hoped that they were pure larks; exploratory projects to teach their writers something. Humility should come along for the ride; several comments on support forums were, quite deservedly, scathing.)

Now I am in the midst of conversion from Crimson Editor to JEdit. While it has a few quirks and oddities, likely due both to the fact that I am using a beta (prerelease) version and to the peculiarities of Java on Microsoft Windows, it does most of the mundane things quite well and some greatly appreciated extra features uniquely well.
  • As a Java program, it runs not only on Windows but on Linux, the Apple Macintosh, BSD Unix, your pocket supercomputer, whatever.
  • Being an open source project, you have total control. Think some features are just bloated junk and want to get rid of them? Go right ahead. Thought of a cool new feature that would fit right in with what's already there? Fire up your inner coder and go write it. Think something's pretty cool, but think you can make it faster/better? No problem -- and when you're done, you can contribute the changes back to the original project, start distributing your modified version on your own, or just hold onto the goodies for yourself. That's the freedom (of action) that open source gives you.
  • Since it's hosted on SourceForge, you don't need to worry about the original team getting bored and walking away, leaving the code on a site that eventually goes dark. Even if JEdit does go dormant for a time (a highly unlikely scenario, apparently), SourceForge makes it easy for some fresh talent to pick things up and carry forward at a later time.
  • If you're working in Docbook XML, it has some really nice features: not only does it support autocomplete for tags like
    , it also autocompletes
    entities like &eacute; for é. Better still, for any entity that you define -- any entity which would be known to a parser reading your document -- autocompletion is supported. To show a specific example, my standard "data dictionary" entities file (what I use to reuse standard phrases and URLs in all my work), I can type &ap and the list of candidates has apache-httpd as the third item in a dropdown list. Two taps on the down-arrow key and I've saved eight keystrokes. If you're authoring in Docbook XML, you should be making pervasive use of entities to aid in consistency and reuse. This makes it lots easier.
I'll likely come back and update this blog entry as I gain more experience with JEdit, and as it matures into an "official" release of Version 4.3. If anybody with writing or coding experience really is reading this, I'd appreciate it if you'd leave comments describing your experiences. Thanks.

Thursday, October 27, 2005

Craft, culture and communication

This is a very hard post for me to write. I've been wrestling with it for the last two days, and yes, the timestamp is accurate. If I offend you with what I say here, please understand that it is not meant to be personal. Rather, it probably means you may want to pay close attention.

When I was in university, back in the Pleistocene, I had a linguistics professor who went around saying that
A language is the definition of a specific culture, at a specific place, at a specific time. Change the culture, the place or the time, and the language changes - and if the language changes, it means that something else has, too.
Why is this relevant to the craft of software development?

Last weekend, I picked up a great book, Agile JavaTM: Crafting Code with Test-Driven Development, over the weekend at Kinokuniya bookstore at KLCC. There are maybe half a dozen books that any serious developer recognises as landmark events in the advancement of her or his craft. This, ladies and gentlemen, is one of them. If you are at all interested in Java, in high-quality software development, or in managing a group of software developers under seemingly impossible schedules, and if you are fully literate in the English language as a means of technical communication, then bookmark this page, go grab yourself a copy, read it, come back, and reread it tomorrow. It's not perfect - I would have liked to see the author use TestNG as the test framework rather than its predecessor, JUnit) but those are more stylistic quibbles than substance; if you go through the lessons in this book, you will have some necessary tools to improve your mastery of the craft of software development, specifically using the Java language and platform.

I immediately started talking up the book to some of my projectmates at Cilix, saying "You gotta learn this".

And then I stoppped and thought about it some more. And pretty much gave up the idea of evangelising the book - even though I do intend to lead the group into the use of test-driven development. It is the logical extension of the way I have been taught (by individuals and experience) to do software development for nearly three decades now. It completely blew several of the premises I was building a couple of white papers on completely away - and replaced them with better ones (yes, Linda, it's coming Real Soon Now). TDD may not solve all your project problems, cure world poverty or grow hair on a billiard ball, but it will significantly change the way you think about - and practise - the crat of software development.

If you understand the material, that is.

There are really only three (human) languages that matter for engineering and for software: English, Russian and (Mandarin) Chinese, pretty much in that order. Solid literacy and fluency in Business Standard English and Technical English will enable you to read, comprehend and learn from the majority of technical communication outside Eastern Europe and China (and the former Soviet-bloc engineers who don't already know English are learning it as fast as they can). China was largely self-reliant in terms of technology for some time, for ideological and economic reasons; there's an amazing (to a Westerner) amount of technical information available in Chinese - but English is gaining ground there too, if initially often imperfect in its usage.

Coming back to why my initial enthusiasm about the book has cooled, for those of you who aren't actually from my company, I work at an engineering firm in Kuala Lumpur, Malaysia called Cilix. We do a lot of (Malaysian) government contract work in various technical areas, but we are also trying to grow a commercial-software (including Web applications) development group. Until recently, I managed that group; after top management came to its senses, I am now in an internal-consulting role. As Principal Technologist, I see my charter as consulting to the various groups within the Company on (primarily) software-related and development-related technologies, techniques, tools and processes, with a view to make our small group more effective at competition with organisations hundreds of times our size.

Up to now, we've been in what a Western software veteran would recognise as "classic startup mode": minimal process, chaotic attempts at organisation, with project successes attained through the heroic efforts of specific, talented individuals. My job is, in part, to help change that: to help us work smarter, not harder. Enter test-driven development, configuration management, quality engineering, and documentation.

Documentation. Hmmm. Oops.

One senior manager in the company recently remarked that there are perhaps five or six individuals in the entire company with the technical abilities, experience and communication abilities to help pull off the type of endeavour - both in terms of the project and how we go about it. Two or at most three of those individuals, to my knowledge, are attached to the project, and one of these is less than sanguine about the currency of technical knowledge and experience being brought to bear.

Since arriving on the project, I have handed two books to specific individuals, with instructions to at lesat skim them heavily and be able to engage in a discussion of the concepts presented in a week to ten days' time. Despite repeated prodding, neither of those individuals appeared to make that level of effort. This is not to complain specifically about the individuals; informally asking developers within the group how many technical books they had read in the last 18 months averaged solidly in the single digits. A similar survey taken in comparable groups at Microsoft, Borland, Siemens Rolm or Weyerhaeuser - all companies where I have worked previously - would likely average in the mid-twenties at least. So too, I suspect, would surveys at Wipro, Infosys or PricewaterhouseCoopers, some of our current and potential competitors.

While American technical people are rightly famous for living inside their own technical world and not getting out often enough, that provides only limited coverage as an excuse. In a craft whose very raison d'ètre is information, an oft-repeated truism (first attributed, to my knowledge, to Grace Hopper, that "90% of what you know will be obsolete in six months; 10% of what you know will never be obsolete. Make sure you get the full ten percent." If you don't read -- both books and online materials -- how can a software (or Web) developer have any credible hope of remaining current or even competent at his or her craft?

That principle extends to organisations. If the individual developers do not exert continuous efforts to maintain their skills (technical and linguistic) at a sufficiently high level, and their employer similarly chooses not to do so, how can that organisation remain competitive over the long term, when competitiveness may be directly linked to the efficiency and effectiveness with which that organisation acquires, utilises and expands upon information - predominantly in English? How can smaller organisations compete against larger ones which are more likely to have the raw manpower to scrape together a team to accomplish a difficult, leading-edge project? "Learn continuously or you're gone" was an oft-repeated mantra from business and industry participants in a recent Software Development Conference and Expo, an important industry conference. What of the individuals or organisations who choose not to do so?

Those of us involved in the craft of software and Web development have an obvious economic and professional obligation to our own careers to keep our own skills current. We also have an ethical, moral (and in some jurisdictions, fiduciary, legal) obligation to encourage our employers or other professional organisations to do so. There is no way of knowing whether, or how successfully, any given technology, language or practise will be in ten years' time, or even five. How many times has the IT industry been rocked by sudden paradigm shifts -- the personal computer, the World Wide Web -- which not only created large new areas of opportunity, but severely constrained growth in previously lucrative areas? I came into this industry at a time when (seemingly) several million mainframe COBOL programmers were watching their jobs go away as business moved first to minicomputers, then to PCs. History repeated itself with the shift to graphical systems like the Apple Macintosh and Microsoft Windows, and again with the World Wide Web and the other Internet-related technologies, and yet again with the offshoring craze of the last five years. What developer, or manager, or director, has the hubris to in effect declare that it won't happen again, that there own't be a new, disruptive technology shift that obsoletes skills and capabilities?

But whatever shift there is, whatever new technology comes along that turns college dropouts into megabillionaires, that changes the professional lives of millions of craftspeople... it will almost certainly be documented in English.

Tuesday, October 18, 2005

About me and my work at Cilix

I'm working on a lot of things for my work at Cilix, an engineering firm in Kuala Lumpur, Malaysia. First off, let me be clear on one thing: this blog is not officially sanctioned in any way by Cilix; this is 'just me'.

We call ourselves "A Knowledge Company". What that means, at least in my understanding, is that we apply professional knowledge and experience, augmented heavily by technology, to solve customers' knowledge-management and IT challenges. As such, we do a lot of writing - documents, Web pages, software, ad (nearly) infinitum.

We're a small shop as these things go, and our competition comes from much larger organisations with instant multinational name-brand recognition. Like any small firm, we have to win our first projects with a given client by promising - and delivering - a better value proposition than our competition. Where we get repeat business - again, like any similar firm - is by being agile, efficient, and above all, competent to the point of being unquestionably the least risky vendor for a particular solution.

Those attributes, in turn, lead us to consider issues like process, quality, and superlative knowledge of everything we are about. These issues, and how we as an organisation work through them, were what originally attracted me to the Company when I was approached and offered a position here. These issues are also the foci of what I expect to accomplish with this blog and the related collaboration tools (such as the Wiki).



I am also trying to evangelise and lead the implementation of open documentation and data-format standards at Cilix. This involves, among other things, migrating away from proprietary, binary formats like Microsoft Office documents to open, preferably text-based formats. As it happens, many of these open, text-based formats are based on XML vocabularies, such as Docbook and SVG.

Wby are text-based formats preferable? Lots of reasons:
  • They are usually much more compact (and compressible) than comparable binary formats. Converting mostly-text Microsoft Word documents to Docbook equivalents often yields size reductions of 80% or more (think how much more convenient email attachments would be);
  • They are usable with a wider variety of tools. I can throw a text file on my PalmPilot and fiddle with it far easier than a Microsoft Word document, for instance;
  • The are more amenable to most version-control systems, particularly cvs and subversion. Instead of making copies of each version of a binary file, all that is required is to take the difference between two different versions of a text file - a much easier and more reliable operation. I have seen version control systems of all flavours - SourceSafe, cvs, Atria/Rational ClearCase - irretrievably corrupt binary files when insufficient care was taken by the configuration manager in dealing with binary files;
  • They are more amenable to being stored in databases. Many databases (such as MySQL can return result sets packaged as XML fragments; this, combined with an XSLT parser and stylesheet, opens the door to some truly compelling presentation capabilities.
By taking advantage of these capabilities, we should be able to create better products with more predictable (and shorter) schedules without either greatly expanding the development team or pushing the present staff to the point of burnout. There is a saying in Silicon Valley in California, only partly tongue-in-cheek:
It isn't a startup until somebody dies
Here's hoping that's one "tradition" that's not exported anywhere outside the Valley.

The Vision, Forward through the Rear-View Mirror

The vision I'm trying to promote here, which has been used successfully many times before, is that of a very flexible, highly iterative, highly automated development process, where a small team (like ours) can produce high-quality code rapidly and reliably, without burning anybody out in the process. (Think Agile, as a pervasive, commoditized process.) Having just returned (17 October) from being in hospital due to a series of small strokes, I'm rather highly motivated to do this personally. It's also the best way I can see for our small team to honour our commitments.

To do this, we need to be able to:
  • have development artifacts (code/Web pages) integrated into their own documentation, using something like Javadoc;
  • have automated build tools like Ant regularly pull all development artifacts from our software configuration management tool (which for now is subversion)
  • run an automated testing system (like TestNG) on the newly built artifacts;
  • add issue reports documenting any failed tests to our issue tracking system, which then crunches various reports and
  • automatically emails relevant reports to the various stakeholders.
The whole point of this is to have everybody be able to come into work in the morning and/or back from lunch in the afternoon and know exactly what the status of development is, and to be able to track that over time. This can dramatically reduce delays and bottlenecks from traditional flailing about in a more ad hoc development style.

Obviously, one interest is test-driven development, where, as in most so-called Extreme Programming methods, all development artifacts (such as code) are fully tested at least as often as the state of the system changes. What this means in practice is that a developer would include code for testing each artifact integrated with that artifact. Then, an automated test tool would run those tests and report to the Quality Engineering team any results. This would not eliminate the need for a QE team; it would make the team more effective by helping to separate the things which need further exploration from the things that are, provably, working properly.

Why does this matter? For example, there was an article on test-driven development in the September 2005 issue of IEEE Computer (reported on here) that showed one of three development groups reducing defect density by 50% after adopting TDD, and another similar group enjoying a 40% improvement.

All this becomes interesting to us at Cilix when we start looking at tools like:
  • Cobertura for evaluating test coverage (the percentage of code accessed by tests)
  • TestNG, one successor to the venerable JUnit automated Java testing framework. TestNG is an improvement for a whole variety of reasons, including being less intrusive in the code under test and having multiple ways to group tests in a way that makes it much harder for you to forget to test something;
  • Ant, the Apache-developed, Java-based build tool. This is, being Java-based and not interactive per se, easy to automate;
  • and so on, as mentioned earlier.

A Lever, or a toothpick?

Any modern development effort which is complex enough to be commercially and/or technically interesting requires active, continuous collaboration between professionals and craftsfolk of various disciplines and specialisations. For instance, most organisations developing computer software have, in addition to the designers and coders of the software itself, several other interested stakeholders: quality engineers, documentation authors and editors, sales and marketing specialists, and various flavours of managers. Each of these individuals and groups have different capabilities and roles with regard to the project being developed, each of the groups have different perspectives, different needs - but one need that all share, knowingly or not, is the ability to communicate effectively and efficiently with each other. This involves the creation or acquisition of information, its refinement, analysis, discussion and use within the context of the project. The end goal, of course, is the completion and delivery of some sort of artifact that meets the needs of the organisation and delights that product's customers, without sending the development organisation on a death march in the process.

"But wait", you might reasonably say, "we already do this. We have meetings, minutes are taken, transcribed and emailed about, lots of other emails get sent back and forth, we have documents like functional specifications and design documents and whatnot to keep ourselves organised - what do we need all this gimcrackery for?" All of which is perfectly true, jus as you can put harness and bit on your horse, hitch up a carriage, and travel from Kuala Lumpur to Singapore - or you could catch an airline flight instead. There are countless organisations, and a depressingly high proportion of smaller ones, who continue to solve earl-21st-century problems with early-20th-century tools and practices. We know better. One of the points that many leading authorities, such as Steve Maguire in his excellent book Debugging the Development Process, make is that for each hour of meetings which a knowledge worker attends, it takes at least another hour for him or her to regain the level of productivity in work product creation which would have been effected had the meeting not taken place. So, for a typical large-corporate developer who spends an hour every day in meetings that could have their purpose accomplished through less intrusive means, the company is taking a 25% hit in productivity for that individual. Take 1/4 of the payroll of, say, Maxis, or even my own company Cilix, and that starts to add up.

The Lever That Rocks Our World

The ancient Greek philosopher Archimedes is quoted as having said:

Give me a place to stand and with a lever I will move the whole world.


We're not here to attempt anything on that scale, but being involved in IT often feels that way.

This blog is one of the vehicles which I intend to use to record, discuss and build upon various projects, ideas and memes which I believe important in the context of a modern software-development organisation. By the term "software-development organisation", I am also including groups which create Web applications and content, podcasts, or similar "media" artifacts which rely on some form of computer or other electronic technology for distribution.