Archimedes' Lever: programming

Showing posts with label programming. Show all posts

Tuesday, 28 August 2012

Even Typos Can Teach You Something: BE CAREFUL!

CoffeeScript is an interesting language, in both senses of that word. Whereas it tries to (and largely succeeds in) insulating the hapless coder from what Douglas Crockford calls "the bad parts" of JavaScript, it does not filter out all of the mind-blowing bits of JavaScript. Arrays are a good example.

Think back to your CS 101 class (or your K&R for you self-taught folks). What is an array?

In most languages, an array is a numbered sequence of elements, using (usually) sequential non-negative integers for identifying a specific element. Many statically-typed languages (such as C and Java) require that array elements each be of the same type; dynamic languages such as Python relax that to one degree or another.

But CoffeeScript and its underlying JavaScript (which I collectively call just Script) suck that up and then do some anatomically improbable things with/to it. Consider this bit of CoffeeScript interaction from a terminal (in Mac OS X 10.8, CoffeeScript version 1.2.0):

Jeffs-iMac:tmp jeffdickey$ coffee
coffee> foo = []                      # declare an empty array
[]
coffee> foo[4] = 'a'                  # assign offset 4; 0-3 have undefined values
'a'
coffee> foo[2.718281828459045] = 'e'  # Floating-point array index? Why not?
'e'
coffee> foo[-2] = 'b'                 # Negative numbers are just fine, too
'b'
coffee> foo['c'] = 27.4               # A string can be an index. It's still an array.
27.4
coffee> foo                           # What do we have now?
[ ,
  ,
  ,
  ,
  'a',
  '2.718281828459045': 'e',           # Non-sequential offsets look a lot like object fields
  '-2': 'b',
  c: 27.4 ]
coffee> foo.length
5
coffee>                               # Control-D to get out of the REPL
Jeffs-iMac:tmp jeffdickey$

So? What's the upshot?

Well, one thing many languages do to/for you is to throw an exception when your program gets too fast and loose with its indexing into an array. In Script, that doesn't happen; any scalar is a usable index, and if you pass something in that isn't a scalar (like an object), it'll have its toString() method called to generate an index value. And did I mention that this is all case-sensitive? Your typo opportunities are limited only by your imagination and by the accuracy of your fingers…

In case it wasn't obvious from the preceding incredulous rant, I really wonder why this "feature" is in the lnaguages; would it have really been that coercive to say "I'm sorry, Dave; I'm afraid I can't do that. Perhaps you'd rather use an object hash instead?'

Thursday, 19 July 2012

An Immodest Proposal: Show Me the Code

I've been doing a lot of CoffeeScript work lately, along with Ruby with Rails and Sinatra. Especially with regards to my CoffeeScript, I've moved away from my recent Ruby-coloured "your code should be all the documentation you need" philosophy.

I've found all kinds of uses for Markdown, particularly including docco.coffee, written by the developer of CoffeeScript itself. Docco "is a quick-and-dirty, hundred-line-long, literate-programming-style documentation generator. It produces HTML that displays your comments alongside your code. Comments are passed through Markdown, and code is passed through Pygments syntax highlighting." It may not be the greatest thing since sliced bread, but the output does make complex yet deliberately-written code much clearer.

The downside to well-commented code, of course, is that it gets bulky. One of my larger CoffeeScript files has ~140 lines of actual code, engulfed in a source file that currently tops 330 lines. Ouch.

I was working on it just now, and an idea popped into my head: the "proposal" of the title.

Comments should be selectively elided or folded, in a fashion similar to the code folding feature offered by most modern editors.

I meander around between four different editors on my Mac (TextMate, Sublime Text 2, Komodo IDE and MacVim), and none of them appear to support such a feature out-of-the-box.

Does anybody know of a plugin for any of the above that does this?

Sunday, 6 May 2012

Rules can be broken, but there are Consequences. Beware.

If you're in an agile shop (of whatever persuasion), you're quite familiar with a basic, easily justifiable rule:

No line of production code shall be created, modified or deleted in the absence of a failing test case, for which the change is required to make the test case pass.

Various people add extra conditionals to the dictum (my personal favourite is "…to make the test case pass with the minimal, most domain-consistent code changes currently practicable") but, if your shop is (striving towards being) agile, it's very hard to imagine that you're not honouring that basic dictum. Usually in the breach at first, but everybody starts in perfect ignorance, yes?

I (very) recently was working on a bit of code that, for nearly its entire existence over several major iterations, had 100% test coverage (technically, C0 or line coverage) and no known defects in implemented code. It then underwent a short (less than one man-week) burst of "rush" coding aimed at demonstrating a new feature, without the supporting tests having been done beforehand. It then was to be used as the basis for implementing a related set of new features, that would affect and be affected by the state of several software components.

That induced some serious second-guessing. Do we continue mad-hatter hacking, trusting experience and heroic effort to somehow save the day? Do we go back and backfill test coverage, to prove that we understand exactly what we're dealing with and that it works as intended before jumping off into the new features (with or without proper specs/tests up front)? Or do we try to take a middle route, marking the missing test coverage as tech debt that will have to be paid off sometime in The Glorious Future To Come™? The most masochistic of cultists (or perhaps the most serenely confident of infinite schedule and other resources) would pick the first; the "agile" cargo-cultist with an irate manager breathing fire down his neck the second; but the third is the only pragmatic hope for a way forward… as long as development has and trusts in assurances that the debt will be paid in full immediately after the current project-delivery arc (at which time increased revenue should be coming in to pay for the developer time).

The moral of the story is well-known, and has been summed up as "Murphy " (of Murphy's Law fame) "always gets his cut", "payback's a b*tch" and other, more "colourful" metaphors. I prefer the dictum that started this post, perhaps alternately summed up as

Don't point firearms at bits of anatomy that you (or their owner) would mind losing. And, especially, never, ever do it more than once".

Because while even the most dilettante of managers is likely to have heard Fred Brooks' famous "adding people to a late project only makes it later" or his rephrasing as "nine women cannot have a baby in one month", too few know of Steve McConnell's observation (from 1995!) that aggressive schedules are at least equally delay-inducing. If your technical people say that, with the scope currently defined, that something will take N man-weeks, pushing for N/4 is an absolutely fantastic way to turn delivery in N*4 into a major challenge.

Remember, "close" only counts in horseshoes and hand grenades and, even then, only if it's close enough to have the intended effect. Software, like the computers it runs on, is a binary affair; either it works or it doesn't.

Sunday, 15 August 2010

Where Do We "Go" From Here? D2 Or Not D2, That the Question Is

Unless you've been working in an organization whose mission is to supply C++ language tools (and perhaps particularly if you have, come to think of it), you can't help but have noticed that a lot of people who value productive use of time over esoteric language theory have come to the conclusion that the C++ language is hopelessly broken.

Sure, you can do anything in C++... or at least in C++ as specified. How you do it in any particular C++ implementation varies. That's a large part of the justification for libraries like the (excellent) Boost; to tame the wilderness and present a reasonable set of tools that will (almost certainly) work with any (serious) C++ compiler in the wild.

But, as several language professionals have pointed out, very, very few (if any) people know all there is to know about C++; teams using C++ for projects agree (explicitly or otherwise) on what features will and won't be used. Sometimes, as in the case of many FLOSS projects, this is because compiler support has historically been inconsistent across platforms; more generally, it's to keep the codebase within the reasonable capabilities of whatever teammates poke at it over the life of the project.

I've been using C++ for 20+ years now, and I still learn something — and find I've forgotten something — about C++ on every project I embark on. Talking with other, often very senior, developers indicates that this is by no means an isolated experience. Thus, the quest for other language(s) that can be used in all or most of the numerous areas in which C++ code has been written.

If you're on Windows or the Mac, and happy to stay here, this doesn't much affect you beyond "there but for the grace of $DEITY go I." Microsoft folks have C#, designed by Anders Hejlsberg, who previously had designed and led development of the sublime Delphi language. C# essentially takes much of the object model from Delphi, does its best to avoid the various land mines encountered by Java and particularly C++ while remaining quite learnable and conceptually familiar to refugees from those languages.

On the Mac, Objective-C is the language of choice, providing commendably clean, understandable object-oriented wrapping around the nearly-universal C language (which is still fully usable). Objective-C is the reason most Mac-and-other-platform developers cite for their high productivity writing Mac code, along with the excellent Mac OS X system libraries. Objective-C is supported on other platforms, but the community of users on non-Apple or -NeXT systems is relatively small.

Google created the Go language and first made it public in late 2009. The initial design was led by Robert Griesemer, Rob Pike and Ken Thompson. Pike and Thompson are (or should be) very well-known to C and Unix developers; they led the pioneers. Robert Griesemer, currently with Google (obviously), was apparently previously best known in the Oracle data-warehousing community, but a quick search turns up a few interesting-sounding language-related hits prior to Go.

Go is described as "simple, fast, concurrent, safe, fun, " and open source. It doesn't try to be all possible things to all possible users; it has a relatively small, conceptually achievable mission, and leaves everything else to either The Future™ or some to-be-promulgated extension system.

Go is small, young, has a corporate sponsor acting as benevolent dictator with an increasingly active community around the language and, if it doesn't fall into the pits of internal dissension or corporate abandonment (as, say OpenSolaris has), looks set to keep that community happily coding away for some years to come. Given a choice for a greenfield project between C and Go, I can see many possibilities where I'd choose Go. I can see far fewer similar instances where I'd choose C++ over either Go or D.

D is another, rather different, successor language to C++ (which, as the name states, considers itself an improved or "incremented C."). Where Go is starting small and building out as a community effort, D was designed by Walter Bright, another C language luminary. He created the first native compiler for C++ (i.e., the first compiler not producing intermediate C code to be compiled "again"). Unlike Go, D's reference implementation is a revenue-generating commercial product. Unlike Go, D has an extremely rich language and set of core libraries; one can argue that these features make it closer to C++ than to the other languages discussed here. Yet it is that richness, combined with understanding of the (two major competing) leading implementations, that makes D attractive for C++ and Java refugees; they can implement the types of programs as before in a language that is cleaner, better-defined and more understandable than C++ can ever be.

D2, or the second major specification of the D language, appears to have wider community acceptance than D1 did, and we confidently expect multiple solid implementations (including GCC and hopefully LLVM)... if the larger community gets its house in order. I wrote a comment on Leandro Lucarella's blog where I said that "Go could be the kick in the pants that the D people need to finally get their thumb out and start doing things "right" (from an outside-Digital Mars perspective)."

Competition is generally a Good Thing™ for the customers/users of the competing ideas or products. It's arguably had a spotty record in the personal-computing industry (85%+ market share is not a healthy sign), but the successor-to-C++ competition looks to be entering a new and more active phase... from which we can all hope to benefit.

Wednesday, 14 April 2010

Process: Still 'garbage in, garbage out,', but...

...you can protect yourself and your team. Even if we're talking about topics that everybody's rehashed since the Pleistocene (or at least since the UNIVAC I).

Traditional, command-and-control, bureaucratic/structured/waterfall development process managed to get (quite?) a few things right (especially given the circumstances). One of these was code review.

Done right, a formal code review process can help the team improve a software project more quickly and effectively than ad-hoc "exploration and discovery" by individual team members. Many projects, including essentially all continuing open-source projects that I've seen, use review as a tool to (among other things) help new teammates get up to speed with the project. While it can certainly be argued that pair programming provides a more effective means to that particular end, they (and honestly, most agile processes) tend to focus on the immediate, detail-level view of a project. Good reviews (including but not limited to group code reviews) can identify and evaluate issues that are not as visibly obvious "down on the ground." (Cédric Beust, of TestNG and Android fame, has a nice discussion on his blog about why code reviews are good for you.

Done wrong, and 'wrong' here often means "as a means of control by non-technical managers, either so that they can honour arbitrary standards in the breach or so that they can call out and publicly humiliate selected victims," code reviews are nearly Pure Evil™, good mostly for causing incalculable harm and driving sentient developers in search of more humane tools – which tend (nowadays) to be identified with agile development. Many individuals prominent in developer punditry regularly badmouth reviews altogether, declaring that if you adopt the currently-trendy process, you won't ever have to do those eeeeeeeeevil code reviews ever again. Honest. Well, unless.... (check the fine print carefully, friends!)

Which brings us to the point of why I'm bloviating today:

Code reviews, done right, are quite useful;
Traditional, "camp-out-in-the-conference-room" code reviews are impractical in today's distributed, virtual-team environment (as well as being spectacularly inefficient), and
That latter problem has been sorted, in several different ways.

This topic came up after some tortuous spelunking following an essentially unrelated tweet, eventually leading me to Marc Hedlund's Code Review Redux... post on O'Reilly Radar (and then to his earlier review of Review Board and to numerous other similar projects.

The thinking goes something like, Hey, we've got all these "dashboards" for CRM, ERP, LSMFT and the like; why not build a workflow around one that's actually useful to project teams. And these tools fit the bill – helping teams integrate a managed approach to (any of several different flavours of) code review into their development workflow. This generally gets placed either immediately before or immediately after a new, or newly-modified, project artifact is checked into the project's SCM. Many people, including Beust in the link above, prefer to review code after it's been checked in; others, including me, prefer reviews to take place before checkin, so as to not risk breaking any builds that pull directly from the SCM.

We've been using collaborative tools like Wikis for enough years now that any self-respecting project has one. They've proven very useful for capturing and organising collective knowledge, but they are not at their best for tracking changes to external resources, like files in an SCM. (Trac mostly finesses this, by blurring the lines between a wiki, an SCM and an issue tracker.) So, a consensus seems to be forming, across several different projects, that argues for

a "review dashboard," showing a drillable snapshot of the project's code, including completed, in-process and pending reviews;
a discussion system, supporting topics related to individual reviews, groups of reviews based on topics such as features, or the project as a whole; these discussions can be searched and referenced/linked to; and
integration support for widely-used SCM and issue-tracking systems like Subversion and Mantis.

Effective use of such a tool, whatever your process, will help you create better software by tying reviews into the collaborative process. The Web-based versions in particular remove physical location as a condition for review. Having such a tool that works together with your existing (you do have these, yes?) source-code management and issue-tracking systems makes it much harder to have code in an unknown, unknowable state in your project. In an agile development group, this will be one of the first places you look for insight into the cause of problems discovered during automated build or testing, along with your SCM history.

And if you're in a shop that doesn't use these processes, why not?

On a personal note, this represents my return to blogging after far, far too long buried under Other Stuff™. The spectacularly imminent crises are now (mostly, hopefully) put out of our misery now; you should see me posting more regularly here for a while. As always, your comments are most welcome; this should be a discussion, not a broadcast!

Wednesday, 18 November 2009

Two steps forward, three steps back

Alternate title: ARRRRRGGGGGHHHHHHH!!!!!

Both of you Gentle Readers may have noticed that I've been away from the blog for a while, and that a few posts that were previously published have gone missing. I've been busy fighting some other fires for a while, and my current network access has lacked the stability and efficiency that local propaganda would have you expect.

This evening (Wednesday 18th) I came across a piece of nifty-looking software, MacJournal by Mariner Software. It looks great — software that would let me compose/revise blog posts offline, in a native Mac app with nice organizational features and so on, at an attractive price, and with a 15-day evaluation period thrown in, just so you can try before you buy.

"Cool," I thought; "I'll be able to multitask on my shiny new MBP that's coming Any Day Now™."

Downloading and installing the eval copy went just as you'd expect; drag an icon into a folder, wait for the "Copying" progress dialog to go away, and it's done. Standard Mac user experience; nothing to see here, folks — unless you were expecting a Windows-style "Twenty Questions" installation.

I decided to do a really simple, trivial first exercise: select the five posts I'd written so far in a tutorial series; add the keyword ("label" in Blogger.com parlance) tutorial to them; save them back to Blogger. No animals were harmed in the performance of this experiment, and very explicitly, no content was directly, intentionally edited. (Note the qualifiers; they're important.)

(insert "train wreck" sound effects here.)

The first two parts were (relatively) unmolested; they didn't have any code blocks in them. The latter three did, however, and those were completely deformed. Numerous span elements were added, particularly around links (MacJournal seems to think links shouldn't be underlined, ever). Other formatting was changed; in particular, code tags were replaced by spans that set the font size to 13 points.

WTF?

It's going to take me a bit of noodling around in the software to figure out how to change the defaults to something that makes sense (at least for me), and until then, I'm back to editing in the browser. If the evaluation period expires before I'm happy with the configuration, then I'll comply with the license and blow it off my system. I'd really rather not do that; the feature list looks good, the interface is clean, and best of all, I don't have this Could not contact blogger.com line underneath my editing area as I type.

I understand that MacJournal, like most apps, has default ways of laying things out and working with things. I'm well aware of the difference between an "import" and a "copy" of something. But... I believe very strongly that the first rule of software, as medicine, should be "First, do no harm" — and that includes "don't mess with my formatting without even putting up a confirmation dialog asking my permission!" I really don't think that's too much to ask, or too hard to implement — and doing so would a) make a much more positive initial user experience by b) showing that you've thought things through well enough that c) your still-potential user isn't looking at an hour or two of careful, detailed work just to get back to where he was before he touched your product &emdash; or, rather, it touched his work.

Like most Mac users, I've gotten spoiled by how well most software on this platform is thought through to the tiniest details. Like most, I get annoyed when I have to deal with Windows or Linux apps that simply aren't thought through at all, apparently. (Spend a week with Microsoft Office or, even better, Apple iWork on the Mac; I dare you to go back to Office on Windows and be happy with it.) To run into a Mac app that fails such a simple use case so spectacularly (granted, in its default configuration) simply beggars explanation.

Tuesday, 18 August 2009

Responding while Forbidden; Gender and Other Issues in OSS and PHP

This post is what would have been a response to a post on Elizabeth Naramore's blog, which she titled Gender in IT, OSS and PHP, and How it Affects Us *All*. Quite a good post, actually, with a long and often thoughtful (but as often thoughtless) comment thread following. I was hoping to respond to a comment on the post, but that apparently is no longer allowed, even though the "Leave a Reply" form at the bottom of the page is functional. There's also a mysterious "Login/Password" pair on the page, but no indication of which ID one uses, or how to go about getting one.

Following is the content of the reply-that-wasn't: I (perhaps unreasonably) think this has some points worth pondering. Please do read the original post first and then come back here - where the "comment" feature definitely does work.

@A Girl: Great that you're doing AP CompSci next year. As someone who's been in the Craft for 30 years, I have a great sentimental attachment to your idea that "teachers and professors have the chance to shape the mindsets of their students towards women in the industry". If we were a true profession, where essentially all practitioners have a certain common level and content of educational background combined with qualifying experience (e.g., apprenticeship, internship), I'd agree wholeheartedly.

The fact is that many if not most of the people in the industry - both the "coders in the trenches" and the ex-coders who got promoted into management ("because they were such great coders" - thereby removing two qualified people from an organization)... far too many of these people have noformal education in CS (or, often, anything else). And by the time they realize how important that might be, they're old enough that they're facing ageism in the workplace already - they're not confident enough to put the "big hole in the middle of [their] career" and "go back" to school. It doesn't help that schools the world over do such a lousy job of outreach and marketing to those potential students - they're focused on the Executive MBAs and other graduate-level returning students, who can have their pricey programs paid for by their employer. Joe or Jane Schmuck trying to keep head above water in the face of cut-throat competition from planeloads of new arrivals with mimeographed certifications, who've been taught their entire lives to never think out of the box to begin with... things start getting really tough out there. I'm not surprised that enrollment in CS programs is down. I'm amazed beyond words that it's still as high as it is; a less starry-eyed observer might expect the number of CS majors to closely track, say, majors in Phoenician economics.

A lot of the new entrants into CS and IT over the last ten years or so have degrees - they're just not in the "obvious" field. Someone, and I wish I could find the original, wrote an article in one of the industry mags (like C/C++ Users Journal, not IEEE Computer) that, to be good at software in the modern era, one needed to have exposure to "behavioral science, psychology, linguistics, human factors, sociology, philosophy, rhetoric, ethnology, ethnography, information theory, economics, organizational politics, and a dozen other things - and please, please learn to write competently in English!" - I've had that taped above my display for years now. So it's not that we're not educated; the problem - and it is a problem - is that there is no universal common body of knowledge for software "engineering" - which is one of the necessary precursors of any true profession. We're not going to have a CBOK without either a broad consensus within the industry, or imposition of a system from outside (and very narrowly-focused) forces in government or the larger economy. Given the prevailing social and political attitudes of current practitioners ("herding libertarian-poseur cats" is a phrase not infrequently heard), that would seriously disrupt the Craft and, by extension, any industry or field dependent upon software (which by now is pretty much everything).

How to solve the problem - and, in so doing, help redress the pandemic sexism, racism and ageism (in huge parts of the world, recruiting with explicit age limits is perfectly legal, and here in South Asia, you're old for coding at 28)? I've got no idea. When I first started doing this, I thought that within the next thirty years or so (from 1979), we'd be able to turn this informal craftwork that had taken the industry away from the "educated CS types" and turn it into a real profession. Now? I'd say we're 30 to 50 years away from now - unless we have a software equivalent of the New London School explosion and a "solution" gets imposed from outside. We need to grow up, and quickly.

Thursday, 6 August 2009

Smokin' Linux? Roll Your Own!

As people who've encountered the "business end" of Linux have known for some time, the system (in whichever distribution you prefer, greatly rewards (some would say 'requires') tinkering and customisation. This can be done with any Linux system, really; some distros, like LinuxFromScratch and, to a lesser degree, Gentoo and its derivatives, explicitly assume that you will be customizing their base system according to your specific needs.

Now the major distros are getting into it. There have been various Ubuntu and Fedora customisation kits on the Net, but none as far as I can tell that are as directly supported (or easy to use) as from OpenSUSE, the "community-supported" offering from Novell (who also offer SUSE Linux Enterprise Desktop and Server.

Visit the OpenSUSE site, and prominently visible is a link to the OpenSUSE Build Service, which "allows developers to package software for all major Linux distributions", or at least those that use rpm packaging, the packaging system used by Red Hat, Mandriva, CentOS, and other similar systems. But that's not all...

SUSE now have a new service, SUSE Studio, which allows users to create highly customized systems based on either the community (OpenSUSE) or enterprise versions of SUSE Linux. These "appliances" can be put together on the basis of "patterns", such as lamp_server (LAMP, or Linux/Apache/MySQL/PHP Web server) or technical_writing (which includes numerous tools like Docbook). You can even supply your own (either self-built or acquired elsewhere) RPM packages to include in the appliance you're building, and SUSE Studio will deal with the dependency matching (warning you if packages are required that aren't either among its standard set or uploaded by you).

Startup scripts, networking, basically anything that is usually handled through the basic installation or post-installation configuration - all can be configured within the SUSE Studio Web interface.

And then, when you've got your system just the way you want it, you can build it as either an ISO (CD/DVD) image to be downloaded and burned onto disc, or as a VM image for two of the most popular VM systems (VMWare and Xen).

But wait, there's more...

Using a Flash-enabled browser, you can even "test drive" your appliance, testing it while running (transparently) in an appropriate VM hosted within the SUSE Studio infrastructure. Especially if you have a relatively slow connection, this will let you do preliminary "smoke testing" without having to download the actual image to your local system. Once you're ready to do so, of course, downloading is very nearly a single-click affair. Oh, and you're given (presently) 15 GB of storage for your various builds - so you can easily do comparative testing.

What don't I like about it? In the couple of hours I've been messing around with it today, there's really only one nagging quibble: When you do the "test drive" of your new creation, the page you're running it in is a standard, non-secure http Web page. The page warns you that any data and keystrokes sent will not be encrypted, and recommends the use of ssh if that is a concern (by which most people will think https). But there's no obvious way to switch, and shutting down the running appliance (which starts by the time you read the warning) involves keystrokes and so on...

In fairness, this is still very clearly a by-invitation beta offering (but you can ask for an invite), and some rough edges are to be expected. I'm sure I'll run into another one or two as things go on. I'm equally certain that all the major problems will be smoothed out before SUSE Studio goes into general public availability.

So, besides the obvious compulsive hackers and the people building single-purpose appliance-type systems, who would really make use of this?

One obvious use case, which the SUSE Studio site describes, is as a canned demo of a software system. If you're an ISV, you can add your software or Web app to a SUSE Studio appliance, lock down the OS image to suit (encrypting file systems and so on), and hand out your discs at your next trade show (or have them downloadable from your Website). No worries about installing or uninstalling from prospective customers' systems; boot from the CD (or load it into a VM) and they're good to go.

Another thought that hit me this morning was for use as an interview filter. This can be in either of two distinct modes. First, you might be looking for people who are really familiar with how Linux works. Write up the specs of a SUSE Studio appliance (obviously more demanding than just the click-and-drool interface) and use an app of your own devising to validate the submitted entries. This validation could be automated in any of several ways.

The second possible interview filter would be as a programming/Web dev system. As a variation on the "ISV" example above, you load up an appliance with a set of tools and/or source files, ready to be completed and/or fixed by your candidates. They start up the appliance (either a live CD or VM), go through your instructions for the test, and then submit results (probably an encrypted [for authentication] archive of all the files they've touched, as determined by the base system tools) via email or FTP. On your end, you have a script that unpacks the submission into a VM and uses the appropriate automated testing tools to validate it. I can even see this as a business model for someone who offers this capability as a service to companies wishing to have a better filter for prospective candidates than resume-keyword matching - which as we all know is practically useless due to the high number of both false negatives and false positives.

What do you all think?

Friday, 10 July 2009

"You can have my...when you pry it out of my cold, dead hands"

One of the, shall we say, unusual things about being in this line of work is that you develop stronger-than-is-healthy bonds to particular bits and pieces of technology, both hardware and software. For example, I think that anybody who's worked on a Mac as their main system for a year or so would take a catastrophic productivity hit if they were required to work in a Windows-only environment. Further, I assert, based on experience and observation, that this hit would actually increase in severity commensurate with previous Windows experience, as the nature of the problems and hassles encountered on a continuous basis would be that much more familiar.

But, believe it or not, this isn't a hit piece on Windows. If anything, Mac OS X takes the brunt here, in more or less a continuation of a previous post.

Don't get me wrong. The Mac itself is still one of the very few things I'd plug into that quote in the title. But the differences between it and the BSD Unix heritage it's based on, at an architecture-implementation level, sometimes drive me up the wall; it's as though I'm caught in the old Saturday Night Live "It's a floor wax AND a dessert topping!" skit.

Case in point: the Web developer's Swiss Army Ginsu Knife, PHP. On Unix and Linux systems, it's very straightforward to build a scripting engine that contains just the features and extensions needed, with security and debugging features added in to taste. The last few releases have even made that process humanly feasible for Windows usees. On Mac OS X, however, it's a different story entirely. (Item: As of Friday 11 July, the search "mac php configuration" returned some 5.22 million hits. Applying Sturgeon's Second Law to this leaves us with at least half a million pleas for and/or offers of help.)

The practical result is that, on Mac OS X, one uses one of the various prebuilt binaries for PHP (from Apple, MacPorts, MAMP or similar — or goes without. If you're interested enough in PHP to want it to work on a Mac, you probably have some urgent deadline breathing down your neck; you don't have time to figure out how/why there are unique runes and incantations involved in making it work. This is the dark flip side of "almost everything Just Works"; the things that don't Just Work tend to average the entire experience out.

So, what's the most efficient solution for the reasonably serious Mac-loving Web developer? Simple: max out the memory in your Mac (and I do mean "as much as Steve&Co let you put in and not one byte less, blastit!"). Then, add one of the software tools that's going to be on that Mac somebody pries out of my "cold, dead hands"; VMWare Fusion. Once you have Fusion installed, grab an ISO image of your BSD version or Linux distro of choice, install it, and use that for your PHP and other web-dev activities.

Better still, you can still use your fave Mac editor, browser and so on during development in the VM; you'll want to set up ssh on your VM, and then use sshfs to mount your Linux/BSD filesystem as a disk volume in your Mac. From that point — a file is a file is a file; you're just taking advantage of the *ix VM's tools. This is how I do my PHP 5.3 ~~and PHP 6~~ testing now.

And, after going through all this extra effort, you may well pray that the new version of Mac OS X enables a better native solution. I know I do.

Nothing is perfect in this world, and all technologies have speed bumps in some fashion. The good bad thing about OS X is that there aren't nearly as many as in some other systems I could name — but when you hit the ones that are there, you hit them hard.

EDITED 2010/11/03: At the time this was originally written, there was a prerelease bit of software that lots of people, including the vendor and the opportunistic authors and publisher of at least one programming book, thought was going to eventually become PHP version 6. Not quite; one of the long-promised features for PHP 6 went down a rabbit hole and ultimately killed the whole thing. Most of the usable "PHP 6" features were incorporated into version 5.3.2.

Tuesday, 23 December 2008

Maybe not eating 'crow', specifically, but..... DUCK!!!

As in, "bend over, here it comes again..."

One of the things I have greatly appreciated about the Mac, especially with OS X, is how simple and straightforward software management is, compared to Linux and especially Windows (where every system change is a death-defying adventure against great odds). Operating system or Apple-supplied apps need an update? Software Update is as painless as it gets: the defaults Just Work in proper Mac fashion, but you can set your own schedule, along with a few other options. There is a well-established convention for third-party apps to check for updates via a Web service "phoning home" at app startup; this has been very easy to deal with. Application and file layout is regular and sensible; libraries and resources are generally grouped in bundles at the system or user level. After a few years of DLL hell in Windows and library mix-and-match in Linux, this was shaping up to be a real pleasure.

Then, as some of you know, I updated Mac OS X on my iMac from 10.5.5 to 10.5.6. As expected, that apparently went as smooth as glass. I even blogged about it. XCode worked; MS Office 2008 for the Mac worked; Komodo Edit worked; all my IM clients worked; all seemed customarily wonderful in the omniverse. I even started up Mail; it opened normally and happily downloaded my regular mail and Google mail, just as it had done every day for months. (I didn't actually open any messages then; that will turn out to be important.) Satisfied that everything Just Worked as always, I went back to working on a project for a few hours before turning in for the night.

Next morning, I went through the usual routine. Awake the Mac from hibernation; log in; start Yahoo, MSN and Skype; start Mail; open Komodo; open Web browsers (Safari, Opera and Camino) and I'm ready to get started. First thing...here's an interesting-sounding email message; let's open that up and... *POOF* — Mail crashes.

WTF? It started up just fine; I even got the "Message for you, Sir" Monty Python WAV I'd set Mail to use as my new-mail-received notification. I start Mail again. Picking a different message, I double-click it in the inbox. A window frame opens with the message title, sits empty for a few hundred milliseconds, then Mail goes away again. Absolutely, totally repeatable. Reboot changes nothing. Safe Boot (the Mac equivalent of Windows' "safe mode") changes nothing. The cold fingers of panic stroke my ribs like Glenn Gould at the piano. On a bad-karma scale of 0 to 10, initial reaction is an "O my God"; we're not dead, but we're hurt bad; the karma has definitely run over the dogma.

The next couple of days are spent using my ISP's Webmail service, and a set of Python scripts I'd previously written to search mailbox contents — Apple Mail, like any sensible email program, adheres to established standard formats. If I'd been using Microsoft Lookout! in a similar situation, I'd have been up the creek.

Finally, I come across some Web-forum items that indicate that GPGMail needs to be updated; if it's not, Mail will crash under OS X 10.5.6 — which is exactly what was happening. (If you're not using GPGMail, GNU Privacy Guard, or any of the various GPG interfaces for Windows such as Enigmail for Mozilla Thunderbird, you don't know how many people are recording and/or reading your email — but if it transits a server in the US or UK, it's guaranteed that it will be.

Installing the upgraded GPGMail bundle was the work of less than two minutes (hint: remove or rename the old bundle before copying the new one over. You probably don't need the insurance, but consider how we got here...). Then start up Mail as usual. It should, once again, Just Work — complete with being able to read and reply to messages, with or without GPG signatures.

OK, so what lessons can we take away from this experience, both as users and developers?

Time Machine may well be the single most rave-worthy piece of software I've touched in 30 years, but it can't (obviously, easily) do everything, and in a crisis, even experienced users may well not want to risk bringing too much (or too little) "back from history". There's definitely a market for addons to TM to do things like "look in my user and system library directories, the Application directory structure, MacPorts, etc., and bring application Foo back to the state it was in last Tuesday morning, but leave my data files as they are." I almost certainly could do that with the bare interface -- but, especially since it was "broken" as part of an OS upgrade, and (with the Windows/Linux experience fresh in mind) not comfortable exploring hidden dependencies... I was without my main email system for three days. Sure, I had workarounds -- that I wouldn't have had if I'd been in a stock Windows situation -- but that's not really the point, is it?

Also, app developers (Mac or other), add this to your "best practices" list: If your software uses any sort of plug-in/add-on architecture, where modules are developed independently of the main app, then you can have dependency issues. The API you make available to your plugin developers will change over time (or your application will stagnate); if you make it easy for them (and your users) to deal with your latest update, you'll be more successful. There's (at least) two ways to go about doing this:

The traditional "brute force" approach. Have a call you can use to tell plugins what version of the app is running, and allow them to declare whether or not they're compatible with that version. Notify the user about any that don't like your new version. For examples of this, see the way Firefox and friends deal with their plugins. Yes, it works, but it's not very flexible; a new version may come out that doesn't in fact modify any of the APIs you care about — which means that the plugin should work even though it was developed against version 2.4 of your app and you're now on 4.2.

Alternatively, a more fine-grained approach. Group your API into smaller, functional service areas (such as, say, address-book interface or encryption services for an email program). Have your plug-in API support a conversational approach.

The app calls into the plugin, asking it which services it needs and what versions of each it supports.
The app parses the list it gets back from the plugin. If the app version is later than the supported range for a specific feature identified by the plugin, add that to a "possibly unsupported" list. (If the app version is earlier than the range supported by the plugin, assume that it's not supported and go on to check the next one.)
If the "possibly unsupported" plugin list is empty, go ahead and continue bringing up the app, loading the plugins normally; you're done with this checklist.
For each item in the "possibly unsupported" list, determine whether the API for each feature required for the plugin has changed since the plugin was explicitly supported. (This is how a plugin for an earlier release, say 2.4, could work just fine with a later version, like 4.2.) If there's no change in the APIs of each feature required by the plugin, remove that plugin from the "possibly unsupported" list.
If any plugins remain in the list, check if there's an updated version of that plugin on the Net. This might be done using a simple web-service-to-database-query on your Web server. If your Web server knows of an update, ask the user for permission to install it. If the user declines, or no upgrade is available, unload the plugin. (You'll check again next time the app is started; maybe there's an update by then.)
Once the status of each plugin has been established, and compatible plugins loaded, finish starting up your app.

Of course, there are various obvious optimisations and convenience features that can be built into this. Any presentation to the user can and likely should be aggregated; "here's a list of the plugins that I wasn't able to load and couldn't find updates for." Firefox and friends are a good open-source example of this. The checks for plugin updates can also be scheduled, so as not to slow down every app startup. This might be daily, weekly, twice a month, whatever; the important thing is to let the user configure that schedule and view a list of plugins that are installed but not active.

As I started this post by saying, I've been very favorably impressed by Mac apps' ease of use (including installation and maintenance). Mail fell down and couldn't get up again without outside assistance; this is unusual. The fact that this was caused by a plugin and that Mail could not detect and work around the conflict just amazes me; I expect more from Apple. I'm not ready to decrease my use of the Mac because this happened — but I am going to pay more attention to how things work under the hood. The fact that I have to even be aware of this -- which is one of the features that hitherto distinguished the Mac from the grubbier Windows and Linux alternatives -- is worrisome.

Again, your comments are welcome.

Wednesday, 17 December 2008

Things that make you go 'Hmmmm', continued

Very much picking up from the mindset expressed in my earlier post... with the knowledge that this could (and probably should) be broken up into at least three different rants...

I've been working heavily in Python for the past couple of months, regrettably letting some other projects slide a bit. Now done with that, I spent yesterday picking up where I'd left off in a moderately-sized, reasonably well-designed PHP 5.2 project. (Bear in mind that my PHP experience is easily 5 or 6 times as much as my Python.)

And... while I'm not ready to jump on the "PHP sucks" bandwagon, it does feel clunky. Occasionally obscure (though never up to the standards of obscurity a good Perl hacker deals with every day).

Why is this? Three years ago, Joel Spolsky wrote an excellent rant on The Perils of JavaSchools. His point essentially boils down to that how you're trained (or "educated") as a developer shapes the way you look at problems; if all you know is a "Hammer", you try to visualize every problem as a "Nail" (even when it's a "Glass Figurine"). You may well have less-than-satisfying success with that view.

More importantly, the tools and techniques you know shape whether you can properly understand a problem at all. Not having certain features in a language (or not being knowledgeable in their use) means that you'll write clunky, hard-to-understand (and therefore -maintain) code to achieve the desired result...and wind up with (an attempt at) calculus using Roman numerals. Just as having a positional numeric system (e.g. Arabic or "modern" numerals) makes whole classes of problems possible that weren't otherwise, languages and their features make programming problems practical or more efficient.

What we don't want is one language that tries to do everything, in every way possible. We already have that; it's called Perl, and one ramification of its overriding philosophy, "There's more than one way to do it", is that there's always a better way to do it; the search for same can and often does suck in resources faster than a Sol-sized black hole. This also illustrates the failing of most of the more recent practitioners of software development that I've worked with. While those of us who've been working since before oh, about 1988 or 1990 generally make a point of reading at least one new technical book a month, I recently led a group of about 20 young (less than 5 years experience as of 2006) developers where not a single one admitted to reading more than one technical book a year since graduation. These people didn't know how to solve the problem we were working on because the two or three tools which they were familiar with encourage their users to think in ways that do not lead to effective solutions for this problem.

A language, any language, can really only do a limited number of things well. If you are fluent in more than one human language, say, English and Mandarin Chinese, think for a moment about concepts and sayings that are natural in one language but just don't work well in the other. Computer languages are like that, too, which is one reason why your computer's operating system is much less likely to be written in COBOL than your company's accounting program is (with benefits for all concerned).

Getting back to what started this rant....coming back to PHP after a sojourn in Python....

PHP gets the job done. Recent versions, particularly the current 5.2, are much more pleasant to work in than previous versions were for those of us who "think in objects". But it has taken years to get here.

As I look at one class in particular in this PHP project's code base, part of my mind is working on "if this were Python code, I'd write it like..." — and a two-hundred-line unit of code would be about 60 or 80, and much cleaner and easier to understand to boot. Why is that? Think "original intent".

PHP was originally developed as an adjunct to HTML for Web pages, to provide some simple dynamic content. It then "just growed", adding new features and capabilities (database access, object orientation) as it became used in a wider variety of problems. It is, quite simply, a tool that grew into a reasonably useful — if not quite general-purpose — language. There are a number of things it does quite well, especially since it can be used to do useful work without requiring a steep learning curve beforehand.

Python is different. A general-purpose language, with functional-programming features, it is useful for Web application development (e.g., with mod_python for the Apache Web server). Whereas Perl has the idea that "There's more than one way to do it" — and therefore neither a best way, nor strictly an advance certainty that anything in particular will work — Python argues that "there should be one — and preferably only one — obvious way to do it", whatever "it" is.

So am I suggesting that PHP developers ditch everything and go with Python? Of course not. What I am arguing is the seemingly quaint notion that developers, especially those who aspire to work in the craft for a living, should continually strive to learn new tools, techniques, languages and processes. Your abilities are like every other living thing; they're either growing, or they're dying.

Sunday, 12 October 2008

Things that make you go 'Hmmmm'

...or 'Blechhhh', as the case may be... I've been using PHP since the relative Pleistocene (I recently found a PHP3 script I wrote in '99). I've been using and evangelising test-driven development (TDD) for about the last five years, usually with most such work being done in C++, Java, Python or other traditionally non-Web languages (with PHP really only being amenable to that since PHP 5 in 2004). So here I am, puttering away on a smallish PHP project that I've decided to TDD from the very beginning. For one of the classes, I throw together a couple of simple constructor tests in PHPUnit, to start, such as:

require_once( 'PHPUnit/Framework.php' );

require_once( '../scripts/foo.php' );

class FooTest extends PHPUnit_Framework_TestCase
    public function testCanConstructBasic();
    {
        $Foo = new Foo( 'index.php' );
    }
    
    public function testCanConstructBasicWildcard()
    {
        $Foo = new Foo( '*.php' );
    }
};

And, as is right and proper, I code the minimal class necessary to make that pass:

class Foo
{
};

That's it. That's really it. No declaration whatever for the constructor or any other methods in the class. Since it doesn't subclass something else, we can't just say "oh, there might be a constructor up the tree that matches the call semantics." PHPUnit will take these two files and happily pass the tests. I understand what's really going on here - since the class is empty, you've just defined a name without defining any usage semantics (including construction). I would say fine; not a problem. But I would think that PHPUnit should, if not give an error, then at least have some sort of diagnostic saying "Hey, you're constructing this object, but there are no ctor semantics defined for the class." I can see people new to PHP and/or TDD, who are maybe just working through and mentally adapting an xUnit tutorial from somewhere, getting really confused by this. I know I did a double-take when I opened the source file to add a new method (to pass a test not shown above) and saw nothing between the curly braces. On one level, very cool stuff. On another, equally but not always obviously important level, more than enough rope for you to shoot yourself in the foot. Or, to put it another way, even though I've been writing in dynamic languages off and on for ages, I still tend to think in incompletely dynamic ways. Sometimes this comes back and bites me. Beware: here be (reasonably friendly, under the circumstances) dragons.

Thursday, 19 June 2008

Good Things are good things....aren't they?

Anybody who's worked with me over the last 20 years or so knows that I generally evangelize conforming to standards when they exist, are relevant and widely agreed on. As the famous quote from Andrew Tanenbaum (in Computer Networks, 2/e, p. 254) reminds us, "The nice thing about standards is that you have so many to choose from." When "standards" are used to promote vendor agendas (e.g., Microsoft force-feeding OOXML to a hapless ISO) or when they go against the common sense built up through hard-won experience by practitioners. And when multiple standards for a product or activity exist, and those standards are each widely used by various users (who could have chosen other alternatives), and when those standards conflict with each other in important ways that can't be amicably resolved, then those "standards" cause reasonable people to not merely question their validity, but, too often, the entire concept of "standards".

As any developer who's worked in more than one shop, or sometimes even on more than one project in a shop, knows, coding standards are sometimes arbitrary, often the prizes and products of epic bureaucratic struggle, and (in the absence of automated enforcement such as PHP CodeSniffer) often honored more in the breach than the compliance. What makes things even more "fun" is conflicting standards. It's not all that unusual for a company to contract out for development work, specifying that their coding standards be complied with (since they're the customer and they're going to maintain, or control maintenance of, the code). If the contractor has their own set of standards that conflict with the customer's, then problems arise with internal process compliance, customer involvement and final delivery. It can be — and too often is — a sorry mess. Simple code reformatting problems can be taken care of with a pretty-printer program; oftentimes, though, one sees entire programs (which have to be debugged, documented and maintained) developed just to "translate" one format to another. Many shops just give up, declare the project to be an exception or exemption from their own internal standards and processes, and try to conform to the customer's demands. "Try to", since their developers, both writing and reviewing the code, are going to be fighting against it tooth and nail because it just "feels wrong".

This whole rant was inspired by reading through yet another coding-standard document; this one the Zend Framework PHP Coding Standard. One item in particular struck me as counter-intuitive. In Item B.2.1.1, PHP File Formatting — General, it says:

For files that contain only PHP code, the closing tag ("?>") is never permitted. It is not required by PHP. Not including it prevents trailing whitespace from being accidentally injected into the output.

Experienced PHP developers are quite likely to have problems with this, not least because it conflicts with earlier behavior of the PHP interpreter and with tools that expect well-formed code. This is one of the oddities which tools like the aforementioned PHP CodeSniffer need to take into account. (There are other, more blatant "yellow flags"

If you're in a shop which takes standards seriously, uses PEAR code and uses the Zend Framework, your code review meetings are likely quite interesting.

"OK, we're going to look at foodb.class.php first, and then the others I mentioned in the email yesterday."
"Which standard does it use?"
"Well, it ties in with PDO, so it ought to follow the PEAR standard, right?"
"OK, that sounds reasonable."

As the meeting continues...

"Hey, what's this at the end of omnibar.class.php? There's no 'close-PHP' tag! If we start using the code-search Wiki plugin that the Bronx Project folks keep raving about, it's not going to like that...."
"Oh, yeah, but that's because it uses all this Zend Framework stuff, so we use Zend's coding conventions... see that comment at the top about how to run CodeSniffer?"
"Riiiiight...."

and so on. Weren't process and standards supposed to make development easier and more reliable?

I agree with the sentiment, apocryphally attributed to one or another of numerous software gurus, that, in the presence of otherwise adequate and sufficient standards, we shouldn't be so "egotistical" as to think developing a "better" standard than others already have is worth our time; take what's already out there, adapt as necessary, and move forward. The trick, of course, is in evaluating that condition, "otherwise adequate and sufficient." Also, since our craft is (hopefully) continuing to advance and adopt standard patterns for things done before, striking out on your own (after careful consideration) demands that the question be revisited from time to time. What are other development groups using (broadly) similar techniques to solve (broadly) similar problems using? Is a consensus forming, and do we have anything useful to say about it? Or has a single standard already taken hold, and we can take advantage of it (at least for new or reworked code)?

Code analyzers like lint and PHP CodeSniffer can be amazingly useful. But for them to function as standard/policy enforcement tools, there must be a standard, or a small group of similar standards for them to enforce. When development teams have to juggle between incompatible standards, it discourages them from following any standards. And in that direction lie... the 1970s.

Saturday, 10 May 2008

ANFSD: starting a series to scratch an itch

(And Now For Something Different, for the 5LA-challenged amangst you...)

I've made my living, for about half my career, on the proposition that if I stayed (at least) three to six months ahead of (what would become) the popular mean in software technology, I'd be well-positioned to help out when Joe Businessman or Acme Corporation came along and hit the same technology — with the effect of "Refrigerator" Perry hitting a reinforced-concrete wall. This went reasonably well as "the market" started using PCs, then GUIs, then object-oriented programming, and then "that Internet thingy" (Shameless plug: résumé here or in PDF format).

In other ways, I've been a staunch traditionalist. I've used IDEs from time to time, because I was working as part of a team that had a standard tool set, or because I was programming for Microsoft Windows and the Collective essentially requires that that be done in their (seventh-rate) IDE unless you want to decrease productivity by several dozen orders of magnitude.

Otherwise, just give me KATE or BBEdit and a command-line compiler and I'm happy. This continued for a significant chunk of the history of PCs, until I decided that, for the Java work I was doing, I really needed some of the whiz-bang refactoring and other tie-ins supported by Eclipse and NetBeans. Then I started hacking around on a couple of open-source C++ packages and thought I'd give the Eclipse C/C++ Development Tooling a try. Now I'm coming up to speed on wxWidgets development in C++.

During this learning-curve week, I spent a lot of time browsing the Web for samples, tutorials and so on. To call most of them execrable is to give them unwarranted praise. Having recently resumed work on a Web development book dealing with useful standards and helpful process, and since I've been doing C++ off and on since the mid-80s, I thought I'd start a series of blog entries that would:

Document some of the traps and tricks I hit to get a simple wxWidgets program into Eclipse;
Illustrate some early, very simple refactoring of the simple program to get a bit more sanity;
Get Subversion and Eclipse playing well together;
Explain why I think parts of teh Agile method are simulataneously nothing new and the best new idea to hit development in a very long time.
Start using an automated-testing tool to build confidence during debugging and refactoring; and
Using a code-documentation tool in the spirit of JavaDoc to produce nice technical/API docs.

At the end of the series, you'll have a pretty good idea of how I feel most projects (regardless of underlying technology and specific tools) "should" be done. You'll have seen a very simple walk-through of the process, demonstrated using Linux, Eclipse, C++ and wxWidgets, but actually quite broadly applicable well beyond those bounds.

Please send comments, reactions, job offers, etc., to my email. Death threats, religious pamphlets, and other ignorance can, as always, go to /dev/null. Thanks!

Archimedes' Lever