Archimedes' Lever: May 2010

Wednesday, 26 May 2010

Beating Your Head Against the Wall, Redux

...or, the "Monty Python and the Holy Grail" monks' guide to making your Mac desktop work like a server, instead of going and getting a copy of OS X Server like you should...

Mac OS X Server brings OS X simplicity and Unix power to a range of hardware systems. Most of the things that Server makes trivially simple can be done in OS X Desktop. Some of them, however, require the patience of Job and the ingenuity and tenaciousness of MacGyver...or so they at first appear.

One such task is installing Joomla! This is a nice little Web CMS which has some nice features for users, developers and administrators. On most Unix-like systems, or even recent versions of Microsoft Windows, installation is a very straightforward process for any system which meets the basic requirements (Web server, PHP, MySQL, etc.) as documented in the (PDF) Installation Manual or the PDF Quick Start guide. On most systems, it takes only a few minutes to breeze through from completed download to logging into the newly installed CMS.

The OS X desktop, as I said, is a bit different. This isn't a case of Apple's famous "Think Different" campaign so much as it appears to be a philosophical conflict between Apple's famous ease-of-use as applied to user and rights management, coming up against traditional Unix user rights management. Rather than the former merely providing a polished "front end" interface for the latter, some serious mind-games and subversion are involved. And I'm not talking about the well-known version control software.

When things go wrong with regard to a simple installation process of a well-known piece of software, usually Google is Your Best Friend. If you search for joomla mac install, however, you quickly notice that most of the hits talking about OS X desktop recommend that you install a second Apache, MySQL and PHP stack in addition to the one that's already installed in the system — packages such as XAMPP, MAMP and Bitnami. While these packages each appear to do just what it says on their respective tins, I don't like having duplicate distributions of the same software (e.g., MySQL) on the same system.

Experience and observation have shown that that's a train wreck just begging to happen. Why? Let's say I've got the "Joe-Bob Briggs Hyper-Extended FooBar Server" installed on my system. (System? Mac, PC, Linux; doesn't matter for this discussion.) When FooBar bring out a new release of their server (called the 'upstream' package), Joe-Bob has to get a copy of that, figure out what's changed from the old one, and (try to) adapt his Hyper-Extended Server to the new version. He then releases his Hyper-Extended version, quite likely some time behind the "official" release. "What about betas," you ask? Sure, Joe-Bob should have been staying on top of the pre-release release cycle for his upstream product, and may well have had quite a lot of input into it. But he can't really release his production Hyper-Extended Server until the "official" release of the upstream server. Any software project is subject to last-minute changes and newly-discovered "show-stopper" issues; MySQL 5.0 underwent 90 different releases. That's a lot for anybody to keep up with, and the farther away you get from that source of change (via repackaging, for example), the harder it is to manage your use of the software, taking into account features, security and the like.

So I long ago established that I don't want that sort of duplicative effort for mainstream software on modern operating systems. (Microsoft Windows, which doesn't have most of this software on its default install, is obviously a different problem, but we're not going there tonight.) It's too easy to have problems with conflicts, like failing to completely disable the "default" version of a duplicated system, to inject that kind of complexity into a system needlessly.

That isn't to say that it doesn't become very attractive sometimes. Even on a Mac OS X desktop — where "everything" famously "just works" — doing things differently than the "default" way can lead to great initial complexity in the name of avoiding complexity down the line. (Shades of "having to destroy the village in order to save it.")

The installation of Joomla! went very much like the (PDF) Installation Manual said it should... until you get to the screen that asks for local FTP credentials that give access to the Joomla! installation directory. It would appear that setting up a sharing-only user account on the system should suffice, and in fact several procedures documenting this for earlier versions of Mac OS X describe doing just that. One necessary detail appears different under 10.6.2, however: the "Accounts" item in System Preferences no longer allows the specification of user-specific command shells...or, if it does, it's very well hidden.

Instead, I created a new regular, non-administrative user for Joomla! I then removed the new user's home directory (under /Users) and created a symlink to the Joomla! installation directory.

Also, one difference between several of the "duplicating" xAMP systems I mentioned above and the standard Apache Web server as on OS X (and Linux) is that in the default system, access to served directories is disabled by default; the idea is that you define a new Apache <Directory> directive for each directory/application you install. Failing to do this properly and completely will result in Apache 403 ("Forbidden") errors. Depending on your attitude to security, you may either continue to do this, or change the default httpd.conf setting to Allow from All and use .htaccess files to lock down specific directories.

Once you have the underlying requirements set up (and FTP access is the only real think-outside-the-box issue, really), Joomla! should install easily. But if you're in a hurry and just trying to go through the documented procedures, you're quite likely to spend considerable time wondering why things don't Just Work.

And no, neither Fedora nor Ubuntu Linux "do the right thing" out-of-the-box either. At least, not im my tests.

Monday, 17 May 2010

Those were the days, my friend...but they're OVER.

No, I'm not talking about Facebook and the idea that people should have some control over how their own information makes money for people they never imagined. That's another post. This is a shout out to the software folk out there, and the wannabe software folk, who wonder why nobody outside their own self-selecting circles seems to get excited about software anymore.

Back in the early days of personal computers, from 1977 to roughly 2000, hardware capabilities imposed a real upper limit on software performance and capabilities. More and more people bought more and more new computers so they could do more and more stuff that simply wasn't practical on the earlier models. Likewise, whenever any software package managed something revolutionary, or did something in a new way that goosed performance beyond what any competitor could deliver, that got attention...until the next generation of hardware meant that even middle-of-the-road software was "better" (by whatever measure) than the best of what could happen before.

After about 2000, the speed curve that the CPUs had been climbing flattened out quite a lot. CPU vendors like Intel and AMD changed the competition to more efficient designs and multi-core processors. Operating systems – Linux, Mac OS X and even Microsoft Windows – found ways to at least begin to take advantage of the new architectures by farming tasks out to different cores. Development tools went through changes as well; new or "rediscovered" languages like Python and Forth that could handle multiprocessing (usually by multithreading) or parallel processing became popular, and "legacy" languages like C++ and even COBOL now have multithreading libraries and/or extensions.

With a new (or at least revamped) set of tools in the toolbox, we software developers were ready to go forth and deliver great new multi-everything applications. Except, as we found out, there was one stubborn problem.

Most people (and firms) involved in software application development don't have the foggiest idea how to do multithreading efficiently, and even fewer can really wrap their heads around parallel processing. So, to draw an analogy, we have these nice, shiny new Porsche 997 GT3s being driven all over, but very few people know how to get the vehicle out of first gear.

I started thinking about this earlier this evening, as I read an article on Lloyd Chambers' Mac Performance Guide for Digital Photographers & Performance Addicts (pointed to in turn by Jason O'Grady and David Morgenstern's Speedtest article on ZDNet's Apple blog. One bit in Chambers' article particularly caught my eye:

With Photoshop CS4/CS5, there is also increased overhead with 8 cores due to CS5 implementation weakness; it’s just not very smart about knowing how many threads are useful, so it wastes time and memory allocating too many threads for tasks that won’t even benefit from them.

In case you've been stuck in Microsoft Office for Windows for the last 15 years, I'll translate for you: The crown-jewels application for Adobe, one of the oldest and largest non-mainframe software companies in history, has an architecture on its largest-selling target hardware platform that is either too primitive or defective to make efficient use of that platform. This is on a newly released version of a very mature product, that probably sells a similar proportion of Macs to the proportion of Windows PCs justified by Microsoft Office (which is even better on the Mac, by the way). Adobe are also infamous among Mac users and developers for dragging their feet horribly in supporting current technologies; CS5 is the first version of Adobe Creative Suite that even begins to support OS X natively – an OS which has been around since 2001.

I've known some of the guys developing for Adobe, and they're not stupid...even if they believe that their management could give codinghorror.com enough material to take it "all the way to CS50." But I don't recall even Dilbert's Pointy Haired Boss never actually told his team to code stupid, even if he did resort to bribery (PHB announces he'll pay $10 for every bug fix. Wally says, "Hooray! I'm gonna code me a minivan!"...i.e., several thousand [fixed] bugs).

No, Adobe folk aren't stupid, per se; they're just held back by the same forces that hold back too many in our craft...chiefly inertia and consistently being pushed to cut corners. Even (especially?) if you fire the existing team and ship the work off halfway around the world to a bunch of cheaper guys, that's a non-solution to only part of the problem. Consider, as a counterexample, medicine; in particular, open-heart surgery. Let's say that Dr. X develops a refinement to standard technique that dramatically improves success rates and survival lifetimes for heart patients he operates on. He, or a colleague under his direct instruction, will write up a paper or six that get published in professional journals – that nearly everybody connected with his profession reads. More research gets done that confirms the efficacy of his technique, and within a fairly short period of time, that knowledge becomes widespread among heart surgeons. Not (only) because the new kids coming up learned it in school, but because the experienced professionals got trained up as part of their continuing education, required to keep their certification and so on.

What does software development have in common with that – or in fact, with any profession? To qualify as a profession, on the level of the law, accounting, medicine, the military, or so on, a profession is generally seen as

Individuals who maintain high standards of ethics and practice;
who collectively act as gatekeepers by disciplining or disqualifying individuals who fail to maintain those standards;
who have been found through impartial examination to hold sufficient mastery of a defined body of knowledge common to all qualified practitioners, including recent advances;
that mastery being retained and renewed by regular continuing education of at least a minimum duration and scope throughout each segment of their careers;
the cost of that continuing education being knowingly directly or indirectly subsidised by the professional's clients and/or employer;
such that these professionals are recognised by the public at large as being exclusively capable of performing their duties with an assuredly high level of competence, diligence and care;
and, in exchange for that exclusivity, give solemn, binding assurances that their professional activities will not in any way contravene the public good.

Whether every doctor (or accountant, or military officer, or lawyer) actually functions at that level at all times is less of a point than that that is what is expected and required; what defines them as being "a breed apart" from the casual hobbyist. For instance, in the US and most First World countries, an electrician or gas-fitter is required to be properly educated and licensed. This didn't happen just because some "activists" decided it was a good idea; it was a direct response to events like the New London School explosion. By contrast, in many other countries (such as Singapore, where I live now), there is no such requirement; it seems that anybody can go buy himself the equipment, buy a business license (so that the government is reassured they'll get their taxes), and he can start plastering stickers all over town with his name, cell phone number and "Electrical Repair" or whatever, and wait for customers to call.

Bringing this back to software, my point is this: there is currently no method to ensure that new knowledge is reliably disseminated among practitioners of what is now the craft of software development. And while true professionals have the legal right to refuse to do something in an unsafe or unethical manner (a civil engineer, for example, cannot sign off on a design that he has reason to believe would endanger public users/bystanders, and without properly-executed declarations by such an engineer, the project may not legally be built). Software developers have no such right. Even as software has a greater role in people's life from one day to the next, the process by which that software is designed, developed and maintained is just as ad hoc as the guy walking around Tampines stuffing advertisements for his brother's aircon-repair service under every door. (We get an amazing amount of such litter here.)

Adobe may eventually get Creative Suite fixed. Perhaps CS10 or even CS7 will deliver twice the performance on an 8-core processor than on a 4-core one. But they won't do it unless they can cost-justify the improvement, and the costs are going to be much higher than they might be in a world where efficient, modern software-development techniques were a professionally-enforced norm.

That world is not the one we live in, at least not today. My recurring fear is that it will take at least one New London fire with loss of life, or a software-induced BP Deep Horizon-class environmental catastrophe, for that to start to happen. And, as always, the first (several) iterations won't get it right – they'll be driven by politics and public opinion, two sources of unshakable opinion which have been proven time and again to be hostile to professional realities.

We need a better way. Now, if not sooner.

Saturday, 8 May 2010

Making URL Shorteners Less "Evil"

The following is the text of a comment I attempted to post to an excellent post on visitmix.com discussing The Evils of URL Shorteners. I think Hans had some great points, and the comments afterward seem generally thoughtful.

This is a topic which I happen to think is extremely important, for both historical and Internet-governance reasons, and hope to see a real discussion and resolution committed to by the community. Thanks for reading.

I agree with the problem completely, if not with the solution. I was a long-time user and enthusiastic supporter of tr.im back in the day (up to what, a couple of months ago?) It was obvious they were doing it more or less as a public service, not as a revenue-generating ad platform; they were apparently independent of Twitter, Facebook and the other "social media" services (which is important; see below) and several other reasons. Unfortunately, since the First Law of the InterWebs seems to be that "no good deed goes unpunished," they got completely hammered beyond any previously credible expectation, and, after trying unsuccessfully to sell the service off, are in the process of pulling the plug.

I think it's absolutely essential that any link-shortening service be completely independent of the large social-media sites like Facebook and Twitter, specifically because of the kind of trust/benevolence issues raised in the earlier comments. We as users on both ends of the link-shortening equation might trust, say, Facebook because their policies at the time led us to believe that nothing dodgy would be done in the process. I think the events of the past few weeks, however, have conclusively proven how illusory and ill-advised that belief can be. Certainly, such a service would give its owner a wealth of valuable marketing data (starting with "here's how many unique visitors clicked through links to this URL, posted by this user"). They could even rather easily implement an obfuscation system, whereby clicking through, say, a face.bk URL would never show the unaltered page, but dynamically rewrite URLs from the target site so that the shortener-operator could have even MORE data to market ("x% of the users who clicked through the shortened URL to get to this site then clicked on this other link," for example). For a simple, benign demonstration of this, view any foreign-language page using Google Translate. (I'm not accusing Google of doing anything underhanded here; they're just the most common example in my usage of dynamic URL rewriting.)

Another security catastrophe that URL shorteners make trivially easy is the man-in-the-middle exploit, either directly or by malware injected into the user's browser by the URL-shortener service. The source of such an attack can be camouflaged rather effectively by a number of means. (To those who would say "no company would knowingly distribute malware", I would remind you of the Sony rootkit debacle.)

So yeah, I resent the fact that I essentially must use a URL-shortener (now j.mp/bit.ly) whenever I send a URL via Twitter. I also really hate the way too many tweets now use Facebook as an intermediary; whenever I see a news item from a known news site or service that includes a Facebook link, I manually open the target site and search for the story there. That is an impediment to the normal usage flow, reducing the value of the original link.

Any URL-shortening service should be transparent and consistent with respect to its policies. I wouldn't even mind seeing some non-Flash ads on an intermediate page. ("In 3 seconds, you will be redirected to www.example.com/somepage, which you requested by clicking on w.eb/2f7gx; click this button or press the Escape key on your keyboard to stop the timer. If you click on the ad on this page, it will open in a new window or tab in your browser.")

Such a service would have to be independent of the Big Names to be trustworthy. It's not for nothing that "that zucks" is becoming a well-known phrase; the service must not offer even the potential for induced shadiness of behaviour.

I'd like to see some sort of non-profit federation or trade association built around the service; the idea being that 1) some minimal standards of behaviour and function could be self-enforced, and especially 2) that member services that fold would have some ability/obligation to have their shortened link targets preserved. This way, there would still be some way of continuing to use links generated from the now-defunct service.

Since the announcement that the Library of Congress will be archiving ALL tweets as an historical- and cultural-research resource, and contemplating a future in which it is expected that URL-shortening services will continue to fold or consolidate, the necessity and urgency of this discussion as an Internet-governance issue should have become clear to everyone. I hope that we can agree on and implement effective solutions before the situation degrades any further.

She's Putting Me Through Changes...

...they're even likely to turn out to be good ones.

As you may recall, I've been using and recommending the Kohana PHP application framework for some time. Kohana now offer two versions of their framework:

the 2.x series is an MVC framework, with the upcoming 2.4 release to be the last in that series; and
the 3.0 series, which is an HMVC framework.

Until quite recently, the difference between the two has been positioned as largely structural/philosophical; if you wished to develop with the 'traditional' model-view-controller architecture, then 2.x (currently 2.3.4) is what you're after; with great documentation and tutorials, any reasonably decent PHP developer should be able to get Real Work™ done quickly and efficiently. Oh the other hand, the 3.0 (now 3.0.4.2) offering is a hierarchical MVC framework. While HMVC via 3.0 offers some tantalising capabilities, especially in large-scale or extended sequential development, there remains an enthusiastic, solid community built around the 2.3 releases.

One of the long-time problems with 2.3 has been how to do unit testing? Although vestigial support for both a home-grown testing system and the standard PHPUnit framework exists in the 2.3 code, neither is officially documented or supported. What this leads to is a separation between non-UI classes, which are mocked appropriately and tested from the 'traditional' PHPUnit command line, and UI testing using tools like FitNesse. This encourages the developer to create as thin a UI layer as practical over the standalone (and more readily testable) PHP classes which that UI layer makes use of. While this is (generally) a desirable development pattern, encouraging and enabling wider reuse of the underlying components, it's quite a chore to get an automated testing/CI rig built around this.

But when I came across a couple of pages like this one on LinkedIn (free membership required). This thread started out asking how to integrate PHPUnit with Kohana 2.3.4, and then described moving to 3.0 as

I grabbed Kohana 3, plugged in PHPUnit, tested it, works a treat! So we're biting the bullet and moving to K3! :)

I've done a half-dozen sites in Kohana 2.3, as I'd alluded to earlier. I've just downloaded KO3 and started poking at it, with the expectation to move my own site over shortly and, in all probability, moving 3.0 to the top of my "recommended tools" list for PHP.

Like the original poster, Mark Rowntree, I would be interested to know if and how anybody got PHPUnit working properly in 2.3.4.

Thanks for reading.

Archimedes' Lever