Archimedes' Lever: rant

Showing posts with label rant. Show all posts

Tuesday, 1 April 2014

Rails is Rails is NOT Rails. Oh, my aching head.

For a couple of decades, I was into Linux. One of the ways you could tell good distributions and (especially) good documentation for those distributions is that they were (usually) careful to make notes like

This is a new feature as of kernel 2.6, first introduced in QuuxLinux in Version 6.0.

That API changed in kernel 2.4 and again in kernel 3.0, first introduced in QuuxLinux in Versions 2.0 and 11.0 respectively.

Now I'm a Ruby and (pro tem) Rails developer, and all that is old is new again, and you don't always know it.

I just spent a day and a half of highly-energetic tail-chasing, working at getting my code and specs for (what is so far) a "toy" demo project to understand Rails routing when namespaces are introduced. Namespaces, which are (as far as I can tell) functionally equivalent to Ruby modules, let you provide an additional level of separation between different parts of your app. They're also practically essential for writing Ruby Gems of any complexity, which is a basic requirement for "Vendor Everything" — an important tool in structuring your apps for maintainability and reusability. So "supporting namespaces" was an essential, if not universally obvious, part of the exploration that I'd embarked upon.

As I write in the project README, I want to "namespace everything". I also write "no code in the absence of a failing spec", so obviously my specs need to work properly with namespaces. Believing in six impossible things before breakfast is trivial in comparison, if you've never done it before.

Well, how better to learn than with a demo project? And how better for my current and future employers to judge my progress and reasoning than by making such (non-proprietary) explorations in the full glare of public scrutiny and ridicule, à la GitHub?

After accomplishing this first tiny step of The Mission, I'm even more convinced than ever that part of what makes Rails development "interesting" is that there's such a mix of information about Rails 2.3, Rails 3.x and Rails 4 out in the wild, treated as though they are slight variations on a common theme, rather than three different products unified mainly by a shared product name.

For those who care, the initial plea for help was on Reddit's /r/ruby sub-Reddit. Initially I'd written up this Gist to document the files I was using and the output I was getting. After reading /u/Nitrodist's response, I made some changes, as reflected in this Gist. Finally, after further prodding from Nitrodist and others, and intensive BFI, I determined the two changes needed to my original code. Documented in this final Gist, and pulling together information from several sources referencing different versions of Rails, this *early step* in the exploration has been completed — with several times the expected effort.

Things have to get better, don't they? Only if you're not aware that Captain Edward A Murphy was a raving optimist.

Monday, 24 February 2014

Let's Stop Pretending that Source Control is "Optional but Recommended"

Essentially all of us who make our living from the craft of software development, early in our careers, have had the experience of losing source code that we couldn't get back. Oh, we could type what we remembered into a file of the same name, but it has important differences that we will discover as we work with it. (There is no such thing as eidetic, or photographic, memory).

This has been recognised as a sufficiently universal phenomenon that most developers and managers I have known in the lsat 10-15 years (at least) use use of a versioning system as a filter against dangerous dilettantes; if someone claims to have developed "the next VisiCalc for Linux" without using one, the file containing that person's "CV" may be safely deleted (and do remember to Empty the Trash).

So why on $DEITY's green Earth do we still see well-meaning tutorials on blogs the Internet over which include a step titled "Setup source control (Optional but recommended)"? The particular tutorial that drew my ire this morning even went out of its way to note that you will need to use a "text editor of your choice" with the author noting that "I use Emacs". (Subject and direct object inverted in the original sentence, but that's another rant.)

Don't

that!!

The person reading your tutorial is probably going to be someone with strong continuing economic motivation to improve his software skills; he's reading your tutorial on the off chance that you'll actually teach him something useful. Seeing those three words ("optional but recommended") tells him, almost literally, that you don't really take him seriously. Your half-dozen other readers are going to be (would-be) apprentices in the craft, just learning their way around; they can and, based on experience, will take those three words as permission to just blaze on ahead and, when something goes pear-shaped, to start all over. Maybe you think "nobody I know would take it that way", or even "how could anybody who can read this take it that way", but The Voice of Experience™ is here to tell you that this Internet thingie is a global phenomenon. People from all walks of life, with every language, technical and educational level variation imaginable, can wake up one fine morning and say "I'm going to learn something new, using the Internet", and set out to do just that. That's a feature, not a bug.

I can hear you sputtering "but I didn't want to have to cover how to set up a VCS or use it in the tutorial's workflow; I just wanted to teach people how to use batman.js". Fair enough, and a noble goal; I don't mean to dissuade you (it's actually a pretty good tutorial, by the way). Might I suggest adding to the bullet list under "Guide Assumptions" a new bullet with text approximating

A version control system of your choice (I use Foo)

with an appropriate value of Foo. We don't care whether the Genteel Reader uses git, Bazaar, sccs, or Burt Wonderstone's Incredible Magical VCS; we can merely assume that if he and we are worth each other's time, he's using something; we only need that little extra bit of reinforcement. (Though some choices on that list might be cause for grave concern.)

One of the ways in which the craft of software development needs to change if it's ever going to become a professional engineering discipline is that we're going to have to hammer out and apply a shared set of professional ethics; the closest we've got now is various grab-bag lists of "best practices". I'd argue that "optional but recommended" source control is on the wrong side of one of the lines we need to draw.

Thursday, 14 November 2013

Buy Shares in the Garden Path, friends; it's a wonder!

Not giving actual securities advice, of course; merely commenting on the metaphorical "garden path" one is "led down" by seemingly trustworthy sources. The next thing you know, it's three minutes since you started falling; you've no idea where "bottom" is and even less idea how you got there. (For the record, at three minutes, were it not for atmospheric drag slowing you down, you'd have been falling for ~158,760 meters and have an instantaneous speed of ~1764 m/s. Happy landings!)

All right, what am I blathering on about? It's really simple. (Until you fall off that cliff.) See here…

The problem is actually documented rather clearly in the Position section of the W3C "DOM Level 2 Traversal Range" spec, and had I read that instead of relying on the documentation of the jQuery++ class that wraps it, I'd have saved several hours and $DEITY-knows-how-many litres of stomach acid. According to the spec,

The offset within the node is called the offset of the boundary-point and its position. If the container is [any of 5 different types of] node, the offset is between its child nodes. If the container is a CharacterData, Comment or ProcessingInstruction node, the offset is between the 16-bit units of the UTF-16 encoded string contained by it.

(Emphasis mine.)

The jQuery++ documentation gives an example that uses a single element containing a single text node. No mention is made of the (sensible once you figure it out) distinction between offsets in a text node vs. offsets in an element. And attempting to adapt their example to your markup, which is unlikely to be so trivially simple, is bound to lead to confusion unless you know the answer to The Riddle of the Magical Redefining Offset™.

Now you do, and you can be about your work a great deal more quickly than I.

So how do I do what I set out to do, which is to build up an HTML fragment matching selected text on a page? With great and grandiose ceremony, alas.

Get the active Range for the block-level element containing my content. That gives me the starting and ending offsets (child nodes of that outer block) corresponding to the child nodes that the selection overlaps.
If that outermost range spans multiple child nodes
1. walk down the first selected node and its descendants, until we find the actual text node containing the start of the selection;
2. Add that text and the trailing child nodes, if any, of the element node containing that text node to a buffer;
3. Iterate for that element node's parent element node's trailing child nodes, and on again until we've walked back up to the outermost content area;
4. Add the entire markup of each succeeding top-level child node up to but not including the block identified by that initial ending offset;
5. Descend into the top-level ending node and its descendants, adding markup of each node until we reach the actual node containing the endpoint of the selection;
6. Add the selected text fragment from the end point text node to the buffer. Done.
If that outermost range has a single child node (the starting and ending offsets are the same)
1. walk down the selected node and its descendants, until we find the actual text node containing the start of the selection;
2. Add that text and the trailing child nodes, if any, of the element node containing that text node to a buffer;
3. Iterate for that element node's parent element node's trailing child nodes, and on again until we find a node containing the endpoint node (or that node itself);
4. Descend into the top-level ending node and its descendants, adding markup of each node until we reach the actual node containing the endpoint of the selection;
5. Add the selected text fragment from the end point text node to the buffer. Done.

Oh, my aching head.

If anybody has any better ideas, I'd love to hear them. Oh, did I mention that this is in CoffeeScript/JavaScript in the browser, so we don't have any fancy tools like Nokogiri, which I've previously described as "the Swiss Army Ginsu Chainsaw for parsing markup". With tooling like that, this little exercise would be over and done with in an hour. Building a no doubt buggy improper subset of its functionality, in Script, has taken days. Plural, and never mind how plural. If I were a drinking man, there'd be a crate of top-shelf Scotch in my immediate future.

Wednesday, 11 September 2013

"You use the tools you have, not the tools you wish you had."

(with apologies to Donald Rumsfeld, and noting that this is frustrating precisely because we're so asymptotically close!

I forget who first said this, but "Experience is as much a catalogue of what doesn't work as well as you'd hoped as what does" might have had my career in mind. Or at least the 2013 segment of it.

A few short months ago, I came across Trello, by Joel Spolsky's Fog Creek Software. As with all great Web products, it started as a brilliantly simple idea (a Web version of the classic stick-user-story-index-cards-to-it whiteboard that every Agile project centres around) that's slowly sinking under a tsunami of well-intentioned customer feature requests.

So excuse me if I seem to be a)piling on in b)an arguably hypocritical manner when I say that, like most things, it's both a blessing and a curse. You use it for a while, you start thinking in terms of cards and such… and then you realise that it's really not going to help you as much as you expected. Partly because of features, alas, but more because of philosophy, or communication of same.

Some features that I would gladly pay to be able to use:

There doesn't appear to be a "show me a list of all cards on this board, sorted by card number (creation date), last modification, and so on" feature. Apparently, the workflow Trello expects you to adopt is to have your to-be-done and in-work cards in various lists, and your "done" cards (and lists) should be archived. If, instead, you prefer to have a "Done" list, you'll quickly find that it grows into a mess that you're going to be searching through using the (very handy in itself) filtering feature, but that requires you to remember keywords exactly and, more importantly, prevents the "aha!" moments that often come when you can look at a list of not-very-obviously-related information and draw new insights from what you're seeing.
Trello also have (generally deservedly) made a Big Deal out of their support for "Markdown Everywhere", but their parser doesn't support the feature that future-proofs Markdown itself; the ability to embed HTML inline seamlessly. This is especially noticeable for things like tables, which have no Markdown (or Trello) direct implementation.

This lack means, for instance, that "rolling your own" card that contains a list of all cards in the board looks pretty cluttered and hard-to-read:
Checklists are almost optimally useful and easy to use. However, they suffer from one major misfeature: converting a checklist item to a separate card "removes the item and creates a card with the title of the item beneath the current card". It would be very useful to have Trello, instead of deleting the checklist item after the card has been created, replace the checklist item's text with a link to the new card. Many (most?) of our cards start out as blue-sky, high-level description of a functional feature. For example, "Any User, or any Guest, may view a list of all Articles with public status." That card had a six-item checklist on it, several of which turned out to be "worth" breaking out to separate cards. For each of those checklist-items-turned-cards, I went back and re-added an item to the checklist with a one-sentence summary of the card and a link (e.g., "Update the `Ability` model; see Card #47."), followed by editing the new "item" card so that its description started out like "This is a checklist item from Card #44."

Why? In our process, each card (most granular tracked activity) should take a reasonably consistent amount of time, generally under a man-day of effort. When a single card instead takes 4-5 days (as has happened), it's easy for customer reps and investors to get the idea that we've slacked off because they understand our estimating process/goals. Being able to drill down (and up) from any given card, without the level of effort presently required on my part, helps avoid misunderstandings and time lost responding to panicked emails.
Trello allows you to "subscribe" to cards, lists and boards. This adds items to a list viewable when you're logged into trello. It would be much more useful if these "feeds" were available via RSS/Atom as well.

tl;dr: Trello is a great tool; so great, that many users (including me) are either pulling it either in directions it wasn't designed to go, or are demanding features that don't exist in the way we'd like them. I'll keep using it and recommending it for now, even though the time involved in running a project with it is scaling up prodigiously. That time, however, is less than the time it would take to work without Trello or something quite similar to it.

Sunday, 26 May 2013

FOCUS! Oooh, shiny, shiny...

That didn't last long; having a nice set of components to build on and being satisfied that it would carry us forward indefinitely…

As Ruby developers on Rails or Padrino or bare metal, we're absolutely spoilt for choice. As I write this, Rubygems claims 1,577,138,099 downloads of 56,183 gems since July 2009. (Compare to ~495 PHP packages in the official PEAR repo on Github.) If you can figure out what you're looking for well enough for a Google search, chances are excellent that you'll find something you can at least use; likely several somethings. As with all such, this is both a blessing and a curse.

The blessing is that the stuff is out there; it is mostly at least usable, especially for the gems that are at or past Version 1.0 (yes, there's a formal numbering system for gems); and if it doesn't do precisely what you want, chances are likewise excellent that you can wrap your own code around it to do your bidding, and/or pull a copy of the official source, make changes you'd like to see, and submit a pull request to tell the original developer(s) what you'd like to see pulled into the "official" version. ANd so on.

Back a couple of years ago when I was doing PHP for a living, it wasn't that hard to keep a rough idea of what was both available and usable on PEAR in my head, so that as I developed a project, I knew what I had available and could try to structure my code to work effectively with it. It's a lot easier when you can reverse that: think of what you want and how you'd like to work with it, and go run down the list of candidates to plug that hole.

But that touches on the other edge of the sword, and it's sharp. Doing all that browsing and cataloguing and selecting and integrating takes time and effort; time and effort that your customers (external or otherwise) generally assume you're putting into making a kick-ass product. Well, you are, sort of; you're just trying to figure out which giants' shoulders to stand on. Throw in the usual social media like Google+ and Twitter and so on, alerting you to the newest and shiniest of the new-and-shiny, and it's a wonder you get any "real work" done.

Memo to bosses: This is real work, and it's almost certainly less expensive real work than having your folks reinvent several dozen (or several hundred) wheels as they develop your Awesome Product, Version next.0. Deal.

But the "is it done yet?" bosses do have a very good point, devs. Promises have been made; contracts have been let; people's jobs (including yours) may well be on the line. Focus is necessary. You just have to learn to do it in a reasonably structured way, like your RSI breaks.

When you come across the latest shiny-shiny that you just know is going to make your job so much easier, better, more fun and profitable, add it to a list. When you're about to start on some new part of your project, glance at the list and see if it makes sense to use your new feature-thingy as a test bed for the new gem. Be careful that it can peacefully coexist with the rest of your codebase that hasn't been updated yet, and do remember to use a new feature branch. Set yourself a time limit for how long you'll allow yourself to get the new gem working. When that's expired, give yourself at most one extension (hey, you're exploring the not-yet-known-to-you, after all) and if it's still not walking and chewing gum at the same time, go back to where you branched and try again, the "old" way. This is so simple, yet not doing it can have a devastating effect on your schedule (and project, and…)

But given a choice between running the risk of distraction by a myriad assortment of Things That Generally Work, and trying to make one of a smaller number of Things That Should Work But Don't, Quite… well, that's what brought so many VB6 and Java folk to PHP in the first place, and what brought so many PHP people (including yours truly) into Ruby. Given sufficient resources and patience, you can make almost anything work in almost any language; I once saw a quite-playable version of the classic Star Trek game written in RPG II. I thought, and still think, that the guy who wrote it was absolutely nuts (especially since he knew at least three other languages that would have been easier/better aligned to the job), but *it worked*. So, as the adage goes, never argue with success. However, given the chance early enough in the project for it to make a difference, do respectfully suggest that jobs are best done with the available tools most closely aligned to the job. RPG is still in production for what it does best: complicated report generation from highly-structured data. PHP remains a valid choice for small projects, or for those companies unwilling to rely on non-local talent and willing to trade initial cost for quality. Ruby has numerous tools available, for just about anything, that challenge its practitioners to be better developers than they started out to be. Part of that challenge is the sheer wealth of tools available, and their haphazard discoverability. Part of it is looking back at code you wrote a few short months, or even weeks, ago; knowing it could be so measurably better; and having the self-discipline to leave it be until you have a business reason to change it.

Problems like that make my days more enjoyable than solutions in other languages generally have. Now if I could just turn this megalithic Rails app into something more componentised, service-supporting. Oooh, shiny, shiny…

Sunday, 31 March 2013

Yes, It's Nonsense.

No doubt profitable nonsense. What follows is a reply to a comment by one Andrew Webb on an InfoQ.com sales-pitch-as-technical-"journalism" puff piece. The piece was "reporting" on a study by Dr Donnie Berkholz of InfoQ.com. This kind of hucksterism is one of the most formidable barriers blocking our craft of software development from ever becoming a professional engineering discipline but, well, read on if you will.

Why here rather than on InfoQ? Simple: even though their reply-entry form states that <a> is an accepted HTML element in comments, it refused to accept any of the links I had in this post. Shoddy "journalism", neet shoddy Web development.

Worse than nonsense; this was written for PHBs, likely as a tool to sell consulting hours training teams in the "more expressive" languages. Anybody who remembers Java's transition from a language and VM into a marketing platform, circa 1997-1998, has seen this done before and better.

Note that I am not talking about Dr Berkholz' original study, the link to which was buried in this puff piece. But there appears to be a real problem with the data that the study was based on from Ohloh. From what I've been able to see, the data covers a period of at least 15 years (late 1990s-present). Subversion(!) was used for 53 percent of the projects in the data set; when you add in now-Palaeolithic CVS, the share rises to nearly 2/3 of the projects covered by the data set.

Let me beat that timeframe to death one more time: In the last 20 years, we've gone through at least two major revolutions in the techniques mainstream developers use. Twenty years ago, if you mentioned "OOP" in half the corporate development centres in North America or Asia, the response would have been "what's wrong?" We as a craft, we were just beginning to get our minds wrapped around object-oriented software development, moving it out of its metaphorical Bronze Age, when the larger revolution which that enabled, BDD/TDD/whatever your flavour of Agile, hit like a silver tsunami. Twenty years ago, I was writing "C/C++", Fortran and Ada code, using various odd bits of version control (anybody else remember how good SourceSafe was before Microsoft bought it?), and checking in massive commits, because centralised SCM systems like SourceSafe, CVS and RCS are a pain; seen less as a design/team-support tool than an insurance policy against Raj's hard drive being wiped by some virus or other. Network connectivity was not as omnipresent, reliable or fast as it is today, by several orders of magnitude.

Nowadays, regardless of what language you're in, you're (hopefully) using some form of agile process. You're strongly encouraged to tunnel in, specifying the smallest change that could possibly fail, and then implement just enough code to make that spec pass. You're not going to check in thousands (or even hundreds) of lines of code at one go; you're going to build that feature you're working on over several commits, on a branch of your SCM tree, and then merge it back into the mainline when it's done. The last ten PHP projects I worked on, over a six-year period, averaged less than 50 lines per commit – in one of the most overly verbose languages since ALGOL 68 and COBOL.

Changes in development practices, and tools, will hopelessly skew that data set unless additional controls are applied, and I could find no description of any. That, to me, says that this is, at best, an enjoyable-in-the-moment bit of data-mining with all the relevance to day-to-day developers' lives of the Magic 8-Ball.

This article was even worse; no examination of assumptions, no discussion of changing trends in the craft and industry, just a vacuous puff piece. An insult to the intelligence of the readers and, despite the possible flaws in the original study, to Dr Berkholz' work as well. If my mind could go take a hot shower to wash the oily residue off, it would. I used to think ZDNet was the bottom of the barrel for this sort of thing. I was wrong.

NOTE: Earlier, I'd previously written "Twenty years ago, I was writing "C/C++", Fortran and Ada code, using Subversion", which is an anachronism, of course. Subversion came out in 2000, and I was not in a time warp in 1994. It just feels like I'd been using (used by?) it for that long.

Tuesday, 27 December 2011

Hallowe'en II: Boxing Day

Here's a scary/stupid trick to pull if you, like me, are a finalist for Biggest Email Pack Rat On The Internet.

Open Apple Mail's Preferences, select the "General" tab, and change Dock unread count from Inbox Only to All Mailboxes. I double-dare you.

Or, it might just be too depressing. I went from a mere(!) 355 unread messages to…

56,232

Let's see, at an average of just under 2 minutes each (clocked over a recent week), cleaning that out would take me some 78 days, 2 hours and 24 minutes, during which some 31,000 new emails (not including auto-filtered spam at ~85% of total incoming email) would arrive. Assuming I could stay awake that long. Caffeine is The Elixir of Life™, but… "filling a baby's eyedropper from a raging waterfall" does not even begin to do the image justice. Maybe the old Ragú ad slogan, "It's In There".

There has to be a better way.

Thursday, 24 November 2011

ANFSD: Fascism Bites Everyone In The Pocket (Among Other Places)

Fascism should more properly be called corporatism, for it represents the fusion of State and corporate power.
— B. Mussolini

If you've lived in the USSR, former Soviet republics, or much of south Asia (including Singapore), you're quite familiar with the concept of "exclusive distributors", which the Free World seems to have thrown on the ash-heap of history at roughly the same time as leaded petrol. For those who've never had the displeasure, it works just like it sounds: one appointed business entity is the only way for subjects of a particular country to lawfully access the products or services of a particular (foreign) company. In places like Singapore which follow a State-capitalism model, that exclusive agent customarily has strong ties to either the Government or to entities acting as agents of the Government (sovereign-wealth funds, public officials who also run nominally-private-sector companies, and so on). This rarely, if ever, is a boon for the consumer.

Case in point: today, I wanted to buy a good fan, a Vornado Compact 530, comparable to the Vornado fans I owned when in the States.

Naturally, I can't buy a fan from the Vornado Website, though it gives me plenty of information about the thing. A little search engine-fu tells me that Home-Fix are the exclusive distributor for Vornado in Singapore.

Naturally, I can't order anything, or even get any product information, from Home-Fix's site. I could, however, get the phone number(!) for their nearest store, and called to enquire about availability and pricing.

The conversation went something like this:

Me: Can you tell me if you have Vornado fans?

Clerk: Yes, we do. Which one are you looking for?

Me: The "Compact 630".

Clerk: Yes, we have that one, in white and black.

Me: How much is it?

Clerk: S$170.

Me: $170?!? So your rate on the US dollar is about three Singapore dollars, then? The list price on the Vornado Website is US $49.99!

Clerk: (as to a very small, very slow child) Oh, that's the online price. It's bound to be cheaper.

Me: Well, I've done some checking on various US, European and Australian stores; the walk-in retail price is right about the same.

Clerk: Well, our price is S$170.

Me: Well, thank you for your time.

Not to be terribly unfair to either Home-Fix or the clerk; that's the way the system operates here, and any company that didn't jack their prices up to whatever the marks will pay isn't doing things The Singapore Way. It's not as though it's actually people's own money; they're just holding it until those who pull the levers of State decide they want it back2.

So, ragging at Home-Fix or any of the many, many other businesses whose prices have no apparent correlation to corresponding prices elsewhere won't accomplish anything. If, as was let slip during the Singapore Presidential "election" this year, the Government's sovereign-wealth funds and people connected to High Places really do control 2/3 or more of the domestic Singapore economy, then complaining about any individual companies is rather like the man who's been attacked by a chainsaw-wielding madman who then worries about all the blood on his shirt. Fix the real problems, and the small ones will right themselves.

Incidentally, this also illustrates why I support the Occupy movement. If Americans want to see what another decade or two of economic and political polarisation between the top 400 families and the rest of the country will look like, Singapore is a good first-order approximation. And some sixty percent of Singaporeans apparently either couldn't be bothered to think differently, or were too afraid to.

Sigh. I still need to find a reliable, efficient fan.

Footnotes:

1. According to the current XE.com conversion, 170 Singapore dollars (S$170) is approximately US$130.19,, or some 260 percent of the list price. In a free society, that would be called "gouging"; here in Singapore it's called "buying a foreign product". Remember, no foreign companies really operate here. When you walk into a McDonald's or Starbucks or Citibank in Singapore, you're walking into a local, almost invariably Government-linked, franchise or local representative. The quality and experience difference can be, charitably, stark. (Return)

2. Known in the US as the "thug defense", after a Los Angeles mugger who used that line of "reasoning" as the basis for an attempted pro se defence in court. He was, of course, unsuccessful. (Return)

Wednesday, 20 October 2010

It's Not Lucid For Me Anymore

For those of you a bit limited in your English knowledge, lucid means "easily understood; completely intelligible or comprehensible."

My experience bringing up some Web servers on the latest release of Ubuntu Linux, release 10.10 aka "Maverick Meerkat" (replacing 10.04, "Lucid Lynx") left me feeling scalped by Apache (the Web server software).

It's (usually) quite straightforward to bring up a Web site in a server's main document root directory, the system-wide, (hopefully) secured directory that Apache loads files from when you browse a simple Web URL such as http://www.example.com/.

Many modern Web sites, and the tools used to build them, depend heavily on Apache's mod_rewrite module, which takes the URL requested by your browser and "rewrites" it into something quite different for security purposes and/or to enable various Web application frameworks to function. Your browser's request for http://www.example.com/ may be presented to the server as if you'd typed http://server1.example.com/webapps/whatever/index.php&node=1. All this is transparent FM to you and, more importantly, to the search engines (Google and friends).

OK, fine. Anybody who's been doing Web work at all knows this. If you've used the Web server on your own system (Apache comes standard with every Apple Mac or Linux system; you Windows usees will have to do it for/to yourselves). The ''mod_rewrite'' syntax makes "arcane" a laughable understatement, but again, no biggie. If you're using just about any framework, it comes with (or documents how to create) the configuration file needed, which generally requires minimal-to-no twiddling from you. Again, as long as you're working at the system document root.

What will send you on a magical voyage of discovery, or an extreme bout of Google-fu, is if you're not working at the system root. Maybe you've got more than one Web site or app you're hosting on a single server (generally using "virtual hosts". More likely, you want to be able to serve a Web page or site from your own user-level directory. Apache has a module for that, too; it's called mod_userdir. And as long as you're dealing with static Web pages, it's drop-dead simple. That makes sense; Apache has been around quite nearly as long as the Web has; all the easy problems and most of the reasonably hard ones have long been solved by now.

As you can probably guess from the build-up here, mixing all the above ingredients (Apache, a framework, mod_rewrite and mod_userdir is a fairly reliable way to ensure that "hilarity ensues". This is so not so much because the combination inherently poses challenges (reasonable experienced developers may differ on that), but because the interaction is quite often (read: until and unless decisively proven otherwise) platform-specific. By "platform-specific," I mean "virtually certain to differ between various distributions of Linux, not to mention Mac OS X, BSD Unix (FreeBSD, OpenBSD, etc.) and Windows."

You may have made it all work on a dozen systems before, only to find that the next system you try displays the performance characteristics of an in-process actuation of an IED. You'll flail around, you'll feel like a "stupid newbie," you'll get on IRC and pound on Google. And sooner or later, because you are a stubborn SOB (aren't you?), you'll find (situation-specific) Enlightenment.

And so, I wish to announce my heart-felt thanks to user FRuso on the Ubuntu user forums who posted this response to The Question. It's a different answer than for openSUSE or Fedora, and please don't do this to an OS X system. But if you're one of the Teeming Millions™ (well, Hundreds, anyway) trudging down this well-worn trail and you're trying to make Ubuntu work, this will help out.

Because, after all, if every similar platform worked the same way, people might actually get work done. If it was that easy, ordinary Joe-Sixpack users might start setting up their own servers, instead of relying on corporate-sanctioned and -supported hosting providers to centralise everything. And then, people might actually exchange ideas and grow their minds a bit! Who knows?

Ahhh, it's "the Internet" (actually, the Web). Most people don't care, as long as they can view the latest bread and circuses on YouTube or Hulu.

Tuesday, 14 September 2010

Saving Effort and Time is Hard Work

As both my regular readers well know, I've been using a couple of Macs1 as my main systems for some time now. As many (if not most, these days) Web developers do, I run different systems (Windows and a raft of Linuxes) using VMWare Fusion so that I can do various testing activities.

Many Linux distributions come with some form of automation support for installation and updates2. Several of these use the (generally excellent) Red Hat Anaconda installer and its automation scheme, Kickstart. Red Hat and the user community offer copious, free documentation to help the new-to-automation system administrator get started.

If you're doing a relatively plain-vanilla installation, this is trivially easy. After installing a new Fedora system (image), for example, there is a file named anaconda-ks.cfg in the root user's home directory, which can be used to either replay the just-completed installation or as a basis for further customisation. To reuse, save the file to a USB key or to a suitable location on your local network, and you can replay the installation at will.

Customising the installation further, naturally, takes significant additional effort — almost as "significant" as the effort required to do the same level of customisation manually during installation. The saving grace, of course, is that this only needs to be done once for a given version of a given distribution. Some relatively minor tinkering will be needed to move from one version to another (say, Fedora 13 to 14), and an unknowable-in-advance amount of effort needed to adapt the Kickstart configuration to a new, different distribution (such as Mandriva), since packages on offer as well as package names themselves can vary between distros3.

It's almost enough to make me pull both remaining hairs out. For several years, I have had a manual installation procedure for Fedora documented on my internal Wiki. That process, however, leaves something to be desired, mainly because it is an intermixture of manual steps and small shell scripts that install and configure various bits and pieces. Having a fire-and-forget system like Kickstart (that could then be adapted to other distributions as well), is an extremely seductive idea.

It doesn't help that the Kickstart Configurator on Fedora 13, which provides a (relatively) easy-to-use GUI for configuring and specifying the contents of a Kickstart configuration file, works inconsistently. Using the default GNOME desktop environment, one of my two Fedora VMs fails to display the application menu, which is used for tasks such as loading and saving configuration files. Under the alternate (for Fedora) KDE desktop, the menu appears and functions correctly.

One of the things I might get around to eventually is to write an alternative automated-installation-configuration utility. Being able to install a common set of packages across RPM-based (Red Hat-style) Linuxes such as Fedora, Mandriva and CentOS as well as Debian and its derivatives (like Ubuntu), and maybe one or two others besides, would be a Very Handy Thing to Have™.

That is, once I scavenge enough pet-project time to do it, of course. For now, it's back to the nuances of Kickstart.

Footnotes:

1. an iMac and a MacBook Pro. (Return)

2. Microsoft offer this for Windows, of course, but they only support Vista SP1, Windows 7 and Windows Server 2008. No XP. Oops. (Return)

3. Jargon. The term distro is commonly used by Linux professionals and enthusiasts as an abbreviation for "distribution"; a collection of GNU and other tools and applications built on top of the Linux kernel. (Return)

Thursday, 2 September 2010

Patterns and Anti-Patterns: Like Matter and Anti-Matter

Well, that's a few hours I'd like to have over again.

As both my regular readers know, I've long been a proponent of agile software development, particularly with respect to my current focus on Web development using PHP.

One tool that I, and frankly any PHP developer worth their salt, use is PHPUnit for unit testing, a central practice in what's called test-driven development or TDD. Essentially, TDD is just a bit of discipline that requires you to determine how you can prove that each new bit of code you write works properly — before writing that new code. By running the test before you write the new code, you can prove that the test fails... so that, when you write only the new code you intend, a passing test indicates that you've (quite likely) got it right.

At one point, I had a class I was writing (called Table) and its associated test class (TableTest). Once I got started, I could see that I would be writing a rather large series of tests in TableTest. If they remained joined in a single class, they would quickly grow quite long and repetitive, as several tests would verify small but crucial variations on common themes. So, I decided to do the "proper" thing and decompose the test class into smaller, more focused pieces, and have a common "parent" class manage all the things that were shared or common between them. Again, as anyone who's developed software knows, this has been a standard practice for several decades; it's ordinarily the matter of a few minutes' thought about how to go about it, and then a relatively uneventful series of test/code/revise iterations to make it happen.

What happened this afternoon was not "ordinary." I made an initial rewrite of the existing test class, making a base (or "parent") class which did most of the housekeeping detail and left the new subclass (or "child" class) with just the tests that had already been written and proven to work. (That's the key point here; I knew the tests passed when I'd been using a single test class, and no changes whatever were made to the code being tested. It couldn't have found new ways to fail.)

Every single test produced an error. "OK," I thought, "let's make the simplest possible two-class test code and poke around with that." Ten minutes later, a simplified parent class and a child class with a single test were producing the same error.

The simplified parent class can be seen on this page, and the simplified child class here. Anybody who knows PHP will likely look at the code and ask, "what's so hard about that?" The answer is, nothing — as far as the code itself goes.

What's happening, as the updated comments on pastebin make clear, is that there is a name collision between the ''data'' item declared as part of my TableTest class and an item of the same name declared as part of the parent of that class, PHPUnit's PHPUnit_Framework_TestCase.

In many programming languages, conflicts like this are detected and at least warned about by the interpreter or compiler (the program responsible for turning your source code into something the computer can understand). PHP doesn't do this, at least not as of the current version. There are occasions when being able to "clobber" existing data is a desirable thing; the PHPUnit manual even documents instances where that behaviour is necessary to test certain types of code. (I'd seen that in the manual before; but the significance didn't immediately strike me today.)

This has inspired me to write up a standard issue-resolution procedure to add to my own personal Wiki documenting such things. It will probably make it into the book I'm writing, too. Basically, whenever I run into a problem like this with PHPUnit or any other similar interpreted-PHP tool, I'll write tests which do nothing more than define, write to and read from any data items that I define in the code that has problems. Had I done that in the beginning today, I would have saved myself quite a lot of time.

Namely, the three hours it did take me to solve the problem, and the hour I've spent here venting about it.

Thanks for your patience. I'll have another, more intelligent, post along shortly. (That "more intelligent" part shouldn't be too difficult now, should it?)

Wednesday, 7 July 2010

Phing! It's a Dessert Topping! It's a Floor Wax! No, it's efPhing something!

I want my hour back. No, seriously; that's what happens when you don't touch a tool for a while, but you spend a lot of time with its "kissin' cousin."

Phing, if you're new-ish to serious PHP development, is your ultimate Swiss Army Ginsu Chainsaw™. It's a "build tool" that lets you automate pretty near anything, especially having to do with PHP. There are several dozen "tasks", like the PhpCodeSnifferTask, the IfTask, and of course the PhingTask. There is also good documentation on extending Phing; adding all sorts of new tasks and other bits, and people have run with the ball.

Put another way, if you're coming from the Java omniverse, Phing is analogous to Apache Ant, with at least as vibrant a community effort throwing stuff over the wall.

Sometimes the things that get thrown over the wall blow up, though. Sometimes that isn't Phing's phault, though that's the fish-eye lens you're looking through as you grapple with the problem. FOr instance, the PEAR coding standards as interpreted by PHP CodeSniffer require tags in the internal documentation you must produce to comply with the standard that are not supported by the documentation generator they're ostensibly intended for. You'll see your Phing script break on that rock — unless you "change the conditions of the test." Figuring out the (multiple) flags, option settings and occult incantations necessary to make things run smoothly will take you some time to figure out, unless you just did it last week and wrote copious notes in your wiki.

But the absolute break-down-and-laugh-until-you-cry moment came from the way Phing handles "custom properties," the symbols you can define to make your Phing life easier in various ways, like having common, shared policies and processes across projects, with specific values set on a per-project basis. You whip up a new "property file" as part of a new project, and reuse your existing XML "build file." Ah, but there is one sizable bump to trip the unwary or rushed: Properties are specified in a sensible XML format when included in the build file, but separate, included "property files" look like old-style Windows .ini files. And $DEITY help you if you forget, because Phing sure won't.

Let's say you have a "drop-dead simple" build.xml file, like this:

<?xml version="1.0" encoding="UTF-8"?>

<project name="demo1" default="demo">

<if>
    <equals arg1="${usepropfile}" arg2="false" />
    <then>
        <property name="foo" value="baz" />
    </then>
    <else>
        <property file="build.properties" />
    </else>
</if>

    <target name="demo">
        <echo message="Demo target; foo= ${foo}." />
    </target>
</project>

If you screw up the build.properties file — say, by specifying your values in the same sort of XML format:

    <property name="foo" value="quuz" />

you'll see a most sublimely confarkled message:

    ....
     [echo] Demo target; foo= ${foo}.
    ....

At this point, you'll either have a D'oh! moment, or you'll start chasing your tail. Choose Door #2, and you could be at it a while.

The answer, as often, is RTFM. Appendix F (!) of the Phing manual defines the "Property File Format," which is our venerable foil the .ini file with a few twists (like being able to define properties that incorporate the values of other properties – just like in build.xml).

Would it really have been too complicated to use the same format (i.e., the XML tags) in the property file as in the build file? You could even deal with making it valid XML with minimal effort. But no...

I actually did learn this properly a couple of years ago, when Phing was new and kind of shiny. It's only become more powerful since then. Just don't hold three lighted M-80s in your hand.

Why rant about this? I've long firmly believed that one of the main duties of a master craftsman, in any craft – including software development – is to occasionally fall into the weeds, pick himself up, and remind the young journeymen and apprentices "don't do what I just did; you'll make yourself look silly, or worse." Better for one person to do it who can be reasonably expected to pick himself up, than for a dozen others to fall in with no idea how deep they're getting.

Wednesday, 30 June 2010

The Decline of the Internet, Part MMMDIC

Rant mode: on. Some effort will be made to keep ear-steam to a minimum.

Internet Relay Chat (better known as IRC), specifically Undernet, was one of the very first things I discovered on the (then-)text-mode-only Internet, Over the next 20 years or so, I've gone through varying levels of activity (including writing an IRC client for OS/2, the operating system that brought technology to a marketing fight (as in, "bringing a pocket knife to a firefight")./p>

One of the changes that happened on Usenet (and other networks) when the Net started getting used by larger hordes of people, with varying intentions, was to institute some form of registration. You'd visit a bare-bones Web site (such as the one for Undernet, fill in a few items on a form, and wait for an email to pop into your inbox stating that your ID was active (known to the bot managing authentication on the servers). I'd used the same "nick" (nickname), JeffD for at least as long as I or my logs can remember (certainly before 1998), and formalized this when the channels I frequented (chiefly #usa) started requiring it.

Now, I (like to think I) do have a life outside IRC and the Net in general; I've been known to go as long as a year without signing on in Undernet. Not a problem; things Just Worked™, in almost Mac-like fashion.

Until today. I fired up my current-favourite client program (Colloquy, highly recommended); it connected to a random Undernet server (in Budapest in this case), and sent the customary PRIVMSG (private message) to the bot that handles authentication, with my username and password. Instead of the usual "authentication accepted" message coming back, I see "X NOTICE I don't know who JeffD is."

Well, blowups happen. I go to the Undernet CService page I mentioned earlier; it's a whiz-bang series of 7 pages now. I go through it, enter my desired username, password, and an email address that I can be reached at (the same one I've had since text-mode days), and am finally presented with this marvel of clarity:

Congratulations! Something went wrong. Sorry.

I'd bet heavily that the reason for the breakage would, on investigation, turn out to be a case of unchecked assumptions no longer being valid, combined with a lack of human management. Nobody has the time for community-without-a-pricetag anymore, and that in a nutshell is what IRC is

And, two hours later, I still haven't received the expected email. Feh.

Rant mode: standby.

Friday, 18 June 2010

I Thought Standard Libraries Were Supposed to be Better...

...than hand coding. Either the PHP folks never got that memo, or I'm seriously misconceptualising here.

Case in point: I was reading through Somebody Else's Code™, and I saw a sequence of "hand-coded" assignments of an empty string to several array entries, similar to:

    $items[ 'key2' ] = '';
    $items[ 'key1' ] = '';
    $items[ 'key6' ] = '';
    $items[ 'key3' ] = '';
    $items[ 'key8' ] = '';
    $items[ 'key5' ] = '';
    $items[ 'key4' ] = '';
    $items[ 'key7' ] = '';

I thought, "hey, hang on; there's a function to do easy array merges in the standard library (array_merge); surely it'd be faster/easier/more reliable to just define a (quasi-)constant array and merge that in every time through the loop?"

Fortunately, I didn't take my assumption on blind faith; I wrote a quick little bit to test the hypothesis:


$count = 1e5;
$data = array(
        'key2' => '',
        'key1' => '',
        'key6' => '',
        'key3' => '',
        'key8' => '',
        'key5' => '',
        'key4' => '',
        'key7' => '',
        );
$realdata = array();

$start = microtime( true );
for ( $loop = 0; $loop < $count; $loop++ )
{
    $realdata = array_merge( $realdata, $data );
};
$elapsed = microtime( true ) - $start;
printf( "%ld iterations with array_merge took %7.5f seconds.\n", $count, $elapsed );

$start = microtime( true );
for ( $loop = 0; $loop < $count; $loop++ )
{
    $data[ 'key2' ] = '';
    $data[ 'key1' ] = '';
    $data[ 'key6' ] = '';
    $data[ 'key3' ] = '';
    $data[ 'key8' ] = '';
    $data[ 'key5' ] = '';
    $data[ 'key4' ] = '';
    $data[ 'key7' ] = '';
};
$elapsed = microtime( true ) - $start;
printf( "%ld iterations with direct assignment took %7.5f seconds.\n", $count, $elapsed );

I ran the tests on a nearly two-year-old iMac with a 3.06 GHz Intel Core 2 Duo processor, 4 GB of RAM, OS X 10.6.4 and PHP 5.3.1 (with Zend Engine 2.3.0). Your results may vary on different kit, but I would be surprised if the basic results were significantly re-proportioned. The median times from running this test program 20 times came out as:

Assignment process	Time (seconds) for 100,000 iterations
array_merge	0.41995
Hand assignment	0.15569

So, the "obvious," "more readable" code runs nearly three times slower than the existing, potentially error-prone during maintenance, "hand assignment." Hang on, if we used numeric indexes on our array, we could use the array_fill function instead; how would that do?

Adding the code:

    $data2 = array();
    $data2[ 0 ] = '';
    $data2[ 1 ] = '';
    $data2[ 2 ] = '';
    $data2[ 3 ] = '';
    $data2[ 4 ] = '';
    $data2[ 5 ] = '';
    $data2[ 6 ] = '';
    $data2[ 7 ] = '';
$start = microtime( true );
for ( $loop = 0; $loop < $count; $loop++ )
{
    $data2 = array_fill( 0, 8, '' );
};
$elapsed = microtime( true ) - $start;
printf( "%ld iterations with array_fill took %7.5f seconds.\n", $count, $elapsed );

produced a median time of 0.21475 seconds, or some 37.9% slower than the original hand-coding.

For folks coming from other, compiled languages, such as C, C++, Ada or what-have-you, this makes no sense whatsoever; those languages have standard libraries that are not only intended to produce efficiently-maintainable code, but (given reasonably mature libraries) efficiently-executing code as well. PHP, at least in this instance, is completely counterintuitive (read: counterproductive): if you're in a loop that will be executed an arbitrary (and arbitrarily large) number of times, as the original code was intended to be, you're encouraged to write code that invites typos, omissions and other errors creeping in during maintenance. That's a pretty damning indictment for a language that's supposedly at its fifth major revision.

If anybody knows a better way of attacking this, I'd love to read about it in the comments, by email or IM.

Tuesday, 1 June 2010

The Sun Has Set on a Great Brand

Just a quick note pointing out, once again, that when many of us foresaw gloom and doom from the Oracle acquisition of Sun Microsystems....we were being hippie-pie-in-the-sky optimists.

Item: A couple of days ago, I tried to log into my Sun Developer Network account, so I could grab the latest "official" Java toolset for a Linux VM I was building. I couldn't log in; neither the password that was saved in my password manager nor any other password I could think of would work. I then went through the recover-lost-password form, expecting a reset email to pop into my inbox within a minute or two. An hour later, I had to go do other things. Yesterday, no email.

Finally, today (Tuesday morning), this gem appears in my inbox:

Date: Sun, 30 May 2010 10:07:20 -0700 (PDT)
From: login-autoreply@sun.com
To: jdickey@seven-sigma.com
Message-ID: <12602411.48691.1275239240752.JavaMail.daemon@mailhost>
Subject: Your Sun Online Account Information
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Source-IP: rcsinet15.oracle.com [148.87.113.117]
X-CT-RefId: str=0001.0A090205.4C029B48.0062,ss=1,fgs=0
X-Auth-Type: Internal IP

Thank you for contacting Sun Online Account Support.

We're sorry but a user name for your account could not be determined.

We would be happy to look into this further for you.  Please contact us at @EMAIL_FEEDBACK_CONTACT@.  We apologize for any inconvenience.

Thank you,
Sun Online Account Support

Oracle isn't even bothering to fill in (no doubt database-fed) form-mail fields to keep in touch with current and possible future Sun-originated customers, and the login-autoreply address is no doubt a black hole slaved to /dev/null. If they had sent an HTML email with 48-point red F*** YOU!!! as the entire content, the message received could not have been more clear: you don't matter, go away, throw your money, time and attention at somebody else.

I plan to do precisely that, and make clear to all and sundry precisely why.

Wednesday, 26 May 2010

Beating Your Head Against the Wall, Redux

...or, the "Monty Python and the Holy Grail" monks' guide to making your Mac desktop work like a server, instead of going and getting a copy of OS X Server like you should...

Mac OS X Server brings OS X simplicity and Unix power to a range of hardware systems. Most of the things that Server makes trivially simple can be done in OS X Desktop. Some of them, however, require the patience of Job and the ingenuity and tenaciousness of MacGyver...or so they at first appear.

One such task is installing Joomla! This is a nice little Web CMS which has some nice features for users, developers and administrators. On most Unix-like systems, or even recent versions of Microsoft Windows, installation is a very straightforward process for any system which meets the basic requirements (Web server, PHP, MySQL, etc.) as documented in the (PDF) Installation Manual or the PDF Quick Start guide. On most systems, it takes only a few minutes to breeze through from completed download to logging into the newly installed CMS.

The OS X desktop, as I said, is a bit different. This isn't a case of Apple's famous "Think Different" campaign so much as it appears to be a philosophical conflict between Apple's famous ease-of-use as applied to user and rights management, coming up against traditional Unix user rights management. Rather than the former merely providing a polished "front end" interface for the latter, some serious mind-games and subversion are involved. And I'm not talking about the well-known version control software.

When things go wrong with regard to a simple installation process of a well-known piece of software, usually Google is Your Best Friend. If you search for joomla mac install, however, you quickly notice that most of the hits talking about OS X desktop recommend that you install a second Apache, MySQL and PHP stack in addition to the one that's already installed in the system — packages such as XAMPP, MAMP and Bitnami. While these packages each appear to do just what it says on their respective tins, I don't like having duplicate distributions of the same software (e.g., MySQL) on the same system.

Experience and observation have shown that that's a train wreck just begging to happen. Why? Let's say I've got the "Joe-Bob Briggs Hyper-Extended FooBar Server" installed on my system. (System? Mac, PC, Linux; doesn't matter for this discussion.) When FooBar bring out a new release of their server (called the 'upstream' package), Joe-Bob has to get a copy of that, figure out what's changed from the old one, and (try to) adapt his Hyper-Extended Server to the new version. He then releases his Hyper-Extended version, quite likely some time behind the "official" release. "What about betas," you ask? Sure, Joe-Bob should have been staying on top of the pre-release release cycle for his upstream product, and may well have had quite a lot of input into it. But he can't really release his production Hyper-Extended Server until the "official" release of the upstream server. Any software project is subject to last-minute changes and newly-discovered "show-stopper" issues; MySQL 5.0 underwent 90 different releases. That's a lot for anybody to keep up with, and the farther away you get from that source of change (via repackaging, for example), the harder it is to manage your use of the software, taking into account features, security and the like.

So I long ago established that I don't want that sort of duplicative effort for mainstream software on modern operating systems. (Microsoft Windows, which doesn't have most of this software on its default install, is obviously a different problem, but we're not going there tonight.) It's too easy to have problems with conflicts, like failing to completely disable the "default" version of a duplicated system, to inject that kind of complexity into a system needlessly.

That isn't to say that it doesn't become very attractive sometimes. Even on a Mac OS X desktop — where "everything" famously "just works" — doing things differently than the "default" way can lead to great initial complexity in the name of avoiding complexity down the line. (Shades of "having to destroy the village in order to save it.")

The installation of Joomla! went very much like the (PDF) Installation Manual said it should... until you get to the screen that asks for local FTP credentials that give access to the Joomla! installation directory. It would appear that setting up a sharing-only user account on the system should suffice, and in fact several procedures documenting this for earlier versions of Mac OS X describe doing just that. One necessary detail appears different under 10.6.2, however: the "Accounts" item in System Preferences no longer allows the specification of user-specific command shells...or, if it does, it's very well hidden.

Instead, I created a new regular, non-administrative user for Joomla! I then removed the new user's home directory (under /Users) and created a symlink to the Joomla! installation directory.

Also, one difference between several of the "duplicating" xAMP systems I mentioned above and the standard Apache Web server as on OS X (and Linux) is that in the default system, access to served directories is disabled by default; the idea is that you define a new Apache <Directory> directive for each directory/application you install. Failing to do this properly and completely will result in Apache 403 ("Forbidden") errors. Depending on your attitude to security, you may either continue to do this, or change the default httpd.conf setting to Allow from All and use .htaccess files to lock down specific directories.

Once you have the underlying requirements set up (and FTP access is the only real think-outside-the-box issue, really), Joomla! should install easily. But if you're in a hurry and just trying to go through the documented procedures, you're quite likely to spend considerable time wondering why things don't Just Work.

And no, neither Fedora nor Ubuntu Linux "do the right thing" out-of-the-box either. At least, not im my tests.

Tuesday, 27 April 2010

Let's Do The Time Warp Agai-i-i-i-n!! (Please, $DEITY, no...)

For those who may somehow not be aware of it, LinkedIn is a (generally quite good) professionally-oriented social-networking site. This is not Facebook, fortunately. It's not geared towards teenagers raving about the latest corporate boy band du jour. It often can be, however, a great place to network with people from a variety of vocational, industry and/or functional backgrounds to get in contact with people, share information, and so on.

One of the essential features of LinkedIn is its groups, which are primarily used for discussions and job postings. In the venerable Usenet tradition, these discussions can have varying levels of insightful back-and-forth, or they can degenerate into a high-fidelity emulation of the "Animal House" food fight. As with Usenet, they can often give the appearance of doing both at the same time. Unlike Usenet, one has to be a member of LinkedIn to participate.

One of the (several) groups I follow is LinkedPHPers, which bills itself as "The Largest PHP Group" on LinkedIn. Discussions generally fall into at least one of a very few categories:

How do I write code to solve "this" problem? (the 'professional' version of "Help me do my homework");
What do people know/think about "this" practice or concept?
I'm looking for work, or people to do work; does anybody have any leads?

As veterans of this sort of discussion would expect, the second type of discussion can lead to long and passionate exchanges with varying levels of useful content (what became known on Usenet as a "flame war.") The likelihood of such devolution seems to be inversely proportional to its specificity and proportionally to the degree which the concept in question is disregarded/unfamiliar/unknown to those with an arguable grasp of their Craft.

It should thus be no surprise that a discussion on the LinkedPHPers group of "Procedural vs Object Oriented PHP Programming" would start a flame war for both of the above reasons. With 58 responses over the past month as I write this, there are informational gems of crystal clarity buried in the thick, gruesome muck of proud ignorance. As Abraham Lincoln is reported to have said, "Better to remain silent and be thought a fool than to speak out and remove all doubt."

What's my beef here? Simply that this discussion thread is re-fighting a war that was fought and settled over a quarter-century ago by programming in general. The reality is that any language that has a reasonable implementation of OOP (with encapsulation/access control, polymorphism and inheritance, in that order by my reckoning) should be used in that way.

Several of the posts trot out the old canard about a performance 'penalty' when using OOP. In practice, that's true of only the sharpest edge cases – simple, tiny, standalone classes that should never have been developed that way because they don't provide a useful abstraction of a concept within the solution space, generally by developers who are not professionally knowledgeable of the concepts involved and quite often by those copying and pasting code they don't understand into their own projects (which they also don't understand). That bunch sharply limited the potential evolution and adoption of C++ in the '80s and '90s, and many of their ideological brethren have made their home in Web development using PHP.

Yes, I know that "real" OOP in PHP is a set of tacked-on features, late to the party; first seriously attempted in PHP 4, with successively evolving implementations in 5.0, 5.2 and 5.3, with the semi-mythological future PHP 6 adding many new features. I know that some language features are horribly unwieldy (which is why I won't use PHP namespaces in my own code; proven idea, poor implementation). But taken as a whole, it's increasingly hard to take the Other Side ("we don' need no steeeenkin' objects") at all seriously.

The main argument for ignoring the "ignore OOP" crowd is simply this: competent, thoughtful design using OOP gives you the ability to know and prove that your code works as expected, and data is accessed or modified only in the places and ways that are intended. OOP makes "software-as-building-blocks" practical, a term that first gained currency with the Simula language in the mid-1960s. OOP enables modern software proto-engineering practices such as iterative development, continuous integration and other "best practices" that have been proven in the field to increase quality and decrease risk, cost and complexity.

The 'ignore OOP in PHP' crowd like to point to popular software that was done in a non-OOP style, such as Drupal, a popular open-source Web CMS. But Drupal is a very mature project, by PHP standards; the open-source project seems to have originated in mid-2000, and it was apparently derived from code written for a project earlier still. So the Drupal code significantly predates PHP 5, if not PHP 4 (remember, the first real whack at OOP in PHP). Perusing the Drupal sources reveals an architecture initially developed by some highly experienced structured-programming developers (a precursor discipline to OOP); their code essentially builds a series of objects by convention, not depending on support in the underlying language. It is a wonder as it stands – but I would bet heavily that the original development team, if tasked with re-implementing a Web CMS in PHP from a blank screen, would use modern OO principles and the underlying language features which support them.

And why would such "underlying language features" exist and evolve, especially in an open-source project like PHP, if there was not a real, demonstrable need for them? Saying you're not going to do OOP when using PHP is metaphorically akin to saying you intend to win a Formula One race without using any gear higher than second in the race.

Good luck with that. You might want to take a good, hard look at what your (more successful) colleagues are doing, adopt what works, and help innovate your Craft further. If you don't, you'll continue to be a drag on progress, a dilettante intent upon somehow using a buggy whip to accelerate your car.

It doesn't work that way anymore.

Saturday, 27 February 2010

Protecting Yourself and Others from Yourself and Others

Nowadays, there's simply no excuse for any computer connected to the Internet, regardless of operating system, not to have both a hardware firewall (usually implemented in your router/broadband "modem") and a software firewall, monitoring both incoming and outgoing traffic.

The software firewall I've been recommending to those "unable"/unwilling to leave the hypersonic train wreck that is Windows has been ZoneAlarm's free firewall. Users of modern operating systems should start off by enabling and configuring the firewall built into their OS (either ipfw on Mac OS X/BSD Unix, or netfilter for Linux). That can be tricky to manage; fortunately there are several good "front end" helper packages out there, such as WaterRoof. Another excellent, popular Mac tool is Little Snitch; the latter two work quite well together.

However, no matter which tools you use to secure your system's Net connection, one of the main threats to continued secure, reliable operation remains you, the user. This has a habit of popping up in unexpected but obvious-in-hindsight ways.

For instance, I recently changed my Web/email hosting service. Long prior to doing so, I had defined a pair of ipfw rules that basically said "Allow outgoing SMTP (mail-sending) connections to my hosting provider; prevent outgoing mail-sending connections to any other address." Thus, were any of the Windows apps I ran inside a VMware Fusion VM to become compromised (as essentially all Windows PCs in Singapore are), they couldn't flood spam out onto the Net – at least not using normal protocols. This didn't do anything to protect the Net from other people's Windows PCs that might sometimes legitimately connect to my network, but it did ensure that the Windows VM (or anything else) running on the Mac was far less likely to contribute to the problem.

A few days after I made the change, I noticed that my outgoing mail server configured in my email client wasn't changed over to the new provider, so I fixed that. And then, I couldn't send mail anymore. It took an embarrassingly long time (and a search of my personal Wiki) to remember "hey, I blocked all outgoing mail except to (the old provider) in my software firewall." Two minutes later, WaterRoof had told ipfw to change the "allowed" SMTP connection, and I soon heard the "mail sent" tone from my client.

Mea culpa, indeed. But why bother blogging about this? To reinforce these ideas to my reader(s):

First, that if you aren't already using a software firewall in addition to the hardware one you probably have (especially if you're not aware of it), you should be. It will make you a better Net citizen; and
Use a Wiki, either a hosted one like PBworks or one running on your own system. (I use Dokuwiki; this page on Wikipedia has a list of other packages for the host-it-yourselfer.)
Use your Wiki to record things that you'd previously think of writing in a dead-tree notebook or a bajillion Post-it® notes stuck to all available surfaces. This specifically and emphatically includes details of your system configuration(s).

Of course, if using a public, hosted wiki, you'll want to make sure that you can secure pages with sensitive data, like system configurations; why make Andrei Cracker's job easier than it already is?

This whole rant is basically a single case of the age-old warning, "if you don't write it down (in a way that it can be found again at need), it never happened." As the gargantuan information fire-hose that is the Internet continually increases the flow of information as well as increasing the rate of increase, this becomes all the more critical for any of us.

Wednesday, 10 February 2010

Simple Changes Rarely Are

I want to say "thank you" to the anonymous commenter to my post, XHTML is (Nearly) Useless, who said "Paragraphs would be useful."

Google's Blogger software, by default, separates paragraphs by an "HTML break tag" (br); this preserves the visible separation as intended, but destroys the structural, semantic flow. I recently discovered this, found the feature to turn it off ("don't put "break" tags in unless I bloody tell you to!), and set it. I naively thought that this would apply only to new posts, and posts that had previously been saved in the old, "broken", format would stay as is.

Oops.

I must now go back and hand-edit every post over the last nearly five years so that their paragraph flow is properly marked up. Since I do in fact have other demands on my time, this is not likely to be completed immediately; it is, however, a very high priority.

It's enough to make you seriously consider migrating to WordPress — which is apparently trivially easy.

Tuesday, 2 February 2010

I thought OSS was supposed to break down walls, not build them.

Silly me. I've only been using, evangelizing and otherwise involved in open source software for 15 or 20 years, so what do I know?

In reaction to the latest feature article in DistroWatch Weekly, I'm angry. I'm sad. Most of all, I feel sorry for those losers who would rather keep their pristine little world free of outside involvement, especially when that involvement is with the express intent of not making it easy for non-übergeeks to use open source software — in this case, OpenBSD. OpenBSD, like its cousins FreeBSD and NetBSD, is one of the (current) "major" base versions of BSD Unix. While the latter two have had numerous other projects that have taken their software as a base, fewer have branched from "Open" BSD, as can be seen in this comparison on Wikipedia. Recently, two OpenBSD enthusiasts have attempted to address this, only to be flamed mercilessly for their trouble.

The DistroWatch feature previously mentioned concerns GNOBSD, a project created by Stefan Rinkes, whose goal plainly was to make this highly stable and secure operating system, with a lineage that long predates Linux, accessible to a wider audience (who don't use Mac OS X - based (indirectly) on both FreeBSD and NetBSD).

For his troubles, Mr. Rinkes was the subject of repeated, extreme, egregiously elitist flaming, this thread being but one example. He was eventually forced to withdraw public access to the GNOBSD disc image, and adding a post stating that he did not "want to be an enemy of the OpenBSD Project."

Reading along the thread on the openbsd-misc mailing list brings one to a post by one "FRLinux", linking to another screaming flamewar summarized here, with the most directly relevant thread starting with a message titled "ComixWall terminated. Another hard worker with the intent of creating an OpenBSD-based operating system that was easy for "ordinary people" to use, promptly incurred the wrath of the leader of the OpenBSD effort, Theo de Raadt, with numerous other "worthies" piling on.

WTF?!?

OK, I get it. The OpenBSD folks don't want anyone playing in their sandbox providing access to the OpenBSD software (itself supposedly completely open source, free software) that might in any way compete with "official" OpenBSD downloads or DVD/CD sales. They especially don't want any possibility of the Great Unwashed Masses™ — i.e., anyone other than self-professed, officially-blessed übergeeks — from playing with "their" toys.

OK, fine. It has been a large part of my business and passion to evangelize open source and otherwise free software as an enabler for the use of open data standards by individuals and businesses. I have supported and/or led the migration, in whole or in part, of several dozen small to midsize businesses from a completely closed, proprietary software stack (usually Microsoft Windows and Microsoft Office for Windows) to a mix of other (open and proprietary) operating systems and applications. OpenBSD has historically been high on my list of candidates for clients' server OS needs; if you can manage a Windows Server cluster with acceptable performance and stability, your life will be easier managing almost anything else — provided that you're not locked in to proprietary Windows-only applications and services. In these days of "cloud" computing and "software as a service", the arguments against Ye Olde Standarde are becoming much more compelling.

I just won't be making those arguments using OpenBSD anymore. And to my two clients who've expressed interest in BSD on the desktop for their "power" developers over the next couple of years, I'll be pitching other solutions to them...and explaining precisely why.

Because, out here in the real world, people don't like dealing with self-indulgent assholes. That's a lesson that took this recovering geek (in the vein of "recovering alcoholic") far too many years to learn. And people especially don't like trusting their business or their personal data to such people if they have any choice at all.

And, last I checked, that was the whole point of open standards, open source, and free software: giving people choices that allow them to exercise whatever degree of control they wish over their computing activities. I'd really hate to think that I've spent roughly half my adult life chasing a myth. Too many people have put too much hard work into far too many projects to let things like this get in our way.