Archimedes' Lever: tutorial

Showing posts with label tutorial. Show all posts

Monday, 24 February 2014

Let's Stop Pretending that Source Control is "Optional but Recommended"

Essentially all of us who make our living from the craft of software development, early in our careers, have had the experience of losing source code that we couldn't get back. Oh, we could type what we remembered into a file of the same name, but it has important differences that we will discover as we work with it. (There is no such thing as eidetic, or photographic, memory).

This has been recognised as a sufficiently universal phenomenon that most developers and managers I have known in the lsat 10-15 years (at least) use use of a versioning system as a filter against dangerous dilettantes; if someone claims to have developed "the next VisiCalc for Linux" without using one, the file containing that person's "CV" may be safely deleted (and do remember to Empty the Trash).

So why on $DEITY's green Earth do we still see well-meaning tutorials on blogs the Internet over which include a step titled "Setup source control (Optional but recommended)"? The particular tutorial that drew my ire this morning even went out of its way to note that you will need to use a "text editor of your choice" with the author noting that "I use Emacs". (Subject and direct object inverted in the original sentence, but that's another rant.)

Don't

that!!

The person reading your tutorial is probably going to be someone with strong continuing economic motivation to improve his software skills; he's reading your tutorial on the off chance that you'll actually teach him something useful. Seeing those three words ("optional but recommended") tells him, almost literally, that you don't really take him seriously. Your half-dozen other readers are going to be (would-be) apprentices in the craft, just learning their way around; they can and, based on experience, will take those three words as permission to just blaze on ahead and, when something goes pear-shaped, to start all over. Maybe you think "nobody I know would take it that way", or even "how could anybody who can read this take it that way", but The Voice of Experience™ is here to tell you that this Internet thingie is a global phenomenon. People from all walks of life, with every language, technical and educational level variation imaginable, can wake up one fine morning and say "I'm going to learn something new, using the Internet", and set out to do just that. That's a feature, not a bug.

I can hear you sputtering "but I didn't want to have to cover how to set up a VCS or use it in the tutorial's workflow; I just wanted to teach people how to use batman.js". Fair enough, and a noble goal; I don't mean to dissuade you (it's actually a pretty good tutorial, by the way). Might I suggest adding to the bullet list under "Guide Assumptions" a new bullet with text approximating

A version control system of your choice (I use Foo)

with an appropriate value of Foo. We don't care whether the Genteel Reader uses git, Bazaar, sccs, or Burt Wonderstone's Incredible Magical VCS; we can merely assume that if he and we are worth each other's time, he's using something; we only need that little extra bit of reinforcement. (Though some choices on that list might be cause for grave concern.)

One of the ways in which the craft of software development needs to change if it's ever going to become a professional engineering discipline is that we're going to have to hammer out and apply a shared set of professional ethics; the closest we've got now is various grab-bag lists of "best practices". I'd argue that "optional but recommended" source control is on the wrong side of one of the lines we need to draw.

Wednesday, 18 November 2009

You want to start a tutorial; well, you know...

Not as catchy as the Beatles' Revolution, even if the meter works.... oh well....

Continuing from the first post in this tutorial. What do I think is important when starting to demonstrate some code? As with most writing, it depends on the audience. For the purpose of this series of posts, I'm assuming that you fit comfortably in or near the following:

You're comfortable with HTML and XML doesn't make you run screaming from the room;
You have a basic understanding of databases; you've run across SQL before and understand the basic concepts;
You understand PHP; you've written some code before;
You understand the concepts of "object-oriented development", "patterns", "best practices" and ideally "test-driven development" (usually abbreviated as "TDD"), even though you may not have loads of experience (yet) with them; and crucially
You want to improve your ability to write code that you can refine and possibly reuse over time.

The assumption that you know or at least are interested in PHP is a given, since that's the language we'll be using here.

What will you need to have installed and available to follow along?

Access to a system with PHP 5.2 or higher, available both from the command line and the Web server (via a module or CGI);
The PHPUnit and MDB2_Driver_mysql modules installed and available;
A text editor of your choice;
The ability to create PHP scripts and HTML files and have those accessible from the Web server as well as the command line.

These should all be pretty obvious to more experienced PHP developers, but making sure that we're both operating from the same set of assumptions — and no others — greatly reduces the likelihood of confusion and breakage along the way. Many of you haven't yet dealt much with unit tests using PHPUnit or similar systems; that's going to be a starting point for us.

Tuesday, 11 August 2009

Tutorials, best practices and staying current

A gent by the name of Brian Carey has written a very nice little tutorial on "Creating an Atom feed in PHP", and gotten it published on the IBM DeveloperWorks site. In the space of about ten pages, Brian gives a stratospheric overview of what Atom is and why PHP is a good language for developing Atom-aware apps, and then gets into the tutorial - defining a MySQL database table to hold the data used to 'feed' the Atom feed, and writing code to get the data out and put it into the form that a reader such as NetNewsWire expects an Atom feed to be in.

Now, to be fair, Brian describes himself as "an information systems consultant who specializes in the architecture, design, and implementation of Java enterprise applications", and the paper is clearly meant as a whirlwind tutorial, not to be taken by the careful/experienced reader as necessarily production-quality code. And if this had been published in, say, 2002 or so, I'd have thought it a great how-to for banging out some PHP 4 code. But this is 2009 PHP 4 is a historical artifact, and blogs and industry journals of seemingly every stripe are decrying the poor quality (security, maintainability, etc.) of PHP code... much of which is still written as if it were the turn of the century, ignoring PHP 5's numerous new features and the best practices that both spawned them and grew from them.

So what's really wrong with doing things like they did in Ye Olden Tymes™?

Procedural code makes it harder to make changes (fix bugs, add features, change to reflect new business rules) and be certain that those changes don't introduce new defects. This is largely because...
While it is possible to test procedural code using automatable tools like PHPUnit, it's a lot harder and more complex than testing a clean, object-oriented design.
Hard-coding everything, interspersing 'magic values' throughout code, is a major hindrance to future reuse - or present debugging;
Using quick-and-dirty, old-style database APIs exposes you to the risk of input data that is more 'dirty' than it should be - opening the door to SQL injection and other nastiness;
Not staying current exposes your code to the risk that it's either using features that have since been deprecated or removed entirely, or (arguably worse), the risk that new features (such as standard library additions) make some of your existing code redundant at best.

Each of these, to varying degrees, is true of much of the PHP code I've read in the last couple of years, including that in the DeveloperWorks paper. For example, the DW paper's code makes use of the old-style PHP4 mysql_* API rather than the more abstract/portable MDB2 database abstraction layer. And there's also the rather terse implementation of a date3339() function for converting a timestamp to RFC 3339 output format, that's now nicely handled through the standard PHP DateTime class.

How would I have done things instead? Read the next few posts in this blog to find out. And, of course, comments are always welcome.

Archimedes' Lever

Monday, 24 February 2014

Let's Stop Pretending that Source Control is "Optional but Recommended"

Wednesday, 18 November 2009

You want to start a tutorial; well, you know...

Tuesday, 11 August 2009

Tutorials, best practices and staying current

About Me

Labels

Blog Archive