Archimedes' Lever: refactoring

Showing posts with label refactoring. Show all posts

Wednesday, 19 February 2014

Going Forward; Avoiding the Cliff While Looking for Cliff's Notes

Over on the Ruby sub-Reddit, I got into a discussion (mostly with Jelly_Jim, the OP, and with realnti) about the conflation in far too many minds of Ruby, the programming language, with Ruby on Rails, the famously-"opinionated" Web application framework. While pontificating on Jelly_Jim's original question, I described my frustration-bordering-on-antipathy with the traditional (known as "The Rails Way") application structure; listed a few presentations, books and blog posts that have influenced me greatly, and described my own take on the Kübler-Ross model for Ruby-based (and at least initially, largely Rails-based) Web development.

That discussion, over the course of roughly a day, helped me figure out how our app is going to survive the expected Slashdot effect, transitioning from a pretty traditional, megalithic Rails app to an app that uses Rails components in a sharply different, more recognisably "object-oriented", service-based architecture.

Leaving aside my more prosaic experiential difficulties with Rails (I really loathe ActiveRecord as commonly used), and taking into account the "Uncle" Bob Martin keynote, the various blog posts I referenced in the original Reddit post, and a couple of books I mentioned in a later reply, I think I've possibly hit upon a way to get where I want us to go.

That's the easy part, of course; it only took me never-mind-how-many months. The "interesting" part is going to be getting there — understaffed, on a shoestring budget for the moment even by startup standards (another problem I'm working), and working a schedule for two and a half years that would land a healthy twenty-year-old in hospital by now. Comments/suggestions are, as always, greatly appreciated.

Basic Architectural Principles

Loosely-coupled, granular architecture FTW. That means, inter alia,
- Nearly the entire app to be packaged as Gems used by an otherwise minimal application;
- Rails mountable engines to package assets (particularly CoffeeScript/JavaScript) and major components;
- Plan and design for updating and provisioning components which communicate across natural architectural seams.
Hexagonal, "ports-and-adapters", or "clean" conceptual architecture; all dependencies point inward. Your domain entities, at the centre of the system, will change less than details like the delivery mechanism (Web UI, phone app, etc) or the database (which the domain shouldn't even be able to prove exists as such).
By adopting a heavily job-story-oriented view of the workflows; with the use of online tools like Trello, WebSequenceDiagrams.com; and with supporting code tools like ActiveInteraction and so on, we should be a lot more productive and a lot more Agile than we have been during our forty (calendar time: two) years lost in the wilderness.
And oh, yeah, just to keep things interesting: we've got to keep regularly demonstrating forward user-visible progress, for all the obvious reasons and a couple I really can't go into here.

How We Get There, v. 28.0

First, since the major focus of late has been on some unique (and not-so-unique) CoffeeScript code, start there. Separate the existing Script code out into four engine-based Gems, following/adapting Derek Prior's example. These levels are, from conceptual bottom to top:

A "gateway" CoffeeScript class we've written as a poor man's RequireJS, that gives us just enough pseudo-dynamic referencing of data/code without changing anything the Rails asset pipeline assumes;
Our directly DOM-aware Script code, as one or possibly two Gems (one is a major crown jewel; we might decide to split off utility/support code into a separate Gem, so as not to touch one when maintaining the other);
The code proxying our internally-used message queue (decoupling "FTW", remember?) and the Ajax code to communicate with the Rails back end; and
The code implementing various domain/service objects sitting atop the Script stack; one Gem per service.

Next, adapt the relatively-straightforward (as yet) Ruby/Rails code to honour the architectural principles listed earlier. This is where tools like ActiveInteraction pull their weight (and solve several existing pain points in the process).

Just to make sure we really do understand where we're going with this, take our first user story-specific code (including interactor) and package it up as a Gem that should then Just Work as in the previous step. Start thinking about whether we really want traditionally-packaged Gems or unbuilt dependencies that remain part of the single app repository.

Proceed to reimplement and repackage the (again, relatively few) job stories already existing as was done in the preceding step.

Start knocking down the remaining Trello cards as we complete our app.

We've been saying since the Very Beginning that anything that we were building that wasn't inextricably tied to proprietary, commercially-valuable code (or that gives too much of a hint into any unique details it may have) "should be" openly released, but hadn't yet figured out a feasible, cleanly reusable way to do that. If we've done our jobs properly up to this point, we have that way now.

Push the "deploy and launch" Big Green Button. We've bloody well earned it by now.

Any Suggestions or Comments?

Thursday, 2 September 2010

Patterns and Anti-Patterns: Like Matter and Anti-Matter

Well, that's a few hours I'd like to have over again.

As both my regular readers know, I've long been a proponent of agile software development, particularly with respect to my current focus on Web development using PHP.

One tool that I, and frankly any PHP developer worth their salt, use is PHPUnit for unit testing, a central practice in what's called test-driven development or TDD. Essentially, TDD is just a bit of discipline that requires you to determine how you can prove that each new bit of code you write works properly — before writing that new code. By running the test before you write the new code, you can prove that the test fails... so that, when you write only the new code you intend, a passing test indicates that you've (quite likely) got it right.

At one point, I had a class I was writing (called Table) and its associated test class (TableTest). Once I got started, I could see that I would be writing a rather large series of tests in TableTest. If they remained joined in a single class, they would quickly grow quite long and repetitive, as several tests would verify small but crucial variations on common themes. So, I decided to do the "proper" thing and decompose the test class into smaller, more focused pieces, and have a common "parent" class manage all the things that were shared or common between them. Again, as anyone who's developed software knows, this has been a standard practice for several decades; it's ordinarily the matter of a few minutes' thought about how to go about it, and then a relatively uneventful series of test/code/revise iterations to make it happen.

What happened this afternoon was not "ordinary." I made an initial rewrite of the existing test class, making a base (or "parent") class which did most of the housekeeping detail and left the new subclass (or "child" class) with just the tests that had already been written and proven to work. (That's the key point here; I knew the tests passed when I'd been using a single test class, and no changes whatever were made to the code being tested. It couldn't have found new ways to fail.)

Every single test produced an error. "OK," I thought, "let's make the simplest possible two-class test code and poke around with that." Ten minutes later, a simplified parent class and a child class with a single test were producing the same error.

The simplified parent class can be seen on this page, and the simplified child class here. Anybody who knows PHP will likely look at the code and ask, "what's so hard about that?" The answer is, nothing — as far as the code itself goes.

What's happening, as the updated comments on pastebin make clear, is that there is a name collision between the ''data'' item declared as part of my TableTest class and an item of the same name declared as part of the parent of that class, PHPUnit's PHPUnit_Framework_TestCase.

In many programming languages, conflicts like this are detected and at least warned about by the interpreter or compiler (the program responsible for turning your source code into something the computer can understand). PHP doesn't do this, at least not as of the current version. There are occasions when being able to "clobber" existing data is a desirable thing; the PHPUnit manual even documents instances where that behaviour is necessary to test certain types of code. (I'd seen that in the manual before; but the significance didn't immediately strike me today.)

This has inspired me to write up a standard issue-resolution procedure to add to my own personal Wiki documenting such things. It will probably make it into the book I'm writing, too. Basically, whenever I run into a problem like this with PHPUnit or any other similar interpreted-PHP tool, I'll write tests which do nothing more than define, write to and read from any data items that I define in the code that has problems. Had I done that in the beginning today, I would have saved myself quite a lot of time.

Namely, the three hours it did take me to solve the problem, and the hour I've spent here venting about it.

Thanks for your patience. I'll have another, more intelligent, post along shortly. (That "more intelligent" part shouldn't be too difficult now, should it?)

Sunday, 17 January 2010

Fixing A Tool That's Supposed To Help Me Fix My Code; or, Shaving a Small Yak

Well, that's a couple of hours I'd really like to have back.

One of the oldest, simplest, most generally effective debugging tools is some form of logging. Writing output to a file, to a database, or to the console gives the developer a window into the workings of code, as it's being executed.

The long-time "gold standard" for such tools has been the Apache Foundation's log4j package, which supports Java. As with many tools (e.g., PHPUnit), a good Java tool was ported to, or reimplemented in, PHP. log4php is uses the same configuration files and (as much as possible) the same APIs as log4j. However, as PHP and Java are (significantly) different languages, liberties had to be taken in various places. Add to this the fact that PHP has been undergoing significant change in the last couple of years (moving from version 5.2 to the significantly different 5.3 as we wait for the overdue, even more different, 6.0), and a famous warning comes to mind when attempting to use the tool.

Here be dragons.

The devil, the saying goes, is in the details, and several details of log4php are broken in various ways. Of course, not all the breakage gives immediately useful information on how to repair it.

Take, as an example, the helper class method LoggerPatternConverter::spacePad(), reproduced here:

    /**
     * Fast space padding method.
     *
     * @param string    $sbuf      string buffer
     * @param integer   $length    pad length
     *
     * @todo reimplement using PHP string functions
     */
    public function spacePad($sbuf, $length) {
        while($length >= 32) {
          $sbuf .= $GLOBALS['log4php.LoggerPatternConverter.spaces'][5];
          $length -= 32;
        }
        
        for($i = 4; $i >= 0; $i--) {    
            if(($length & (1<<$i)) != 0) {
                $sbuf .= $GLOBALS['log4php.LoggerPatternConverter.spaces'][$i];
            }
        }

        // $sbuf = str_pad($sbuf, $length);
    }

Several serious issues are obvious here, the most egregious of which is acknowledged in the @todo note: "reimplement using PHP string functions." The $GLOBALS item being referenced is initialized at the top of the source file:

$GLOBALS['log4php.LoggerPatternConverter.spaces'] = array(
    " ", // 1 space
    "  ", // 2 spaces
    "    ", // 4 spaces
    "        ", // 8 spaces
    "                ", // 16 spaces
    "                                " ); // 32 spaces

If you feel yourself wanting to vomit upon and/or punch out some spectacularly clueless coder, you have my sympathies.

The crux of the problem is that the function contains seriously invalid code, at least as of PHP 5.3. Worse, the error messages that are given when the bug is exercised are extremely opaque, and a Google search produces absolutely no useful information.

The key, as usual, is in the stack trace emitted after PHP hits a fatal error.

To make a long story short(er), the global can be completely eliminated (it's no longer legal anyway), and the code can be refactored so:

    public function spacePad(&$sbuf, $length) {
        $sbuf .= str_repeat( " ", $length );
    }

Of course, this makes the entire method redundant; the built-in str_repeat() function should be called wherever the method is called.

I'll make that change and submit a patch upstream tomorrow... errr, later today; it's coming up on 1 AM here now.

But at least now I can use log4php to help me trace through the code I was trying to fix when this whole thing blew up in my face.

At least it's open source under a free license.

Thursday, 14 May 2009

Jaw-Droppers - Blast from the Past

Just when you thought it was safe to forget that the 1970s ever existed... this gem shows up on the XML Daily Newslink, a mailing list I follow intermittently. (Actually, this was included in the XMLDN from Wed 11 Feb — an indication of how "closely" I've been following lately.)

Developing a CICS-Based Web Service
G. Subrahmanyam, G. Mokhasi, S. Kusumanchi; SOA World Magazine
Web services have opened opportunities to integrate the applications at an enterprise level irrespective of the technology they have been implemented in. IBM's CICS transaction server for z/OS v3.1 can support web services. It can help expose existing applications as web services or develop new functionality to invoke web services. One of the commonly used protocols for CICS web services is SOAP for CICS. It enables the communication of applications through XML. It supports as a service provider and service consumer independent of platform and language. SOAP for CICS enables CICS applications to be integrated with the enterprise via web services as part of lowering the cost of integration and retaining the value of the legacy application. SOAP for CICS also comes along with the implementation encoder and decoder.

"SOAP for CICS"? Give it a REST, guys. On the other hand.... preserving this by-now more-solid-than-most-rocks 1969-era technology IS a signal achievement; an example of engineering stability on par with Soyuz.

On the other hand... support for legacy technologies like this looks set to become increasingly expensive and risky over time, since apparently:

Numerous surveys published in the last few years indicate that the sizable majority of "old mainframe" tech folk will have retired by 2010 (do the math on the years), and
partly as a result of (1), users risk become increasingly dependent on outsourced Indian providers — hardly conducive to effective project control.

In my own professional view, technological preservation activities like this are mainly useful for one thing: they can serve as the pattern against which a re-implementation of the business process using more currently-supportable tech can be verified. And once an organization has done this for its most valuable legacy systems (which wouldn't have been preserved this long if they weren't so valuable), the actual and perceived risk of migrating to new technologies as conditions warrant drops dramatically. After all, if you (or your internal colleagues) have successfully brought your line-of-business systems from CICS via REST to, say, PHP or Java, you're a lot more comfortable with the idea of migrating to whatever the mid-to-trailing technology is in ten years' time — while the support costs of the (then) existing system are still manageable.

Are any of you actually involved in any technological archaeology like this? When I was younger, I used to brag that half the systems I'd worked with were older than I was, but that stopped being true about 1995. (AFAIK, I was one of the last guys to professionally touch a working IBM 360 mainframe, in about 1989.)

(Original article at this entry's title link, which is also here.)

Tuesday, 18 September 2007

Improvement as opposed to Change

A link on the Agile CMMi blog had some very interesting things to say about a gentleman named Brian Lyons, the CEO (as well as CTO and founder) of Number Six, a Vienna, Virginia (Washington, DC area) software technology services/business consulting firm. Number Six quite obviously 'get' the concepts behind Agile CMMi, Hillel Glazer of the blog wrote. Interested, I flipped over to their site to hopefully learn a bit more, maybe drop off a CV.

Mr. Glazer wrote the blog entry in question on 25 July. The first thing you notice when viewing the Number Six site is the home-page obituary of and tribute to Brian Lyons, who died on (Monday) 3 September 2007 in a motorcycle accident. There are various links to family and tribute sites, a press release, and a request that "in lieu of flowers, the family has requested that donations be made to" a scholarship fund at the University of Maryland (excellent, particularly for those too far or too late to attend the funeral). I wish the family and company all the best in their times of grief and trouble.

The point I originally started writing this entry to make, however, was inspired by one of the bullets on Number Six' careers page, under the heading "Why Six?": "We are committed to consistent improvement, not continual drastic change."

Think about that distinction for a moment. Many of us, particularly those in the software-geek persuasion, start our careers trying to remake everything; not necessarily because it's broken, but so that we can do it (whatever it is), make it ours, introduce new techniques or technologies, and hopefully (but improbably) provide a better solution to the problem at hand than what was there before.

The forces arrayed against that impulse are formidable. Playing the role of (and often in actual fact) those who are by nature suspicious of any radical approach to a problem, too often dismissing out of hand any alleged improvement or innovation without "giving it a fair chance", in the eyes of the wet-behind-the-ears Young Turk. Occasionally, of course, improvements and innovations too beneficial to ignore do come out of this process.

As the developer matures (which may take days, decades or eternities), the Young Turk recognizes that not all of the hindrances were traditionalist per se; rather, they were operating from a different (usually business-oriented) set of priorities. Learning to understand and appreciate those priorities is one of the fundamental aspects of any successful transformation from ivory-tower Geek to journeyman software craftsman (or -woman), able to make a contribution to customers and to the craft of software development.

What becomes obvious, to all thoughtful participants and stakeholders in the process of developing and maintaining any software system is that a natural tension exists between three apparently conflicting ideas:

Don't change what works well enough when there are more pressing needs to attend to;
New capabilities, chosen and implemented properly, can have strongly beneficial effects (efficiency, productivity, wider customer scope, 'better' according to various aspects of quality);
Any change to an existing system involves direct and indirect costs which must be carefully evaluated and compared against the expected improvements.

Balancing and managing the conflict between those concepts throughout the lifetime of a piece of software is an art that has yet to be thoroughly and reliably mastered with a level of reliability and cost widely acceptable to business customers. Several philosophies and techniques (often marketed as technologies) have been developed and marketed as (attempted) solutions to that problem. Three that I have found extremely useful, in complementary ways, over the last few decades are agile development, refactoring and the CMMI. (Those already familiar with the concepts may skip down to the paragraph which begins "As argued by practitioners such as Hillel...".)

The CMMI (previously more widely and less precisely known as the CMM, for Capability Maturity Model) has been around for quite some time as a formal means of evaluating the maturity and coherence of an organization's development efforts (among other activities). It requires its users (organizations) to acquire and document ever-increasing detail of precisely how they go about the process of development and how they are required to be able to prove (in an internal or external audit) that those processes are being followed and any discrepancies noted and accounted for. This suffers as a standalone policy framework for software development on two grounds:

the "what" and "why" of development are completely ignored. CMMI fundamentally doesn't care if you're building a pile of junk that has serious practical problems, as long as you do it using the process you've defined for yourself.
The CMMI, along with US DOD Standard 2167A, have been blamed for the logging of more trees (to make paper) than virtually any other industry business practice. This is largely because both are seen (in CMMI's case, somewhat unfairly) as pushing "hidden mandates" for the rigid , never-far-from-obsolete "waterfall" model of development.

Agile development, on the other hand, is a survival response to both the (largely management-driven) mantra that "Real Artists Ship", and on the other hand, to seemingly interminable periods between "milestones" (particularly in the waterfall model) where nothing of intrinsic value is visible to an outside observer (e.g., management or customers). Its main criticism from opponents is that it produces too little documentation (the product is the documenatation, to a large degree).

An aspect of, or adjunct to, agile development is refactoring, pioneered and popularized in the Martin Fowler-written book. Refactoring is all about how you can make (sometimes radical) changes to the internals of a software system, provided you keep the external (customer-facing, revenue-generating) interfaces constant. Wrapping one's mind around this as a formalized process, as opposed to the "don't-break-what-works" mentality common to experienced developers (and managers), is a threshold experience in a software craftsman's progress.

As argued by practitioners such as Hillel, the combination of the process-centric CMMI and the solution-centric, visible-progress Agile set of processes (including, particularly, refactoring) gives the "best of both worlds" for development — a focus on artifacts and products created through a survivable process, matched with a demonstrated and documented knowledge of exactly what is meant by "that process" under current circumstances. This also implies demonstrating understanding of the specifics of "current circumstances" that drive the decision-making and artifact-development processes — which brings us back full circle to understanding why Agile is a good fit to begin with.

Unlike the Rational Unified Process (RUP), which grew out of a waterfall-driven management-oriented mentality (and the opportunity to sell software tools and consulting services to any shop using the heavily-marketed process), Agile and CMMI development can be successfully started after passing around a few books among the development and management staff. Tools and training are, chosen thoughtfully, extremely helpful, and the appraisal process within CMMI (formally SCAMPI, the well-known "Level 1" to "Level 5" assessment of process maturity) does eventually require an outside assessor, or registrar, to formally certify the organization. But to take a typical small development group from the customary initial chaos to a finely-tuned, market-leading, customer-satisfying machine, it has been my experience and observation that it is far easier (and more cost-effective) to implement CMMI in an Agile fashion than to go down the (vendor-specific) RUP route. Imposing either on a large organization would require an unlikely combination of managerial brilliance, sadism and masochism; revolutions boiling up from below and winning eventual management sanction have proven much more likely to be successful in that type of environment.

So what's a Young Turk (or a more seasoned craftsman) to make of all this blather? The simple point: Consistent improvement is practical, and without life- or career-threatening implications, achievable on a repeatable basis. Continual drastic change, in contrast and almost by definition, will lead the development group to many sleepless nights trying to get that last show-stopper bug fixed, management to wonder why those bozos in development can't be trusted to ship anything on time, and customers to wonder what they're really getting for their money beyond the hype and bling of the marketing materials. Which team would you rather be on?

Archimedes' Lever