Wednesday 18 April 2012

'Least Knowledge' May Be More than You Think

I was chatting on IRC the other day and got into a discussion that stuck in the back of my mind until now. This guy was a fresh convert to the ideas of SOLID and to the Law of Demeter (or Principle of Least Knowledge). Now, these are principles I hold at least as strongly as the average experienced developer, and we eventually agreed to disagree over the propriety of code such as a line I just wrote (which in fact prompted this post).

Consider the Ruby code:

    data[:original_content] = article.content.lines.first.rstrip

This says:

  1. "take whatever your article is;
  2. get whatever its content attribute is or method returns;
  3. (assuming that is a String or something that can be treated as one,) get its lines enumeration;
  4. get the first entry in that enumeration;
  5. (again, assuming it's a String or workalike,) strip any trailing whitespace from that string; and
  6. stuff it into the collection data, indexed by the symbol :original_content."

It's not hard to argue that this is a bad code smell, being a long list of assumed return types and so on, except for one thing: in my view, the LoD does not cover sequencing a series of standard library calls.

Look at the code again. The bit that references my code is the fragment

  data[:original_content] = article.content

That fragment is clean; article is an object and content is an attribute of or method on that object. This code assumes that the value of article.content is a standard String. As a standard class, it's virtually guaranteed not to change in backwards-incompatible ways over the useful life of the code. What then? Standard library calls!

  1. String#lines returns a standard Enumerator instance;
  2. Enumerable#first (via the earlier Enumerator) returns a standard String instance again; and finally
  3. String#rstrip cleans off any trailing whitespace.

Nowhere are used any internal APIs that could change. Breaking that one line apart into 5 would produce repetitive, non-semantic, anti-Ruby-best-practices code of the style pandemic on large Java projects. We're in Ruby (and Rails) for a reason: part of that reason is that the language idioms encourage an almost literary expressiveness that makes more explicit the reality that a program is a conversation between the humans developing and maintaining it, that just happens to be executable by a computer. Literate, semantic programming is like behaviour-driven development; once you've wrapped your head around it and see how much better a programmer it makes you, you really don't want to go back to BDUF jousting in Java or PHP.

On further review...

Now, if you argue that the LoD is an aid in reducing coupling between classes, which reduces per-statement complexity, I can see that. But, in the example above, we're back to the fact that there is only one dot before you start jamming together standard library calls. Had I instead written:

content = article.content lines = content.lines first_line = lines.first data[:original_content] = first_line.rstrip

then I'd argue that I've introduced three unnecessary local variables when only one (the second) could foreseeably change. If I later modified the return value from article.content to something other than a Stringlike object, then probably the assignment to lines would have to change. But only that one line — the same level of change that would be required in my current code.

Comments?

No comments: