Sunday 1 September 2013

Lies, Damned Lies, and Truths Backed By Statistics

(with apologies to Mark Twain and Henry Du Pré Labouchère)

UPDATED 6 October 2014; see below. Original content follows:

This afternoon, I came across a fascinating post by Prem Sichanugrist on the Thoughtbot blog, courtesy of the ginormous informational firehose known as Twitter. In it, he discusses a new feature of Bundler 1.4.0, currently in prerelease, that lets you use multiple cores to install the Gems for your project.

Finally, you exult. One of the slowest (read: most network-bound) tools in the Ruby toolbox is getting at least some help. In his post, Prem recommends "setting the size" (number of concurrent bundle subtasks to run concurrently "to the number of CPU cores on your machine".

Don't do that! At least, not with doing some more exhaustive testing first. Not because the feature doesn't work (it does, splendidly) but because matching the --jobs option to the number of cores may well give you suboptimal performance.


Following is a summary of the results on my current project on my main development system, a Mid-2011 iMac with a 3.1 GHz Core i5 CPU (4 cores) and 16 GB of RAM under OS X 10.8.4. The command lines used for benchmarking were

for BIJOBS in 1 2 3 4
  export T1=`date`
  rm -rf Gemfile.lock vendor/cache/* vendor/ruby/*
  bundle install --jobs $BIJOBS --path vendor
  bundle package --all
  bundle install --binstubs
  echo $T1
Time to Install Gems with Different --jobs Settings
Value of Elapsed Time Savings from Time Savings in % from
1 10m 41s 0m 00s 0%
2 6m 21s 4m 20s 40.56%
3 4m 44s 5m 57s 55.69%
4 6m 31s 4m 10s 39.62%

As you can see, the trend was progressing nicely for 2 and 3 cores, at ~40% and ~56% time savings relative to --jobs 1 respectively. When --jobs had the value of 4, however, the savings was less than when using only two cores. Why?

My conjecture is that Bundler suffers a (relative) drop in efficiency if it has to swap out use of one of the CPU cores between bundling Gems and everything else. I suspect that many people will find the advice

Set the value of --jobs to one less than the number of cores in your CPU(s)

to be optimal.


It's good to have our tools use the available hardware resources (CPU cores, RAM, etc) more efficiently. But we should always remember that the obvious "best" configuration is not always the actual best configuration.

Thanks again to Prem Sichanugrist of Thoughtbot, as well as to Kohei Suzuki who Prem credits with the original patch enabling multiple cores in Bundler. Thanks, guys!

UPDATE — 6 October 2014: Matthew Rothenberg has published new benchmarks and a discussion for how things have (apparently) changed significantly with the current-as-I-write-this Bundler 1.7.3 and apparently since 1.5.3. I don't have the resources to replicate his test environment (local propaganda is titled "From Third World to First"; we made it halfway) but I don't doubt his results. It will be interesting to see what insights may be drawn from repeating his benchmarks on slower hardware and a far slower Internet connection. I expect things to become I/O-bound far more quickly than in his results, but I'll be benchmarking and reporting again by mid-October.

Thanks again, Matthew!


Sikachu! said...

Whoa! This is awesome. Thank you for doing a benchmark for this. I'm going to update the post that the optimal number is N-1, where N is number of cores.

Matthew Rothenberg said...

I re-ran similar benchmarks with new versions of bundler, and found different results. Thoughts?