Attack of the Codosaurus!: September 2013

OOPS! I was asked to flesh out an earlier version of this post, to contribute to the blog of one of my clients (Celerity IT). In doing so, I messed up the existing post. Rather than retrieve it from the dim dark depths of Internet history, I give you here the new version. Meanwhile, they have further edited it and posted their version at http://www.celerity.com/blog/2013/09/14/test-driven-development/.

   There is some controversy among developers whether TDD (Test Driven Development) is really worth all the extra time it seems to take.

   To answer this, first we must define what TDD is! Basically, it means developing a small piece of functionality by first writing a test for it, then code to make that test pass. For instance, in making a job board, a piece of functionality might be "get a list of jobs with given text in the title". So, you might write a few tests like "with an empty database, create a job with the title 'Java Developer', ask the Job class for all jobs with 'Java' in the title, and assert that I found that job", and the same but looking for 'Ruby' and assert that you didn't find it.

   (This is not to be confused with BDD, or Behavior Driven Development. BDD is like TDD from a user's point of view, rather than a developer's. It usually uses much more English-like language, so as to let non-technical stakeholders be involved. This can help narrow the gap of understanding between them and the developers. Many people do BDD for the broad overview and then TDD for the nitty-gritty internal details.)

   Most developers, however, go a bit further. To most, TDD is a cycle of "red, green, refactor":

Red: write a test, to test whether the code (that you haven't written yet) does what you want... and verify that it fails. If it doesn't, then your test is meaningless! (Writing good tests is an art unto itself, which I won't go into in this post.)
Green: make the test pass... and keep the whole test suite passing! If your code broke anything else, and you must now go fix the breakage, whether that means updating an outdated test, tweaking your new code, tweaking old code, etc. You can't call it "green" until the whole test suite passes!
Refactor: this is what makes it "go further". To refactor a piece of code means to improve the internal design, without altering the behavior. There have been many books written about this, so I won't go into detail; just know that, even above and beyond the benefits of the red and green parts, TDD practitioners feel a responsibility to clean it up. If the way you got the test to pass was a horrible little kluge (admit it, we've all done it!), make it right before you check it in.

   So... does this take extra time, and is it still worth it?

   One of the dirty little secrets of TDD is that, yes, it will slow you down... in the very short term. If you just want to get a feature implemented today, and don't care about tomorrow, you might be better off skipping testing, whether before or after coding.

   BUT....

   That would not be wise in the long run, or even the medium run. You have to think of it as an investment. (This pairs quite nicely with the notion of "technical debt".)

   TDD will help you get that feature to market even more quickly than skipping the tests, and with far better quality! The process of getting a feature not just implemented but also to market allows enough time for bugs to be noticed, and have to be fixed. And for other features to be added, that might interfere with this one. And bugs to be noticed in that other feature, whose fixes might interfere. And situations to crop up that you just didn't anticipate.

   THAT is where TDD will save your bacon! The test suite, that you grow along the way, will help get those features implemented, and bugs fixed, without breaking other features. It's almost like giving you guard rails. If something does break, then the test suite will help pinpoint it, saving you hours of exploratory manual testing. In the long term it will save the project hours of exploration, debugging, finger-pointing, and other such nonsense... and if you're really doing Test-DRIVEN Development, probably guide you to higher quality code in the first place, saving months of disentangling and re-implementation.

   But how does TDD do that? First we have to define what we mean by "quality". The two main things TDD will help with are, from a general standpoint, "it does what it's supposed to do" (including not having bugs), and from a geekier standpoint, "it has better internal design".

   The first part is obvious. After all, that's what the tests prove. But what about "better internal design"? What does that even mean? There are many aspects of software design, but TDD guides you to think in terms of small easily testable pieces. This leads to code that is more reliable, modular, reusable, flexible overall, and a host of other benefits. For this reason, some people are now claiming that TDD should stand for Test Driven Design rather than Development. Perhaps our more DoD-minded colleagues will call it Test Driven Design and Development, or TDDD, or T3D for short, in much the same way they keep coming up with more C's to precede an I. ;-)

   Of course, if you include the Refactor part of the cycle, that's another investment, one that usually pays off quite well in the long run. Paying attention to proper design early will make it much more likely that the code will stand the test of time, lasting much longer before needing to be totally chucked out and rewritten. We've all seen code so horrendous that we'd rather start over from scratch, rather than modify it -- don't be "that coder".

TL;DR: TDD does make it take longer to implement a feature... but not to get it
to market, and it yields much better code, saving even more time and expense later.

Recently I encountered some Ruby code that looked like:

ids = holder.things.collect { |thing| thing.id }

(I prefer to say map rather than collect, but they're really the same thing. Which one you use is largely a matter of taste, influenced by what languages you've used in the past, and your laziness in typing.)

There are two small successive improvements that can be made to this. First, when you have any code of the form:

bunch_of_things.collect { |thing| thing.some_method }

(and remember, retrieving a data-member of an object is a method!) you can shorten that to:

bunch_of_things.collect(&:some_method)

This uses the & shorthand for Ruby's to_proc method. Long story short, the : makes a Symbol, and the & calls to_proc on that. collect will send that to each item in turn, making it behave just like a block explicitly calling it on each item. (I won't go into the nitty-gritty details here of how that works; if you care, investigate Ruby's yield keyword.)

For example, if you have a block of numbers and you want to get their even-ness, you can do:

[1, 2, 3, 5, 8].map(&:even?) # => [false, true, false, false, true]

You can also use the &: trick with block-taking methods other than collect/map, such as inject/reduce:

[1, 2, 3, 4, 5].inject(&:+) # => 15

though of course inject will want a method that takes an argument. (Why this is so, is left as an argument for the reader.)

   Sometimes you can omit the &. I'm not sure exactly what the rule is, or even if there is one. At the cost of one more character, you may as well just always use it.

  Back to our original code, though, there's another trick we can use to simplify this.

  ActiveRecord provides a method called pluck... and we were indeed using ActiveRecord. pluck sets the SQL SELECT statement to retrieve only the columns you want. The result is an array of the values ready to be used by your program. (If you give it more than one column to pluck, the values are themselves arrays. However, in this case, as in the vast majority, we were only interested in one column.) Not only does this often make the results easier to deal with, it can also help deal with a large dataset by saving i/o between the database and your application, memory on both ends, etc.

   So, rather than go through the hoops of retrieving the things associated with holder, and then looping through them to extract the id column, this could be written more simply as:

ids = holder.things.pluck(:id)

What are some of your favorite Ruby (or Rails) idioms for making common code more concise (short but still clear)?

Attack of the Codosaurus!

Monday, September 23, 2013

Is TDD Worth It?

Thursday, September 12, 2013

Pluck Your Colon, or, Concisifying Your Ruby on Rails Code