Moving from cowboy coding to agile development

Friday, February 10, 2006

On Test Coverage

On Wednesday I spent three hours at Reaktor again, participating in an Agile Finland's coding dojo session for the second time ever. It was fun, and this time I had a particularly good lesson about (lacking) test coverage.

First, a small disclaimer: I don't mean to reprove anyone involved in writing the code that we started with. I can't claim I would have noticed this bug during the previous session myself, if I had happened to be there. I read the code base through before the Wednesday's session and this didn't catch my eye.

Now, to the point. The challenge was to write a poker hand evaluator, and for this, we need a way to construct the hand in some way. This is the corresponding method as it was:

private Card[] createHand(String hand) {
String[] cardStrings = hand.split("-");
List cards = new ArrayList();
for (String cardString : cardStrings) {
cards.add(new Card(cardString.substring(0, 1),
cardString.substring(1, 2)));
return cards.toArray(new Card[cards.size()]);

This method takes a string representation of a five-card hand, e.g. "DA-D2-D3-D5-D4" (which would be a straight flush from the ace of diamonds to the five of diamonds). The method generates the corresponding Card objects with a suit and a rank. However, there's a small error. A card with a rank of 10 can't be generated, since the latter substring only takes one character to represent rank (and elsewhere in the program the rank was supposed to be "10" - this is also the only card in which the error occurs, since aces are represented as "A":s, and facecards are "J", "Q" and "K", respectively).

The fix was very simple:

cards.add(new Card(cardString.substring(0, 1),

This error had not caught anyone's eye - maybe because there was not a single test including a card with the rank of 10. To me, this raises a question about the right kind of test coverage to use with TDD. Should there have been test cases with every single card tested? Probably not. But this time taking something - even this simple - for granted resulted in a bug.

Personally, I think one of the points of TDD is not to think you have taken everything into account, but knowing you have done so. (Well, of course programs can never be completely bug-free, but you know what I mean...) To a non-expert coder like me, I guess it's exactly the small things like this that I can learn to use extensive unit tests with. Live and learn - sometimes the hard way.


Timo Rantalaiho said...

Now afterwards it is easy to say that this particular bug would have easily been caught with simple equivalence class analysis.

Since 10 is the only card that has a two-character rank, a hand containing it should have been tested.

This is also a good example how code coverage tools can tell you only so much: That particular piece of code might have had a 100 % coverage on Clover, but it had to be exercised with certain input to expose the bug.

Mika Viljanen said...

Yes, it should have been tested. The reason why this was a good lesson for me was that a test like that is probably never written if the coder(s) don't expect any problems in that part of the program. Here it had not occurred to anyone that the hand creation method would not work with a certain card, so no tests were written for that special case. (As one of the people said on Wednesday, it's funny when you can fix a bug by removing code - it certainly hints that the problem was never even considered, if it could have been avoided by just NOT writing something.) So, in a way, writing a test like that would mean you already know what kind of troubles to expect.

I think the only way to avoid a situation like this would be to write tests for everything that happens in the program, however trivial it may be. However, that would probably be overkill.

Dave Nicolette said...

I disagree the only way to avoid this kind of situation is to write tests for every possibility. Design tests carefully, and you need not try to cover every permutation.

This example looks like an edge case to me. The 10 is the only rank described by a two-character string. Therefore, it's the "longest string" case. That's a very basic test condition, generally speaking. It's possible that a case like that would be more obvious when writing the unit test first than when writing the code first and then trying to think of tests to cover areas where you "expect" problems. (Would you need any tests at all if you already knew where the problems would be?)

I see another possible red flag in all this. Some of the entities in the solution are implemented as strings instead of objects. It's an approach that can increase the potential for subtle errors. What if you defined typesafe classes for the cards more or less along the lines of this concept? And why not a Hand class instead of just Card[]? Isn't a hand more than just a stack of cards?

Timo's comment about Clover is well taken. Tools like that may be useful, but we have to avoid using them as a crutch. It's all too easy to slip into a false sense of security. Conversely, it's possible to have a useful test suite that doesn't achieve 100% coverage. What's the value in testing trivial getters and setters that are generated by tools like IntelliJ or Eclipse? Better to spend one's time looking for cases like the "10".

Dave Nicolette said...

Sorry, didn't think of this before: What does this give you


if you get an input value like "D387"? It's only a string, so there's no protection against that sort of thing.

Mika Viljanen said...

I guess I went a bit too far when I suggested that the only way to avoid the aforementioned situation would be to write tests for everything. The "edge case" point you make is well justified, and it should have sufficed.

When it comes down to the class structure, I should have provided links to the whole code bases. They are both available online: the former, the latter. So, there are both Card and Hand classes. The "D387" problem you asked about is tackled in the Card class constructor. (However, as I was writing this I noticed there is no validation whatsoever for the suit. Speaking about test coverage...)

The string representation is only used to generate hands as easily as possible and is not used after that. We also discussed about changing the rank list to an enumeration but ran out of time.

Dave Nicolette said...

I see what you mean, after having a look at the code. I shouldn't leap to judgments based on small examples.

It sounds as if the coding dojo events are a lot of fun and a valuable experience for everyone involved. If I were just a bit closer to Helsinki, I would definitely join you!