Design Pattern Evangelist Blog

Smart pointers about software design

Writing Tests Before the Implementation

I know it sounds completely backwards, but please give it some consideration


Fosbury Flop

Introduction

I’ve written about my conversion to unit testing, the attributes of effective unit tests, the basic elements of unit tests, and most recently test doubles.

But I haven’t described a process to create automated unit tests. The process mostly features writing the test before implementation. Please stay with me gentle reader. I know many of you may hate this idea. I won’t force you to adopt it. Let me state my peace, and you can decide for yourself.

Why Are Developers So Reluctant to Write Tests First?

I can’t tell you how many times I heard stand-up status reports like this:

Yesterday, I implemented the feature. Today I’m going to write the unit tests.

Next day’s stand-up:

I’m still working on the unit tests. They’re harder to create than I thought they were going to be.

I’d suggest that their efforts would be easier if they wrote their tests first. They’d look at me like I had three heads.

Writing tests first didn’t make sense to me either. I’ve already described this in my conversion to unit testing.

Once I started writing tests before implementation, I delivered fewer bugs into production. I can only recall one bug that escaped into production in the last two years of my career. It occurred because I was too overly confident with a minor refactoring, and I didn’t write a test to confirm it. I should have tested it, since the refactoring was flat out wrong. Mea culpa. Mea culpa. Mea maxima culpa.

I suspect two basic reasons developers avoid writing the test before the implementation:

I think there’s also a third one, more fundamental, which I’ll present in the next blog (TDD Is Introduced Too Late).

It Feels Wrong

Writing the test first doesn’t make sense. Most developers write their code and then test it. Maybe they create an automated test. Maybe they just run it manually. This is the way.

Dick Fosbury revolutionized high jumping with the Fosbury Flop. Before Fosbury, high jumpers jumped the high bar by thrusting their leading leg over the bar first, then curling their body around the bar, face down, with their trailing leg being the last part of their body clearing the bar.

Fosbury Flop

Fosbury jumped headfirst on his back over the bar. His legs would be the final parts of his body to clear the bar. He didn’t invent the flop, but he popularized it once there was sufficient padding in the landing zone so that jumpers were less likely to injure themselves while landing on their backs.

There was skepticism in the track and field community at first, but Fosbury’s success, including a 1968 Gold Olympic medal, was too much evidence to ignore.

Writing tests before the implementation is backwards, just like the Fosbury Flop is backwards. The flop wasn’t feasible until there was sufficient padding for the landing. Likewise, test-before-implementation wasn’t feasible until we had sufficient test frameworks.

Previous Experiences

Previous automated testing experiences weren’t fruitful. Here are some common experiences or impressions, which I may go into more detail with these in a future blog (TBD):

Test First Processes

There are several processes for writing tests before the implementation. Most are variations on the same theme.

It’s not just that developers are reluctant to adopt test first techniques, test before and test after camps can almost get into holy wars over the practices.

I’m going to present several test first processes as best I understand them. I probably won’t change too many opinions for those staunchly in either camp. If you haven’t chosen a camp, approach these with an open mind. Some of the steps don’t make sense at first. I’ve provided commentary for additional context.

Test-Driven Development

Test-Driven Development (TDD) is the most common practice. It was introduced by Kent Beck in his 2002 book: Test-Driven Development: By Example. It focuses upon two basic rules:

  1. Never write a single line of code unless you have a failing automated test.
  2. Eliminate duplication.

The first rule ensures that the code works. The second rule ensures that the code is clean. These rules are often reduced to: Make it work; make it right.

The TDD process is comprised of several steps:

  1. Write a failing test case. Since there is no implementation for that scenario, it should fail when executed. In most test frameworks, the failing test will be color coded red.
  2. Make the failing test case pass. Write the simplest possible amount of implementation to get the test to pass when executed. This might be as simple as returning a harded-coded constant that the test expects. In most test frameworks, the passing test will be color coded green. This is the Make it work phase. Kent Beck refers to this as: Fake it until you make it. This made no sense to me when I first read it. Shouldn’t we implement a general solution? Not yet. It will emerge through the process. NOTE: If the new simple implementation causes an existing test to fail, then do not proceed until all tests are passing once more. If the issue is not obvious, then undo the new simple implementation and start again.
  3. Refactor. Clean up the code if needed. Refactor by changing the structure of the code without changing its behavior. This is the Make it right phase. This may not occur until you’ve faked it for several tests, and a more generic implementation starts to emerge. As the tests get more specific, the code gets more generic.Bob Martin. Run your tests after each refactoring. If the refactoring has violated any previous tests, the tests will fail, and you’ll need to address the refactoring failures before moving forward.
  4. Repeat Steps 1-2-3. Add new failing tests repeating steps 1-2-3 for each until you can no longer add another failing test. I.e., all known behaviors, including corner/edge cases, are defined via a test.

The first two steps are not about the implementation. They are about the test. Step #1 ensures that you don’t create a false positive test. If you write the test after the implementation, and it always passes, you’ll never know for sure whether it’s a legitimate passing test or a false positive test. Step #2 ensures that the failing test passes when it should pass.

Step #3 updates the implementation making it more generic while the tests provide the safety net. Be careful to only refactor existing behavior covered by tests. Don’t implement new behavior that doesn’t have a test case that specifies it.

Evolve the implementation through a sequence of edits until it looks as though you knew what you were doing all along. You know you are working on clean code when each routine you read turns out to be pretty much what you expected.Ward Cunningham

Don’t do Steps #2 and #3 together. Step #2 adds new behavior. Step #3 refactors the implementation for the existing behavior. If you try to add new behavior and refactor existing code and test cases fail, then you don’t know if the failure is due to the new behavior or the refactoring. Always run the tests between Steps #2 and #3.

A complete 1-2-3 cycle is often called the Red-Green-Refactor cycle. Red for the failing test. Green for the passing test. Refactor for refactoring. Step 4 repeats the Red-Green-Refactor cycle until you can no longer create a failing test.

Step #4 requires creative thinking and domain knowledge. We want to specify all scenarios based upon customer/domain/behavior needs. We don’t want tests based upon the implementation. I’ll address this in a future blog about behavior. See: Be On Your Best Behavior.

The Three Laws (Rules) of TDD

Bob Martin added more discipline and granularity to TDD with his Three Laws of TDD (3TDD):

  1. You may not write production code until you have written a failing unit test.
  2. You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
  3. You may not write more production code than is sufficient to pass the current failing test.

At this point you should refactor if there’s enough code to refactor meaningfully, and even then, there won’t be much to refactor. Consider refactoring the test too. Tests are code, and they should be well maintained as well. Do not refactor both code and test without running the tests between the refactorings. We don’t want to run the risk of adding a fault to both the test and implementation such that they mask one another.

Each Red-Green-Refactor cycle should only take about 30 to 120 seconds with 3TDD. Multiple iterations may be needed until one unit test and its implementation are complete.

Repeat with a new test using the Red-Green-Refactor TDD process until you can no longer create a failing test.

Martin admits that these laws are counter-intuitive at best and possibly nuts. He recommends two open windows in your IDE. One for the test file and one for the implementation file. You’ll be bouncing back and forth between them during each Red-Green-Refactor cycle. It’s a little disorienting at first, but you’ll get used to it.

Refactoring

Returning to Make it work; make it right, briefly. Developers get their code to work. But they often don’t take the time to make it right. Code is like any creative writing endeavor. The first draft puts your initial ideas on paper. Refinement makes it comprehensible by others.

All too often developers deliver their first drafts. Their code works, but it may not tell the story well. This is how technical debt is added to our codebases, one commit at a time. I’ll blog (TBD) about technical debt in the future.

Refactoring is built into the TDD/3TDD process. Refactoring within the process allows us to:

The word “refactoring” should never appear in a schedule. Refactoring is not a story or a backlog item. Refactoring is not a scheduled task. Refactoring is immediate and continuous. It’s like washing your hands in the bathroom. You always do it.Rob Martin

Test && Commit || Revert (TCR)

Kent Beck posted a short blog a few years ago: Test && Commit || Revert (TCR). If you thought Martin’s Three Laws were a bit crazy, wait until you hear TCR:

Period. The code is always working. If anything fails, it vanishes, and you’re back to your previous passing green state. You’re always in a known state, but it’s quite Draconian when returning there. If a test fails, then poof, all non-committed work vanishes.

Beck introduced TCR in this blog, and stated that he thought it would never work, but he decided to give it a try. He was surprised to find that he liked aspects of it. Beck felt TCR provided the following benefits:

I doubt that many follow TCR religiously. People will fudge here and there and debug rather than revert. However, if you maintain the spirit of TCR and 3TDD, your code will most likely always be in a green known state.

I feel TCR is mostly consistent with 3TDD. It adds version management to the 3TDD process. There is one aspect of TCR that I don’t like. There’s no ability to introduce the failing test, which I think is critical to the overall process. I don’t know how TCR prevents false positive tests. Jeff Grigg proposed a TCR modification in his blog entry: Test-Driven Development with “( Test AND Commit ) OR TestCodeOnly OR Revert”.

My Preference

I tend to follow Martin’s 3TDD process, but with the spirit of TCR. If one of my refactoring updates causes tests to fail, I’ll give myself about 5 minutes to find the source of the problem and address it. If I can’t, which usually happens when I’ve made too many changes without running the tests, I’ll revert my changes one at a time until my tests are passing, and I’ll start again.

Sometimes it’s painful to see updates I added an hour ago vanish. But once I start again, I run my test suite more frequently to stay true. I also find that I proceed much faster on the second attempt and without any further issues.

I’ve only reverted a significant amount of code a handful of times. I’ve never had to revert on the second attempt.

Code Coverage

Code coverage is not a target. It’s a measurement. It’s a tool for developers to identify code that is not being tested.

TDD and 3TDD yield high code coverage by default. Code isn’t written until there’s a failing test case that executes it. Dead code should not be introduced. If refactoring renders code dead, then it can be removed.

Humble Object

Code that interacts directly with external dependencies, often Adapters, can’t be tested easily since we want to eliminate external dependencies from unit tests. We can’t kick the indirection can down the road any further once we’ve reached the edge of the design, where the Adapters interact with the external dependencies directly. Therefore, this code might not be covered via automated tests.

One way to eek out a bit more coverage is via the Humble Object. It’s the only design pattern I know of whose intent is to make testing easier. It’s not in the Gang of Four Design Pattern catalog.

Use the Humble Object when code that should be easy to test is tightly coupled to code that is difficult to test. Separate the difficult code into its own class or method, as I showed with the Semaphore in Suril, the Semaphore and Me.

Isolating the difficult-to-test-code should make testing easier for the remaining code. The isolated Humble code should be as small as possible, that is, humble. We can’t easily test it, but it should be humble enough to confirm its functionality via visual inspection. Integration and full system testing should provide the remaining confidence sufficiently.

Coding Katas

TDD, 3TDD and TCR practices feel awkward at first. Don’t practice them on a project that matters. Practice them via katas.

Kata (カタ) is Japanese for form. It’s a martial arts term for a choreographed pattern of movements to train your muscle memory. Practicing musical scales would be another analogy for those who study music performance.

The idea of kata in software development is the same as martial arts. Use practice and repetition to hone a skill.

Choose a relatively small problem that’s not too simple yet not too complex either. Spend half an hour implementing it repeatedly over several days. You don’t even have to save your work. Implement it with TDD. Implement it without TDD. Implement it with 3TDD, etc.

Summary

I’m going to wrap it up here, but there’s more to come. I’ll continue the TDD themes with a true story (Yuri, the Programming Assignment and Me), how to determine what to test and some of the benefits (TBD) of TDD and possibly more. Spoiler Alert: Testing isn’t necessarily the most important benefit of TDD.

I’ll also provide a blog (TBD) to resolve some of the conflicts between My Personal Process and TDD.

References

https://nick.kriesing.de/ScientificWork/TDDOverThePastDecade.pdf

Comments

Previous: Suril, the Semaphore and Me

Next: Yuri, the Programming Assignment and Me

Home: Design Pattern Evangelist Blog