Dave's Reflections » Blog Archive » Unit Testing As a Standard Is Nondescript

Unit Testing As a Standard Is Nondescript

Often when I am discussing programming practices with developers they are quick to mention their use of unit testing. It is a badge of honor that they wear and rightly so. “I test my code. I care about the quality of my work!” Of course unit testing means that the test is exercising a small “unit” of code, typically a method or function. Does the term tell us anything else? Are all unit tests equivalent?

The tools and techniques developers use for unit tests differ, such as using frameworks versus more homegrown approaches, but many developers stress the importance of unit testing. However, saying that one subscribes to the use of unit tests is like saying that someone uses a motorized vehicle. There is a lot of detail missing.

If you explore someone’s motorized vehicle it might turn out to be a motorcycle, car, train, boat, etc. What is meant by motorized vehicle can vary widely making the term ineffective when trying to determine the vehicle’s ability to carry or tow something. In the same fashion, if you dive more deeply into each person’s definition of unit tests, it is clear that one developer’s unit testing is not the same as another’s.

Here I’ll explore some details surrounding unit testing. I’ll start off with my two invariants: predefined results and automation.

A test is less effective if its result is not defined before the test is run. Each test must be defined with its input and result documented. This means that no interpretation on the part of the tester can be involved. The reason for this is simple; people can convince themselves that an answer is correct when they see it. If a tester enters a test input without a definition of the correct answer, he or she may be willing to accept the result as correct. This is human nature; to accept information that we see presented.

A logical corollary is that unit tests should be automated. In order to guarantee that the tester is not interpreting any results, let the computer do the check. The result is a pass or fail. No murky gray area where the tester tries to understand whether it is correct. The automation also leads to the creation of a regression test suite that will grow with the code. Finally, if the tests are automated, other tools can help us assess the completeness of the tests.

Beyond these two statements there are decisions that developers need to make about the types of unit tests that will be created. These decisions impact the complexity of the tests that must be written as well as the number of tests required.

A basic place to start is with black and white box techniques. Black box simply means we cannot see into the implementation (the code) and we write our tests based on what the code is supposed to do. White box testing allows us to leverage our knowledge of the implementation to craft tests that execute specific code and force specific branches to be taken.

In the case of test first, we are constrained to the world of black box testing. Once the code is written we can create additional tests, white box, based on our understanding of the code.

Both black and white box testing have value and may peacefully coexist. The black box tests concentrate on the contract presented by the method and can leverage the design and even requirements as their basis. They are good at finding deviations from the specification. White box techniques allow us to assure that all the code, even less-used paths, are tested. They allow us to verify code reachability and completeness.

Moving deeper into the white box testing arena we need to understand the concepts of paths, branches and domains. When we create a white box test we know the path that we expect to be covered in the code. That is, the set of statements which will be executed. In order to have complete path coverage we need to assure that each statement executes during at least one test. Falling short of that, we have no proof that the statement works as intended or is even reachable.

When decisions are made in the code, we are creating a branch. A simple branch requires at least two separate unit tests, one where the branch result is true and one where it is false. The complexity of branch testing goes up quickly as conditions become more involved. Concepts such as short-circuiting, where the language skips the evaluation of a condition once it knows the Boolean outcome of a branch, also adds to the need for effective branch testing.

To achieve full branch coverage requires tests where each individual Boolean test is run with a true result and a false result and combined with the permutations of any additional Boolean tests controlling the same branch. For instance if I want to test a branch whose decision logic is “TempCelcius <= 0.0 Or TempFaren <= 32.0″ (which I can think of as “A Or B”) I need to create a minimum of 4 tests to actually achieve complete branch coverage (A is true, B is false; A is true, B is true; A is false, B is false; A is false, B is true).

Now, one could argue that I may only need three tests since for a short-circuiting environment the second Boolean test won’t be checked if the first is true. Although this is correct in specific cases, it is best for development teams to create standards that don’t rely on careful examination of each situation.

The third area I mentioned above is testing the domain. Domain testing goes beyond evaluating whether each branch’s conditions are being tested for all true and false combinations. Instead we want to assure that the domains created by those branches work correctly.

For instance, given the test “TempCelcius < 0.0 And Depth > 1.0 And Depth < 2.0″ there are 6 domains created (e.g. two domains are all temperatures less than 0.0 and another is all temperatures greater than or equal to 0.0). When creating domain tests, we use values at and near (epsilon) the decision boundary (for instance at a temperature of 0.0 and 0.01) as well as far from the boundary (for instance a temperature of 50 or -50).

Domain testing is much more rigorous than branch testing which is more rigorous than path testing. In all cases, developers need to understand the basis for their unit tests and make sure they have established standards for creating a meaningful set of tests.

I’ll dive more deeply into these areas in future posts. These different test types address different sorts of flaws in our code. Understanding the flaws and testing options will help us design more appropriate and effective test cases.

Next time you are talking about your approach to testing your code, don’t leave it at unit testing. Share your approach to test creation just like you might describe your dream motorized vehicle!

This entry was posted on Monday, September 8th, 2008 at 23:53 and is filed under Testing. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

You must be logged in to post a comment.

David S. Read

Dave's Reflections (Blog)

Unit Testing As a Standard Is Nondescript

Leave a Reply