Dave's Reflections » Blog Archive » Domain Testing at the Unit Level, Part 1: An Introduction

Domain Testing at the Unit Level, Part 1: An Introduction

It is surprising how many times I still find myself talking to software teams about unit testing. I’ve written before that the term “unit testing” is not definitive. “Unit testing” simply means that tests are being defined that run at the unit level of the code (typically methods or functions). However, the term doesn’t mean that the tests are meaningful, valuable, or quality-focused.

From what I have seen, the term is often used as a synonym for path or branch level unit testing. Although these are good places to start, such tests do not form a complete unit test suite. I argue that the pursuit of 100% path or branch coverage and the exclusion of other types of unit testing is a waste of time. It is better for the overall quality of the code if the unit tests achieve 80% branch coverage and include an effective mix of other unit test types, such as domain, fuzz and security tests.

For the moment I’m going to focus on domain testing. I think this is an area ripe for improvement. Extending the “ripe” metaphor, I’d say there is significant low-hanging fruit available to development teams which will allow them to quickly experience the benefits of domain testing.

First, for my purposes in this article what is unit-level domain testing? Unit-level domain testing is the exercising of program code units (methods, functions) using well-chosen values based on the sets of values grouped, often, by Boolean tests in the code. (Note that the well-chosen values are not completely random. As we will see, they are constrained by the decision points and logic in the code.)

The provided definition is not meant to be mathematically precise or even receive a passing grade on a comp-sci exam. In future postings I’ll delve into more of the official theory and terminology. For now I’m focused on the basic purpose and value of domain testing.

I do need to state an assumption and create two baseline definitions in order to proceed:

Assumption: We are dealing only with integer numeric values and simple Boolean tests involving a variable and a constant.

Definitions:

Domain - the set of values included (or excluded) by a Boolean test. For example, the test, “X > 3” has its domain of matching values 4, 5, 6, … and its domain of non-matching values 3, 2, 1, …
Boundary – The constant used in a Boolean test forming the point between the included and excluded sets of values. So for “X > 3” the boundary value is 3.

Now let’s look at some code. Here is a simple Java method:

public int add(int op1, int op2) {
    return op1 + op2;
}

This involves one domain, the domain of all integers, sort of. Looking closely there is an issue; the domain of possible inputs (integers) is not necessarily the domain of possible (correct) outputs.

If two large integers were added together they could produce a value longer than a Java 32-bit integer. So the output domain is the set of values that can be derived by adding any two integers. In Java we have the constants MIN_VALUE and MAX_VALUE in the java.lang.Integer class. Using that vernacular, the domain of all output values for this method can be represented as: MIN_VALUE – MIN_VALUE through MAX_VALUE + MAX_VALUE.

Here is another simple method:

public int divide(int dividend, int divisor) {
    return dividend / divisor;
}

Again we seem to have one domain, the set of all integers. However we all know there is a problem latent in this code. Would path testing effectively find it?

We’ll return to those examples later. For now, let’s look at how the number of domains can grow quickly. Here is a method containing several decisions:

public String calcGrade(int score, int effort) {
    String letterGrade;

    if (effort > 90) {
        score += 15;
    } else if (effort > 80) {
        score += 10;
    } else {
        score += (int) ((double) effort * .01);
    }

    if (score < 60) {
        letterGrade = "F";
    } else if (score < 70) {
        letterGrade = "D";
    } else if (score < 80) {
        letterGrade = "C";
    } else if (score < 90) {
        letterGrade = "B";
    } else {
        letterGrade = "A";
    }

    if ("ABC".indexOf(letterGrade) > -1) {
        score %= 10;
        if (score < 3) {
            letterGrade += "-";
        } else if (score > 6) {
            letterGrade += "+";
        }
    }

    return letterGrade;
}

How many domains does that code contain? 3 domains for the “effort” tests, 5 domains for the first “score” tests, 2 domains for the “indexOf” test and 3 domains for the nested “score” tests. In total we have 13 domains.

To understand the purpose and value of domain tests we need to look at these sample methods differently. First, consider branch-level unit testing. How many tests would be required to achieve branch coverage and would those tests do a good job of exposing run-time bugs? For the first two examples there are no branches so we can get 100% code and branch coverage with one test per method. For the final code sample I believe most branch coverage tools would insist on 12 tests to score the coverage as 100%.

Let’s shift our thinking and explore a couple of questions: 1) if the branch coverage is 100% but there are unexplored domains, which are being missed; and 2) if a value in each domain has been used, what else does domain testing require and what is the value of that additional testing? In other words, is testing one value within each domain a good approach to testing?

The first question seems straightforward to answer. The missed domains are those that do not add or remove any behavior that isn’t added or removed when using a value from some other domain. However, it is actually not that straight-forward an issue (otherwise automated tools would just create a complete set of domain tests for us).

Our code explicitly defines domains through Boolean expressions, but there are also domains created through logic and operations. These domains are harder to see but are important to our domain testing process.

In the first example, we don’t explicitly create multiple domains, yet we have two logical domains of values involved (those that can be represented in a Java 32-bit integer and those that cannot). In the second example we also have two domains (legal and illegal divisors). The last sample creates many domains in multiple dimensions (a concept I’ll explore in a future posting).

The first differentiator of domain testing is a requirement to assure the use of test values chosen from each domain. The net effect of this is often to gain significant path and branch coverage for the unit being tested. However, this represents a basic requirement and still leaves us open to a variety of undiscovered bugs. To fully flesh out a set of domain tests we have to document the domains and then test in the following ways:

Test at and near all boundaries. The “near” part gets interesting and involves choosing an “epsilon” value. I’ll discuss epsilon in a future posting dealing with floating-point value domains.
Test far from all boundaries.
Test using the values -1, 0 and 1, which often cause problems under a variety of circumstances
For multi-dimensional tests (those involving more than one variable, such as “X > 3 and Y < 7”) follow instructions 1, 2 and 3 for all permutations.

These are basic best-practices. Let’s see if they help discover flaws in the previous examples where simple branch-level testing wouldn’t.

For example 1, testing at or near the MIN_VALUE and MAX_VALUE integer values would lead to out of range exceptions. For example 2, testing with the value 0 would discover the divide-by-zero exception. In the last example there are permutations of very small and very large scores and efforts that will lead to range errors or odd grading effects. None of these would likely be discovered with simple path and branch-level unit tests.

When I return to this topic I’ll drill into more of the terminology and theory which will allow us to explore a broader set of domain issues that I’ve seen in many real-world coding situations.

For now I would enjoy hearing about your use of domain testing techniques and tools. Are there situations that help identify good candidate units that would benefit from domain testing? Are there other best practices that you would recommend when applying domain testing to a development team’s process?

Tags: linkedin, programming, Testing

This entry was posted on Tuesday, February 1st, 2011 at 13:45 and is filed under Quality, Testing. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

You must be logged in to post a comment.

David S. Read

Dave's Reflections (Blog)

Domain Testing at the Unit Level, Part 1: An Introduction

Leave a Reply