// JSON-LD for Wordpress Home, Articles and Author Pages. Written by Pete Wailes and Richard Baxter. // See: http://builtvisible.com/implementing-json-ld-wordpress/

Archive for the ‘Testing’ Category

Heartbleed – A High-level Look

Saturday, April 12th, 2014

HeartbleedThere has been a lot of information flying about on the Internet concerning the Heartbleed vulnerability in the OpenSSL library. Among system administrators and software developers there is a good understanding of exactly what happened, the potential data losses and proper mitigation processes. However, I’ve seen some inaccurate descriptions and discussion in less technical settings.

I thought I would attempt to explain the Heartbleed issue at a high level without focusing on the implementation details. My goal is to help IT and business leaders understand a little bit about how the vulnerability is exploited, why it puts sensitive information at risk and how this relates to their own software development shops.

Heartbleed is a good case study for developers who don’t always worry about data security, feeling that attacks are hard and vulnerabilities are rare. This should serve as a wake-up-call that programs need to be tested in two ways – for use cases and misuse cases. We often focus on use cases, “does the program do what we want it to do?” Less frequently do we test for misuse cases, “does the program do things we don’t want it to do?” We need to do more of the latter.

BusinessSecurityBrief: Heartbleed - TitleSlideI’ve created a 10 minute video to walk through Heartbleed. It includes the parable of a “trusting change machine.” The parable is meant to explain the Heartbleed mechanics without requiring that the viewer be an expert in programming or data encryption.

If you have thoughts about ways to clarify concepts like Heartbleed to a wider audience, please feel free to comment. Data security requires cooperation throughout an organization. Effective and accurate communication is vital to achieving that cooperation.

Here are the links mentioned in the video:

Expanding on “Code Reviews Trumps Unit Testing, But They Are Better Together”

Tuesday, October 18th, 2011

Michael Delaney, a senior consulting software engineer at Blue Slate, commented on my previous posting.  As I created a reply I realized that I was expanding on my reasoning and it was becoming a bit long.  So, here is my reply as a follow-up posting.  Also, thank you to Michael for helping me think more about this topic.

I understand the desire to rely on unit testing and its ability to find issues and prevent regressions.  For TDD, I’ll need to write separately.  Fundamentally I’m a believer in white box testing.   Black box approaches, like TDD, seem to be of relatively little value to the overall quality and reliability of the code.  Meaning, I’d want to invest more effort in white box testing than in black box testing.

I’m somewhat jaded, being concerned with the code’s security, which to me is strongly correlated with its reliability.  That said, I believe that unit testing is much more constrained as compared to formal reviews.  I’m not suggesting that unit tests be skipped, rather that we understand that unit tests can catch certain types of flaws and that those types are narrow as compared to what formal reviews can identify.

(more…)

Code Reviews Trump Unit Testing , But They Are Better Together

Tuesday, October 11th, 2011

Last week I was participating in a formal code review (a.k.a. code inspection) with one of our clients.  We have been working with this client, helping them strengthen their development practices.  Holding formal code reviews is a key component for us.  Part of the formal process we introduced includes reviewing the unit testing results, both the (successful) output report and the code coverage metrics.

At one point we were reviewing some code that had several error handling blocks that were not being covered in the unit tests.  These blocks were, arguably, unlikely or impossible to reach (such as a Java StringReader throwing an IOException).  There was some discussion by the team about the necessity of mocking enough functionality to cover these blocks.

Although we agreed that some of the more esoteric error conditions weren’t worth the programmer’s time to mock-up, it occurred to me later that we were missing an important point.  What mattered was that we were holding a formal code review and looking at those blocks of code.

Let me take a step back.  In 1986, Capers Jones published a book entitled Programming Productivity.  Although dated, the book contains many excellent points that cause you think about how to create software in an efficient way.  Here efficiency is not about lines of code per unit of time, but more importantly, lines of correct code per unit of time.  This means taking into account rework due to errors and omissions.

One if the studies presented in the book relates to identifying defects in code.  It is a study whose results seem obvious when we think about them.  However, we don’t always align our software development practices to leverage the study’s lessons and maximize our development efficiency.  Perhaps we believe that the statistics have changed due to language construct, experience, tooling and so forth.  We’d need similar studies to the ones presented by Capers Jones in order to prove that, though.

Below are a few of the actions from the book’s study of defect detection approaches.  I’ve skipped the low end and high-end numbers that Caper’s includes, simply giving the modes (averages) which are a good basis for comparison:

Defect Identification Rates Data Table
Defect Identification Rates Graph

(more…)

Domain Testing at the Unit Level, Part 1: An Introduction

Tuesday, February 1st, 2011

It is surprising how many times I still find myself talking to software teams about unit testing.  I’ve written before that the term “unit testing” is not definitive.  “Unit testing” simply means that tests are being defined that run at the unit level of the code (typically methods or functions).  However, the term doesn’t mean that the tests are meaningful, valuable, or quality-focused.

From what I have seen, the term is often used as a synonym for path or branch level unit testing.  Although these are good places to start, such tests do not form a complete unit test suite.  I argue that the pursuit of 100% path or branch coverage and the exclusion of other types of unit testing is a waste of time. It is better for the overall quality of the code if the unit tests achieve 80% branch coverage and include an effective mix of other unit test types, such as domain, fuzz and security tests.

For the moment I’m going to focus on domain testing.  I think this is an area ripe for improvement.  Extending the “ripe” metaphor, I’d say there is significant low-hanging fruit available to development teams which will allow them to quickly experience the benefits of domain testing.

First, for my purposes in this article what is unit-level domain testing?  Unit-level domain testing is the exercising of program code units (methods, functions) using well-chosen values based on the sets of values grouped, often, by Boolean tests in the code. (Note that the well-chosen values are not completely random.  As we will see, they are constrained by the decision points and logic in the code.)

The provided definition is not meant to be mathematically precise or even receive a passing grade on a comp-sci exam.  In future postings I’ll delve into more of the official theory and terminology.  For now I’m focused on the basic purpose and value of domain testing.

I do need to state an assumption and create two baseline definitions in order to proceed:

Assumption: We are dealing only with integer numeric values and simple Boolean tests involving a variable and a constant.

Definitions:

  • Domain - the set of values included (or excluded) by a Boolean test.  For example, the test, “X > 3” has its domain of matching values 4, 5, 6, … and its domain of non-matching values 3, 2, 1, …
  • Boundary – The constant used in a Boolean test forming the point between the included and excluded sets of values.  So for “X > 3” the boundary value is 3.

Now let’s look at some code.  Here is a simple Java method:

public int add(int op1, int op2) {
    return op1 + op2;
}

This involves one domain, the domain of all integers, sort of.  Looking closely there is an issue; the domain of possible inputs (integers) is not necessarily the domain of possible (correct) outputs.

If two large integers were added together they could produce a value longer than a Java 32-bit integer.  So the output domain is the set of values that can be derived by adding any two integers.  In Java we have the constants MIN_VALUE and MAX_VALUE in the java.lang.Integer class.  Using that vernacular, the domain of all output values for this method can be represented as: MIN_VALUE – MIN_VALUE through MAX_VALUE + MAX_VALUE.

Here is another simple method:

public int divide(int dividend, int divisor) {
    return dividend / divisor;
}

Again we seem to have one domain, the set of all integers.  However we all know there is a problem latent in this code.  Would path testing effectively find it?

(more…)

Fuzzing – A Powerful Technique for Software Security Testing

Friday, January 21st, 2011

I was participating in a code review today and was reminded by a senior architect, who started working as an intern for me years ago, of a testing technique I had used with one of his first programs.  He had been assigned to create a basic web application that collected some data from a user and wrote it to a database.  He came into my office, announced it was done and proudly showed it to me.  I walked over to the keyboard, entered a bunch of junk and got a segmentation fault in response.

Although I didn’t have a name for it, that was a standard technique I used when evaluating applications.  After all, the tried and true paths, expected inputs and easy errors will be tested early and often as the developer exercises the application using the basic use cases.  As Boris Beizer said, “The high-probability paths are always tested if only to demonstrate that the system works properly.” (Beizer, Boris. Software Testing Techniques. Boston, MA: Thomson Computer Press, 1990: 76.)

It is unexpected input that is useful when looking to find untested paths through the code. If someone shows me an application for evaluation the last thing I need to worry about is using it in an expected fashion, everyone else will do that.  In fact, I default to entering data outside the specification when looking at a new application.  I don’t know that my team always appreciates the approach.  They’d probably like to see the application work at least once while I’m in the room.

These days there is a formal name for testing of this type, fuzzing.  A few years ago I preferred calling it “gorilla testing” since I liked the mental picture of beating on the application. (Remember the American Tourister luggage ad in the 1970s?)  But alas, it appears that fuzzing has become the accepted term.

Fuzzing involves passing input that breaks the expected input “rules”.  Those rules could come from some formal requirements, such as a RFC, or informal requirements, such as the set of parameters accepted by an application.  Fuzzing tools can use formal standards, extracted patterns and even randomly generated inputs to test an application’s resilience against unexpected or illegal input.

(more…)

Cut Waste, Not Costs

Monday, March 15th, 2010

As I read more and more about the Toyota debacle I’m struck by an apparently myopic management drive to cut costs.  In the case of Toyota it appears that cost cutting extended into quality cutting.  A company once known for superb quality had methodically reduced that aspect of their output.  This isn’t just conjecture; it seems that people inside the company had been aware of a decline in quality due to a focus on reducing costs.1 Is there a general lesson to consider?

I believe the failure is one of misplaced focus. The focus when Toyota began cutting costs was to remove waste.  That waste could be found throughout their manufacturing processes.  Wasted materials, productivity, tooling, and equipment were all identified as Toyota’s management and workers struck out on a journey to reduce waste and improve productivity.  They ushered in a set of practices that others would soon adopt.

Head back to the 1950′s and you’ll find Taiichi Ohno2 hard at work addressing myriad manufacturing shortcomings at Toyota.  Mr. Ohno is really the father of lean manufacturing and just-in-time inventory management.  He didn’t name them as such.  He was just trying to remove waste from the entire manufacturing process.  By the late 1990′s these concepts had become standard operating procedure at many firms.

It makes sense that a business would focus on reducing waste.  Although it may require effort to remove waste without reducing productivity, overall one would expect a leaner process to have an overall efficiency gain.  It would also seem that quality does not benefit from waste.  After many years of experience with these principles, companies have found that an approach of using only the resources that are needed when they are needed provides a sound basis for their operations.  So what happened at Toyota?  They apparently went beyond cutting waste.

(more…)

Design and Build Effort versus Run-time Efficiency

Saturday, October 17th, 2009

I recently overheard a development leader talking with a team of programmers about the trade-off between the speed of developing working code and the effort required to improve the run-time performance of the code.  His opinion was that it was not worth any extra effort to gain a few hundred milliseconds here or there.  I found myself wanting to debate the position but it was not the right venue.

In my opinion a developer should not write inefficient code just because it is easier.  However, a developer must not tune code without evidence that the tuning effort will make a meaningful improvement to the overall efficiency of the application.  Guessing at where the hotspots are in an application usually leads to a lot of wasted effort.

When I talk about designing and writing efficient code I am really stressing the process of thinking about the macro-level algorithm that is being used.  Considering efficiency (e.g. big-O) and spending some time looking for options that would represent a big-O step change is where design and development performance effort belongs.

For instance, during initial design or coding, it is worth finding an O(log n) alternative to an O(n) solution.  However, spending time searching for a slight improvement in an O(n) algorithm that is still O(n) is likely a waste of time.

Preemptive tuning is a guessing game; we are guessing how a compiler will optimize our code, when a processor will fetch and cache our executable and where the actual hotspots will be.  Unfortunately our guesses are usually wrong. Perhaps the development team lead was really talking about this situation.

The tuning circumstances change once we have an application that can be tested.  The question becomes how far do we go to address performance hotspots?  In other words, how fast is fast enough?  For me the balance being sought is application delivery time versus user productivity and the benefits of tuning can be valuable. (more…)

Unit Testing As a Standard Is Nondescript

Monday, September 8th, 2008

Often when I am discussing programming practices with developers they are quick to mention their use of unit testing.  It is a badge of honor that they wear and rightly so.  “I test my code.  I care about the quality of my work!”  Of course unit testing means that the test is exercising a small “unit” of code, typically a method or function.  Does the term tell us anything else?  Are all unit tests equivalent?

The tools and techniques developers use for unit tests differ, such as using frameworks versus more homegrown approaches, but many developers stress the importance of unit testing. However, saying that one subscribes to the use of unit tests is like saying that someone uses a motorized vehicle.  There is a lot of detail missing.

If you explore someone’s motorized vehicle it might turn out to be a motorcycle, car, train, boat, etc.  What is meant by motorized vehicle can vary widely making the term ineffective when trying to determine the vehicle’s ability to carry or tow something.  In the same fashion, if you dive more deeply into each person’s definition of unit tests, it is clear that one developer’s unit testing is not the same as another’s.

Here I’ll explore some details surrounding unit testing.  I’ll start off with my two invariants: predefined results and automation.

A test is less effective if its result is not defined before the test is run.  Each test must be defined with its input and result documented.  This means that no interpretation on the part of the tester can be involved.  The reason for this is simple; people can convince themselves that an answer is correct when they see it.  If a tester enters a test input without a definition of the correct answer, he or she may be willing to accept the result as correct.  This is human nature; to accept information that we see presented.

A logical corollary is that unit tests should be automated.  In order to guarantee that the tester is not interpreting any results, let the computer do the check.  The result is a pass or fail.  No murky gray area where the tester tries to understand whether it is correct.  The automation also leads to the creation of a regression test suite that will grow with the code.  Finally, if the tests are automated, other tools can help us assess the completeness of the tests.

Beyond these two statements there are decisions that developers need to make about the types of unit tests that will be created.  These decisions impact the complexity of the tests that must be written as well as the number of tests required. (more…)