Tuesday, September 25, 2007

Using the wrong tool

I just saw a Selenium test script that was supposed to test routing system (routing as in finding a driving route between places on the map, not network routing).
The script would open the routing web-page, enter a source and destination into the appropriate inputs, click "search" and then assert on some key properties of the resulting route (such as route length, total time, etc).

All's nice, except the script repeated this sequence for about twenty-something different routes.

So what's wrong with this?
1. Wrong focus:
The test is in fact about checking results of different route-requests. It has nothing to do with the web-GUI, and having the test repeat the GUI actions (enter this into that textbox, enter that into this textbox, click that button, wait for this panel to be visible etc etc) is just creating "noise" in the test which is totally irrelevant to the actual "beef" of it which is "for a given route request, assert that the total time is XXX, that the route length is YYY" etc.

2. Slow test:
Testing the routing server from the web GUI is is just making the test run slow, which makes the feedback cycles for this test larger (more time take between running the test and getting feedback about the system-under-test), and makes the likeliness of running the test on a continual basis much lower.

3. Test Brittleness:
Since the test is repeating a sequence of web-GUI actions (click this link, type into that textbox, wait for this panel to be visible etc.), any change to the web-GUI will force us to repeat a fix in this test pertaining to that GUI-change. Like I mentioned before, these GUI change have nothing to do with the "beef" of this test, and having to apply repeated and tedious fixes to the test-script makes it brittle (i.e. breaks easily due to irrelevant changes in the system-under-test) and hard to maintain.

The solution?
The same assertions that are made by this test can be made at a lower-level, such as sending raw requests to the routing server and asserting on the results. I think something like a Fitness ColumnFixture which sends requests with different source and destination points and expects certain key-values to be returned. This should focus the test on what it's actually about, run faster (since t won't go through the web-GUI) and be more maintainable (since it won't have to fixed due to changes in the web GUI).

Balanced breakfast:
In addition to being misplaced, the bad Selenium test actually did indirectly test the web-GUI, which should not be discarded. I guess having a focused Fitness-test for the routing service should be complimented with a (much shorter) Selenium test for the web-GUI, just repeating the original sequence once or twice to be sure that the web-GUI actually works.

Sunday, September 23, 2007

Testing "Everything"

Roy Osherove has posted about how to test-drive composite-methods such as validation, This got me triggered to ramble about my own experience with this kind of task.

Here's the short background:
(I'll be using Roy's example to demonstrate my own solution here).
We start test-driving a Validate() method which in fact involves validating several rules (HasAtLeast6Chars, HasDigits, HasMax10Chars etc).
We start by test-driving the first validation rule. Easy.
Next, we want to test-drive the second rule, but for this we need to "assume" that the first rule is valid, otherwise we might be getting false-negatives.
Once we get to the third rule, we wish we could "assume" that "everything is valid except the rule-under-test".

Solution: Everything = Abstraction
I find that words like "everyhing", "all", "anything" and other generalization-words in the specifications can be like a big neon-sign above our heads flashing "abstract it!". Imagine that we could actually write the tests for the 3rd rule like this:

//arrange - assume all rules are valid except HasMaxChars:
validator.AnyRuleShouldReturnTrueExcept(HasMaxChars);

//act:
validator.Validate("this is longert than ten chars but has no digits");

// assert:
Assert.IsFalse(validator.IsValid);


Hmmm... wait - we're practicing TDD, so maybe we can get the code to be like that? Remember - the tests should "tell" us what the code should look like (hence the word "Driven" in "Test-Driven-Development" - the tests "drive" the design of the code).

Let's try refactoring a bit, replacing each of the "other" validation-rules with stub-rules that are always valid:

//arrange - assume all rules are valid except HasMaxChars:
validator.HasAtLeast6CharsRule = new Stubs.ValidRule();
validator.HasDigitsRule = new Stub.ValidRule();
...



Whereas the validator class now relies on extracted collaborators to perform each rule. Each of these rules implement an IRule interface which can be stubbed or mocked.

class Validator
{
public IRule HasAtLeast6CharsRule = new HasAtLeast6CharsRule();
public IRule HasMax10CharsRule = new HasMax10CharsRule();
public IRule HasDigitsRule = new HasDigitsRule();
...

public void Validate()
{
if (this.HasAtLeast6CharsRule.IsValid &&
this.HasMax10CharsRule.IsValid...)
...
}
}



Now this is a little better since we can really neutralize "everything except" when testing a specific rule by replacing all the other rules with stubs, but it'll still be a hassle dealing with "all" these rules in each test's "arrange" section. Let's generalize a bit further - how do we generalize "neutralizing all the rules"? with a collection of course. We can run through each rule, replacing it with a stub positive:

//arrange - assume all rules pass....
string[] ruleNames = validator.Rules.GetKeys();
foreach (string ruleName in ruleNames)
{
validator.Rules[ruleName] = new Stubs.AlwaysPassingRule();
}

// ...except HasMax10Chars:
validator.Rules["HasMax10Chars"] = new HasMax10CharsRule();



Where the Validator class now uses a collection of IRules (a dictionary of named rules, in fact), like so:

class Validator
{
public Dictionary<string,IRule> Rules = new Dictionary<string,IRule>();

...

public void Validate()
{
if (
this.Rules["HasAtLeast6Chars"].IsValid &&
this.Rules["HasMax10Chars"].IsValid...)
...
}
}


But there must be a better way to make "all the rules" valid... How?
"The truth is there is no spoon" - we can make all the rules valid by simply elminating all the rules:

//arrange - assume all rules pass....
validator.Rules.Clear();

// ...except HasMax10Chars:
validator.Rules["HasMax10Chars"] = new HasMax10CharsRule();



And we quickly realize that the rule-names aren't doing us much good either, so we throw them away too:

//arrange - assume all rules pass....
validator.Rules.Clear();

// ...except HasMax10Chars:
validator.Rules.Add(new HasMax10CharsRule());



...And in the validator:

class Validator
{
public List<IRule> Rules = new List<IRule>();

...

public void Validate()
{
foreach (IRule rule in this.Rules)
{
if (!rule.IsValid)...
}
}
}


Viola. we've got generalized composition through refactoring.

Testing this AND that
Roy also mentioned how we need to be able to test what happens if the first rule is satisfied but the second is not, etc etc. Notice that in the above example, we can very easily test this too through generalization.

When we're asking "what happens when the first rule is satisfied and the second rule is not satisfied", this can be reduced to "assume there are only two rules; the first is satisfied and the second isn't". In codespeak this would read:

// arrange - assume we have 2 rules, the first one is satisfied and the second one isn't:
validator.Rules.Clear();
validator.Rules.Add(new Stubs.AlwaysPassingRule());
validator.Rules.Add(new Stubs.AlwaysFailingRule());


Notice how (again) the actual rules don't matter? we're testing the composition-logic decoupled from the actual rules that compose the whole. Asbtraction at it's finest hour.