Tuesday, January 17, 2012

Known ways of managing ET #05 - Off-Piste Testing

tl;dr – scripted guides may not help exploration

Team leaders tell me ‘My testers use manual testing scripts*, but I want them to do more than just plod through – I want them to find problems’. This seems a reasonable idea, but is fundamentally incoherent; ‘Finding a problem’ is not necessarily a complementary goal to the action of ‘following a script’. However, it happens. Let’s look at two common approaches. I’ll call them Iron Script, and (by way of contrast) Marshmallow Script.

Iron Script
The scripts remain authoritative. Testers are expected to deviate at agreed points, typically by changing their data within reasonable limits. Adding new events or changing the path is frowned upon; in extremes, a bug that is found using a non-standard path may be rejected, and the tester asked to reproduce the bug on the accepted path. If you can get some kind of pre-approval for diversions taken through error states and validation checks, you’ll open up a bunch of interesting routes whose valuable information might otherwise be used against you by project pedants.

It’s my experience that scripts in these situations frequently mirror end-user activity, and that the steps focus on triggers at the expense of observations. If your testers must run through the script, then they must, but don’t let them get dull. Remember that you are a team of testers, not users, and that you can still get unassailably-helpful information from querying the database, watching the CPU, intercepting transactions, using a diff tool on the filesystem, or any other neat trick that takes your fancy. Constraints breed creativity.

Marshmallow Script
The scripts are a guide, a collection of hints or waypoints. Testers can deviate wherever they want, using the scripts to get to interesting points, or as a checklist to verify their expectations. The scripts act as fat charters, and by giving their names to paths, help testers to talk about what they’ve done and what they’ve found. This isn’t bad, as far as it goes.

However, the approach puts happy paths – reliable routes that demonstrate limited capability – at the core of decisions about what to test and how to test it. This emphasis can be a terrible drag on the swift revelations that might be desired from unfettered ET. It can wastefully restrict your testers’ imaginations, and seems to reinforce manual testing at the detriment of small cheap tools.

I tend to find that these approaches exist in parallel, but may not be acknowledged as such. It is powerful – sometimes, too powerful – to wonder out loud whether the team as a whole is looking to move away from their scripts or to add to their scripts. This can turn out to be emotive enough to be ignored in polite company; bringing it up in public can make people very impolite indeed.

One might question why the team is writing scripts at all. Scripts are expensive to make and hard to maintain. If they exist to give the testers a framework to explore requirements and product use while writing them, other frameworks might work just as well. If they are primarily a guide for novices or a crutch for the feeble, then perhaps one needs to rethink one’s approach to learning, and possibly to hiring. If they are primarily a way of recording work, then why not record the work with something more unambiguous, or more searchable? If they exist because an environment is hard to automate, then I would wonder if everything scripted is quite so hard to automate. If they exist to keep testers on a leash, then I have no further questions.

These are, however, rationalisations of a generally irrational position. I think the answer lies not in conscious choice, but in habit. The approach seems common in situations where budgets are only available for work that can be measured with reference to requirements and scripts, yet where the test team and its decision makers know that their survival-in-current-form requires fast information about emerging trouble. Maybe it’s endowment bias; if no one wants to chuck away all those scripts they’ve been working on so hard, then the scripts will remain central to the work no matter what the work is. In the first, future plans don’t match current practice. In the second, neither does the past. I often see both. Is it any wonder that the team lead’s goals might not match their means?

As a skier**, I’m drawn to the term ‘Off Piste’, and the term ‘Off-Piste Testing’*** seems a popular metaphor for this approach. Between the mountain top and the snug chalet in the valley floor, there are many paths: some groomed, marked and filled with tourists; others quiet or exposed, with cliffs and bears. There is an implication that off-piste is only for the most skilled, keen and well-equipped skier. The implied kudos is eagerly lapped-up by testers, and off-piste testing can be used as a motivator with two caveats; status should be earned through good work, and good information can gained from diverse approaches. Whatever the rules of the mountain might be, it is perilous to restrict off-piste testing to your elite.

More importantly, off-piste is still downhill. Scripts, whether used as hard or soft guides, bias testers towards a set of activities that most typically follow whatever actions are expected to be valuable to the user, system or owner. These activities are not the only ways to find problems. Those who manage exploratory testing by running after scripts will handicap their team.

* For this blog post, script means a step by step set of instructions to be read through and followed manually by a tester. Some of you may be aghast that such things are still in use. Some of you may see no alternative. For each of you, please believe that each position exists.
** Note to prospective clients – if you're near a properly-skiable mountain and book me to come to you close to a weekend during the season, I may have a seriously-tasty winter offer for you.
*** ‘piste-off testing’, anyone? Just me? Hey ho.


  1. Hi James

    First I must say your series is very good, the Set Aside Time post is brilliant.

    In this one I lack some parts (maybe they will come later?)
    I have used Marshmallow scripts + Off-Piste quite a lot, so I might be just defensive.

    I think the "habit" reason is spot on, but another driver can be that audits go well if the pre-defined scripts exist (although I think auditors will change their minds in the upcoming years.)

    I might be wrong, but it seems you assume that the scripts are of the positive, happy path type. I have used vague scripts with essences like "provoke errors", "perceived performance at least as good as the old product", "investigate upgrade from previous version".
    The set of tests is broader than requirements, and exists to make sure you cover what you think is important.

    A difficult thing with exploration around scripts is to know when to stop. This requires both skill and judgment, but is easier when collaborating (this is a reason why pair testing can be more effective.)

    We should also be aware of scripts and off-piste often is one of many testing activities; exploratory or not.
    We want diversity, also in the way we do things.

    1. Audits can be helped along by the presence of scripts. However, I've read at least one auditor's report highlighting that although scripts existed, the audit could not find a single instance in which they were followed. The audit hadn't gone well; the auditor had set their judgement when they found such a clear inconsistency between the team's stated intentions, and the tester's actions in practice. This judgement precluded any deeper questions about whether the work was worthwhile.

      You're right to pick out my assumption that scripts tend to follow the happy path. I should have made it explicit: most scripts that I see are essentially confirmatory (especially the automated ones). I have tried to put together scripts with instructions like 'provoke' and 'investigate'*. However, I've not generally got those scripts to stick, and I want to know** the ways that you've made yours work. It sounds like your instructions hand the initiative to the tester. This, to me, is one of the vital enabling elements of an exploration.

      Stopping is easy with a script – you stop when you get to the end, or when something prevents you getting to the end. Some off-piste testing running close to a happy path still has a sense of 'end'; the path that takes an account from birth to death, a closing message in a back-and-forth of transactions, getting the basket through checkout, finishing the level. Once one gets to a point where the instructions have to carefully aim and ration the initiative of a headlong off-piste tester, if only so they can stop and get to work on the next most important thing, then again those carefully-tuned instructions are characteristic of an exploration.

      * any readers from my distant past may also recall instructions to use 'appropriate' and 'inappropriate' data
      ** steal

  2. Interesting one and I agree with most of what you wrote there. Two things I spotted made me respond:

    "If they (test scripts) are primarily a guide for novices or a crutch for the feeble, then perhaps one needs to rethink one’s approach to learning, and possibly to hiring."

    I agree a bit to the learning and not necessarily to the hiring part.
    If test scripts are available that may guide a new user, similar to a manual I don't have anything against those, I wouldn't object to them reading the manual either. I wouldn't and in fact don't put a lot of effort into maintaining those that we currently use, they come in handy for the first couple of days alongside other training approaches.

    About the hiring part - if you sit down testers in front of test scripts and expect them to perform, then I'd agree. That's inhumane in my eyes though and I hope this breed dies out quickly.
    I do use some smoke test scripts for the new starters to guide them towards the high risk that we want to look at for each build. After a few weeks they know the areas and don't usually need the scripts anymore and expand (test around) the scripts anyway as required. To me that's a good use - the scripts don't need maintaining a lot, they help people on a high level what should be checked before they leave the nest and develop their own understanding of what's important.
    I think that most new starters need some help to become productive with a new application quite quickly. Hiring even the best people won't get away that with help with documentation we see results faster. If that learning from docs (including test scripts) is bolstered with help from human beings then I go with that. But you probably meant that when you said primarily anyway...

    I haven't heard of the off-piste metaphor. What springs to mind for me is that asking people to go off-piste is either only for experienced people or the reckless. Which is incidentally how some people see ET (not me, as you know).
    So sending a newbie down the scary path but with someone experienced as backup would be the way to go imo.

    Interesting read, thanks!

    1. Scripts can give a new team member a neat (and potentially rather reliable) path through various actions and functions. If it's important to keep the scripts up-to-date, then one might ask the new tester to make the script reflect the current state of play. You'd get an engaged tester and a more accurate script. If they can use it as a stepping off point for their own work, then that's fine. I entirely agree; it's rotten to ask a person to do mo more than follow a sequence of instructions and report when something doesn't match (but unfortunately not all that unusual in many lines of work).

      Having set up the 'off-piste' metaphor in the heading, I'd like to move away from it for the reasons you give. However, it struck me earlier today that off-piste skiing leaves behind it a temporary history of actual use, possibility and hazard. Off-piste skiers* are drawn to fresh and unskied powder, and so their tracks collectively increase to fill the skiable space. Until the next snow, skiers looking at a slope can see where previous skiers have been and where they have avoided, and the keen eye can even spot the kind of skier and the trouble they've got into. I wish I had something similarly visual for testing; various maps of the system, with heat trails of testing activity criss-crossing them, gently overlapping and fading until smoothed away by the next release.

      * yes, I know. And snowboarders. Just assume I'd rather not write 'snow users' or 'off-pisteurs'. As a man who zips about on tiny comedy blades**, and is therefore at the bottom of the pecking order of slopeside cool, I'm not about to take sides.
      ** don't laugh until you've tried. Much more fun, and differently versatile.

  3. This comment has been removed by the author.