Friday, December 14, 2012

Modelling super powers

tl; dr it's not just the tactics that matter

In this scenario, I've modelled five testers. Each has a super power.

  • One logs bugs more easily than the others – effectively, they log as many bugs as they can see. The others can only log 10 bugs for each bit of budget they consume.
  • One only logs big bugs – bugs with a cost of 10 or more. The others log any bug they find.
  • One learns three times more effectively than the others.
  • One switches tactic twice as often.
  • One finds it easier to retain their skills after switching tactic.
What difference might each of these qualities make?

Run the exercise a few times. Make some changes. You may find it easier fullscreen, and of course the XML is available to play with. Does the model match your experience?

More to the point, perhaps, how are you comparing the different testers in the model? How do you compare real testers on your team?

Friday, December 07, 2012

Diversity matters, and here's why

tl;dr – It ain't what you do, it's the way that you do it

We've got a model which tells us we have an hopeless problem. I promised some perspective.

Let's try throwing people at our problem. In the exercise below, we're using five testers. If a bug has a 1:100 chance of being found by one tester in one cycle, surely five testers should have a better chance.*

How much better? Run the thing and find out.

Less than impressed? That's because hard-to-find bugs are still hard to find, even with a few more people. Your one-in-five-million shot is not going to jump into your lap if you've only managed to make the chance of finding it one-in-a-million.

There's a key quality I've not changed in this model. We've said that some bugs are harder to find than others. We've not yet mentioned, or modelled, that my Mum merrily finds problems that have eluded me. The way that you don't see my bugs on your machine. The way that performance testing jiggles bugs by the bucketload out of systems which seemed to be working just fine, or the way that unit testing and usability studies find (mostly) entirely different collections of bugs.

Our model should reflect that any individual bug** should have lots of different likelihoods of being found. For this model, we're going to make the choice of likelihood depend on the tactic that is in use. Indeed, for this model, that's what differentiates and defines "tactic" – changing tactic changes the distribution of likelihoods across the collection of bugs.

Below, you'll find an exercise which again has five testers working in parallel. This time, each tester has their own individual profile of chances, and a bug that one finds easily may be much less likely to be found by another.

In the model, we do this by setting up tactics. Each tester has one tactic, which they use exclusively. Each tactic is set up in the same way across the full population of bugs – it's just a distribution of probabilities. If you were to look at one bug, you'd find it has five separate probabilities of being found. Have a play.

The difference is clear.

Diversity matters*. In this model, it matters a lot; more than budget, more than the number of testers.

For those of you who prefer analysis over play, also clear if you think about the chances of finding an individual bug. Tactic 1's million-to-one chance bug may be a billion-to-one for tactic 2, too, but tactic 3 might well see it as a hundred-to-one. Ultimately, the no-chance-for-that-bug tactic would continue to have no chance whatever your budget (or patience), but by having many tactics, one increases the chance of having a technique that with easily find that particular bug easy.

QED – but I hope the toys help the demonstrandum at least as much than the argument.

Note that a key assumption hidden in this model of diverse approaches is that the different tactics are utterly different. In the real world, that's hard. There's plenty of refinement to do to our model to make it a more accurate reflection of the world. However, the central idea remains: in this model of discovery, you get much more from changing your variety than from changing your effort.

This then is the perspective – in this exploratory environment, persistence is what leads to hopelessness. Variety gets you closer. Just for fun, here's a model with the five tactics, and just one tester – but this tester can switch tactics. I'll be mean, so they switch randomly, and each time they switch, their skill slides backwards. Look at the poor beggar ticking away; hardly ever gets over 50%.

See how well this works with just one tester.

One random tester does better***** than five monotonic testers? You're surprised by that conclusion? Enough with the rhetoricals: I have (metaphorical) knobs for you to play with.

The sharp-eyed will notice an extra button – I've finally given you a reset. Indeed, this is a rather more interactive machine than you've had so far – you can change the number of bugs and the cost model. You can also give (not entirely reliably) the machine a (not entirely reliable) "seed" to start from as it builds the model, which lets you replay scenarios. Be aware that the I've not sorted out a fully-intuitive workflow around set/start/stop/change/reset, nor have I tested this well (it's mine, and I'm too close to do a job to be proud of). I'd appreciate any feedback – be aware that behaviours may change in the near future.

If you want to dig deeper into the model, I've made a change that allows you to play with the machine offline. Download the .swf, and the Exercise.xml file from the same directory. Bung them in the same folder on your own machine, and the .swf will pick up your local copy. Have a play with Exercise.xml and see what you can come up with. I'll share interesting exercises and conclusions here, and you're welcome to post them on your own site. I'd like to hear about your postings, because then I'll know who to send updated machines to. I'll open-source this sometime,

There's lots further one can go with this model, and over the next few posts, we'll explore and illustrate the effects of some common constraints and balances.

It looks like I'll be teaching exploratory testing in Amsterdam early next year. I'm just about to set the dates. If you want 30% off the price for simply telling me you're interested, you've got a couple of days to catch the opportunity.

Cheers -


* maths people will know; (1- (1-0.01) ^ 5) ~ 4.9%, which is just a tad more unlikely than a 1:20 chance.
** for the purposes of this explanation, let's assume we can identify at least one bug.
*** this panders directly to my prejudices, so I'm pleased to reach this conclusion, and find it hard to argue against effectively. I'd be grateful**** if you felt able to disagree.
**** through gritted teeth, but still grateful.
***** better?