Wednesday, November 28, 2012

Enumeration hell

tl;dr some bugs are beyond imagining

"Rational people don't count bugs."

There's a rash statement. Let's say that rational people who do count bugs ought to count other, less pointless more meaningful things, too.

Bugs* are rotten to count. There are plenty of posts** about this, and I won't go over the same ground here. Counting bugs is a bit like counting holes – superficially obvious until someone takes a shovel to your cheese.

But the big problem with a bug count is that it summarises a potentially useful collection of information into a number that is mostly meaningless. A single nasty that makes the wheels fall off is worth any number of niggles that make the horn too loud. Unless you're driving a clown car.

In our idealised model, we're counting surprises because it's interesting to see how many are left. None is none on any scale, and if there's none, we're done. We're still not done if we've got one left, because that one might be a stinker.

You've noticed that I've only given you one knob to twiddle*** on these toys. You only get to change the budget – you don't get to change the context****. This is a cheap manipulation on my part, because I've been asking you to concentrate on where you might set that budget to feel reasonably confident that the thing is tested.

So far, we've not considered bug stink in our model. It's time that changed.

In the same way that our model gives each bug a chance of being found, it gives each bug a quality I'll call cost. That's probably not the best word, but it's the one I've chosen for now*****. I'll give it a local meaning. Cost is the amount by which the value of the system goes down when it contains the bug. Quality is value to someone. Trouble makes that value go down. Cost, here, is not cost of fixing the bug. It's the cost of leaving it in, and it's the cost to the the end users.

Bugs aren't made equal, so we'll need to consider a distribution again, this time of (our local definition of) cost. Experience leads me to believe that most bugs have low cost, some bugs have higher cost, and a very few (so few that they might not exist in a given system) have astronomically large costs that outweigh the value of the system.

In earlier examples, each bug had the same cost. The distribution I've chosen to use in this model, to match my experience, is called a "power law" distribution. Power law distributions fir lots of things observed in the real world, such as city sizes, distribution of wealth, and the initial mass of stars. Power law maths underlie the Pareto Principle (aka the 80:20 rule), and Taylor's Law****** (and , more incomprehensibly, phase changes). If you want to dive into this, set your head up with this handy note comparing the similarities of Power/Zipf/Pareto in a real (if rather antique) context.

Why have i picked this distribution? Because it feels right. Instinct is no justification, so you can expect that we'll have a look at other distributions later. For now, though here's a fourth assumption:

4        The cost of a bug to (all the end users over the life of a product) has a power law distribution.

Enough of the hands-waving. Let's play.

Below you should find an identical machine to last time's closing toy, but with costs set to match a pareto-style distribution. You'll quickly see that there are two "stuff found" numbers, and that the size of the yellow dot is related to the cost. Run this a few times.

Don't be surprised if, occasionally, you see a simply huge yellow dot. Try hovering over the top right of the square set of 400 circles, and click on the ? you see to reveal a god-like understanding of how much trouble this system is hiding. Know that, generally, you'll see the total trouble is around 1000*******. If you see around 2000, expect that one of the bugs has a cost of 1000. If you happen to see around 11000, you've probably got a fat 10K bug hiding away.

In our most recent outing, I hope you got a feel for why it's hard to use a bug rate to say that you're done testing. If you play with the models in this posting, you may get an idea for how 'not done' feels in terms of the cost of what you've left behind.

I hope you're still considering where your omnicognisant self would set a reasonable budget so you could say with confidence that you'd done enough. Have a look at the left-hand graph of what's been found. It's still very front-loaded, but you'll see the occasional big spike as a particularly troublesome bug is revealed.

Let's rack up the difficulty another notch. I set up the model above so that the budget and the bug distribution meant that you got to find most of the bugs in a relatively brief exercise. Of course, that's no use at all. Here's another; more bugs, smaller budget. Crucially though, in this model plenty of the bugs are very hard to find indeed. You're not going to find the lot, so that's what this model looks like.

Hopeless, isn't it? If the real world looks anything like our model, how can anyone be bothered to give a sensible answer when asked to set out a budget?

Next time, all being well, we'll approach these frustrations sideways on. We won't find clarity, but we may find perspective.

* I'm not going to define "bug", because it's a vague word, and therein lies its power. But if there's a scale that runs through vague to countable, then I suggest these two ideas are at opposite ends.
** Try Michael Bolton's Another Silly Quantitative Model and Elisabeth Hendrickson's What Metrics do you use in Agile.
*** there's lots more interactivity to come. For now though, mull on how it must feel to be a leader whose only effective control is over budget-setting, then be nicer to your poor distant Chief Money Officer next time.
**** suggestions accepted, but without any guarantee they'll be used.
***** "Law" appears to be used by some scientists in a similarly-imprecise way to the way some lawyers use "Proof". Business people naturally choose to use both words with abandon. I would treat the word "Law" here with as much scepticism as you might treat it in Moore's Law. They're empirical laws, and describe, rather than necessarily account for, system behaviour.
******* 1000 what? I don't care. Stop your whining and go count the number of things in this list.

No comments:

Post a Comment