**tl;dr – you still need your imagination, even with real-life examples**

Magic pennies? Pshaw.

Let me put this another way.

This problem has been put to me frequently in my testing life. Here's one close-to-actual situation.

My client has a hundred trucks. Each has a bit of kit, and I've ben told that the bit of kit needs to be replaced occasionally. Actually, not so occasionally – it's new kit, and I'm told that it's likely to fail at least once in the first hundred days use.

So, how many trucks will experience that failure in their first hundred days? All of them? Also, how long should we test for? How many rigs should we use? How reliable is that suspiciously-round 1 in 100 figure?

As it happens, there's a bit of maths one can do. If the chance of a truck failing is 1%, then the chance of it

*not*failing is 99%. The chance of it

*not*failing for 2 days in a row is 99% * 99% (just over 98%). For 3 days, 99% * 99% * 99% (a tad over 97%).

Can you see where I'm going? The chance of a truck not failing for 10 days in a row is 99% * [99% another 9 times]. That's 99%^10.

For 100 days in a row, it's 99% ^ 100. Which is about 37%*.

So after a hundred days, I'm likely to still have 37 trucks, more or less, that haven't failed yet.

Which makes around 63 trucks that I need to go and mend**.

The maths is satisfying, but it doesn't tell me any more than the question I was first asked. Nonetheless, we know that all good testers have an practically unlimited supply of extra questions to ask, so we're probably not completely satisfied.

However, if go grab my hi-viz jacket and get to work on the trucks, I'll get a better idea of what happens. I'll find that some days everything works as well as it did yesterday, and occasionally three new trucks phone in failed. I'll get an idea that I'll see more failures when there are more things that work – so as the period goes on, I'll see fewer and fewer. Some trucks could go on for ages (I'm sure that you've all heard of immortal lightbulbs, too. Survivorship bias – mostly.)

Working on the trucks allows a visceral, complex experience. It takes a while to get, it's not terribly transferrable, and it's hard to forget. You know it deeply and in many different ways. You are "experienced". The maths approach is different; the result is ephemeral, and you may remember the method more easily. To imagine its implications, you'll have to think hard. You are "expert"***, and because you can remember the method, you might be able to re-apply it in a different context.

In between these two, there are models and simulations. Models aren't reality, but neither are they primarily symbolic (at least, not on the outside). I hope that the right model might engender something between experience and expertise. For what it's worth, I think that asking "How long should I test for to be confident that I'm not going to see problem X much in real life" is a fair question, and I think that "It depends" is a rotten answer without some idea on what "it" might depend.

I've given you three machines below. 10 trucks, 100 trucks, 1000 trucks. I've knocked out various noisy bits, but it's otherwise the same simulation. Have a play. You can change the budgets. Think about what the frequency of failure tells you, especially over time. While you play, just have in the back of your mind the ways that this kind of failure differs from the failures that we discover when exploring...

* We're assuming here that a once-broken truck is no more likely (or less likely) to break down again. We're also assuming that the non-broken trucks are at no greater chance of breaking. In one of the cases I'm thinking of, the "broken" truck was entirely functional as far as most people wee concerned, so the broken trucks didn't get less use, and the working trucks didn't get more use. If you're thinking of an un-enlargeable fleet of trucks with broken axles, we've got different models.

** If I'm swift to mend, some of these probably will have needed to be mended more than once.

*** Nobody said that being experienced and being expert were mutually exclusive. You can be both, you can be either, most of us are neither outside our fields of interest.

