Tuesday, January 10, 2012

Uncommon Ways of Managing ET #04 - Post-partum Labelling

tl;dr – tl;dr your ET notes to see where you've been

I’ve worked with plenty of testers who don’t timebox their time, don’t set out a charter before testing, and don’t do formal debriefs. Clearly, they’re not following session-based testing, but that doesn’t mean they’re necessarily doing bad exploratory work. Indeed, some of the most brilliant exploratory testers I’ve worked with are fully able to do all these things yet choose not to for much of their exploratory testing.

Personally, I almost always have a timebox, and find I prefer my results (but not my activity) if I make good notes. I can find charters trivial or restrictive, and debriefs can lead me to remember only the edited highlights of my exploration – so if my debrief sucks, my memory can be less useful than if I’d not debriefed at all.

Charters, timeboxes, notes and debriefs have a value to the team and the project as well as to the tester. If the team habitually relies on them, but an individual works best without them, then you’re faced with a choice of whether to force that tester towards an ineffective discipline, or whether to damage team performance. Which is no fun.

This then is a simple and obvious alternative, but I’m not aware of much that has been written to describe it. Nonetheless, I’m sure that many readers will recognise the activities below, and I can’t claim that this is in any way novel. Perhaps if no one’s written about it, it doesn’t seem legitimate, so no one writes about it. Perhaps I’ve just forgotten what I’ve read. Anyway, the following is a collection, and to that extent an imaginary extension, of stuff that has worked for me. I’m going to call it Post-partum labelling*. If you’ve got a better name, or know where someone else has named it, super. Comment away.

After a chunk** of testing, the exploratory tester describes their work in a sentence or two. They write this up in public. For example:
8 Jan – 60 minutes: Ed used a javascript profiling tool to analyse the web app.
8 Jan – 120 minutes: Sue spent 2 hours exploring reported instabilities related to switching sort order while autosaving.
9 Jan – 180 minutes: Brin and Rudi spent 30 minutes watching two users interact with the demo app for the first time, and spent the next 60 minutes reviewing and annotating video of the sessions.
10 Jan – 180 minutes: Jonno spent 3 hours on batch input, generating 3088 files that together contained all possible orderings of 5-step transactions.

This works well if you’ve set aside time for experienced and self-directed explorers to test. If you’re expecting a terse diurnal list like the one above, you might find it to be a good fit with daily news. It’s perhaps not such a good fit if you’ve got testers who have problems with focus, or if your test approach means that your list grows by more than half a dozen lines a day.

The list won’t help you know where testing is going, but it’s great to help you know where it’s been. Everyone in the team can see who explored what and when, so you know who to talk to, you know what’s been hit recently, and your memory and understanding of the system’s recent history has enough to help fill in the blanks. The team knows what it is paying attention to, and knows where individual interests lie. I think this is generally more useful than having an obscured testing genius bringing the system to its knees in interminably unfathomable ways.

Writing a post-partum label helps me put most recent test activity behind me, and allows me to think diversely as I enter the next round. I like knowing that I’ll need to write a public one-line summary of my hours of discovery; it helps maintain focus.

While I like a timebox, you might not. I wouldn’t insist on timeboxes if I was doing post-partum labelling. The people in the team know the budget, and they’re already trusted. The exploration is done when it’s done; forcing a timebox is a silly micromanagement. However, if people on your team are prone to pissing away their time and don’t embrace timeboxes or similar tools as part of their personal discipline, they’re probably not the best people to be doing post-partum labelling.

It’s time to change approach when your post-partum labels turn into “looked at login, again” or “checked out last week’s bugfixes”. If your label can be made before exploring, then it probably should be. Post-partum labels arrive after, and may not fit what you would have expected at the start. If you’re exploring, this is often a good thing.

Please, don’t get the impression that the label is an adequate substitute for notes. Sometimes, awfully, unfortunately, that’s what it is. Try to avoid this.

I’ve used similar approaches when I’ve been the exploratory addition to a team that has been relying solely on scripted or massive and confirmatory tests. I found it helpful when we had more test ideas than we could easily manage, and yet had target pathologies, observations and triggers that urgently called for our attention. Post-partum labelling helped me fit my work with other explorers and the rest of the team, helped me gain trust by offering visibility, acted as a trigger and conduit for other people to bring me ideas. It let my team spin very swiftly back through a couple of weeks of exploration, identifying which set of notes might hold relevance. It gave explorers who weren’t happy with SBT fit into a team that was trying to gain the disciplines of SBT. It wasn’t much good for assessing coverage. It didn’t link to requirements. It was rubbish for steering. But I liked it.

I’m very tempted to extend the idea further. I want to capture the information electronically. I want to add tags, to allow me to analyse where we’ve been spending time. I’m keen to describe problems found. I’d like to try using Stefan Butlin’s interesting TestPad web app (and I shall, it’s neat). However, these adjustments change the emphasis of the list. Have a look:
8 Jan – 60 minutes: Ed used a javascript profiling tool to analyse the web app. [code, performance, UX] We’re spending plenty of time inside check_constraints(), which looks recursive.
8 Jan – 120 minutes: Sue spent 2 hours exploring reported instabilities related to switching sort order while autosaving. [instability, UX, autosave] She found a reproducible crashing bug, logged a couple of UX issues, and identified potential exploitation.
9 Jan – 180 minutes: Brin and Rudi spent 30 minutes watching two users interact with the demo app for the first time, and spent the next 60 minutes reviewing and annotating video of the sessions. [UX]We identified and logged UX Issues around context menus, the hiding menu bar, and error messages.
10 Jan – 180 minutes: Jonno spent 3 hours on batch input, generating 3088 files that together contained all possible orderings of 5-step transactions. [batch, instability] The system correctly accepted 182, correctly rejected 2900, but hung on 2 that it should have rejected. No bugs logged yet, as we think this may be to do with a mistake in the configuration data in the test system.

Do you find yourself skipping over stuff now? I do. It’s as if it’s all too much to hold together. You’ll be keeping this information somewhere else, too, I expect – and I think that’s where it should stay. Keep the list single-purpose. You’ll find it lives in people’s heads more easily and more consistently, becoming part of the shared consciousness of the test team. And how cool is that?


* made-up name. Obviously. Post-partum is a latin term used to refer to the mother after giving birth (as opposed to post-natal, which apparently applies to the baby). You know what a label is. I want to get across the idea of a tester giving their work a unique and meaningful title, once it’s done.
** a chunk? What’s a chunk? I find that my mind merrily organises memory and activity, and groups the similar and temporally-close. If you have control over your interruptions, you’ve come to the limits of your chunk when you choose to change task. Sometimes, you don’t choose consciously. It’s still a chunk. My chunks of time testing are often hours. Writing, just minut… hey! A squirrel***!
*** I can see six, right now, in the evergreen oak outside my window. No, seven. Five. A parrot!

2 comments:

  1. What about calling it "Post-hoc Labeling"

    ReplyDelete
  2. I've consciously avoided post-hoc for a couple of reasons. Firstly, lots of the kinds of people who might be aware of this blog know that ad-hoc testing is a neat term, poorly used. I wanted to use something that wouldn't ask for direct comparisons. Secondly, although my Latin is awful, I'm aware that the opposite of post is ante, not ad, and I couldn't face my inevitable pedantry at the inevitable moment when someone insists that post-hoc is the opposite of ad-hoc.

    Now I've written it down, I rather like the implication that an exploratory test is conceived and developed by a tester, and that when it has finished being made, its results pass into the world and out of the direct control of the person who conceived it. However, that derivation wasn't at all planned, and could be the result of anything from coincidental rationalisation to an over-active unconscious.

    An aside: Google translate works as expected on "post hoc", but currently reckons "post-hoc" means "this the Post-". Which is what I might remark to the wife tomorrow when the letters come through the door. Just the once, though. Could get tedious.

    ReplyDelete