Tuesday 1 October 2013

Problem patterns


I had a puzzling problem. I had noted for some time that one particular automated scenario often failed and yesterday it was time to dig deeper into that. The scenario was very simple: it searched for and displayed a persons name and address, then it checked that the persons name matched the search criteria. Not a revolutionary or even finished scenario, which is why it had been ignored earlier on. We are still building this thing up.

When it failed it could be confirmed from the screen dumps, that were automatically saved whenever a scenario failed, that another persons name in fact was shown on the screen - not just name, but a totally different person had been found instead.

Debugging I created a special command to take a screen dump whenever I wanted, which I thought would come in handy anyway, and then plastered it all over the scenario so I could follow each step.

I was baffled that the next test run showed that the right person was found and displayed. Aha, I thought, this was probably like following a link. An internet posting suggested to give the web page focus before activating a link. I guessed that taking a screen dump could have that effect.
Quickly I disabled all the screen dump commands, and ran it again. Theory confirmed, the scenario found the wrong person this time.

Back to the code: I enabled the screen dump right before the search thinking that that would focus the page and make it catch the search criteria correctly - and ran the whole thing again. I was confirmed - it really did find the right person now.
But - saving a screen dump unnecessarily was not really what I wanted, so instead I used the set-focus command, and checked that it worked.. It failed dramatically - a quick inspection showed that I had screwed up with my simple copy-pasting ruining the scenario. Dammit. After fixing this the right person was found and happily I leaned back, victory!

All I needed to do now was simply to build the whole thing and I was ready to finish up. My mandatory test run came out as a surprise. It failed again. Flabbergasted I decided to run it again, just for the heck of it and to give me time to think it over once again. I lost my breath... it passed.

Suddenly it dawned on me. I had not bothered to investigate the pattern for the bug. I could possible have guessed that something funny was going on since I knew that it didn't fail all the time. But I just didn't follow that thread.

After running the scenario 20 times in a row without changing anything, the pattern was clear. Failure happened every second time, exactly. Reviewing my efforts, fleshed out above, it was clear to me that actually I had matched this pattern exactly in my efforts, and drawn false conclusions on that basis. In fact, none of my efforts had probably had any effect. Shoot.

I thought it was a valuable lesson to learn about making a few more investigations before trying to fix a bug. I mean, had I bothered to run the scenario just two or three times in a row, I had seen it right away.. Probably..

And the bug ? It evaporated with the next Get Latest... another funny day in automation land, eh?

No comments:

Post a Comment