Friday, 12 August 2016

Human centered test automation

The opening keynote at CAST2016 was Nicholas Carr. Though his talk was live streamed, unfortunately a recording is not going to be published. If you missed it, much of the content is available in the material of a talk he delivered last June titled "Media takes command".

Nicholas spoke about typology of automation, the substitution myth, automation complacency, automation bias and the automation paradox. His material focused on the application of technology in traditionally non-technical industries e.g. farming, architecture, personal training.

As he spoke, I started to wonder about the use of automation within software development itself. Specifically, as a tester, I thought about the execution of test automation to determine whether a product is ready to release.

Automation providing answers

Nicholas shared an academic study of a group of young students who were learning about words that are opposite in meaning e.g. hot and cold. The group of students were divided in two. Half of the students received flashcards to study that stated a word with the first letter of it's opposite e.g. hot and c. The other half of the students received flashcards that stated both words in their entirety e.g. hot and cold.

The students who were in the first group performed better in their exam than those in the second group. Academics concluded that this was because when we need to generate an answer rather than simply study an answer, then we are more likely to learn it. This phenomenon is labelled the generation effect.

On the flip side, the degeneration effect is where the answers are simply provided, as in many automated solutions. Nicholas stated that this approach is "a great way to stop humans from developing rich talents".

It's interesting to consider which of these effects are most prevalent in processing the results provided by our continuous integration builds. I believe that the intent of build results is to provide an answer: the build will pass or fail. However, I think the reality of the result is that it can rarely be taken at face value.

I like to confirm that a successful build has truly succeeded by checking the execution time and number of tests that were run. When a build fails, there is a lot of investigative work to determine the real root cause. I dig through test results, log files and screenshots.

I have previously thought that this work was annoying, but in the context of the degeneration effect perhaps the need to manually determine an answer is how we continue to learn about our system. If continuous integration were truly hands-off, what would we lose?

Developing human expertise

Nicholas also introduced the idea of human centered automation. This is a set of five principles by which we see the benefits of automation but continue to develop rich human expertise.

  1. Automate after mastery
  2. Transfer control between computer and operator
  3. Allow a professional to assess the situation before providing algorithmic assistance
  4. Don't hide feedback
  5. Allow friction for learning

This list is interesting to consider in the context of test automation. The purpose of having test automation is to get fast feedback, so I think it meets the fourth point listed above. But against every other criteria, I start to question our approach.

When I think about the test automation for our products, there is one suite in particular that has been developed over the past five years. This suite has existed longer than most of our testers have been working within my organisation. There is logic coded into the suite for which we no longer have a depth of human expertise. So we do not automate after mastery.

The suite executes without human interaction, there is no transfer of control between computer and operator. Having no tester involvement is a goal for us. Part of providing rapid feedback is reliably provide results within a set period of time regardless of whether there is a tester available.

The suite provides a result. There is human work to assess the result, as I describe above, but the suite has already provided algorithmic assistance that will bias our investigation. It has decided whether the build has passed or a failed.

Finally, the suite is relatively reliable. When the tests usually pass, there is no friction for learning. When the tests are failing and flaky, that is when testers have the opportunity to deeply understand a particular area of the application and associated test code. This happens, but ideally not very much.

So, what are the long term impacts of test automation on our testing skills? Are we forfeiting opportunities to develop rich human expertise in testing by prioritising fast, reliable, automated test execution?

I plan to think more about this topic and perhaps experiment with ways to make our automation more human centered. I'd be curious to hear if other organisations are already exploring in this area.

Other posts from CAST2016:

1 comment:

  1. Great thought provoking post, Katrina.