On joining this organisation as a tester I was astounded by their approach. On my first day they had a bad release go to production and roll back off the live platform. I felt that I had entered the Wild West of software development; it appeared reckless.
As time went on I saw the process repeated, there were more good releases than bad, and things started to make a certain amount of sense. In a twice-daily release cycle there isn't time to spend on detailed performance testing, usability sessions, or security audits. Though I became more comfortable, I had a nagging doubt that this was not the way that others were solving this problem.
At CITCON this weekend I had the opportunity to find out, by facilitating a session titled "How do you incorporate non-functional testing in continuous delivery?".
The first thing that struck me, after sharing what I just described, was that no-one jumped to volunteer a better solution. People started to talk around the topic, but not to it. I had to repeat my question several times before one attendee said "We focus on functionality and kind of ignore non-functional testing". I felt this statement was reflective for many in the group.
Someone proposed that the first problem in incorporating non-functional testing is a lack of written non-functional requirements. People can quickly determine whether something is not working by means of it being too slow, or difficult to use, or succumbing to malicious infiltration. Defining what is expected from the application for performance, usability, and security, is much more difficult. The rapid pace of continuous delivery, coupled with a relatively robust process for testing in production, creates a compelling excuse not to stop and think about non-functional requirements.
In the case that requirements are present, how do testers find time to test them? General consensus was that the requirements would form the basis for a suite of discrete automated checks designed to alert the tester; a prompt to hold the release while the tester investigated the problem. Pre-release non-functional testing would be driven by a failing check.
In the case of performance, the check may fire when a threshold is exceeded, or highlight a marked degradation that still falls within the threshold e.g. if the page load time jumps from 0.3s to 2s, and the threshold is set at 4s, we would still want to know about this change. Some in the audience had already implemented lightweight, targeted, automated performance checks that were running in their continuous integration environment.
As the conversation turned to security there was doubt that the same principle could be applied. However one tester in the audience was doing just this by using the results of security audits to create scripted security checks. Though vigilance is required to keep up with evolving security threats, he felt that the maintenance overhead was no different to any other automated test suite.
Finally we spoke about usability. The first thought from the audience was that perhaps A/B testing is how most companies achieve this in a continuous delivery environment. Those assembled were familiar with the concept as New Zealand is often used as the trial region for new Facebook features. Some used this approach, though others argued that if your focus is user loyalty or sales you may not want to risk alienating a proportion of your clientele by giving them a weaker design.
Interestingly, there were those who thought that the same principle of checks may even work for usability. In particular, the accessibility aspects that often require that the application can be used by a machine. Tools to check for tab order, alternate text in images, appropriate colour and contrast, and valid HTML were all mentioned.
The session finished with a conversation about whether this would really work. The arguments against seem to be invalidated by the type of organisations that choose continuous delivery. Organisations that make frequent releases a priority and pride themselves on responsiveness must acknowledge that this comes at the expense of quality. It's fine if a user sees something that isn't quite right, so long as its only briefly. I found it interesting that those with real-world experience in continuous delivery often worked in an iconic or monopolistic organisation where the user has strong brand loyalty and little choice.
Are you using continuous delivery? How do you incorporate non-functional testing?