Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Quality Experience Infodeck

No description

Holger Schmeisky

on 11 May 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Quality Experience Infodeck

Quality Experience
Quality Assurance Without Separate Testers
Lutz Prechelt, Holger Schmeisky,
Franz Zieris, Freie Universität Berlin
Testers or no Testers?
Case Study
Grounded Theory
Quality = Business Value

OnM, Pay
Want to discuss Quality Experience
in your Team?

Drop me a message!

Quality Experience is a concept and a work mode that allows agile teams to assure their quality without separate testing.
It was discovered through observations and interviews with software engineers 3 teams from SoundCloud and ImmobilienScout24, 1 working with separate testers, two without.

Quality Experience centers around tight field-use feedback use, driven by a strong sense of responsibility, supported by test automation, resulting in frequent deployments.
If the conditions are met, separate Testers hamper the feedback loop more than they contribute to quality, so working without them is preferred.
Conventional wisdom is that separate quality assurance is good and there are often whole departments devoted to testing.
Agile methodologies like XP, Scrum, Kanban are unclear on whether one should employ separate testing or not.
We observed that in companies we work with,
there are agile teams employing separate testers
there are agile teams that don't
And wanted to find out what the differences are.
A scientific case study is a way to deeply study phenomena in the real world in their actual context where it is not possible for the researcher to exert control. This makes it much more suitable for studying software engineering than other methods, like experiments or surveys.
In this case Holger Schmeisky observed the teams for about a week to get a fundamental understanding of their work and conducted interviews with representatives of all roles in the team (Developer, Product Owner, Team Lead, QA Engineer).
ImmobilienScout24 (or IS24 for short) is by far the largest real-estate web portal in the German-language-area, with about 1.5 million offerings and about 10 million monthly visitors. Its core are real estate offerings (sale or rent; houses, apartments, commercial property) but the portal also brokers financing, insurance, and many other services.

IS24 has about 180 software developers organized in about two dozen more-or-less cross-functional teams. Each team uses some (typically Scrum-ish) flavor of agile process. The portal software was originally a large, monolithic Java EE application, built by a department-structured organization. Much of the software has since been split into separately de- ployed services, the remainder is called the core application.

SoundCloud is a music sharing service: artists can present themselves and upload their own music (3 hours for free, more against payment); other users can browse this music, listen to it, comment on it, and share it on social networks. SoundCloud has about 10 million users and is usually among the world’s top 200 web sites. It supports web browsers, iOS and Android apps, and community-built apps based on the SoundCloud API.

The SoundCloud architecture started as a single Rails application (now called mothership) but since 2011 is gradually split into separate services. New services are realized in a variety of technologies. Any SoundCloud development team consists of developers only (there are no architects, testers, etc.) and most are vertical, i.e., in charge of one functional area completely. SoundCloud has around 80 developers overall.
Team Pay is responsible for the Buckster service that contains all payments-related functionality such as user subscriptions, fraud detection, and reporting. A few parts of Buckster were still in the mothership during our study. Team Pay consisted of two developers (previously three) and one product owner4 and used a lightweight Kanban process, without fixed iterations.
Team OnM (full name “Online Marketing”) develops a
range of services with little end-user visibility such as search-
engine optimization, marketplace integration with partner
portals, landing pages for AdWord campaigns, visitor and
campaign analytics and reporting, and data export APIs.

The team with formerly seven developers had recently
been split into a sub-team oriented towards routine tasks
and another for new tasks; we talk only about the latter
here. It consists of four developers, a technically knowl-
edgeable product owner, and a technical lead and uses a
Kanban process without fixed iterations. When we asked
the product owner how happy they were with the quality of
their work, the answer was “Extremely happy.”

Team OffProf OffProf (full name “Offerings Professional”) is
another team within IS24, but with rather different character.
OffProf develops some functional areas of the ScoutManager,
a closed-group part of the portal used for creating advertisements by
private individuals and real-estate agents. The ScoutMan-
ager is part of the core application, which is still very large,
with over 10,000 source files that form a single Spring MVC
application and several teams working on it.

The core application has a weekly release cycle as follows:
All core-related teams provide a development snapshot every
Tuesday at 10am, which is then tested by the separate QA
department, handed over to the operations department, and
deployed on Wednesday the next week (day 8 after submis-
sion); defect fixes can be inserted into an ongoing QA week,
but still normally need to wait for a deployment Wednesday.

Quality in this context is not just the absence of defects.
For all three teams, quality is a holistic attribute that covers
most aspects of business value, from functional defects
and gaps over all kinds of attractiveness issues to operation
problems regarding deployability, scalability, monitorability,
and so on.
Quality Experience is a concept that explains how and when teams
are able to assure quality well without separate testers.

It is not a theoretical model but founded on empirical observations
of software engineers in their daily work context.

Our results suggest the Quality Experience mode of
quality assurance has strong positive effects on business flexibility and developer motivation.

The mode can only be reached under certain circumstances. If it can be reached, separate testers would only get in the way. If it cannot be reached, working with separate testers is presumably preferable for the well-known conventional reasons.
Quality Experience is a mode of quality assurance and
deployment in which the developers
(1) feel fully responsible for the quality of their software;
(2) receive feedback about this quality, in particular the qual-
ity of their recent changes, that is (2a) quick, (2b) direct, and
(2c) realistic; and
(3) rapidly repair deficiencies when they occur.

These are the five core concepts (in bold).
Working in a context that emphasizes these, means working in a
Quality Experience work mode.
Quality Experience
Quality Experience is a concept that explains how and when teams are able to assure quality well without separate testers.

It is not a theoretical model but founded on empirical observations of software engineers in their daily work context.

Our results suggest the Quality Experience mode of
quality assurance has strong positive effects on business flexibility and developer motivation.

The mode can only be reached under certain circumstances. If it can be reached, separate testers would only get in the way.

If it cannot be reached, working with separate testers is presumably preferable for the well-known conventional reasons.
Each of the nodes in this picture is a factor for quality experience.
The connecting lines mean that one factor is influencing the other positively.
Almost all the nodes are related to each other somehow, so we only show the most important connections.

The fundamental precondition for an arrangement that leads to a strong Quality Experience appears to be an existing software architecture that sufficiently decouples the work of one team from that of another (here: separate web services
communicating via HTTP).
Without such Modular Architecture, much additional beyond-team coordination effort is required, it is not always clear which team owns a particular piece of code,
and defect introduction frequency is higher.
While Modular Architecture is a technical precondition, it enables a management decision: To hand over the power to deploy their applications to the teams. OnM and Pay are very deliberate about it:

"[We want] the team to own, end-to-end, the stuff they produce so there are no external dependencies, no external QA teams or anything that says you can release or you cannot". -- Manager at SoundCloud

Team OffProf is not able to do this, because their application is large and very difficult to deploy.

In the previous step responsibility was assigned to the development teams, now it needs to be taken. In our cases this was the case as can be seen very well from this quote from a developer of Pay:
"In most cases it is not about getting woken up, but the implication that people are not able to upgrade [their SoundCloud subscription]. That is what we want to prevent. [...] I demand of myself that the system is running."

Together with this empowerment, developers are held accountable for their piece of functionality.
In the Quality Experience work mode feedback is extremely important and has to be of very quality, by being quick, realistic and direct.

Feedback on the code written has to come very fast: Ideally immediately after writing code, but at most a few days.
Team Pay and OnM release several times a day, meaning that feedback from live relates directly to code just written. Team OffProf has at least 8 days of delay between implementation and release and it was obviously too much.
No matter how good your testing is, it is always trying to emulate real customers and cannot test in the depth and breadth of real users.
Pay and OnM usually roll out new features only in small slices, in order to reduce the impact to customers, but always employing real customers (for example rolling out a new payment service only for redeeming coupons).

Feedback from the live systems goes directly to the teams, displayed on large TV sets in their team areas.
There are no intermediate teams involved, also when reacting to problems in the live system.
This had several effect:
Metrics are better understoof (because the team defines them itself)
Quicker to act on feedback (not intermediate steps)
Creates a high sense of urgency (problems have to be fixed NOW!)

Team OffProf also had access to live metrics, but since they were part
of a larger application the data was intermingled and was not very actively used.

Deployment, Monitoring and taking responsibility for quality may take a lot of time. The development teams have to be able to still focus on development, so the solution is, obviously automation.

The smaller part of the automation is deployment. Team OnM used to hand over deployment artifacts manually over to the Ops team (with ensuing phone calls whether deployments were successful).
To enable team deployment they developed a web application that allows deployments with a single click.
Team Pay used an internal tool called Bazooka to deploy up new instances of the service with just one console command.
This reduces the wall-clock time for a deployment to roughly a minute (for OffProf's legacy application it takes about a day!).

The larger contribution comes from almost fully automated
testing. Deploying frequently clearly brings forward the
need for automated tests that check very nearly all of what
is necessary to assure correctness. You can clearly see this in this quote from a Pay developer:

"I know that when there is an error
I might be woken up at 2 in the morning. That is enough
motivation to say: I will [write] twice and thrice [as many
tests] and make sure I will not be woken up [through an incident]."
Pay initially spent a lot of time on creating a suitably balanced testing pyramid, adding new types of tests and removing those that did not provide more assurance. The end result was heavy use of fine-grained unit tests, few integration tests on an HTTP level and very few end-to-end Selenium tests.
Team OnM similarly invested a lot of time into their test code, mainly parallelizing End-To-End tests and refactoring test code to make it easier to maintain.

Note that OffProf invests a lot of effort into automated testing as well; for example 30 complex web tests are executed with every continuous integration run, and 350 additional ones are run every 2 hours. But the team has a hard time making these tests adequate.
All of these factors: quick, direct, realistic feedback, automated tests & deployment lead to a new strategy for handling faults:
* faults in the live system do not occur too often, because automated testing is extensive
* if there are faults in the live system they will be detected quickly through the excellent feedback
* and they can repaired in very short time, through automated deployment

Definitely this has an impact on motivation. From which team do you think is this quote?

"Especially bugs [in the live system] are very annoying. [...]
Sometimes I think ’What will our customers be thinking?’. But this is probably
a developer thing, to be so angry with oneself for overlooking something.
[...] And then it takes another 2 weeks ..."
This is the work mode that results from all these factors, that allows teams to ditch separate testers and go without. Teams can move to a much finer-grained iteration style, releasing new versions of services several times a week. This has imprtant consequences:
It makes the risk of not having separate testers approve releases bearable
The consideration is roughly as follows: A live fault might be costly, but only for its duration. Even if the fault is expensive, if it is live for only a very short amount of time its total impact is small.
Compare this to team A, were all except the largest bugs will remain live for at least 8 days, due to the deployment cycle.
Defects are much easier to identify, because only small changes are made at once (compared to huge releases with many tickets)
Makes realistic feedback easily available, allowing frequent A/B testing techniques)
Enables risk-reduction strategies for larger goals, e.g. piecemeal refactorings and migrations
Lean Software Development incorporates most of these factors (but puts them together differently). You can map the concepts as follow:
Several concepts map to various topics under thinking tool 1 “Seeing Waste” [18, pp.4-8]: Automated Deployment, Automated Tests, and Frequent Deployments to “Waiting”; Direct Feedback to “Motion”; Rapid Repair to “Defects”.
Direct Feedback, Quick Feedback, and Realistic Feed back pertain to thinking tool 3 “Feedback” [18, pp.22-27], from principle 2 “Amplify Learning”, but are much more refined.
Empowered To Deploy maps to principle 5 “Empower the Team”, thinking tool 13 “Self-Determination” [18, pp.99-103].
High Motivation also comes under principle 5, in thinking tool 14 “Motivation” [18, pp.108-109], specifically as the motivation building blocks “Competence” (via the availability of feedback), “Progress” (via the deployment-feedback cycle plus Rapid Repair), and “Belonging” (via Co-define Requirements, which again is a part of tool 13 “Self-Determination”).
Automated Tests maps to principle 6 “Build Integrity In”, thinking tool 20 “Testing” [18, pp.145-149].
Frequent Deployments, as the key outcome of Quality Experience, maps to principle 4 “Deliver as Fast as Possible” [18, pp.69-92] and is also a core topic of at least three currently-popular streams of lean thinking: Kanban [2], Continuous Delivery [9], and DevOps [10].

Full transcript