Quality assurance in agile projects
Quality in agile projects is not created at the back in acceptance testing, but at the very front in dialog. Why this is the case and what often goes wrong.

Quality assurance in agile projects does not start with testing, but with talking. Those who clarify requirements early on, analytically derive acceptance criteria and create a risk classification reduce errors structurally. Tests are built up in stages: static analysis, module testing, integration testing, system testing. Omission is allowed, but only consciously and with known risk.
Key Takeaways
- Early involvement of the test team in the requirements phase prevents expensive rework, because acceptance criteria are already checked for testability when the user story is written.
- Those who consciously reduce test scopes must explicitly name the risk and regularly check whether it is still acceptable, otherwise unnoticed quality gaps will arise.
- Static code analysis and four-eyes reviews are the fastest quality assurance measures in the process because they fix errors before the code is even executed.
- Artificial intelligence will derive and execute test cases from acceptance criteria, but requires meta-level human control because AI is only as good as its training.
Talking comes before coding
Quality assurance in agile projects does not start with testing, but with conversations. Before anyone writes code or builds a function, it is important to clarify what is to be created and where the journey is going. Only from this understanding can you derive how to test the whole thing.
The dialog is the most important phase in the entire development process, and it can be the longest. What is clearly discussed at the beginning shortens everything that comes out at the end. This doesn’t just apply to agile. You also need to be involved early on in the classic approach, but the agile framework forces you to sit down at a table right at the start.
In the classic approach, testers could duck away for a long time: let the requesters do it first, then look at it later. This convenience is no longer available in an agile environment. Testers must actively work to ensure that everyone involved is involved at an early stage, including themselves.
How a requirement becomes a testable user story
Requirements end up in the backlog, typically as a user story. An example: “As Thomas, I want the button to be green so that I can distinguish it from the red background.” A story like this is the start, not the end of the work.
The acceptance criteria emerge from the story during the conversation. Which button is meant? Which color code exactly? Should similar buttons be included? How does the button behave when clicked? Each of these questions sharpens the requirement and makes it verifiable.
If you listen with a tester’s ear, you will recognize early on the points that will later count for acceptance. This is exactly where the value becomes apparent: if the tester is already involved in formulating the story, test thoughts flow into the requirement before a single line of code exists.
Everyone in the team needs tester glasses
Quality is not a task for a single tester with sleeve protectors who sets the stamp at the end. Ideally, everyone carries a pair of tester glasses in their pocket, including the person who makes the requirement and the person who develops it.
If the requester and the developer both look at a story from a testing perspective, they consider in good time how the subsequent quality can be ensured. If the technical stakeholder sits at the table and gives their assessment, the circle is complete.
For this to work, bridges must be built between the specialist department and IT. Young, freshly assembled teams need support to overcome the perceived gap. IT staff don’t bite when you explain something technical to them, and the experts are not aloof. Once these bridges are in place, the collaboration works.
Dialogue should be dosed, not maximized
Forcing everyone to sit at the same table is rarely effective. It is more important that no one is left sitting alone in a closet. It makes sense to withdraw, think about what has been written down and sleep on it for a night, but then it has to go back to the exchange.
Small groups that coordinate with each other again and again and work through a topic often achieve more than a large meeting. Once a topic has reached a certain level of maturity, it goes into refinement or backlog grooming, where the key stakeholders work together to create a common understanding.
The right balance is difficult to find. Packing everything into refinement meetings goes beyond the scope, and not every detail is of interest to everyone. Sometimes technical parts have to be broken down separately or the technicians have to go over the concept again. The key is to tailor the dialogs so that the right conversations take place in the right circles.
Four environments, four test levels that build on each other
A typical setup consists of four environments that continue to stabilize from developer to production. As the environment grows in stability, so do the tests that run on it.
| environment | character | focus of the tests |
|---|---|---|
| Development | Unstable from the outside, stable enough to work; hardly any real test data | Developer checks whether what has been built meets his understanding of the requirement |
| Integration | Software is inserted into the existing system landscape | Interfaces, version statuses, data exchange between systems |
| Acceptance | Close to production: data, interfaces, performance | Department checks business processes throughout |
| Production | Live operation | After-go live check |
Before anything is executed dynamically, there is a static check. A static code analysis answers whether the code is understandable, maintainable and clean, or whether it contains dead code, syntax violations and inappropriate constructs. This first quality loop provides findings without the program running.
In the banking environment, a batch code analysis is often added because security counts: There must be nothing in it that later diverts half a cent unnoticed. This is followed by a review by a second developer. Static analyses and reviews are quick wins because they start early and bring robustness to the system that would otherwise have to be painstakingly gathered later.
Why you can deliberately omit test levels, but not accidentally
The complete sequence of static analysis, module testing, integration testing and system testing is the ideal case, not mandatory for every situation. No static analysis is required for the first five lines of code.
The point is awareness. Anyone who skips a step must actively decide to do so and name the associated risk. With five lines, the risk is manageable, with 5000 lines you lose the overview, and then additional protection must be added.
The test manager’s job is to create this awareness: why you are testing, what you can do, what you can deliberately leave out and what risks result from this. This also includes constantly checking whether a risk that was once accepted is still acceptable or whether the framework conditions have changed.
Who tests the interface? The question of responsibility in integration testing
When it comes to integration testing, clarifying responsibility is crucial to quality. If each side assumes that the other is already testing, there will be a hole in the middle of the interface that nobody checks.
The opposite is just as unhelpful: if both sides test deep into the other’s part, there is so much overlap that it doesn’t make any progress. It must be clear who owns which part of the interface and where who is testing.
Each new increment changes this coordination. One side understands their part technically, the other theirs, and both must remain in dialog as to whether the interaction still works. The experts behind it also need to be involved in this testing.
What belongs in a single sprint
After static testing, a sprint includes component and module testing, system integration testing and system testing. If the specialist department looks at the function for the first time and the acceptance criteria are met, the Definition of Done is achieved and the sprint can be completed.
It is worth noting that this is not the end of the software development process. The software is not in production, but the sprint is still finished and the next iteration can start. The acceptance testing follows later, often required by regulations.
Module tests are usually automated because a version is iterated over several times. Creating and maintaining this takes effort, but lays a solid foundation. If all functions are tested with the conceivable data characteristics and possible errors are intercepted, there is little that can go wrong technically.
Deliver several times in a sprint instead of five minutes before the review
The most common pitfall is the hectic rush at the end of the sprint, when everything is checked in, merged, pushed and pulled five minutes before the review. You can avoid this by delivering finished pieces early.
Once a story has gone through static analysis, review, module testing and integration, deliver it and let the department take a look at it before you move on to the next story. This way, you get feedback in the same sprint, can work on defects immediately and present a tested story at the end instead of piling up three days of test work on the last few meters.
This early delivery is often lost in the heat of the moment, without malice aforethought. This is where an outside perspective can help: knock on the door in a friendly manner and remind them of an interim delivery instead of building up pressure. In retrospective at the latest, you can agree to deliver every finished story immediately in future.
Where the test manager sits in the agile team
The pure role of the test manager hardly exists in agile anymore. Ideally, the test manager sits in the dev team as a T-shaped person: they have test management skills, but can also develop. In practice, there are often full-stack developers and a test manager alongside them.
Their task then is to raise awareness of quality among the developers and remind them of steps that would otherwise go unnoticed. He or she not only manages tests from the outside, but also gets people involved so that everyone is on board early on.
Why actionism does not improve quality
When quality is in a mess, the first reflex is usually “we need to test more”. Then test cases are written en masse at the back of the acceptance testing, often not of high quality, but just any test cases, as long as there are a lot of them. This does not improve quality.
Typically, there are two ends with a gap in between. On the one hand, many acceptance tests from a technical point of view, on the other hand, unit tests, which almost every developer writes out of their own need for quality, but mostly from the gut, without the dual control principle and without an analytical method. The two sides are not intertwined and do not mesh.
What is missing is a systematic approach. Many people remember ad hoc testing or requirements-based testing because it sounds good. The fact that there are other test case determination procedures is often overlooked. With the right skillset, you ask different questions and notice, for example, that a text field allows special characters and their behavior needs to be checked.
If this test case is written down early on, the developer can catch it while it is being built. The quality is then built in from the outset.
Test scope is risk management, not full coverage
Very little software needs to be fully tested; that would not be profitable. A mission to the moon, where human lives depend on the code, requires a whole lot more effort. With normal software, you can omit test cases, but you have to know which ones.
The selection is based on the risk of two factors: probability of occurrence and impact. Highly complex, highly nested or library-dependent code sections are given special attention. The same applies to places with a high technical risk. A login is often technically simple, but it has to work because no software works without a login.
In agile development over several iterations, regression testing is added, and not everything that has ever been tested is tested there either:
- New functions (progression): as many test cases as economically justifiable in order to test them cleanly.
- General regression: High-risk test cases that check the software for vitality.
- Changed areas: additional medium-risk test cases to check to the right and left of the change whether the test object still works.
A risk classification of the test cases makes this access fast. No doctoral thesis, but rough levels such as high, medium, low for complexity and impact. This matrix can be used to intervene specifically for each regression, and if the test cases are automated, an additional one hardly costs anything.
Documentation does not contradict the agile idea
The Agile Manifesto does not say that documentation is unimportant, but that there are more important things. The more important thing is the conversation. At some point, however, what has been discussed must be written down so that you can still understand what you have done in three months or three years’ time.
A risk classification on the test case, a defined test object and structured test cases help, especially in agile environments, because test cases are handled very frequently and teams change. The documentation remains concise: briefly write in how you arrived at the chosen test scope, no novels.
The team decides on the scope together. Product owners as representatives of the specialist department and developers consider together what a reasonable scope is, instead of a single test manager writing a large test planning in the background. This ensures that the entire team has the same understanding.
Quality as an attitude, AI as a tool with blind spots
Understanding quality as an attitude, dealing with automation and using artificial intelligence, but distinguishing between what you can and cannot do.
- Christian Mercier
Artificial intelligence will automate testing to a greater extent. Testers can be derived from acceptance criteria and executed without having to write every test case themselves. This shifts the work, but does not replace control.
Even an AI has errors and blind spots, and it is only as good as it has been trained to be. If you don’t train an AI yourself, you have to take an even closer look at what it does and doesn’t do. Humans reach a meta-level here: they need to know what the automation can do and where its limits lie.
Quality does not end with the review. Acceptance testing and an after-go-life check are part of an end-to-end process. Everyone who works on software should have this attitude, not just the tester.
Related Posts

Richard Seidl
•Jun 2, 2026
Patient agility: Is agile working dying?

Richard Seidl
•May 26, 2026