Software quality is not a quality of an individual test, but a joint task for the entire development team. It arises when product owners, developers and testers work together to define what quality means for their product. When testing AI systems in particular, it becomes clear that classic quality criteria are not enough, and the question “What is quality?” often remains unanswered.
Key Takeaways
- Agility has taken the topic of software quality out of its niche: Where regression testing used to take place every few months, two-week sprints enforce continuous quality work across the entire team.
- Today, quality is a collaborative task in which product owners, business analysts, UX designers and developers think together, no longer just the testers.
- AI-supported testing tools help with requirements testing and deriving test ideas, but require an organizational maturity that many teams have not yet reached.
- Testing AI systems raises a fundamental question that should have been answered long ago: What does quality mean in concrete terms for our product, and when has enough quality been achieved?
How quality came out of the cellar hole and into the projects
For a long time, software quality was a marginal topic, relegated to windowless test rooms with decommissioned computers. Today, it is on the table of every development team every day. This shift over the last 20 years or so has fundamentally changed the profession.
In the classic model, testing waited until development was complete. A few test cases were written in advance, checked, playtested back errors and then had weeks of peace and quiet. Quality was a downstream step, not a joint task.
Those days are over, and for one specific reason: software is everywhere today, even in business applications that used to be considered mere tools. The demand for functioning software now applies across the board. If a Windows update is poorly tested, you can very quickly see how much is at stake.
Why agility was a boost for testing
Agile working has pulled quality and testing forward because the old way of waiting suddenly stopped working. Iterations, sprints and constant regression testing forced teams to organize quality differently.
What used to be tested every few months suddenly had to be run every two weeks. This was a tough change for testers and quality people. The first reflex was mini-waterfalls: On the last two sprint days, testing was done quickly, and one mistake ruined all the work.
A better practice emerged from this pain. Today, quality is thought of more holistically, no longer just as the profession of a single role. During refinement and planning, the team thinks together: Where could we test what? What makes sense at which test level?
One key benefit is that you no longer have to test everything twice. Instead of repeating every case from unit and integration testing all the way up to acceptance testing, you know which tests are running at which level. The department gets closer and gains confidence in what is happening.
Back in 2003, one thesis was: agility means quality. The reaction was strong because many people confused agility with “we no longer have to document anything”. Practice has shown the opposite.
If I don’t pay attention to quality, everything will blow up in my face after the third sprint. Richard Seidl
There is a lot of structure and discipline behind the supposedly easy-going agile development. This is precisely what makes quality work in the first place.
Agility is not over, it is just being understood
The theory that agility is over is based on a misunderstanding. Agile working essentially means designing a process for yourself as a team, not working through a specific framework.
Which framework is used is of secondary importance. A team can use Scrum, Kanban, Design Thinking, Working Out Loud, retrospectives or even large models such as SAFe and LeSS. The key is to choose the building blocks that suit your own process.
What remains is the idea behind it: pursuing common values and a vision as a team and building a process that is fun, efficient and improves over time.
The difficult part is personal responsibility. If you come from a traditional environment and suddenly have to design a process yourself, it is a stressful change. However, this effort cannot be avoided if you want to master complex requirements. Teams are only just beginning to really understand what agility means.
Better tools have defused the maintenance nightmare
The tool landscape in testing has improved noticeably and now takes real work off your hands. On the developer side, scripts, open-source tools, frameworks and automated security checks support quality.
Automation tools are getting smarter. Concepts such as keyword-driven and data-driven testing are established, and object recognition is becoming increasingly reliable. Nevertheless, every decision requires brainpower: what do you automate at all, and where does it make sense?
The earlier reflex to automate everything via the UI often led to high maintenance costs. Today, the team consciously sets up tests at the appropriate level, usually below the UI. Developers and others on the team help to place tests where they belong. A heavy UI waterhead that runs for hours on end can be reduced in this way.
AI is not a panacea for testing problems
AI only helps in testing where the actual problem is also an AI problem. Many workshops show that the most pressing problems lie elsewhere and cannot be solved with AI.
There is a danger of declaring AI to be a commonplace tool and using it to solve problems that do not exist. Budgets are there, the pots are full, and that is precisely what leads to actionism.
But there are sensible fields of application. These include checking requirements, deriving test ideas and supporting automation. The prerequisite is a certain maturity in the team, otherwise the application will come to nothing.
The real question is: What is quality?
AI throws testing back to a fundamental question that should have been answered long ago. The classic quality criteria are deterministic and clear: functionality, efficiency, usability, security. When it comes to testing AI systems, they no longer go far enough.
This raises the question of what quality actually means. If you ask it in a team, it goes quiet. Hardly anyone has given this question any serious thought, apart from a paragraph in the test plan.
The pattern is not new. A performance criterion such as “this thing has to be fast” has always forced testers to look for someone to define what “fast” actually means. Often no one could be found, and the numbers were pulled out of the nose.
Today, this problem is getting worse. Software integrates many technical views and serves many stakeholders. The difficult question is not only what quality is, but also who answers it and when enough quality has been achieved.
These questions remain unanswered, and it is honest to say so. AI will not disappear, even if not everything is an LLM. The whole system still has to work. How the concept of quality will be shaped in the future and how test organizations will change has not yet been decided.


