Software testing Christmas chat 2025

Software quality in 2025 is the result of three interacting factors: a functioning test process as a foundation, an understanding of non-functional requirements such as performance, usability and security, and the targeted use of AI tools. If the procedural basis is missing, it will catch up with everyone, and with AI faster than ever before.

Key Takeaways

Unclean processes and a lack of groundwork are slowing down the introduction of AI, because gaps in the setup take their toll faster with AI than ever before with test automation.
Non-functional quality characteristics such as performance, usability and accessibility have gone from being a bonus to a mandatory program and their weight continues to increase.
Testers will need a broader technical skillset in future, but must not lose their intuition, as gut instinct uncovers a significant proportion of defects.
Quality must be integrated into the product at an early stage, not just at the end: those who integrate QA too late will pay significantly more, both financially and in terms of the effort required to make improvements.

AI only accelerates what is already running smoothly

Artificial intelligence reinforces existing weaknesses instead of solving them. If you have a patchy test process and put AI on top of it, you’ll get your problems back faster, not slower.

Matthias Groß describes this mechanism from test automation, which is now being repeated. It was already true with automation: if the underlying process is garbage, sooner or later it will fall by the wayside. This is exactly what happens with AI, only at a faster pace. Incorrect setups and gaps in the process catch up with the team so quickly that the AI cannot be used in the desired sense.

This leads to an uncomfortable truth for many projects. AI is often something that is pushed forward in order to avoid having to deal with the real issue. The real issues lie in the test strategy and the test process: How do I really test? What do I actually want to achieve with it? What problem am I solving? If you deal with these questions using an AI tool, you’re not doing anything better.

However, perhaps the current hype is drawing attention back to exactly that. If AI only works on a solid foundation, the basic work will once again become a compulsory program.

Why groundwork is now becoming more important, not less important

A uniform understanding and a practiced process are the prerequisites for AI in testing to work at all. Without this foundation, every further layer becomes more difficult, regardless of whether it is automation or AI.

Some of the established methods in testing are decades old, while the framework behind the common certifications has existed for around 20 years. This maturity is an advantage. They are tried and tested, clearly defined concepts with a common language.

AI, on the other hand, is new and changes dynamically. Even the terms are still vague. Some people talk about AI agents, others about prompt chains, and often everyone means something different. Before a team can use AI sensibly, it needs to clarify what the terms actually mean.

This is exactly where the old groundwork pays off. If a company can say “this is our process and this is how we live it”, it has the basis on which AI works. If this basis is missing, getting started with automation or AI will be much more difficult.

Vibe coding produces software that needs to be tested

With AI, software is created at a pace that overruns the usual QA procedure. Testers are structurally lagging behind because they have to understand a topic before they can validate it.

On the other hand, there are people who simply do things with AI. A hackathon results in something that works without anyone having thought about quality or security in advance. The problem with this is that what is created in this way can hardly be captured later.

Please, please don’t develop a product out of a hackathon, but think about it beforehand. Build in quality assurance beforehand, build in security beforehand, because you won’t be able to catch it later.

Wolfgang Sperling

In the coming months and years, a lot of software will be created that is thrown out quickly and whose quality remains questionable. Testers have to deal with this. You won’t get very far with 15-page Excel test lists. Such lists still exist and sometimes even work, but they are not enough for this new speed.

The non-functional quality characteristics come to the fore

The focus in testing is shifting from pure functionality to non-functional criteria. Functionality remains the starting point, but it is no longer what makes good software.

Christian Mercier draws a comparison over the years. About 30 years ago, it was enough if a piece of software fulfilled its function at all. Today, the bar is higher. Performance used to be of secondary importance, today everyone expects quick answers. For a long time, usability only had to be usable; today, very good usability is required. Accessibility was once a unique selling point, today it is standard.

The relevant quality models also reflect this shift. Wolfgang points out that a single sentence is dedicated to functionality, while several criteria are allocated to the non-functional area.

AI agents also shift part of the quality assurance to production. Anyone using agents must ask themselves what happens during ongoing operations and what means can be used to safeguard against risks there. The platform providers supply tools for risk mitigation. Which ones you use, how you use and configure them becomes a concrete test task.

Troubleshooting in AI requires new reading skills

When testing AI systems, you analyze traces instead of classic test results. A trace documents the individual steps within the AI and can quickly comprise several pages.

The work involved is tiring to read, but doable. You have to recognize whether there is an error at all and what type it is. A typical case: A function is called three times and no response is returned. The question then arises as to whether the prompt needs to be improved.

The mechanisms for finding errors are exactly those that testers have been trained to use anyway. This is a good starting point. In addition, there is more proximity to the code and a feeling for what happens during prompting and what comes back as a response.

Non-deterministic systems are no longer about testing in the traditional sense, but about validation. You work with criteria, not with fixed target values, because the same result is not guaranteed to be reproducible.

The skillset of the tester becomes broader and more technical

Specialist domain knowledge alone is no longer enough. The profile increasingly demands technical depth, and at the same time, business and IT need to move closer together.

Matthias describes two directions of movement. The specialist area must move closer to the technology and learn to test not only functionally, but also aspects such as usability. IT is moving closer to the business department and providing support with test automation, performance testing and compatibility. The goal is a joint product instead of code that is thrown over the fence to the business department with the order “now test it”.

One thing must not be lost despite all the mechanization: the intuition for the error. This professional intuition drives the search. According to Matthias’ estimates, 40 to 50 percent of errors are not found via a clean test case, but via a gut feeling that is followed up.

This intuition is a form of unconscious competence that grows over the years. It and methodical perseverance complement each other. Both together cover the bases and at the same time find what no test case has provided for.

In the end, quality is a feeling

Quality can be described as a feeling, and all quality characteristics contribute to this. The functional and non-functional criteria are tools to make this feeling measurable.

The user notices good quality. Their data is secure, the software is fast, it is pleasant to use, it does what they want it to do. The individual features increase this inner satisfaction, each in its own way.

This gives rise to a new role for the tester. They are the user’s advocate, bringing their perspective to the team. This holistic view covers many criteria at the same time and is a skill that you should cultivate in a targeted manner.

This also includes questioning the status quo. If you have been maintaining and implementing test case catalogs for years, you should check whether they still fit. Is the granularity right? Am I testing the right things? Have the requirements changed? This critical reflection is not a secondary task. If it’s not already part of the job, it has to become one, because no one else in the project approaches things with this perspective.

Exchange beats going it alone

In rapid development, knowledge sharing is the most effective tool for keeping up. No AI can replace the network, thinking outside the box and the short distances to others.

Regional communities show how quickly this can work. A new testing community founded in the southwest brought previously unknown people together, and after a short time they were more closely networked, with shorter distances to obtain opinions. Anyone who offers something like this quickly realizes that people flock to it and want to exchange ideas.

The exchange is effective because different backgrounds come together there. System integrators, pure test providers, people who implement IT solutions themselves. This mix leads to conversations about day-to-day challenges, from which everyone takes something away.

For the time to come, one thing counts above all: sharing concrete solutions, not just concepts at a high level of abstraction. Show what has worked for you so that others can implement it in their everyday lives. There is a lot of untapped knowledge that everyone could learn from. Have the courage to show your solutions, they will be gratefully accepted.