Skip to main content

Search...

Testing AI - a checklist

Maximize your quality results with effective strategies for AI software testing. Discover key insights with our comprehensive testing checklist.

5 min read
Cover for Testing AI - a checklist

The question “Can AI be tested at all?” leads to a dead end - the decisive factor is whether the system works deterministically or not. This decision determines the entire test strategy and takes away the fear of the supposedly completely new terrain. Software testing for AI systems requires fewer revolutionary new skills than expected: those who test performance correctly already know statistics, and those who can deal with eventually consistent systems have the tools for non-deterministic outputs. A checklist with the right questions - the result of collaboration between testers and data scientists - helps to take a structured approach instead of falling into a state of shock.

Podcast Episode: Testing AI - a checklist

What questions should be asked when testing AI and what can testers learn from data scientists? Marco Achtziger and Gregor Endler provide insights into the world of AI testing and discuss whether testing AI is really that different from traditional tests. A helpful checklist for AI testing contains valuable guidelines and exciting insights. The exchange between testers and data scientists offers the opportunity to learn from each other and improve the quality of AI systems.

“For me as a tester, the most important question is: ‘Is it a deterministic system? That’s the most important one, because it has an incredible amount of influence on my test strategy.” - Marco Achtziger, Gregor Endler

Marco Achtziger works at Siemens Healthineers in Forchheim. With qualifications from iSTQB and iSQI, he is a certified Senior Software Architect at Siemens AG. Deep down, however, he is a test architect. He leads trainings for test architects within Siemens AG and Healthineers. Achtziger enjoys sharing knowledge with other companies and regularly speaks at conferences such as OOP and Agile Testing Days.

Gregor Endler received his doctorate in computer science with an outstanding dissertation on “Adaptive Data Quality Monitoring”. At codemanufaktur GmbH he focuses on machine learning and data analysis. He has published several research papers. As a recognized expert, he is a frequent speaker at academic and industry conferences. His commitment to knowledge sharing is reflected in his willingness to share experiences with other companies.

Highlights der Episode

  • Question first: Is my AI system deterministic? This radically changes the entire test strategy.
  • Testers need statistics basics, data scientists need the overall system view - both need to talk.
  • AI testing is old wine: performance tests have always worked with probabilities and statistics.
  • Data scientists’ training data can be turned directly into test data - ask for it.
  • Model updates in live operation need their own test datasets - otherwise it ends up like chatbot disasters.

Effective test strategies for AI systems

In this podcast episode, we dive into the world of AI testing and discuss the importance of asking the right questions, the interaction between testers and data scientists, and practical approaches to testing AI-based systems.

The importance of AI testing

The world of software development is facing a new challenge: testing artificial intelligence (AI). Today I was able to dive deeper into this topic with Marco Achtziger and Gregor Endler. The concept of AI testing raises many questions - which are the right ones? How different is it really from conventional testing? Our discussion not only addressed key perspectives on these questions, but also provided a comprehensive insight into the complexity and necessity of a structured approach to testing AI systems.

Asking the right questions

One of the key topics of our conversation was the importance of asking the right questions when testing AI. Gregor emphasized that there are no wrong questions per se, but some are definitely more helpful than others. For example, the question “Can it be tested at all?” might be relevant at the beginning of a project, but quickly loses relevance. Instead, we should focus on questions that dive deeper into how AI works and its potential limitations. The development of a joint checklist by a workgroup from different companies showed the value of interdisciplinary collaboration.

Tester vs. Data Scientist

Another topic was the relationship between testers and data scientists. Both sides have their own perspectives and methodologies - but it is precisely in this diversity that the potential for mutual learning and understanding lies. For example, a tester can learn from a data scientist how to evaluate models, while the data scientist can learn from the tester how these models work in a larger system context. This synergistic collaboration opens up new avenues for more effective testing strategies and ultimately improves the quality of software products.

Practical approaches for testing AI

Our discussion also highlighted practical approaches for testing AI systems. A key finding was that testers should not only deal with traditional testing methodologies, but also need to be familiar with stochastic methods and statistical models. The creation of a checklist by their workgroup serves as a guide for testers when dealing with AI-based systems. This checklist covers important aspects such as deterministic vs. non-deterministic behavior of systems and the influence of model updates on test scenarios.

Closing thoughts: The road ahead

Testing AI represents both a challenge and an opportunity. Marco and Gregor made it clear that close collaboration between testers and data scientists is essential in order to conduct effective tests. It is important to recognize that many principles of traditional testing are also applicable to AI testing - but with a new dimension of complexity. A willingness to learn and adapt will be critical to success in this new era of software development.

Share this page

Related Posts