Model-based testing (MBT)

Model-based testing means using graphical representations such as flowcharts to visualize test procedures and systematically generate test cases from them. It clarifies early on what is to be tested, makes implicit assumptions in requirements explicit and forms a bridge to test automation. Tools generate either textual test cases or executable scripts from the model.

Key Takeaways

Model-based testing works without UML knowledge: Boxes, arrows, decision nodes and sub-diagrams are sufficient as a toolbox.
Flowcharts make implicit assumptions explicit, because everyone involved unconsciously fills gaps in requirements with their own wealth of experience and the differing understanding only becomes visible in the model.
The same model can be used to generate both manual test case descriptions and keyword scripts for test automation without changing the model itself.
Test case explosion through loops and combinatorics is controlled via coverage criteria: the minimum requirement is that each modeled test variant occurs in at least one generated test case.

What model-based testing really means

Model-based testing means using a graphical representation of what is to be tested and deriving test cases from it. The ISTQB defines it very broadly as any testing that involves models. At its core, however, it is about graphical models and the tools that generate test cases or test scripts from these models.

The graphical representation is the really important part of the whole thing. Instead of talking about UML or activity diagrams, the simpler term flowchart helps. Each path through a flowchart results in a possible test case. The same principle applies to state charts: The different paths through the state chart form the different test cases.

Anne Kramer has been working with this method for years and argues that it should be stripped of its heavyweight reputation. The term “model-based testing” smacks of complicated, of expert knowledge, of mandatory test automation. “Visual test design” would often be more appropriate. Because that is exactly what happens when someone uses boxes and arrows to describe what a system should do.

Which test types are particularly suitable

Flowchart models show their strength where the view of the system remains abstract. In the language of the V-model, these are the upper levels; in the language of the test pyramid, these are also the upper layers. This refers to system testing and system acceptance testing.

These tests describe entire processes instead of individual clicks. An example: from the creation of the customer to the data record in the database to the actual dispatch of the goods. The tester thinks in terms of cases, not mouse movements. If you’re under 16, you can’t get a beer in Bavaria. Anyone between 16 and 18 gets beer, but no schnapps. Over 18 you get everything. What the checkout process actually looks like is irrelevant for this consideration.

State-based models can also be used for unit testing. And in addition to graphical models, there are also approaches based on textual models. The range is wide, but the practical focus is on the processes above.

The model is created early on, often by the tester

The biggest advantage lies in clarifying early on what is to be tested and how. As soon as it is clear what a system should do, processes can be defined long before it is clear what a button is called or in which order the input masks should appear.

Modeling actually belongs in requirements engineering. The idea comes from the Unified Process environment, which is where UML comes from. The idea was to present requirements graphically to make them easier to understand. In practice, this rarely happens. Projects start with textual requirements or a product backlog, and a large part of the necessary information remains in the heads of the stakeholders.

Making this hidden information visible is the real added value. And it is often the testers who build the model. Developers come up with something and ask whether it works. Testers want to know beforehand what is right and what is wrong, what is required and what is not. This need for clarification makes them good modelers.

Why every model brings questions to light

As soon as someone starts drawing a model, questions arise that never appeared in the text. What happens when you click on Cancel? Is every data set really saved? The first versions of a model are full of yellow sticky notes. The note field is a central modeling element because it can be used to tie up loose ends.

The model thus becomes a means of communication. You can show: This is how I understood the process, does it match your plans, and I still have questions here and here. These questions can be addressed specifically.

Models make implicit assumptions explicit. The human brain doesn’t like gaps and fills in missing information with its own wealth of experience. Everyone is satisfied because everyone believes they have understood the text. But the wealth of experience is different and the understanding differs. Only the drawn model makes it visible that your understanding differs from that of others.

Requirements engineering even has its own review method for this: transformation into a different form of documentation. When you transform requirements from text into a model, you discover for the first time the inconsistencies that remain invisible when you read them over and over again. The earlier this step is taken, the more shift-left testing results.

Models also work with paper and pencil

Model-based testing does not need complete UML notation or ready-made test automation. It can even be done with paper and pencil. Although the advantage of automatic test case generation is lost, even this pragmatic approach with a low level of maturity beats the variant without any model at all.

The more pragmatic I am, the better it works. The modeling palette is sufficient: I need boxes and arrows, the option for sub-diagrams, start nodes, end nodes and perhaps a decision maker.

Anne Kramer

Even this simple start pays off in terms of communication, clarity and mastering complexity. A model can even emerge from a meeting or a demonstration, not just from formal requirements.

From model to test case: how generation works

After the processes comes the second step: determining what is to be tested. This is where the divide-and-conquer principle comes into play. The processes consist of blocks, usually actions. For each action, you can consider what needs to be tested. At this level, the focus is on individual requirements or user stories and their acceptance criteria.

These considerations can also be shared with development so that it is clear what should happen in the event of an error, i.e. what the expected result is. Then it becomes tool-specific. A test case generator runs through the model from the starting point, collects the stored information about the various paths step by step and uses it to build a scenario.

What comes out at the end depends on what is in the model. In most cases, textual scenarios are created for manual testing. However, if you store keywords from a keyword-driven approach, the tool builds a script from function calls. The keywords still have to be implemented afterwards, but the structure of the script is already in place.

This is exactly where model-based testing forms a bridge. Both manual text descriptions and automation scripts can be generated from the same model.

Tools differ in openness and connection

There are roughly two types of tools. Some are relatively closed and connect directly to their own test automation framework, for which they then also generate. The others are more open and export in various formats.

Common test management and ALM tools such as X-Ray, Jira or ALM can be connected, as can implementation tools. According to the manufacturers, if a required output format is missing, it is usually not a major effort to add it, because basically only a suitable format needs to be created.

Consciously mastering the test case explosion

Model-based testing is often associated with the keyword test case explosion. It is caused by feedback loops, for example: If a step fails and the process starts again at the front, the loop can be run once, twice or as often as required. Together with combinatorics, the number of paths grows rapidly.

The central lever, on the other hand, is coverage. The model specifies what is to be tested, i.e. all cases that must occur at least once. The minimum requirement is that every defined test case appears in at least one generated scenario.

In addition, various coverage targets can be selected:

all activities
all equivalence partitions of data values
all state transitions in the state diagram

The specific methods available depend heavily on the tool.

Keeping models up to date: two strategies

In an agile context, the system is constantly changing and the model changes with it. Configuration management for the models is a prerequisite for dealing with this. You need to know which model was used to generate which test cases and what has changed between versions. Otherwise, at some point it will no longer be possible to assign which model belonged to which sprint.

There are two basic strategies for dealing with changed test cases.

Approach	Procedure	Fits well when
Generate new	Discard old test cases, generate new ones every sprint	Agile and automated testing; avoids more and more tests that check the same thing and allows different variants from sprint to sprint
Synchronize	Test cases retain their ID and are adapted to the changed model with tool support	Manual testing and results from the last run are reused or argued as still valid

Pure discarding often fails due to the process. Those who test manually want to reuse parts of the last run or justify that a result is still valid because nothing has changed. With the synchronizing approach, the test cases retain their ID, look largely the same as before and only contain the new steps, such as an additional query that now has to be clicked away. Both approaches have their authorization.

Three reasons in favor of visual test design

The benefits can be summarized in three points.

Firstly, it is a powerful communication tool for clarifying what needs to be done at an early stage. If you do this at the very beginning of the process, you can save yourself some of the separate detailing of requirements and work directly on the test cases. This results in acceptance test-driven development, even with models.

Secondly, it makes work easier and more efficient. Drawing a model is more fun than writing pages and pages of test specifications, and it is easier to review. Someone will look at a model a third or fourth time. Nobody does that with a 200-page test specification. The model also shows better whether all requirements are covered and the test quality can be measured.

Thirdly, it forms the bridge to automation. Even those who do not program can contribute to automation because keyword scripts can be generated from the model.

Visual test design is coming back, AI lowers the hurdle

The pragmatic variant of model-based testing is currently experiencing a comeback. For a long time, the topic was too closely associated with embedded systems and automation. Now the purely graphical is coming to the fore, similar to the trend towards no-code and low-code automation tools. New tools are coming onto the market and interest is growing.

One reason lies in agility itself. The focus on user stories works well, but the systematic overall view suffers. The overview of epics and features is nowhere near as clear, and much ends up in exploratory testing. A model returns precisely this system view.

The next boost comes from artificial intelligence. Modeling requires the ability to abstract, and this first step in particular is difficult for many people who have been describing step by step for decades. AI could analyze existing models, test cases and requirements and generate a first draft. Such activities already exist, and some of them are already built into products.

One caveat remains. The mental effort of dealing with the system to be tested is one of the major added values. If the AI takes over, you get a pre-chewed text that you just wave through without having thought about it yourself. Whether this helps the cause is questionable. But what AI does do: It lowers the inhibition threshold. Perhaps it works in two steps. The AI takes the first hurdle, and once you get used to the way it’s formulated, it’s easier to write the next model yourself from scratch.