Hybrid app testing combines automated and manual testing to close gaps that neither approach alone can cover. Automation systematically checks many devices and content, while manual and exploratory testing detects visual errors, usability problems and device-specific special cases. An automation share of around 70 to 80 percent does not replace the manual part, but complements it.
Key Takeaways
- If you run test automation without any manual part, you will miss exactly the errors that only the human eye notices: visual overlaps, usability problems and device-specific edge phenomena.
- A device setup of 15 to 20 devices in automation does not cover all relevant OS variants because manufacturers such as Samsung place their own layers on top of Android, resulting in thousands of special cases.
- Around 90 percent of failed automated test cases are actual bugs in the software, not errors in the automation itself.
- Automation breaks faster than planned because an iOS update can put the tool used out of action. Manual testing catches exactly this gap.
- Test automation falls into disrepair when teams defer maintenance under deadline pressure: If a third of the tests are missing after a sprint and no one fixes them, the suite loses its value.
Why apps cannot be tested completely automatically
Full test automation fails with mobile apps due to device diversity. At ZDF, apps have to run on a wide range of devices, including old models, so that the content remains accessible to everyone. It is precisely this bandwidth that makes pure automation expensive and error-prone.
Anika Strake describes the core problem as follows: Different operating systems, different Android versions and manufacturer-specific interfaces such as Samsung’s create thousands of special cases. If you want to cover all of this automatically, you create a maintenance effort that quickly becomes disproportionate.
The consequence is a trade-off instead of a maximum principle. You have to decide what is automated, on how many devices and what gaps remain. This remaining gap cannot be automated away, it can only be closed differently.
How a tender for pure automation became a hybrid approach
The original brief was to automate everything. ZDF’s tender stated that only automated testing was to be offered. Quality assurance had previously been neglected and automation was seen as the logical next level.
Appmatics won the tender and still chose a different approach. The reason was of a practical nature: the software supplied was poorly suited to automation. Some of the tags and IDs that facilitate the automated activation of elements were not set. In an app that is constantly being further developed, this also increases the maintenance effort.
The kick-off phase dragged on as a result. However, the client wanted regular testing and regular results. So the team started manually and added automation later. This makeshift solution became the fundamental principle.
Manual and automated testing complement each other, they don’t compete
The strength lies in the combination of both approaches. Automation gets into many small corners of an app, manual testing catches the things that machines are poor at assessing.
For example, automated tests click through every broadcast page in the media library and check whether content is available and links work. This would be extremely time-consuming manually. Conversely, a human recognizes usability problems, overlapping elements or hidden buttons much faster and finds errors during exploratory testing that no script would have foreseen.
The basic setup comprises 15 to 20 devices in test automation. Additional devices are added for manual testing. If an error only occurs with a certain OS version, narrowing it down manually is often faster than searching through all automated tests.
How to decide whether to automate a test case
The decisive factors are criticality and frequency. Key features and happy paths belong in automation because they must not break under any circumstances.
Manual testing itself provides a second signal. If a test feels idle and repetitive, this is a first indicator of automation. This is precisely why it is worth starting manually: you get a feel for the app and for the places where problems regularly occur.
Frequent errors in a certain area are the third criterion. If similar problems occur again and again, even outside of the absolute main features, this case moves to automation.
Benedikt Broich describes the typical process in concrete terms: in the first week, a critical error is identified through exploration. In the second week, it is retested manually, a test case already exists. If this test is time-consuming or the error occurs again, the team uses automation as a double bottom.
Automation breaks, and the system must be able to handle it
Mobile automation does not run stably on its own. It involves many tools and providers that have to work together. An iOS update can be enough for the automation tool to stop working with the new version.
The manual part catches such failures. If an iOS version cannot be tested automatically at the moment, it is taken along manually. The hybrid structure is therefore not a luxury, but an insurance policy against the brittleness of the tools.
The setup effort is high, but the ongoing maintenance effort remains manageable. Any errors that occur on the automation side can usually be fixed within a day. Of the failed test cases, around 90 percent are genuine bugs in the software and around 10 percent are problems with the automation itself.
Results found are verified manually, not blindly reported
Automated results end up in the in-house tool and undergo a human review. A test analyst looks over the results and makes an initial assessment: Is the error in the automation or in the app?
The error is then manually adjusted and narrowed down. Does it only occur on iOS or only on Android? Is it related to certain OS versions? Is it a visual error related to the screen size or tablets?
This verification is particularly important for a media library. If the full-screen mode is not scaled correctly or content is displayed incorrectly, these are error classes that are difficult to find using automation alone.
Gained capacity flows back into manual testing, not into savings
Automation accounts for around 70 to 80 percent of cases. Nevertheless, manual testing is not shutting down.
What capacity is freed up because automation has been unlocked goes back into manual testing: more granular test cases, more exploratory testing, more devices. The proportion of automation therefore increases because new cases are added, not because manual testing is eliminated.
This distinguishes this approach from the usual resource back-and-forth. Only at the very beginning are manual tests replaced by automated ones. After a certain point, manual testing remains as a fixed support.
How to deal with test data that is constantly changing
With dynamic content such as a media library, generalization helps instead of fixed test data. The programs are constantly changing, content expires, new episodes are added.
Automation solves this using a data provider. The script scans the app, identifies the existing programs and runs the same test on each one until no more new programs appear. In this way, a single test can be rolled out across all programs.
This generalization has a deliberate limit. The automation does not check whether a title is linguistically correct or contains a spelling error because the effort involved would exceed the costs. Manual testing, which is part of the setup anyway, takes care of such content assessments.
Why automation breaks down due to deadline pressure
The most common mistake is to impose test automation on the development team on the side. The expectation is that the developers will write their automated tests and everything will fall into place by itself. In reality, the opposite happens.
When the next deadline arrives, testing is put on the back burner. Suddenly, a third of the tests are missing and there is no time to fix them in the next sprint. Because automation is maintenance-intensive, especially when several teams are developing in parallel and release candidates are merged, tests inevitably break.
The countermeasure is to block a fixed contingent, whether a whole job, half a job or a portion of the sprint. This is the only way to keep quality assurance flexible and functional at the same time.
In the end, no one is helped if the feature is ready by the deadline but is broken.
Benedikt Broich
Errors are spread across all levels
No single test level finds most errors. The biggest problems are spread across manual, automated and exploratory testing.
A broken player where the full-screen mode no longer works is more likely to be noticed manually. The automation finds unloaded content in exactly the places where it happens. Wacky errors on rare peripheral devices often only show up when a problem is investigated exploratively.
It is precisely this distribution that confirms the hybrid approach. Only the combination of the three levels ensures that so many errors are found at all.


