Testing data processes means checking machine-generated output data against a target result that is derived directly from the technical specification. This creates equivalence partitions from case distinctions that automatically generate test coverage. Experience shows that production data covers around 70 percent of cases; the remaining 30 percent is added synthetically.
Key Takeaways
- Production databases only cover around 70 percent of case constellations in testing because rare borderline cases simply do not occur there and therefore remain undetected.
- Specialist departments can only reliably detect data errors in processing procedures if the results are presented visually and comprehensibly without SQL knowledge.
- Automatically generated target results based on the functional specification replace manual individual test cases and provide a higher test coverage with the same effort.
- The set of rules for data comparison can also serve as a detailed functional specification because it describes the mapping logic precisely and in a way that is understandable for non-technicians.
- AI can be used as a useful assistant in this context, for example to generate SQL queries and display them visually, but does not relieve the specialist department of responsibility for checking the content.
Data errors appear too late if the specialist department cannot access the data
In many companies, errors in data processing processes are only discovered too late because knowledge of the data and access to it diverge. The IT department has no problem accessing the data technically, but does not have the technical understanding to recognize data errors in terms of content. The specialist department would have the understanding, but not the access.
In practice, this leads to a cumbersome ping-pong. The specialist department comes up with individual test cases, receives data for them and checks them manually. This only covers a fraction of the possible cases and costs a lot of time.
The later an error is discovered, the more effort it generates. This is precisely why it pays to start testing data as early as possible, ideally during the development of the processing logic.
What distinguishes data quality from testing data processes?
Data quality and testing data processes are two different tasks. In the case of pure data quality, data is delivered without knowing how it was created and can only be checked for plausibility.
An example: In a list of living persons, a date of birth of 1810 is unlikely. More than this plausibility check is not possible because neither the input data nor the mapping rule are known.
When testing data processes, on the other hand, there is a technical specification of how input data should be transformed into output data. Here, a target result can be formed and held against the actual result. The question is then: Was the output data produced in accordance with the functional specification?
Systematically generate target results instead of building individual test cases
The more effective approach is to systematically generate target results and fully compare them with the actual results instead of defining individual test cases by hand. This is based on business rules that the specialist department has to define anyway in order to describe the data mapping.
These rules can be used to automatically generate a target, which can even be applied to the complete production database. Every row and every column can be checked in this way.
The effort involved remains manageable. If you want to achieve the same coverage with manual test cases, it will take at least as long and the test quality will still be poorer because you cannot map as many cases.
Production data only covers part of the cases
If automatic target generation is applied to the production data, this often only covers around 70 percent of the relevant cases in practice. The missing 30 percent must be added in order to achieve extensive coverage.
The reason for this lies in the technical logic itself. The case distinctions form equivalence partitions in a natural way. If there is no suitable data for a case distinction, for example for the case “age greater than 90”, it is not known whether the test item is working correctly in this case.
Such gaps only become apparent when a real case occurs later and goes wrong. Those who only use production data simply do not test these constellations. This is why the anonymized 70 percent from production is specifically enriched with the missing constellations.
For ongoing testing, you build a test set that covers all case constellations with one representative each and runs through them quickly. The full set is tested as a separate quality assurance of the production set.
It is not a double implementation if you only replicate the mapping
A common objection to data comparisons is that the system is replicated a second time, thereby introducing the same sources of error. This is not the case with the approach described here, because only the technical mapping rule is reproduced, not the system.
Most mappings are extensive, but not complex. It is all about case distinctions: If this, then that, otherwise something else. Such logic can be mapped directly from the specification without having to worry about performance or other constraints.
The difference in effort is clear. Where generating the test result takes a day, implementing the system with more people takes one to two weeks. You don’t do more, but above all you don’t do much.
We don’t rebuild the entire system. We use the specification and recreate the mapping rule directly with the specification. It’s not about when exactly what has to happen, we just test the mapping. Joshua Claßen
For really complex calculations, a different approach applies. In a complex simulation, you get verified target results from an independent source and compare them with a defined tolerance.
Data from many delivery systems must be merged before testing
In heterogeneous system landscapes, there is rarely just one input system. Several delivery systems feed in data that must be harmonized and transferred to a central interface.
One example is a money laundering check on transaction data. The input data from the various delivery systems is brought together and delivered to this interface. It is precisely this process of merging that can be tested using the same principles as a single mapping.
Why data comparisons need tolerances
A data comparison is not always an exact one-to-one comparison. In certain cases, small deviations are technically acceptable and must be tolerated via defined thresholds.
In a trading system at banks, for example, cash flows are generated algorithmically. Due to numerical properties, the same result can deviate by one or two cents depending on the sequence of calculation operations. Such deviations are not critical as long as they do not rise above acceptable thresholds.
Tolerances do not only affect numbers. A tolerance can also be useful for text fields, for example if upper and lower case should not play a role in the comparison.
Technical rules that are also the specification
The set of rules used to generate the target can also be the detailed specification. In this way, the test basis and the functional specification are combined in one artifact instead of living in separate, divergent documents.
Such a rule does not require any technical understanding, only natural understanding. An example of a naming rule:
- If the first name and last name are not empty, both are output separately.
- If only the first name is filled in, only the first name is output.
- If only the surname is filled in, only the surname is output.
- In all other cases, an error is displayed.
This approach has a double benefit. If the department sees the results of the defined logic early on, it immediately recognizes if a case distinction is missing. An implementation can be correct to the specification and still deliver incorrect results because the specification itself is incomplete.
Visualization brings business and IT on the same wavelength
Data comparisons must be visualized so that the business department understands them and both sides can work on the same object. A technically expressed document separates business and IT, a visible definition of the data comparison connects them.
The patterns are repeated during data testing. Testing always means comparing a target with an actual. Most data is available in tabular form, but can also be compared as XML or JSON. Because these patterns occur constantly, they can be implemented more quickly and easily with low-code components than having to reprogram them for each department.
It is precisely this redundancy that can be observed in large banking customers with many departments. Each one builds its own CSV comparator, each one reinvents the wheel. Nobody writes their own word processing either.
Even those who know SQL benefit from good visualization. It is quicker to grasp than pure code. If you want to query 99 out of 100 columns, you have to specify all 99 in SQL; in a graphical user interface, you click away one column.
AI makes testing more efficient, not superfluous
AI does not replace data testing, it speeds it up. An AI system can make suggestions, such as specifying case distinctions at intervals or designing a test, but the result must remain verifiable.
This is precisely where visualization is the lever. If an AI only generates code, you have to be able to read code in order to verify the suggestion. If, on the other hand, the result is presented visually, even someone without in-depth coding knowledge can quickly assess whether it fits and correct the rest themselves.
There is a hard limit for sensitive company data. Such data is not sent to an external service on the Internet. The development is therefore moving towards more efficient, locally usable language models that can be used to generate SQL from a natural language requirement and a visualized no-code query.
AI is also suitable as a co-pilot for the operation itself. It can provide suggestions on how to use a tool or answer a question on how best to solve a specific task. However, a completely independent statement “here is my system, test it” is not in the near future.


