Data Warehouse Test 

 8 April 2021

Like it?

Alternative approaches to testing a data warehouse system (DWH)

Conventional testing of operational information systems involves testers sitting at a screen workstation and operating the user interfaces. They supply the input, start a transaction and check the results. Each transaction is a test case. The same is true for Internet systems, which are transaction-oriented. For this type system testing the tester simulates the later end user and does not need a higher qualification than the end user himself.

With batch systems or background processes, the testing problem has always been different. Here, one is much more dependent on test tools. The tester needs a much higher qualification because the test is mostly invisible. One or more input files are provided, then batch runs are started that process those files to update databases or generate reports. The tester's job is to populate the input files and/or databases with test data, run the jobs, and validate the results.

This sounds simple at first, but it is in fact a very difficult task, far more difficult than testing a user interface. Data warehouse systems belong to this category of IT systems. When testing a user interface, the relationship between the input data and the result data is clear. They belong to the same transaction. They are also manageable in number. An input screen will contain about 10 data values. The number of results in the output mask is usually even less. So the tester has to deal with a manageable amount of data per transaction.

This is different with a batch test and even more so with data warehouse systems. A data warehouse is usually filled with input files/tables by several source systems using load jobs. Each input medium has a large number of individual attributes, often more than a hundred. With multiple input sources, this multiplies the number of data attributes. Within the data warehouse system, many jobs are usually still working on the transformation and aggregation of these attributes. The relationship between the input data on the one hand and the data stored in the data warehouse on the other is by no means as clear as in the testing of user interfaces and, on the contrary, hardly intuitive to understand.

Data warehouse structure
Data warehouse structure

In dieser Situation ist der Tester auf formale Mittel angewiesen. Er braucht Regeln, die den Zusammenhang zwischen den Eingaben und Ausgaben beschreiben. Anderenfalls kann er nur die Plausibilität der Ergebnisdaten kontrollieren, d.h. ob sie zu einem gültigen Wertebereich gehören, unabhängig von dem Zustand der Eingabedaten. Dies ist allerdings ein sehr oberflächlicher Test. Wenn es heißt, jede Zahl > 0 und < 1000 ist plausibel, dann ist jede Zahl zwischen 0 und 1000 ein korrektes Ergebnis, egal wie sie entstanden ist. Ein derartiger Test ist zwar besser als gar keiner, aber kaum ausreichend, um Korrektheit zu garantieren.

Statistical testing, or comparing the frequency of selected types of results, is also inadequate because it can only detect gross deviations between target and actual. It is better than plausibility testing, but still not good enough to prove the correctness of the results. At best, it can uncover some proxy deviations. 1

Test automation in the Data Warehouse Test

The only really effective method for the verification of all single results is to check each single result against its rule. For each result value, the prescribed formula of its origin as well as the arguments or input values must be used to recalculate whether the value is correct or not. In principle, this check can be performed manually. The tester only needs one window on the input data, a second window on the rule and a third window on the results. In this way, he needs a good half hour for each result value. With larger data warehouse systems, it is quite possible that up to 5000 attributes are read from a source system. Even if he only wants to verify one representative result value for each result type, he will need about 2500 hours or 312.5 days of intensive work just to verify the results for 5000 result types. When considering further source systems and also the transformations within the data warehouse, the manual effort multiplies.

The other problem with this manual matching is that it is not only time consuming but also very demanding. It requires the tester's full concentration if he wants to reproduce the rule for each result type. Such intense mental effort is hardly expected from employees under today's project conditions. The error rate of rule comprehension would be higher than the error rate of rule coding. Therefore, only one solution remains, namely to automate the result checking. In 2006, we were able to gain experience with automated result checking in one of our first data warehouse test projects.

Specification of the transformation rules

At the beginning there is the requirement specification. Without specification, according to Hutcheson, there is no possibility of verification2. In a data warehouse project, the specification is the sum of all data transformation rules. In this data warehouse project, the data transformations were created using an entity-relationship modeling tool, and the rules were specified in Excel tables. In each table, there was a row for each new data attribute or result type. In addition to table and attribute names, the rows documented the attribute type, attribute description, and of course the rule for the attribute. For the example described here, values and rules of the load jobs from one source system were used. The test of the other source systems and the transformation and aggregation jobs in the data warehouse system followed the same pattern.

Deduction from the original rules table:

INTEREST_RATE: "Link to TAB_X / ATTR_D interest conditions applicable to the account."; If ATTR_B in TAB_Y(debit) has a value other than 0, this account is linked to an interest group. The following then applies: TAB_Y / KEY (Position 3-4) (Interest Type) A(2), TAB_Y / KEY (Position 5-9) (Interest Subtype) A(5). TAB_Y / KEY (Position 10-12) A(3). The above Key fields are concatenated in the ID. If in TAB_B the ATTR_D values are zeroes, the account interest condition has to be extracted from the X_TAB: TAB_X / ATTR_B (Base Rate) TAB_X / ATTR_S (Spread Rate). If both attributes <> 0 value, extract TAB_X If TAB_X / ATTR_D is available (i.e. <> 0), extract the same as for TAB_X_ID. If only ATTR_D of TAB_Y <> value, extract ATTR_D.

The rules were written in quasi-formal English prose by the analysts. At first glance, it seemed as if an automatic conversion of the informal rules into a formal interpretable expression would be impossible. However, closer examination revealed certain basic patterns in the rules. Over 4700 of the 5300 result attributes were specified with one of the following 7 rule patterns:

  • Rule class 1: A simple one-to-one assignment of a source attribute.
  • Rule class 2: A simple assignment of a constant value.
  • Rule class 3: A selection from a list of alternative constant values (enumeration).
  • Rule class 4: A union of multiple source attributes from a source table (concatenation).
  • Rule class 5: A union of multiple source attributes from multiple source tables (join).
  • Rule class 6: An assignment of the target attribute based on an arithmetic operation with constant values and/or source attributes.
  • Rule class 7: An assignment of the results of a function.

Each of these assignment rules could be conditional or unconditional. Conditional rules were associated with some condition, for example: if, when, as long as, in case, etc. The condition operands could be constant values as well as source attributes. A special condition was the success of a join operation. Usually these were simple ifs with else/otherwise clauses, but they were not always unique.

Of course, these ground rules were always formulated slightly differently by different analysts, but they were similar enough to recognize and classify. In addition, the constant values and the source attributes were recognizable. The source attributes were always written in capital letters, and the constant values were set below apostrophes. This provided the basis for automatically converting these semi-formal rules into strictly formal, testable assertion statements, which could be translated into interpretable test procedures by an assertion compiler. Before that, however, the rules had to be partially rewritten to be analyzable. This task was performed by the testers. Approximately 3900 rules belonged to rule classes 1 and 2 and did not need to be rewritten. The remaining 1400 rules had to be rewritten, but due to the lack of time only 750 rules were implemented according to the following patterns:

Regular class 3:

The rules for a list of alternative values (enumeration) could consist of constants as well as variables. The operands were separated by "!" characters.

assign 'const01' ! 'const02' ! 'const03'
assign TAB_A.ATTR_1 ! TAB_A.ATTR_2 ! TAB_A.ATTR_3

Regular class 4:

The rules for a union (concatenation) of several source attributes from a source table could consist of constants as well as variables. The operands were separated by "÷" characters.

assign TAB_A.ATTR_1 ÷ '_' ÷ TAB_A.ATTR_2

Regular class 5:

The rules for a union of multiple source attributes from different source tables were represented as join operations.

assign join TAB_A.ATTR_1 with TAB_B.ATTR_2 with TAB_C.ATTR_3

Regular class 6:

The rules for assigning target attributes by arithmetic operations had to be slightly reworded to become unambiguous.

assign TAB_A.ATTR_1 * -1 + TAB_B.ATTR_2 / 2

Since bracketing was not provided, the operands had to be put in the correct order so that they could be resolved from left to right. For the majority of arithmetic expressions this was sufficient.

Regular class 7:

The rules for assigning function results were formalized as a function call with the parameters written in curly braces.

assign Func.x {TAB_A.ATTR_1, '2', TAB_B.ATTR_2}

Finally, all conditions were converted into a uniform form. This form provided for an unbracketed listing of and/or comparisons for ease of understanding.

assign  <Zuweisung> if TAB_B.ATTR_1 = ’1’ and TAB_C.ATTR_2 = ‘0’
assign  <Zuweisung> if TAB_D.ATTR_3 < TAB_E.ATTR_1 or TAB_D.ATTR_3 > ‘0’

The output of a join operation could also be part of a condition, as the following example shows:

assign  <Zuweisung> if join TAB_A.ATTR_1 with TAB_B.ATTR_3 = true

All if-statements were connected with an assign-assignment. The assign statement was noted first and then the determining condition.

assign TAB_D.ATTR_2 if join TAB_D.ATTR_1 with TAB_B.ATTR_1 = true

The guiding principle for this syntax was to remain as close as possible to the existing English prose rule, so that the relevant analysts could easily understand it. Thus, it represents a compromise between the need to be understood by subject matter experts and the need to be interpreted by an automaton. As such, the expression comes close to a domain-specific language for data warehouse projects.

Deduction from the adjusted rules table:

INTEREST_RATE: "Link to TAB_X.ATTR_D interest conditions.
Debit interest conditions applicable to the account.";
" ? assign TAB_Y.ATTR_D | TAB_Y.ATTR_C | TAB_X.ATTR_C | 'D' if TAB_Y.ATTR_D (debit) <> '0',
assign TAB_X.ATTR_C if TAB_X.ATTR_D <> '0',
assign TAB_X./ATTR_C | TAB_X.ATTR_D if TAB_X.ATTR_B <> '0',
assign 'na' if TAB_Y.ATTR_D = '00' and TAB_X.ATTR_S = '0' and TAB_X.ATTR_B = '00'
assign TAB_X.ATTR_N|ATTR_Y|ATTR_O|ATTR_A if other.(comment).";

In the course of the formal revision of the rules, documentation errors were discovered, which were reported by the testers to the analysts. These were mostly incomplete rule descriptions (if without else, missing keys for join statements, etc.) that left room for interpretation, which prevented an exact check of the rule. This affected about 4% of all rules.

Conversion of the rules into test scripts

Of the approximately 5300 rules in the specification, over 3900 could be implemented without adaptation. Thus, 1400 rules had to be adapted, whereby about 600 turned out to be unimplementable. In the end, just under 800 rules were rewritten into the formal syntax. This took an average of 20 minutes per rule. So about 266 hours were required to formalize the rules. Alternatively, there would have been the options of a) not testing these rules or b) writing the assertion scripts manually.

However, writing the assertion scripts manually would have cost just as much, if not more, effort. Moreover, one would then have had to update two semantic levels with two different descriptions of one and the same rule. A similar problem arises in code generation. If the code is generated from the specification, only the specification needs to be maintained. But if, as is usually the case, the code is written manually, both the specification and the code must be updated. Since this is too costly in the long run, after a short time the specification is neglected and only the code is developed further. This leaves the analysts out of the game and the product remains in the hands of the programmers with all the negative consequences associated with historically grown systems, especially that no one understands them, not even the programmer himself.

To avoid this situation, it is recommended to maintain only one description of an IT system, namely the one whose language comes closest to the human language - the domain-specific language. At least the test scripts should be generated automatically. This was also done in this project. The assertions were generated from the specified rules using a tool.

Generated test script:

file: ACCOUNT;
// This comparison procedure assumes that the old file contains the following attributes:
//      from TAB_Y Table: ATTR_C, ATTR_D, ATTR_E, ATTR_F, ATTR_P
//      from TAB_Z Table: ATTR_R
if ( new.A_ID = old.ATTR_ID );
assert new.A_ID = old.A_TYPE if (old.A_TYPE = "G");
assert new.A_ID = "S" if (old.ATTR_A = "R" & old.ATTR_S = "S");
assert new.A_ID = "C" if (old.ATTR_A = "R" & old.ATTR_S = "C");
assert new.A_TYPE = "L" if (old.ATTR_A = "R" & old.ATTR_S = "L");
assert new.A_RATE = old.ATTR_C if (old.ATTR_B = "0");
assert new.A_INTEREST = old.ATTR_D;
assert new.A_LIQU = old.ATTR_E;
assert new.A_ACCT = old.ATTR_P;
assert new.A_PC   = old.ATTR_PC;
assert new.START_DATE = "2005.05.15" if (old.ATTR_H <> "0");
assert new.REVIEW_DATE = old.ATTR_V if (old.ATTR_F <> "0" & old.ATTR_N <> "0") ;
assert new.REVIEW_DATE = "NULL" if (old.ATTR_F = "0") ;
assert new.PRIMARY_INDICATOR = "P" ! "S" ! "n.a.";
assert new.INDICATOR = "inactiv" if (old.ATTR_R = X"3") ;
assert new.INDICATOR = "inactiv" if (old.ATTR_R = X"1") ;
assert new.INDICATOR = "closed" if (old.ATTR_R = "C") ;assert new.INDICATOR = "active" if (other);

The simple thing about test assertions versus code is that they can be ordered in any order for a given file or table. Each attribute of a table is tested on its own, independent of the other target attributes. The only important thing is that all assertions of a table are together in one test procedure. The test procedure has the name of the target table and an if-key condition to connect the target object to the source object. This if-key condition has the following format:

if (TAB_X.KEY_1 = TAB_A.KEY_1 & TAB_X.KEY_2 = TAB_B.KEY_2 & TAB_X.KEY_3 = TAB_C.KEY_3 &

The key comparison seeks to match the target object with the correct source objects. The data of the source objects are previously fetched from the source tables using select and join statements and combined into comparison objects corresponding to the target objects. This means that for each target object there is a comparison object with all the source data from which the target object is formed. This creation of the comparison objects from the source tables is a prerequisite for the verification of the result values. All comparison objects of the same object type or all comparison objects for a target object type are grouped together as rows in a table.

Therefore, the select statement for the formation of the comparison objects is also generated as part of the assertion generation. These select statements are derived from the join and select statements in the rule specification.

The assertion statements themselves are conditional or unconditional. Conditional assertions are linked with an if condition.

The assertion statement has the following basic syntax:

assert new.TAB.ATTRIBUT <oper> <Vergleichsausdruck>;

Der Operator <oper> kann =, <, >, <=, =>, <> oder != sein.

The comparison expression can be one of the following:

  • <operand> im Falle einer 1:1 Zuweisung.
  • <operand> ! <operand> ! <operand> im Falle einer Liste alternativer Werte (Enumeration).
  • <operand> | <operand> | <operand> im Falle eines vereinigten Vergleichswertes (Concatenation).
  • Range <grenzwert>’:’<grenzwert>’ im Falle einer Grenzwertprüfung.
  • <operand> <arithmetischer Ausdruck> im Falle einer Rechenoperation.

The operand is a constant like "const01" or a source attribute like TAB_A.ATTR_1.

If the assertion is conditional, an if condition is appended:


Die Bedingung <condition> hat folgende Syntax:

if (<quellattribut><oper><operand> {&

The source attribute is an attribute in the source tables, e.g. TAB_A.ATTR_1. The operands are the same as in the assert assignment

Multiple 'and' conditions can be associated with a & connection. 'or' conditions are recorded as alternative assertions for the same target attribute.

assert new.TABX.ATTR_1='1' if (TAB_A.ATTR_2='A');
assert new.TABX.ATTR_1='2' if (TAB_A.ATTR_2='B');
assert new.TABX.ATTR_1='3' if (TAB_A.ATTR_2='C');

Rules of rule class 7 (functions) could not be converted into testable assertions by the assertion generator because the connection to the necessary function libraries was missing. These assignments were marked as comments in the assertion procedures and had to be tested manually by the testers.

The assertion generator parsed the Excel tables specifying the rules and converted the rules into assertions in order. One or more assertions were generated from each rule, this was done for the 266 target tables in less than a minute. In the end, there were 266 assertion procedures and a log of all missing and untranslatable rules, statistics on the rules, and SQL-select statements for selecting source data to a target object. The main advantage of automatic assertion generation, apart from the syntactic correctness of the assertions, is the repeatability of the process. If the rules are changed, it is only necessary to restart the assertion generation and thus regenerate the affected assertion procedures.

Statistics from test script generation:

| Assertion Generation completed !
| Number of E/R Rule Lines processed = 11384
| Number of new Tables processed = 266
| Number of new Attributes processed = 10910
| Number of old Tables processed = 132
| Number of old Attributes processed = 00891
| Number of new Attributes without a rule = 05573
| Number of Transformation Rules specified = 05337
| Number of Transformation Rules processed = 04742
| Number of Basel-II Rules recognized = 00890
| Number of Basel-II Rules processed = 00852
| Number of Complex Rules recognized = 00853
| Number of Complex Rules processed = 00562
| Number of Complex Basel Rules recognized = 00157
| Number of Assert Scripts generated = 00266
| Number of Assertions generated = 04897
| Number of Table Selectons generated = 00181
| Number of Assert Keys generated = 00023
| Number of Assert Conditions generated = 00308
| Number of Assert Concatenates generated = 00103
| Number of Assert Alternates generated = 00365
| Number of Assert Arithmetic generated = 00009
| Number of Test Cases generated = 05337

One way to determine test coverage is to determine the ratio of test cases performed to test cases planned. In order to be able to measure the test coverage in the present project and also to make the test progress quantifiable, a separate test case had to be created for each rule to be tested. Since the manual creation of the test cases would have been very time-consuming, they were also created by the assertion generator in the form of CSV files. The CSV files could thus later be imported completely into the test case management tool.

Information generated for each test case included:

  • The test case number unique across all rules.
  • The test case description.
  • The expected result, which consisted of the content of the rule.
  • The description of the pre- and post-conditions.
  • The priority of the test case. This was determined by the assertion generator based on the complexity of the rule, with rules of classes 1 and 2 classified as "Low", classes 3 and 4 as "Medium" and classes 5,6 and 7 as "High".
  • Div. metadata such as: Test case author, tester, test system, keywords, test case status.
  • The test case group. To keep the number of test cases manageable, they were grouped together, with grouping by target tables.

In the present data warehouse project "Quality Center" of the company Mercury was used as a test case management tool, which was well suited for the generated test cases due to the flexible import interface.

The data warehouse test process

A defined and repeatable testing process was established for multiple iterations of the data warehouse test using the testing tools described here. This process consisted of 8 steps:

  1. Revised the data transformation rule in the original Excel spreadsheet.
  2. Generation of the assertion procedures from the transformation rules.
  3. Generation of the test data and the SQL-select statements from the assertion procedures.
  4. Compilation of the assertion procedures.
  5. Reading and saving the source tables (source system) in CSV files.
  6. Loading the data warehouse data tables.
  7. Reading and saving the target tables (data warehouse) in CSV files.
  8. Automated comparison of the CSV files of the source and target tables based on the assertions.

The result of the 8 steps was a log of the missing data objects and the deviating data attributes, i.e. those data whose values did not match the result of the assertions. These were identified as incorrect.

Data Warehouse Test Process
Datawarehouse test process

Verification of the test results against the assertions

Functional testing requires an oracle. The oracle serves as a source of truth, telling us whether the test results are right or wrong. James Bach writes: "An oracle is an evaluation tool that will tell you whether the system has passed or failed a test. In high volume automated testing, the oracle is probably another program that generates or checks the test results. The oracle is generally more trusted than the software under test, so a concern flagged by the oracle is worth spending time and effort to check..." 3

In the data warehouse project, the in-house developed tool "DataTest" was used as the oracle, which interpreted the assertion procedures and verified the test results against the rule. For each target table, the corresponding source data was collected and stored in an intermediate database, where it was sorted by its key. Then, the target table was read and verified per record using the source data matching that record. In this process, each attribute was checked against its assertions. Result data that deviated from the specified result according to assertion and source data were listed in a deviation log. The tests performed showed that it is quite possible to automatically validate a large set of test results in a few minutes.

Furthermore, all target records for which there was no matching source record and source records for which there was no matching target record were logged. Thus, the testers had a list of all unpaired keys and all incorrectly filled target attributes of the data warehouse. These, of course, had to be manually checked by the testers to determine if it was a true error. The result was documented for each data record in the associated test case in the test case management tool "Quality Center" and if the test case was performed negatively, an error was reported. Thus, the test progress could be determined for controlling and reporting.

In this way, several incorrect assignments from the import files to the data warehouse tables could be uncovered. Unfortunately, it was not possible to verify all attributes, since not all rules could be transformed into assertions. Of the 5300 rules, only 4700 could be transformed into assertions. The rest had to be checked manually, which of course took a lot of time.

Excerpt from the data validation report:

|                              File/Table Comparison Report                              |
| Key Fields of Record(new,old)                                                          |
| New: ACCOUNT                                                                           |
| Old: Attributes                                                                        |
| RecKey:100000USD114601001                    | duplicate key in old File/Table         |
| RecKey:100000XXX104501001                    |                                         |
| New: ATTR_ID                                 | G                                       |
| Old: Constant_Value                          | L                                       |
| RecKey:100000YYY104501001                    |                                         |
| New: ATTR_C                                  | XXX00                                   |
| Old: ATTR_C                                  | 0                                       |
| RecKey:100000ZZZ104501001                    |                                         |
| New: ATTR_D                                 | 0                                       |
| Old: ATTR_P                                  | 1                                       |
| RecKey:100000ZZZ104501001                    |                                         |
| New: REVIEW_DATE                             | 9999-08-01_00:00:00                     |
| Old: Constant_Value                          | NULL                                    |
| RecKey:100000YYY104501001                    |                                         |
| New: ATTR_P                                  | GL                                      |
| Old: Constant_Value                          | RE                                      |
| RecKey:100000XXX104601001                    |                                         |
| New: START_DATE                              | 9999-08-01_00:00:00                     |
| Old: Constant_Value                          | NULL                                    |
| RecKey:100000XXX104701001                    |                                         |
| New: ATTR_X                                  | G                                       |
| Old: Constant_Value                          | C                                       |
| RecKey:100000ATS196501100                    | missing from the new File/Table         |
|  Total Number of old Records checked:                91                                |
|  Number of old Records found in new File:            08                                |
|  Number of old Records with duplicate Keys:          72                                |
|  Number of old Records not in new Table:             11                                |
|  Total Number of new Records checked:                59                                |
|  Number of new Records found in old File:            08                                |
|  Number of new Records with alternate Keys:          00                                |
|  Number of new Records not in old File:              51                                |
|  Total Number of Fields checked:                     93                                |
|  Total Number of non-Matching Fields:                46                                |
|  Percentage of matching Fields:                      51 %                              |
|  Percentage of matching Records:                     14 %                              |

Further automated test executions

In addition to the verification of the transformation rules, further test scenarios were automated in the data warehouse project. These were all developed in the Python scripting language, and thus a comprehensive framework for data-based test automation and for static code analysis of the data warehouse environment was created in the course of the data warehouse project. Among other things, the following test approaches were implemented in this framework:

  • Quantity checks: Is the correct number of records transported between stages?
  • Default data: Are all default values used correctly?
  • Data contents: Do all data contents match the specified and expected data types, data lengths, etc.?
  • Code characteristics: Are the correct properties and indicators used for each module?
  • Integrity check: Is the integrity of the loaded data given?

Automation also made it possible to make the test repeatable on the one hand, and on the other hand to make it possible via the complete data warehouse in the first place.

Conclusion from the data warehouse project

In the end, the following conclusions could be drawn from the data warehouse test project:

  • The key to verifying the content of a data warehouse lies in a precise and complete specification of the data transformation rules. This is the basis for checking the individual attributes. Without such a specification, efficient testing is not possible.
  • For a mass test of this type, it must be possible to implement the transformation rules automatically. Otherwise, the effort is much too high, i.e. there must be a tool that generates assertions from the rules.
  • The assertions must be machine-processable. It is not enough to scan through the data attributes to see if their contents seem plausible. Each attribute must be automatically checked against its rule to determine its correctness.
  • Without powerful test tools, the systematic testing of a data warehouse system is not possible; in addition, the tools must be embedded in a well-thought-out test process. Process and tools must complement each other.
  • Such a complex test project requires an own test automation project according to Kaner4

This test approach described here was the basis for further data warehouse tests and was continuously optimized in the course of this.

  1. Dyer, Michael: "Statistical Testing" in The Cleanroom Approach to Quality Software Development, John Wiley & Son, New York, 1992, p. 123
  2. Hutcheson, Maggie: Software Testing Fundamentals, John Wiley & Sons, Indianapolis, 2003, p. 12
  3. Bach, James: "Reframing Requirements Analysis," in IEEE Computer, Vol. 32, No. 6, 2000, p. 113.
  4. Kaner, C. / Bach, J. / Pettichord, B.: Lessons learned in Software Testing, John Wiley & Sons, New York, 2002, p. 111

Do you like this post? Share it: