Estimation of test projects

The planning of a requirements-based system test requires that the test effort can be calculated on the basis of the requirements. If test points are used for effort estimation, the test points must be derivable from the requirements documentation. If test cases are used as a basis for effort estimation, they must be derived from the requirements. With the text analyzer tool for English and German texts, both are counted - test points and test cases. They are then incorporated into a modified COCOMO-II formula to estimate the effort and duration of the planned test. Factors such as the testability of the system, the degree of test automation, and the test conditions are taken into account. It is recommended that at least three estimates be performed and the results compared. A cost range should be created with lower and upper bounds. This is used to negotiate the scope of the test with the user. Ultimately, the scope of the test is a question of ROI.

System test planning

An indispensable task in test planning is the estimation of test costs. According to the forward planning method, the effort required to confirm the fulfillment of all requirements is estimated. Subsequently, the minimum duration of the test is calculated. Only then is the amount of the test budget based on the target costs and the end date based on the calculated duration. This is how test managers would often like to see it.

The reverse planning method is used. First, a test budget and an end date are defined. The purpose of the estimate is to determine how many test cases or test points can be tested with the specified budget by the specified deadline. After that, priorities have to be set based on a risk analysis and a benefit analysis. The goal is to test as much as possible within the budgetary and time limits.

Both methods use the same estimation formula, but with different parameters. Depending on the situation, one or the other method is used.

Measurement of the test scope

A prerequisite for both methods is the measurement of the test scope. In the case of a requirements-based system test, this means measuring the requirements. The requirements documents are to be analyzed either manually or automatically in order to count the test cases or test points. The test cases are derived either from the requirements or from the use cases. Wherever an action is required, a state is queried or a condition is imposed, a test case must be specified. For conditions there are even two test cases, one for the positive and one for the negative outcome. This way it is possible to count all Test Cases for the coverage of the requirement text. Of course, more Test Cases will be added, especially where the requirements are superficially defined and incomplete. But at least you have a starting point. If this count is also automated, you can quickly arrive at a test case count without much effort.

The same applies to test points, which, like function points, are counted on the basis of inputs, outputs, interfaces and database tables. If the user interfaces and the system interfaces are mentioned in the requirements document, they can be classified and weighted. The number of database tables and their attributes is derived from the data model. By merging the interface, interface and database points, one arrives at the test points. This count can also be automated. This gives us two different measurements for the test scope:

One based on the internal logic of the system and
One based on the external system interfaces

Measurement of test productivity

After we have determined how large the test project is, the next step is to determine the test productivity, i.e. how many test cases or test points can be processed per person day. Here we are dealing with empirical values. They cannot be plucked out of the air. The most valuable thing about a professional test organization is that it has a memory, namely the experiences from past test projects. It is important to guard and perpetuate these experiences. They are difficult to take from the literature or from other projects, because each environment has a different productivity. This is partly due to culture, partly due to technology, and partly due to operations. For example, we measured an average of 3.8 test cases per day in one project and 8.2 test cases per day in another. In the former, many web interfaces were tested, in the latter batch processes.

In one author's project, 79,000 test cases were executed within 2 months with 92 testers per release. That makes 21 test cases per tester day. But you have to know that this was more of a regression test. Only the changes and enhancements were tested against the specification. The rest was tested against the previous version. Today the same test company tests 120,000 test cases within 6 weeks with 24 testers. This results in a productivity of no less than 166 test cases per tester day. This sensational increase in test productivity was made possible by automating the entire testing process.

These experiences show how difficult it is to define average values for test productivity. An automated test is incomparably more productive than a manual test.

Measurement of testability

Application systems are not all the same testable. There are systems that are easy to test and others that are difficult to test. A system in which the application logic is interwoven with the user interface is more difficult to test than a system in which the application logic is separate from the user interface. The component of the second system can be tested through a batch interface, while the component of the first system can only be tested through the user interface. Systems with wide import and export interfaces are also harder to test because testers must generate more parameters with more possible combinations. The size of databases also affects testability. The more attributes a database has, the more data must be generated and validated.

Testability can be determined by static analysis of the program sources, the interface definitions, e.g. HTML or XSL sources, the interface definitions, e.g. XML or IDL sources and the database schemas in SQL or XML. From the analysis of the software artifacts follows a rational measure on the scale 0.1 to 0.9. This measure, e.g. 0.45 is divided into the mean value 0.5 to get the testability multiplier = 1.11. This means the test effort will be 11% higher because of the below average testability of the software.

Calculation of the test effort

Once the test scope, test productivity and testability have been determined, the calculation of the test effort is only a question of the estimation formula. A modified COCOMO formula is used here. The original COCOMO-II formula is:

The system types are Standalone = 0.5, Integrated = 1, Distributed = 2, Embedded = 4.

The system size can be in statements, function points, object points, use case points, or whatever.

Productivity is the number of size units a developer can produce per month.

The influence factor is the product of 20 individual influence factors on the scale from 0.7 to 1.4.

The scaling exponent is the mean of five different project conditions:

Familiarity with the target architecture
Team spirit
Quality of the development environment
Degree of reuse
Process maturity

in the range of 0.91 to 1.23.

For estimating test projects, the authors suggest expressing the system size and productivity in test cases or test points, replacing the influence factor with the testability multiplier, and using the following scaling exponents:

Familiarity with the application
Team spirit
Quality of the test environment
Test automation level
Test Process Maturity.

The system type shall be reduced to 0,5, 1, 1,5 and 2.

Accordingly, a distributed system with 1200 test cases and a testability of 0.45 would cost 320 tester days from a test team with a productivity of 8 test cases per tester day and a scaling exponent of 1.05.

This estimate shows what is important in minimizing the test costs. First, the testability of the software must be as high as possible, but the test team usually has little influence on this. They can only make the user aware of it. The system type is also a given. What the testers can influence are the test conditions and the test productivity. They can improve test conditions by having professional testers who are attuned to each other, working in a mature test process with a high degree of automation. In turn, they can achieve their productivity through more test automation. Ultimately, it comes down to getting as many test cases through in as short a time as possible while uncovering as many defects as possible. That means high test automation.

All of this speaks of a departure from the traditional, homespun way of testing software systems. Users must be willing to have their application systems not only developed by professionals but also tested by well-equipped professionals.