Software testing as an independent discipline refers to the systematic testing of programs by specialized testers, separate from the development team. Billing is not based on time, but on performance: number of test cases, degree of code coverage achieved and errors detected. It is crucial that each component has clearly defined inputs and outputs that can be tested in a measurable way.
Key Takeaways
- The first independent test lab in Europe was established in 1978 because Harry Marsh Sneed could not find any employees within Siemens who were willing to work exclusively as testers.
- Harry Marsh Sneed refused to pay testers by the hour as a matter of principle: billing was based on test cases, program branches covered and errors detected, which directly motivated the Hungarian employees.
- Agile development often fails in practice because components are cut too large to actually be completed and tested in a sprint.
- Static analysis alone is not enough to fully validate software, because the specification would have to be at the same semantic level as the code itself.
- Test effort is systematically underestimated by project managers: Anyone who realistically names the necessary testing effort risks being excluded from the project, as the example of the electronic health card shows.
Why an independent test lab was a novelty in 1978
The first independent test lab in Europe was created in 1978 from a simple observation: testers cost more time than developers. In a Siemens project for an interactive database query, a team of five people needed three to four days to write code and at least as long to test it. More than half of the effort went into testing. This question of why testing is so expensive shaped the following decades.
Harry Marsh Sneed set up the laboratory because he couldn’t find any testers in Germany for a major project for the German Federal Railways. In the end, the project comprised around 200 modules and several hundred thousand lines of code. Something like this could no longer be programmed conventionally as a small project. It needed its own organization for testing.
The solution came from Budapest. Through a contact at a testing conference in London, Harry found math-trained programmers there who were willing to test exclusively. Nobody in Germany wanted to do that at the time. The profession of pure tester practically did not exist.
Payment according to performance instead of time
A tester should be paid for what they do, not for the time they serve. This principle underpinned the entire construction of the laboratory. Siemens and the Hungarian institutes were not billed based on hourly rates, but on measurable results.
Four parameters defined the test performance:
- Number of test cases
- branches covered in the program (degree of coverage)
- detected errors found
This was unusual for the Hungarian institutes because they had previously sold their staff to German companies on an hourly basis. In the end, they accepted the model. The employees themselves received a bonus for errors found, and this visibly motivated them.
In Harry’s opinion, this motivation issue remains the core problem of testing to this day, regardless of whether testing is manual or tool-supported. If you involve testers in results, you get better tests.
What remains the same in testing over decades
Every test compares one object against another. This principle has not changed since the late 1970s, regardless of whether the test object is a batch program on a mainframe or a web service on a cloud server.
The prerequisite for every test is a baseline: a description of what goes in and what should come out of this input. Without this defined target specification, there is nothing to test against.
Harry’s early tool, the test bench, started right here. He used an assertion language with which a test case could be defined: if a certain condition prevails, such as age greater than 90, then a certain output must follow. The computer compared the target and actual values and showed the deviations. The testers then had to find out why a deviation occurred.
At component level, testing is therefore a formal check of the input and output parameters against the returned result. Application knowledge is not needed here. This only changes in integration testing, when all modules are linked together and the final result must be checked against the functional requirements.
Static analysis: Finding errors without running the program
Validating software without running it was Harry’s goal for years. The idea: you just enter the source code and the tool reports where there are errors and contradictions with the specification.
The test bench therefore had two parts. The dynamic analysis ran the program. The static analysis parsed the code, documented the architecture (who calls whom, who passes which data to whom) and looked for weaknesses, such as rule violations or weak structures.
Complete static validation fails because of the effort involved. It requires the specification to be on the same semantic level as the program. A separate test would have to be described for each loop, each case decision and each if statement. This is so expensive that Harry stuck to noting gross rule violations.
A recurring point of contention was the GoTo. Until the late 1980s, many programmers thought in assembler: check a state, jump, transfer data in between. This thinking ran deep, and it took Harry a long time to argue against GoTo constructs.
Why agile development fails because of component size
Agile sprints only work if the delivered components are small enough to be testable and usable in the sprint time. In Harry’s experience, this is exactly where things get stuck. Teams try to complete components that are too big in two to four weeks, and when the time runs out, they finish and hand over something that is far from ready.
The key question is: What does “done” mean? To what degree must testing be completed for a sprint result to be truly finished? If you don’t answer this question, you are only apparently delivering.
The architecture of the software must match the agile development theory.
Harry Marsh Sneed
Harry has studied several failed agile projects, including one for an oil company in Vienna and one for a hearing aid company in Graz. Both significantly exceeded their budgets and were not completed. His finding: they had not really worked in an agile way, but had crammed monster components into a single sprint.
His research at the University of Dresden explored the question of how big a web service can be. Calculating from the time required for testing, he came up with around 300 to 400 statements, depending on the language and complexity. A component module should not be larger than this.
Measuring software before testing instead of afterwards
Before testing components, it is worth measuring them: How big and how complex are they? These values can be used to determine which components are tested in which order and whether they fit into a sprint at all.
This approach is deliberately based on the specific size of the components. The widespread opposing position calls for the opposite approach: start at the top with the objectives, derive the architecture from this, and the size of the building blocks results from the design. Harry argues that problems can be identified earlier if the programs are measured directly.
The lesson from many projects is sober. Defining interfaces cleanly was always the most difficult task, in the past with software modules, today with REST and web services. The interface is the be-all and end-all.
If there is too little testing, the project will fail later on
A tightly calculated testing effort takes its revenge. In the health card project, Harry estimated that it would take several thousand man-days just to develop the tests, plus another few days to execute them. The project manager then threw him out. The figure was too high.
The project ended up failing because it was too buggy and not sufficiently tested. Harry had predicted exactly this outcome. The mechanism behind it is always the same: managers have an eye on the deadline, meet the deadline no matter what the cost, and the test is the first item to be cut.
If you invest early in proper testing, you pay once. Those who save money pay more later, often with the entire project.


