How Stiftung Warentest tests

Product testing at Stiftung Warentest is a structured process that begins around one and a half years before publication. A market analyst, a project manager and external testing institutes jointly define a test program, purchase products undercover and test them according to standards and subjective criteria. Each test is concluded with a verification step that checks measurement data for consistency.

Key Takeaways

Stiftung Warentest plans test topics up to one and a half years in advance and is guided by seasonality, user questions and market developments.
All products are purchased undercover so that manufacturers do not know the timing or scope of a test and cannot deliver special products.
Subjective criteria such as handling or feel are objectified as far as possible through blind tests, trained test groups and accompanying reference devices.
Every test project is teamwork: market analysts, project managers, test laboratories and editors each contribute their own perspectives, because sober measurement data alone does not tell a story.
Manufacturers receive the objective measurement data before publication so that they can react to safety-critical findings, but the publication date remains unknown to them.

What Stiftung Warentest associates with software testing

Both disciplines test from the perspective of the user, not the manufacturer. Stiftung Warentest describes itself as an independent eternity foundation with the aim of enabling consumers to make self-determined purchasing decisions. Software testers pursue the same claim when they check whether a product does what the user expects it to do.

The common values are objectivity, independence and transparency. Testers slip into the role of the end user and ask whether the product delivers what it promises. These glasses are the starting point for every test in both worlds.

Johannes Stiller uses an image from consumer protection: The testing or scientific view meets the question of what is relevant for the user outside. This tension is familiar to anyone who measures software against real requirements rather than internal specifications.

How a test process is created: from the idea to the test program

A test process begins long before the actual test. At Stiftung Warentest, there is often a year to a year and a half between the initial idea for a topic and its publication. A test of running equipment in January was accordingly decided around a year and a half in advance, partly because the sporting resolutions at the beginning of the year shape the demand.

The topics come from several sources. User questions on the company’s own website, market observation and suggestions from the teams all come together. Each idea is critically evaluated according to clear criteria.

A topic only progresses if it fulfills four conditions:

It must make sense for the consumer.
It must be testable.
It must be usable and interesting for readers.
It must be measurable.

A test program is created from the confirmed idea. It is based on existing standards, if there are any. Sometimes no standard exists, sometimes it is in preparation. The project manager works intensively on the product and drafts the program, which is then cross-checked several times before it is sent to the testing institutes.

Subjective criteria can be made measurable

Soft criteria are also tested soberly and comparably. Ease of use, reaching hard-to-reach areas or the feeling of brushing your teeth cannot be measured objectively like power consumption. Nevertheless, they must not be left to chance.

Comparability is achieved through several mechanisms. If possible, the same people test over longer periods of time. Reference devices are used as fixed candidates so that tests can be compared with each other. In this way, a benchmark remains stable even if the products change.

Bias is deliberately removed. Product names are masked off, devices are made unrecognizable, tested blindly and in different sequences. The tester should not know which brand product is in front of them.

During a test, a single person does not evaluate a single device. All testers take turns testing different devices in different sequences, the impressions are collected and combined. This distributes subjective outliers across the group.

How the test objects are selected and procured

The appliances are purchased undercover, never provided by the manufacturer. A manufacturer cannot submit its own washing machine. Our own buyers procure the products on the market without the manufacturers knowing what is being bought and when.

The selection follows the consumer’s view. A market analyst works full-time to monitor the market: Who are the big players, who is bringing innovation, what is selling well. A niche device that hardly anyone can buy does not help the consumer.

One device is often not enough for endurance tests. Several devices are purchased in order to be able to test the load over time. If something breaks during the test, it is retested to confirm the result.

Quality comes from the team, not the individual

No test depends on a single person. Behind a project is a group with different roles: the project manager, a market analyst, the testing side and the editorial team, which knows what counts for the readers out there. This mix is reminiscent of the cross-functional team in software development.

The diversity is intentional, not a coincidence. Those who provide the sharpest analysis but remain incomprehensible will not reach anyone. Only when the sober facts meet a story that can be told does measurement data become a useful test: what are the highlights, what are the weaknesses, is something dangerous.

It’s no good if you have the smartest analytical glasses on and nobody understands it.
Johannes Stiller

More eyes do not automatically produce uniform answers. During the complex verification process, every examiner notices something different. This is exactly what makes the test stronger instead of smoothing it out.

Why everything is checked again at the end

There is a central verification step before publication. The measurement data is checked for errors and consistency. Only then is a result considered reliable.

The results of all checks are brought together in a central meeting for each individual topic. This is where the results are discussed and what story can be told from them. This meeting is held just a few days to weeks before publication to ensure that the test remains as up-to-date as possible and at the same time sound.

Every tester knows this logic: a finding is only a finding if it can be reproduced. If a device breaks, it is retested until the result is confirmed.

How to deal with manufacturers and bad news

Manufacturers are informed before publication, but without having any influence on the result. They are told that they are being tested, but know neither the time of purchase nor the publication date. They are informed of the objective measurement data, including critical findings.

In the case of safety-critical findings, this communication has a direct purpose. In the case of child car seats that tore out of their mountings, the manufacturers were informed in advance. The questions behind this: How do you react, can you offer goodwill, is there a material defect or a problem in the batch.

The reactions vary greatly. Some manufacturers engage in a cooperative dialog, others sue for injunctive relief, even in the case of a second place. Over the years, a development can be observed with some product groups such as mattresses or cordless vacuum cleaners: Little works at first, but it gets better over time.

The tester as the bearer of bad news is a familiar role. Those who report defects are easily antagonized at first, even though the test serves the product and the user. This position only becomes viable if there are verifiable rules that are consistently adhered to.

Transparency as the foundation of credibility

Comprehensible methodology is the basis of trust. Stiftung Warentest discloses the test conditions and refers to the underlying standards. In the magazine there is a compact box “This is how we tested”, followed online by the more detailed version with more details.

This openness is also an obligation. A clear mission must be represented and compliance with the company’s own rules must be verifiable. The same principle applies to software testers: a finding is only convincing if it is clear how it was arrived at.

Contact with users remains open. Feedback via e-mail inboxes and forum posts is monitored and information on new aspects or changes is fed back into the work. Those who test for the user also listen to the user.