Testing and quality assurance for start-ups

Quality in start-ups is not created through formal test processes, but initially through customer-oriented observation: founders and product owners test from the user’s perspective at an early stage. As the team grows in size, a dedicated tester role is needed that acts as an active part of the development process, not as a final quality gate.

Key Takeaways

Testing is not an end in itself: If you don’t have anyone in the company who spends sleepless nights worrying about product quality, you won’t be able to establish effective testing.
Generated code from AI tools contains on average more errors and proportionally more critical errors than manually written code, which increases, not decreases, the need for professional software testing.
Startups that make the leap from seedling to scalable company have understood the issue of software quality early on, while pure nonchalance towards quality only survives in rare cases and over short periods of time.
Technical debt works like taxes: If you don’t invest in testability and code quality early on, you pay a much higher price later as complexity grows.
Agile testers are most effective when they act as an equal part of the development team and not as a pure quality gate through which everything is pushed in the end.

Testing in startups doesn’t start with processes, but with testability

In young companies, quality is not created by a ready-made test framework, but by a conscious decision in the conception phase. Anyone setting up a new software system can consider testability from the outset, similar to security. Whether testing is then automated or manual is a question of capacity and is often of secondary importance at the beginning.

The basic attitude behind it is important. If a system is built in such a way that it will later have a testing capability, the effort required remains manageable as soon as the complexity increases. If, on the other hand, testability is ignored, the backlog grows with every additional line of code.

Daniel Krauss, co-founder of Flix, puts it this way: ‘The idea that testing plays no role at all in start-ups can no longer be upheld today. A few years ago, things may have been different. Most people now know that testing is an elementary component of good software development.

Why the difference between B2C and B2B matters more than the age of the company

Whether a company is young or established says less about its testing behavior than the question of who it is building for. For decades, classic enterprise software has shaped the structures that many people associate with testing today. B2C products work differently.

With an app in the app store, quality quickly determines whether the product is accepted or fails. The feedback is immediate. This proximity forces a different approach to errors than in a supply chain in which the finished product is several steps away.

Daniel describes this contrast from his own experience. In the automotive supply industry, the finished vehicle is three steps further away. There, it takes effort to connect the metrics of individual components with the end product in such a way that the customer is satisfied in the end. In a B2C start-up, this bridge is much shorter.

Testing is not an end in itself and should never become one

Testing without quality awareness does not work. Anyone looking for a fig leaf or testing because that’s what you do is missing the point. The goal is to evaluate the quality of a product and to increase it through a good development process.

Florian Fieber, Chairman of the German Testing Board, brings a simple test into play:

First of all, I ask, who among you can’t sleep if the product doesn’t work? If nobody gets in touch, then everything is actually fine. Then we don’t have a problem. Florian Fieber

Without someone who has a genuine interest in product quality, meaningful testing can hardly be established. In this respect, a start-up is no different from a large company. The decisive factor is how important quality actually is to the team.

Metrics say nothing if the goal is missing

A test coverage of zero percent means that functioning software is luck and chance. Achieving one hundred percent does not necessarily make sense. Test coverage itself is not an end in itself, but an auxiliary value. The real question is what you want to achieve with it.

Classic metrics such as lines of code in programming or test coverage in testing are not worthless. They show extremes. But measuring them alone is misleading. Depending on the context, a completely different figure may be the right one.

Startups often work with metrics that are more focused on customer feedback than on formal coverage. A more formally organized, document-heavy industry such as the supply industry will count test cases and measure coverage. An organization focused on speed and learning is more likely to measure what is received by the customer.

How test culture has grown at Flix from the customer perspective

At Flix, the testing discipline did not emerge from the technical corner, but from the customer’s perspective. In the early days, the founders did the testing themselves, later the product owners in an agile context. They downloaded the app, clicked through it and reported back what didn’t work.

The purely technical tasks remained with engineering. The genesis of the dedicated role of the agile tester, on the other hand, was characterized more by the customer’s perspective than by the idea of writing a test case against a piece of software.

The former terse email has become a regulated process through which feedback flows back. This professionalization does not apply to every three-person company, but can be observed in companies with 30, 40 or 50 people. The driving force behind this is the realization that a bad release can very quickly put you out of business.

Learning through pain works differently in a startup than in a corporation

In a small company, quality problems can quickly become life-threatening because the product is still a seedling. At the same time, they can often be resolved quickly because dependencies are low. The depth of the pain therefore remains limited.

Formal processes are where mistakes are expensive and difficult to reverse. If you have to recall 100,000 cars in the automotive environment, you have a real problem. Both worlds have their authorization, depending on the cost of an error.

Florian turns the common image of the turtle that makes it into the sea on its head. Instead of saying that the survivors understood quality, the same argument can be made: Those who get quality right make it into the sea in the first place.

The tester needs a voice because he is outnumbered

Testers must be anchored with roles, otherwise it depends on the goodwill of individuals. A developer in the flow does not automatically think about how to ensure consistent quality when creating. This requires people who keep an eye on neuralgic points.

This role requires attitude. In most teams, there are more developers than testers. Testers are therefore outnumbered and must be prepared to speak up, even to management.

At Flix, an early head tester and principal engineer shaped this role. She drove the issue forward and did not mince her words with management. Treating testers as second-class developers is outdated in this interpretation. It’s about different strengths, not hierarchy.

From start-up to grown-up company: when disciplines are added

Many start-ups grow like an onion around a strong development team. Often a founder is a developer himself. Up to a certain point, this focus is sustainable. Then it reaches its limits.

With size, other disciplines come into play that were previously underestimated:

Architecture as a separate task alongside pure implementation
Requirements management
Testing as a separate area of expertise
Project management and supporting organizational processes

This determines whether a startup will make it to adulthood. As long as the team is small, a founder can participate in testing. As the organization grows, it needs a systematic, professional approach as a separate discipline. The two are complementary, not alternative.

Change is not the pain, the loss of customer focus is

Growth in itself does not cause pain. Those who understand change as a flow experience it as a movement forward, not as a burden. Discomfort usually only arises when people reflexively place every change in the corner of pain.

It becomes painful at another point. When the focus on the customer is lost and instead process is piled on top of process on top of process, self-interest takes over. The result is actually unpleasant.

A simple question that everyone in the team can ask helps to counteract this: Why are we actually doing this? Here, too, it is up to the testers to regularly raise their hands and point out nonsense. Culture, consciously set, takes the whole team with it.

What the role of the Agile Tester means at Flix

The Agile Tester is a player in the development team, not a goalkeeper at a quality gate. The name arose from the team itself and expresses the fact that this role is an elementary part of the development process and can adapt.

Agile does not mean arbitrariness here. It means being able to react to changing requirements and still deliver a set result. The tester is part of the team, not the door through which everyone has to squeeze in the end.

In Flix’s distributed architecture, the development teams are typically six to eight people, sometimes up to ten. Such a team consists of a product owner, an agile tester and various developer profiles, from classic developers to data and AI topics to DevOps. Close coordination with the product owner is at the heart of the role.

Methodical foundation remains, the tools change

Yesterday’s tools no longer fit today, but the methods and strategies do. This is precisely what makes methodologically oriented training valuable. The Certified Tester focuses on fundamentals, principles and the test process, agnostic to development method, approach or company size.

These basic ideas can be applied anywhere. How early testing should be done is not a new insight. Today it is called shift left, an idea that has accompanied testing for decades. The characteristics differ depending on the technology, tool and product, but the basic idea remains stable.

At Flix, further training is based on personal responsibility. Team-specific topics are decided by the team itself. Overarching issues that affect all agile testers are discussed together. Keeping your finger on the pulse is necessary because a growing part of the world consists of software and therefore there is more and more to test that did not exist years ago.

Vibe coding belongs in prototyping, not in enterprise architecture

AI-supported tools increase productivity, vibe coding is separate from this. At Flix, developers work with tools such as Claude Code to mitigate shortcomings in development with AI support. That makes sense. Taking something that’s been whipped up live and untested in an enterprise architecture is not.

Daniel draws a parallel to earlier tools such as Dreamweaver, which allowed a lot of code to be created, but the results were poor. Being able to produce more does not automatically make the result better. On the contrary, it shifts work to those who have to use such tools sensibly.

For rapid prototyping and requirements engineering, the benefits are real. Product owners can build faster and better prototypes, even without in-depth technical knowledge. This reduces the back and forth about what is actually desired. For production code in a more complex product, skepticism remains. Nevertheless, getting stuck in and exploring what works is the right attitude. Rejection would be a mistake.

Software is more than code, and that is precisely why the need for testing is increasing

The common mistake is the developer-centered view that reduces software to writing code. Software is a product of technical components and people. Building it requires more than just quickly generating code.

AI-based tools pick up considerable speed, this is not vibe coding, but their serious use in development. For routine tasks and quick solutions, much of the work can be done remarkably fast.

Florian does not see this as a cause for concern for testing, but rather the opposite. In many observations, generated code shows more errors and, relatively speaking, more critical errors than code written by humans. This means that a large wave of sometimes bad code is rolling towards software testing. Economically, this is questionable. From a testing perspective, it is: Those who thought testers were no longer needed, need them all the more.