Shift Left but Right
Why do testing activities always end in a traffic jam just before go-live? Shift left alone is not enough: How synthetic monitoring reuses the same test case.

“Shift Left on the Right” refers to a combined test approach that combines early parallelization of development and testing with active monitoring after go-live. Test cases are created before coding, used several times during the sprint and then reused as synthetic monitoring in production. In this way, the same artifact pool covers errors before customers notice them.
Key Takeaways
- Synthetic monitoring with reused UI test cases enables proactive error detection in production before real customers notice a failure.
- Test automation before go-live and synthetic monitoring in production use the same artifacts instead of building and maintaining two separate test sets in parallel.
- The fast feedback loop in the sprint is achieved through a deliberately reduced test set: feature-relevant tests plus prioritized regression testing until the team’s self-selected time tolerance is reached.
- Only predefined synthetic data may be used in production, and the test case must be recognizable as a monitoring run so that customer evaluations and web tracking are not falsified.
Why testing activities end up piling up despite agile working methods
Even in agile teams and in the DevOps model, test activities accumulate shortly before the go-live. This is not a waterfall problem that has disappeared with agility. Each feature goes through a linear process: at some point the team starts working on it, at some point it goes live, and in between it is developed and tested.
The team does not work on all features at the same time from the outset. It starts with one, brings it to maturity and moves on to the next. This is exactly what creates a backlog of testing activities before the go-live. And in the end, it is usually the fault of the test that is not completed.
Björn Scherer, who comes from the insurance environment at Cosmos Direct, describes two opposing responses to this pattern: shift left pulls activities forward, shift right shifts part of the observation to production. Both approaches have charm. The interesting question is how to make use of both worlds.
Shift Left means parallelizing, not just starting earlier
Starting earlier alone is useless. If you just move the same mechanics forward, you create the congestion in a different place. The lever lies in cutting topics small so that they can be worked on in parallel.
Instead of completing a user story as a block and then passing it on to the team, it is worth getting individual acceptance criteria to the point where the first can already be considered in parallel while the second is still being created. As soon as the first is tested, the second is developed.
A bigger step is to look at a feature with several disciplines at the same time. At code level, this leads towards Test-Driven Development, at functional level towards Acceptance Test-Driven Development. The principle remains the same: parallelize development and testing instead of hanging them one behind the other.
The functional definition can be preferred, the technical glue code cannot
The valuable part of a test is its technical definition, and you know this from the outset. Before someone builds a feature, they know what it should be able to do. You can formulate this part in an executable way or write it so that it can be executed later.
What cannot be brought forward is the technical part. Selecting a field in Selenium only works if the UI exists. But this glue code is the last step, not the first. The technically motivated test cases are created beforehand.
The goal is to create test cases that no longer contain any technology at the top level, but are purely technically motivated. This can be achieved using a keyword-driven approach. Which form a team chooses, whether Gherkin or keyword-driven, is up to the team.
Fast feedback needs graded test sets according to pain threshold
Fast feedback for the development teams requires a higher deployment frequency, at least on the test environment. Deployment allows for higher test levels: not only unit testing in the build pipeline, but also API testing and, beyond that, UI or end-to-end testing. This allows errors to be found that are not noticeable in the unit test.
Nobody runs thousands of UI tests in ten minutes. That’s why each team works with a subset. The test cases for the current feature form the core. This set is filled up with regression testing until the pain threshold for quick feedback is reached.
The time limit is set by the team itself. Ten minutes, a quarter of an hour, half an hour, depending on the tolerance. The size of the test set is based on this, and it can be resized for each sprint. The regression cases are selected via tagging and prioritization, not via automatic generation.
This can be staggered for larger applications: the focused set during the day, a broader regression set at night, and an even larger set every week if required. This maintains fast feedback without permanently sacrificing the breadth of coverage.
Synthetic monitoring closes the gap when there are no customers on the system
Classic monitoring evaluates what real users are doing on the system. It writes away logs and provides evaluations based on them. The catch: it only works as long as customers are active. There is no database during the lunch break or at night.
Synthetic monitoring therefore proactively executes use cases, cyclically and regardless of whether anyone is currently using the system. This means that a defect is detected before the first customer notices it. In the best case scenario, it is rectified before anyone notices.
The decisive difference to pure backend tools lies in the perspective. Tools such as Dynatrace show which service has an error or where things are slowing down. From a business perspective, the question of where the customer is being disrupted remains unanswered.
We put the cart before the horse from the other direction. Because we do it from the customer’s point of view, we know this directly and then use the other to analyze errors. Björn Scherer
Reuse existing test artifacts in production
The core of the approach is reuse. The artifacts that are created to the left of the go-live should not remain in the release, but should continue to bring benefits in production to the right of it.
The team selects one to three particularly relevant use cases from the regression set, which must be available for acceptance at the latest. These then continue to run in synthetic monitoring. The tests have to match the application by the time of acceptance anyway, otherwise you will be putting an application live whose tests no longer work.
The logic is complete: if a use case is important enough to run in production, it is also important enough for the fast feedback loop. Such use cases are adapted first when changes are made. Most applications have one or two core use cases that are eligible for this.
At Cosmos Direct, these are online application routes. At the end of a route, the insurance is concluded, not just an application sent. There are around 1.4 million routes in a car insurance policy. This cannot be fully tested. Proof that the Happy Path and two or three variants run through is sufficient for the company.
Why UI testing, even though it is considered vulnerable
The obvious question is why use vulnerable UI testing instead of more stable API testing. The answer has several reasons, and stability is one of them, which the team has worked on a lot independently.
- Reuse: The UI and end-to-end testing already exist from the development process. Nothing new needs to be built.
- User view: UI and end-to-end testing map real user journeys. This perspective is much more difficult to create using pure API testing.
- No extra tooling: An additional monitoring tool would need someone in the cross-functional team who knows how to use it. If this is one person, the bus factor is one. The existing testing stack eliminates this risk.
- Early entry: Testing can be integrated into the standard testing process early on, so that problems with the monitoring use cases are noticed early.
Test flakiness remains the big issue. Anyone who extends testing in production must have invested in its stability beforehand. Slower response times are indirectly noticeable: If the UI responds slower than the set timeout, the test fails. The tolerances have been deliberately increased to avoid constantly triggering false alarms due to flakiness.
Test data in production is the real hurdle
Executing tests in real production was a long internal struggle. Real customer data is taboo. For each use case, we define which data is required and which must be available in the system. This synthetic data is put into production once, and the tests are only allowed to work on it.
The use case must match the data situation. An application route behaves differently for an existing account than for a new customer. Before a use case is even allowed to run in production monitoring, it must prove in pre-production that it will not break anything.
To do this, the applications must be able to distinguish synthetic traffic from real traffic. The use case signals that it is the monitoring run. Otherwise you falsify evaluations: With a niche product, monitoring can generate a multiple of the real number of customers every 15 minutes and thus distort any technical statistics.
Web tracking from e-commerce must also not be falsified, for example by switching off cookies. In practical terms, this means that your test framework needs a monitoring mode. The same test case uses different data in the test environment than in production. This requires some tuning, but is feasible.
Alerting and reporting make the results manageable
Synthetic monitoring is only as useful as the reaction to it. A central control center is active anyway and receives the alerts, integrated into the existing company processes. This is particularly effective at night when the team is not on site.
The next step reverses the process and alerts the teams directly. The team that develops an application is responsible for it. A fastlane plays the problem directly back to them, while the official alerting path remains in place.
Whether a team uses synthetic monitoring at all is up to them. The idea behind DevOps migration is: take responsibility for your application. If a team says it has everything under control, that’s fine too. The central task is to provide the infrastructure and methodology into which teams can integrate their use cases.
A dashboard quickly shows a central on-call service whether an individual application is affected or whether there is a wildfire. The next expansion stage goes deeper: evaluating causes from the user’s perspective, recognizing the steps in which things go wrong and aggregating this via the history.
Frequently Asked Questions
Acceptance-Test-Driven Development (ATDD) promotes the shift-left strategy by defining test requirements early on in the development process. This enables early feedback and reduces misunderstandings between developers, testers and stakeholders. By incorporating acceptance criteria at an early stage, potential problems are identified and resolved more quickly, which increases the quality and efficiency of development. ATDD thus leads to continuous improvement and shortens the time it takes to deliver a functional product as part of shift-left practices.
The combination of Shift Left and Shift Right improves the software development process through early error detection and continuous feedback. Shift Left focuses on quality assurance at the beginning, which saves costs and time. Shift Right makes it possible to make adjustments after the release through monitoring and user feedback. Together, they ensure more robust software that is user-friendly and responds more quickly to changes, which increases user satisfaction.
The key difference between Shift Left and Shift Right in software development lies in the timing of the tests. Shift Left emphasizes early testing during the development phase to identify bugs early and minimize costs. In contrast, Shift Right focuses on post-deployment testing to analyze user behavior and promote continuous improvement. The two approaches complement each other, with Shift Left ensuring quality from the start, while Shift Right provides valuable insights after release.
Shift Left focuses on performing tests and quality checks early in the development process to identify and fix bugs early. In contrast, Shift Right refers to post-deployment testing to analyze user feedback and performance in real time. While Shift Left proactively ensures quality, Shift Right aims to continuously improve the application during operation.
Shift Left significantly influences security practices in software development by integrating security early in the development process. This means that developers take security considerations into account during the planning and design phase instead of testing them at the end. As a result, potential security vulnerabilities are identified and fixed early on, reducing the cost and effort of subsequent adjustments. In addition, Shift Left promotes a culture of shared responsibility for security throughout the team.
The Shift Left strategy supports DevOps implementation by integrating quality checks and tests early on in the development process. As a result, errors are identified and rectified more quickly, which shortens development time and improves software quality. Teams work more closely together, which promotes the exchange of information. This leads to faster delivery of features and increases flexibility to respond to change requests. Overall, Shift Left increases the efficiency and effectiveness of the entire software delivery process.
The Shift Left approach is a method that aims to integrate testing and quality assurance early on in the development process. Instead of testing at the end of the development cycle, attention is paid to quality as early as the planning and design phase. This allows errors to be identified and rectified more quickly, saving costs and time. The Shift Left approach encourages collaboration between developers and testers to deliver a better end product and improve overall quality.
Shift left in software testing means integrating testing activities early on in the development process. As a result, errors are detected and rectified more quickly, which improves the quality of the software and reduces the costs of subsequent bug fixes. This approach promotes collaboration between developers and testers and speeds up the delivery of software. Shift Left is important for increasing efficiency, minimizing risks at an early stage and achieving greater customer satisfaction.
The shift left strategy in software development means integrating quality assurance and testing early on in the development process. This allows errors to be identified and rectified more quickly, which reduces time and costs. The benefits of shift left include better product quality, faster time to market and greater team collaboration. By addressing problems in the planning phase, a more efficient development cycle and smoother workflow is created.
Related Posts

Richard Seidl
•Jun 2, 2026
Patient agility: Is agile working dying?

Richard Seidl
•May 26, 2026