Ensemble testing and ensemble programming

Ensemble testing refers to a collaborative working method in which four to six people test or program together: one person executes, the others think aloud and provide guidance. Thinking and executing are thus deliberately separated. This makes implicit knowledge explicit, promotes the exchange of knowledge between testers and developers and delivers more creative results than working individually.

Key Takeaways

Ensemble testing consistently separates thinking and execution: one person only types when instructed, the entire rest of the team thinks aloud and thus makes implicit knowledge explicit.
The ideal ensemble group consists of four to six people, as individual voices are drowned out below this number and the session gets tough above this.
Ensemble formats convince skeptics more easily than pair programming because the group setting creates less personal pressure than direct one-to-one work.
Testers do not have to become programmers through ensemble sessions, but build up enough understanding to ask more qualified questions about testability.

What Ensemble Testing and Ensemble Programming mean

Ensemble testing and ensemble programming are collaborative work formats in which an entire group works together on a task instead of dividing it up among themselves. One person sits at the keyboard and mouse, the rest of the group sits around them and thinks out loud. The aim is not to finish faster. It’s about sharing knowledge, team building and learning together.

Thomas Much describes the format as a further development of pair programming, which he worked on for years as a coach. Instead of two people, four to six work together. The term has several names: Mob Programming, Team Programming, Software Teaming. In the testing context, it is referred to as ensemble testing.

The appeal lies in the fact that several roles share the same space. Developers, testers, sometimes also architects or people from the specialist area sit together and produce something concrete: code or tests.

The most important rule: separate thinking and execution

The central principle of ensemble working is the separation of thinking and execution. The person at the keyboard only executes. They do not decide what happens. The group does the thinking.

This executing person is called the typist. They type, click and enter values, but they do not take action on their own initiative. If nothing comes from the group, she asks: Where should I click? What amount should I enter? Thomas calls this a “smart input device”. The typist can ask back at any time if they have not understood something and can insist that the group comes to an agreement.

This thinking aloud makes implicit knowledge explicit. Why does someone enter a number with a period and not a comma? What upper and lower limits must be observed when testing? In normal individual work, such decisions remain invisible. In an ensemble, they are expressed and discussed.

Why the format doesn’t overwhelm anyone

Nobody has to be able to do anything at the keyboard because the intelligence is in the room. Those who can’t program may only enter individual characters at first. This levels out quickly because you can watch what the others are typing for twenty minutes and get an idea of how things are going.

This is exactly where the leverage for mixed teams lies. Testers acquire an understanding of development along the way, developers build up test knowledge. No one has to take on the other role completely.

If I say, I didn’t understand you, please explain it to me in more detail, there are guaranteed to be one or two people in the room who also say: I’ve always wanted to know that.
Thomas Much

It works because an ensemble is a safe space. Supposedly stupid questions that nobody dares to ask in a one-to-one situation are easier to ask in a larger group. Thomas has seen CEOs, team leaders and project managers join in, simply to see if it really is that accessible.

The right group size is four to six

Four to six people is the ideal size for an ensemble. With fewer participants, there is a lack of ideas and the dynamic shifts.

In a group of two, one person usually dominates and the other is drowned out or doesn’t dare to ask. Even with three people, one person is often under-buttered as soon as two have the same opinion. With four or more people, the voices balance each other out and the group moderates itself. The loudest voice sometimes has to be quiet because it is sitting at the keyboard.

More than six participants make things tough. Thomas once worked with twelve people and changed every two minutes. It was fun, but the group was actually too big.

Take enough time, otherwise it’s no good

An ensemble session takes time, and rushing is poison. Under stress, no one dares to ask questions and the safe space disappears.

Thomas recommends that teams who are starting out reserve a block of two to three hours per week. If the task is finished earlier, all the better. Then use the rest of the time to make the code more maintainable or automate another test. The time fills itself.

Experienced teams can get by with an hour once the setup is prepared and the computer is running. When starting out, however, it is better to plan too generously than too tightly.

The person at the keyboard changes at short intervals, every five to ten minutes. A timer sets the timebox so that the changeover takes place without discussion. Technology enthusiasts like to hang on to the keyboard and are reluctant to give up. The timer solves this.

Ensemble work is more sustainable than pair programming

Winning over skeptics for ensemble work is easier than for pair programming. This has to do with the different setting.

Thomas quotes a comparison he knows from an agile coach: Pair Programming is like a date where the people have to be a perfect match, otherwise it becomes exhausting. Ensemble work, on the other hand, is like a party with friends, more relaxed and non-binding. That makes the format more durable in everyday life. On average, it is used more often.

In practice, teams are somewhere between two extremes. Some try it once and forget about it. Others work intensively with it. Most end up with a healthy mix: in the Daily, they discuss what they want to tackle together this week.

Architecture is a team sport

From an agile, continuous perspective, architecture is not a specialist topic, but teamwork. Local design and architecture decisions are made by the team. Central architects moderate the whole process and ensure that the company does not drift apart.

If architecture is decentralized, it belongs in the ensemble like code and testing. The distinction is simple: what is hidden in the code is design. What can be observed from the outside is architecture.

Where architecture teams work more separately, it is worth bringing them into the sessions on a regular basis. Architects who make decisions centrally then experience what their specifications mean in terms of implementation. This experience gives them new ideas on how they can set more appropriate guidelines.

The same applies to specialist departments that complain about too few features. If you bring them in, they understand what makes development difficult: unclearly specified features or contact persons who are rarely available.

An ensemble meeting produces code, not minutes

The most common management objection is: four people, two hours, they have to work. Thomas turns the argument around. Managers never sit alone in meeting rooms. They exchange ideas because they want to achieve something.

The difference with an ensemble session is the result. At best, management meetings produce an agenda or minutes. An ensemble session produces tested code with a proper architecture. Exactly what a development team needs.

The format was created over ten years ago by Woody Zuill and his team, initially as a submarine project that bypassed management. The reason given to superiors was simple: we’re having a meeting. But this meeting produces code and tests.

How to set up the first session

Start on site, not remotely. On site, there are facial expressions, gestures and shared laughter, and no one is talking over each other. Remote requires screen sharing, tooling for the handover and much more discipline. You can save that for later.

The most important success factors for getting started:

Room: not too small, not too big, so that it doesn’t become anonymous.
Participants: Four to five colleagues who want to take part.
Preparation: One person sets up the setup in advance so that the computer is running. If the group waits for half an hour, they won’t be interested.
Setup: Computer at a standing desk, the rest of the group sits around it and relaxes. The person doing the work stands up and walks to the computer. This small movement separates doing from watching.
Roles: The typist holds back and does not become active on their own initiative. The group thinks aloud.
Timer: Rotation via smartphone timer so that the change happens without discussion.

Coding katas, i.e. small exercises, are suitable for getting started. Then it’s quickly on to real tasks so that the format doesn’t just prove itself in the classroom. If you want to delve deeper, you can find video material from Woody Zuill on Mob Programming and Software Teaming and from Lisi Hocke on Ensemble Testing.