How to create the perfect conference program

The program of a software testing conference is created through a multi-stage selection process: a program committee evaluates all submissions, each of which is reviewed by at least three reviewers. At Eurostar, more than 527 valid submissions were received for around 50 program slots, which corresponds to an acceptance rate of less than ten percent. Quality, practical relevance and thematic balance are the deciding factors for acceptance.

Key Takeaways

Out of 527 valid submissions, less than one in ten make it into the Eurostar program because around 50 slots meet a sub-10% acceptance rate.
AI-generated abstracts are not rejected because of their origin, but because they leave it unclear what the problem is, who is speaking and what personal experience is behind it.
Those who receive a rejection and request feedback will receive individualized feedback from Eurostar that will identify specific improvements for the next submission attempt.
Members of the program committee and their company colleagues are not allowed to speak at Eurostar so that there is no conflict of interest in the program selection.
Those wishing to submit a presentation will increase their chances if they start early, have someone proofread the abstract and contact the Program Chair directly.

What distinguishes a good test conference from an average one

A good conference is not created by having the best talks, but by the process that leads to those talks. The answer “choosing the best talks” is as empty as saying that good software comes from writing good software.

From a quality perspective, it is the selection process behind it that counts. How does a program committee filter out the few that make it into the program from hundreds of submissions? What criteria decide? How do you ensure that the end result is a strong and plausible end product?

This is precisely where the difference between conferences lies. Not in slightly different rules, but in how strictly and how structured the filter works.

How a program committee builds a program from 527 submissions

The Program Chair is responsible for the program and puts together a small team. At this year’s Eurostar in Oslo, this committee passed five people. Their primary task: to act as a filter between all the submissions and what actually makes it onto the stage.

The figures show just how tight this filter is. 527 complete, valid submissions entered the review process, after incomplete or obviously unusable entries had already been sorted out. This was offset by around 50 slots in the program over three days. This results in an acceptance rate of less than 10 percent.

Each committee member evaluated around 100 submissions. In addition, at least one other reviewer looked at each submission, so that each presentation received at least three reviews. These reviewers come from the community: people who have already spoken themselves, are involved in the topics and work their way through the submissions on a voluntary basis.

Such a low acceptance rate separates Eurostar from most other test conferences. As a result, many good presentations are rejected, not because they are bad, but because another contribution on the same topic was a tad better.

Why a rejected presentation is often still good

When slots are tight, a rejection rarely means that the submission was weak. With an acceptance rate below 10 percent, papers that would have been accepted at any other conference fall out.

At Eurostar, after a rejection there is the opportunity to ask for the reasons. For this year’s edition, around 70 such requests were answered individually: the submission was looked at again, the video was viewed, all review comments were read and constructive feedback was formulated to increase the chances in the next round.

This feedback no longer benefits the current program. It comes from your own experience of how frustrating a rejection without a reason is. If you only get a “no” and hear nothing at all or only another “no” when asked, you learn nothing from it.

This feedback is valuable for speakers. It shows whether more practice was missing, the examples should be chosen differently or the wording simply missed the point. Sometimes it’s just a nuance.

Different perspectives in the committee make the program better

A program committee should deliberately bring together different backgrounds. When five people with different professional backgrounds look at the same submission, this mix catches contributions that a single perspective would have overlooked.

A presentation that leaves one person cold can strongly convince another for an understandable reason. These frictions give rise to the discussions that balance a program: What brings a perspective that would otherwise be missing? What complements the picture?

When slots are tight, there is often a struggle for detail. Deciding between two very good, thematically similar presentations is not pleasant because only one of them fits in.

Why an AI-generated abstract is rarely convincing

A completely AI-generated abstract does not fail because it comes from an AI, but because it is usually poor. It remains unclear what it is about, what the contribution brings and what qualifies the person speaking for it.

An excellent presentation is based on personal work experience. From your own failures, your own successes, from things you have done yourself. That’s why you are the best expert, can answer questions authentically and have material that no language model knows. An AI does not know what you have experienced in your work.

There is nothing wrong with using language models as long as they support rather than replace. For non-native speakers in particular, it helps to look at suggestions at word or sentence level. It is crucial to check each individual suggestion: Does the sentence still sound like you? Will it really be easier? Would you have formulated it that way yourself? Then adopt it. Otherwise not.

A telltale pattern: If an abstract is still not clear when read three times, a check by a recognition tool almost always reveals that the text was 100 percent generated.

A mandatory video raises the entry threshold

A mandatory video link upon submission filters out random, purely generated contributions. For Eurostar, 30 seconds recorded directly on a cell phone is enough.

The video serves two purposes. It increases the effort required so that no-one just goes ahead and submits generated material. And it gives the committee an impression of how a person acts when they speak and whether the whole setting fits together plausibly.

Keynotes thrive on an extraordinary career

The most interesting keynotes are given by people with an unusual career path who have experienced something that is relevant to the community and has not yet been told very often. Their own experience is also the best prerequisite here.

Two examples from this year’s program illustrate the principle. Wolfgang Platz, founder of Tricentis, developed Tosca as a test automation solution within Allianz, founded a company from it and built it up over decades. From this span, he can tell how testing and test automation have changed over three decades and across different cultures, organizations and domains.

Michael Kutz has held various roles within the same company, always with a focus on software quality. His question is directed at the teams: why do some work very well and others not, and what factors determine this? The quality lens is not only focused on software, but also on the team.

Users belong more on the program

Conferences receive a disproportionately high number of submissions from people who make a living from presenting or who want to make a product more visible. These can be excellent presentations, but they don’t have to be.

In contrast, the user group is often underrepresented: people who test a company themselves, run a testing department or introduce a tool and report on it. It is precisely these voices that need special attention so that they are strongly represented in the program.

The fact that someone has a product in the background does not preclude a good presentation. It depends on whether stories, experiences and new insights are in the foreground and the product takes a back seat.

A strict rule against self-placement

At Eurostar, neither the Program Committee nor the Program Chair are allowed to speak themselves. No one who works for the committee members’ companies is allowed on the program either.

This rule prevents the impression that the committee is putting together a program to put itself on the stage. It forces the committee to concentrate on a good program instead of trying to fit in a colleague here and there. For very large companies, this is hard because the whole company stops for a year. But that’s how the separation works.

Speaking from personal experience beats any generic slide

The most worthwhile presentations report specifically: this is what we did, this is what worked, this is what didn’t, this is how the problem turned out. Not “here are the possible problems and maybe it will work”, but a real review from practice.

When it comes to AI in testing, this year’s Eurostar marks a turning point. It is the first edition with contributions that evaluate a deployment over the course of a year because the tools were simply too young beforehand. The discussion is shifting from “we have to use it and be careful” to concrete applications and the question of how humans and AI can interact meaningfully.

Three examples illustrate the range well because they each combine a specific domain with a concrete experience:

Lecture	Speaker / Source	What it’s about
Closing the Loop: Using Field Data as a Quality Metric	Florian Wartenberg, Vestas	Using failure data from the field to improve testing and quality assurance. A fault costs money directly because a wind turbine then delivers less power.
Exploratory Testing for Massively Multiplayer Online Games	Oliver Hilt	Exploratory testing in the game domain. A bug does not lead to hardware downtime, but to players dropping out, which means financial damage in a completely different way.
Using AI to Create User Acceptance Tests: A Case Study from BBC Radio	BBC Radio	AI to create user acceptance tests, evaluated as a concrete case with a review of how it worked.

These presentations are relevant because they are based on a real-life application. A certain maturity is recognizable, especially for the short time the tools have been available.

How you as a speaker overcome the threshold to the call for papers

Start early with the abstract, not at the last minute. The reason is tricky: the better you know a topic, the harder it is to put yourself in the shoes of someone who doesn’t know it. You often hit the wrong level of abstraction for the very topics you are best suited to.

Have your abstract reviewed, ideally by a colleague directly at hand. Unusual, but effective: Write to the program committee or the program chair. Briefly describe your topic, ask if it fits and send a proposal. This rarely happens, but it helps both sides because it improves the contribution.

Talk about something for which you have personal experience. A reviewer should be able to tell from your CV and short bio that you are reporting from your own experience.

And then: trust. It’s normal to be nervous, everyone is. Sometimes it helps to simply send the submission at the right moment before the doubts come back.

There is usually a quarter to six months of preparation time between the acceptance and the conference. Use this time for a trial presentation. First for yourself alone to check slide transitions and timing, you don’t need an audience for this. Then in front of a small test audience who will give you real feedback.