Formal methods in software quality

Formal methods are mathematically based techniques that can be used to prove certain properties of software. Instead of a few examples, they generally show that a program meets a specification. The basis is usually functional programming, because code there consists of equations and can therefore be directly mathematically deduced.

Key Takeaways

Formal methods do not provide proof that software is “absolutely correct”, but increase redundancy so massively that an undetected error is practically impossible.
Functional programming reduces the distance between code and mathematics because each function defines an equation and thus “like can be replaced by like”, which is not the case in imperative languages.
Property-based testing is the first practical step towards formal methods: properties are formulated with all-quantifiers across all inputs instead of just checking individual examples, and this works in all common programming languages.
C and C++ are considered the most dangerous programming languages, but are used in safety-critical embedded systems such as cars and medical devices, where formal methods can at least ensure basic correctness properties.

What are formal methods in software development?

Formal methods are a generic term for mathematically based techniques that developers use to convince themselves of certain properties of their software. The name comes from formalization, i.e. translating a requirement into mathematical form.

In essence, it is a matter of proof rather than sampling. In classic testing, there is a specification, often supplemented by a handful of examples. These examples answer the question of what happens in precisely these cases. They say nothing about what happens everywhere else.

Michael Sperber describes this using the example of a speedometer. A car drives at a certain speed and the speedometer is supposed to display the actual speed. There is an EU standard for this requirement, which specifies a formula: At a given speed, the display may only deviate by a certain percentage. A certification authority does not want to be convinced by seven examples, but by a general argument.

Formal methods provide this argument. The standard formula becomes a mathematical statement, the code is also translated into mathematical form, and a proof shows that the two fit together.

Correctness is not an endpoint, but built-in redundancy

The question of whether software is then “really correct” actually belongs at the end. Sperber turns it around and looks at formal methods from an engineer’s point of view.

In every engineering discipline, redundancy is built in to make a system reliable. Formal methods are a means of massively increasing this redundancy. The protection becomes so tight that hardly any errors can get through.

In practice, those involved call it “bomb-proof” or “bullet-proof”. From a more cautious engineering perspective, the statement is more precise: it is the most reliable method known for verifying software properties, and it goes many times further than what testers alone can achieve.

Why functional programming is closer to mathematics

Formal methods require a proof, and proofs live from equations. This is exactly where functional and imperative programming differ.

In C or Java, x = x + 1 is a familiar expression. Mathematically, it is wrong, because x is not the same as x + 1. The expression only works as an instruction: x new is x old plus one. There is a gap between this programming model and mathematics.

This gap can be bridged, but it takes work. If you want to verify a C or Java program, you also have to deal with pointers that point to a common state. A change at one point can affect another point. In mathematics, on the other hand, if x stands for 5, 5 can be used everywhere. Replace the same with the same. In imperative code, this does not apply, and it takes a lot of machinery to restore it.

The problem does not arise in functional programming. When you define a function, you are essentially writing an equation. This is already close to mathematical reasoning and what a proof needs.

Markus Schlegel adds a second view: Functional programming describes what is, i.e. the technical relationships. Imperative languages such as C describe how the computer should do something step by step. A correct speed measurement in a car can be formulated functionally as a property, not as a sequence of instructions.

Functional programming does not cost more, but often less

Doing without familiar imperative constructs initially seems like asceticism. In practice, however, functional development is not more expensive, but often more efficient.

One reason is the additional abstraction possibilities, especially in data modeling. The other reason is the simpler reasoning about the behavior of the program. Thinking about input and output is more straightforward than keeping every state step in the right order in your head.

I don’t debug that much in functional programming. I just look at how my function gives me the right results for the corresponding inputs. Michael Sperber

According to Sperber, studies on the effort per piece of functionality are consistently in favor of functional development. The exact factor is debatable, but there is no loss.

If you come from university and the imperative school, you have to rethink. Schlegel describes ‘i = i + 1’ as the first stumbling block in his learning of programming, because it contradicted what he knew from school mathematics. The path to formal methods leads a little way back to this equation thinking.

Functional languages are an ecosystem of their own, not a single language

Functional programming does not exist as a single language, but as a parallel ecosystem with its own languages, package managers and frameworks. It has developed largely independently of the mainstream.

Haskell is considered a prototypical representative with a strong type system.
Scala on the JVM.
F# on the .NET platform, the functional parallel world to C#, so to speak.
Clojure on the Java platform.

The strong type system in Haskell is intentional: if you program something that tends to be wrong, the program is not even compiled. This strictness puts some people off. Schlegel points out that you don’t necessarily need Haskell with its monads and monoids. Clojure is also functional without placing these hurdles in the foreground.

Many features of modern object-oriented languages originate from functional programming anyway. What is borrowed there has been available in the functional camp for some time.

How a proof is created and appraisaled in code

A proof can be written on paper, on a whiteboard or by machine. For formal methods, machine appraisal is what counts in the end.

The easiest way to start is with type systems. A compiler already proves that a program behaves in certain ways, for example that there is no integer where a string is expected. Newer functional languages know so-called dependent types. This allows more of a specification to be expressed directly as a type, and the type checker verifies it.

The second way is to use proof assistants. They perform a proof like in school, only with tool support. You write down a proof in machine-readable form and have it checked to see if it is valid. Modern tools take care of the annoying parts. If you tell the assistant to perform an induction on a variable, you can have the tool elaborate the individual steps.

The pair cite Isabelle, a tool that is half Munich-based and has been in productive use for many years, as a mature example.

The security comes from the system itself. In a language with a strong type system, the program does not compile unless the proof is correct. Schlegel summarizes the principle simply: if the light turns green at the end, the proof is correct and the specification and implementation are in harmony.

Proofs belong in the CI like tests

A finished proof is code and is treated like code. Sometimes it is stored in a separate file, sometimes it is integrated directly into the source code in the case of dependent types.

This allows it to be stored and integrated into Continuous Integration. Instead of just running a test suite, the proof checker also runs over the program and reports if something is wrong. The proof thus becomes a regular part of the pipeline, not a one-off special action.

Formal methods are a niche, but do not belong in a

Formal methods are considered a niche, although they are suitable for almost every area. In response to the frequent question of what the approach is suitable for, Sperber answers: for everything. The good properties of functional programming offer advantages everywhere.

The reality is different in the embedded environment. C and C++ dominate there, i.e. the most dangerous languages for systems that need to function reliably, for example in a car or a medical device. This is where formal methods come into play, even below functional correctness. They check whether a program uses limited resources, whether pointer aliasing remains clean and whether the C program exhibits well-defined behavior at all.

A practicable path looks like this: The software is functionally developed, tested and verified, and only then is C code generated. The generated code then moves within narrow, defined limits.

Formal methods have been around since the 1950s. The tools have matured accordingly. What is often missing in the mainstream is simply the knowledge that all this exists.

Three life hacks you can start using right away

Getting started with formal methods doesn’t need a proof wizard on day one. Three steps work in any context.

**First: write down the specification ** When you formulate test cases, you don’t pull them out of a hat, but from an idea of how the program should work. Writing down this idea explicitly is the first step and helps regardless of the technology used.

Secondly, use property-based testing. This technique originates from functional programming, but is available for all common languages. Instead of individual examples, properties are formulated for all possible inputs, for example with an all-quantifier. This is not yet bulletproof because the tests are generated and not proven. However, it is a real step towards formal specification.

Schlegel and Sperber note that property-based testing is hardly known in the testing community, even though it is an original testing technology.

**Thirdly, learning to think functionally ** Thinking about the relationship between input and output instead of sequences of instructions reduces errors during development. Material on this is freely available, and the ISAQB has developed a new curriculum on formal methods that provides an overview of the field.