Skip to main content

Search...

Best practices for (architecture) documentation

Architecture documentation quickly becomes outdated because the tools slow it down. Documentation as Code and Continuous Documentation solve exactly that.

9 min read
Cover for Best practices for (architecture) documentation

Documentation as code refers to the approach of treating architecture documentation like source code: writing it in text formats such as Markdown or AsciiDoc, versioning it via Git and automatically delivering it as PDF or HTML via build pipelines. Supplemented by continuous documentation, i.e. iterative maintenance during the ongoing development process, the documentation remains up-to-date and reviewable.

Key Takeaways

  • Documentation as Code treats architecture documentation like source code: same tools, text formats like Markdown or AsciiDoc, versioned in Git with diff capability and review via pull request.
  • AsciiDoc is the better choice for technical documentation than Markdown because it offers a real standard, can modularize documents and enables a single source of truth.
  • Diagrams as text, for example with PlantUML, can be versioned and directly compared, but fail with larger architecture diagrams because the automatic layout can then no longer be controlled.
  • Continuous Documentation integrates the writing and maintenance of documentation into the CI/CD pipeline so that the documentation is built with each build and delivered as PDF or HTML.
  • Missing documentation is noticeable in the review process if it is part of the Definition of Done and incomplete documentation blocks the merging of a pull request.

Why architecture documentation is so often left undone

Architecture documentation regularly falls by the wayside in day-to-day project work. It is postponed, shortened or left out altogether because other tasks seem more urgent and documentation is rarely fun.

The reasons are sober and recurring. Something has to be finished quickly, so the documentation moves down the priority list. Developers prefer programming to writing text. And those who want to document often need a special tool, such as a UML tool with a license that only exists once in the company and that someone else is currently using.

Then there is resignation. Many people assume that nobody reads the documentation anyway and that it is outdated as soon as it is written. If that’s the expectation, you don’t even start.

Falk Sippach relates these observations to architecture documentation, but emphasizes that the concepts apply equally to other types of documentation. This is relevant for testers: Documentation is the basis on which they orient themselves, whether as a test basis or as a reference work. And testers today look at many levels, sometimes request a sequence diagram or want to understand how the architecture is connected.

Documentation as code: Documentation is treated like source code

The core idea is to treat documentation like source code. The same tools, the same text formats, the same version management, the same integration into the build.

The approach is not new, and many practise it without calling it that. The suffix “as code” is now widely used, from infrastructure as code to diagrams as code. The idea behind it remains constant: mapping things in code or at least handling them like code.

In concrete terms, this means that you write the documentation in the same IDE alongside the source code, in a text file with a lightweight format such as Markdown or AsciiDoc. You don’t need an extra tool, no license, no context switch. Editor, command line and build tool are already there.

You simply write one after the other, similar to a letter. You mark headings, bulleted lists and tables with light markup. You don’t see the finished layout at first, but it’s easier to write. IDE plugins provide a preview: the text on the left, the rendered result on the right.

When I introduce the concept, many people already know it. And those who don’t know it quickly realize: That’s really easy, why don’t we just do it that way?

  • Falk Sippach

Why AsciiDoc has an advantage over Markdown

AsciiDoc is the more robust choice for technical documentation, because Markdown is only a thin standard with many manufacturer-dependent characteristics.

Markdown works for many things, but when it comes to technical documentation, functions are missing or only exist via dialects. AsciiDoc provides these functions directly, such as a table of content and clean tables.

A practical advantage is the modularization. Instead of one large file, you create ten small ones, each for a chapter or section. A central document includes these parts. This allows you to compile content flexibly and still retain a single source of truth.

The advantage of plain text: Focus on the content

Text formats are not distracting. You write content, not formatting.

In Word, you deal with a hundred side issues and the layout often ends up looking different than intended. Word is made for letters, not for large documents. Anyone who has written a long paper in it during their studies knows this.

Nobody thinks about formatting when writing an email either. This is exactly how Markdown and AsciiDoc make you feel. The content is in the foreground, the trappings take a back seat.

Git makes documentation versionable and reviewable

Documentation belongs in the same version management system as the code, and Git is the standard for this today.

Git is a distributed system. You can continue working offline with a local copy, for example on the train, and push the changes back later. You can merge, make pull requests and carry out reviews for both code and documentation.

This creates real traceability. You can compare statuses, see differences and incorporate feedback in a traceable way. This is exactly where Documentation as Code merges seamlessly into Continuous Documentation.

Three steps to integrate diagrams

A picture is worth a thousand words, and a document with thirty pages and lots of diagrams is more likely to be read than a hundred pages of text alone. There are three levels for diagrams, which differ in terms of flexibility and maintenance effort.

The levels at a glance:

LevelWhatProblem with feedback
1Binary formats such as PNG, JPEGDiagram must be opened, changed and re-exported in the original program
2Vector graphics tools such as Visio or UML toolsOwn format, additional export to PNG or JPEG required
3Text-based diagrams (diagrams as code)Hardly, as the source is text and remains versionable

With the first two steps, an additional step is required for each change. You open another program, change the diagram and export it again. This costs time and effort with every feedback loop.

The trick with Draw.io

A good compromise for more complex graphics is Draw.io, also known as Diagrams.net, a lightweight open source vector graphics program. It runs standalone, as a plugin in Confluence and in IDEs, i.e. exactly where you need it.

The real trick is in the metadata. Draw.io saves its vector data directly in the PNG or JPEG image, comparable to the Exif data of a camera. You reference the image as normal and can still open and edit it again at any time. The separate export step is no longer necessary. This is a practical solution for architectural diagrams with lots of boxes and lines.

Diagrams as text with PlantUML

At the third stage, you write diagrams as text using tools such as PlantUML. Sequence or component diagrams are created in a fixed notation and the tool renders the layout itself.

This works well with sequence diagrams because the direction is clear, from top to bottom and from left to right. With component diagrams, the layout is slightly different with each rendering, and as the size increases, the approach reaches its limits. Then the Draw.io middle way is better.

The text-based approach has a clear attraction: text files remain, which migrate to Git, can be versioned and compared with previous versions. The text is often enough to capture the current status, without the rendered image.

Limitation as a side effect: smaller, more targeted diagrams

The fact that text-based diagram renderers reach their limits with large structures is not purely a disadvantage. It forces abstraction.

The real problem is that diagrams generally become too large. You want to pack everything in, and then the diagram is out of date before it is saved. Limiting yourself to what is important forces you to create several smaller views instead of a wallpaper in A0 format.

How continuous documentation gets into the pipeline

Continuous documentation means that the documentation becomes part of the build and delivery, just like the code.

In the simplest case, you have an AsciiDoc or Markdown file. Processors generate PDF or HTML from it, on the command line or via plugins. Since build tools are scripted anyway, building software, executing tests, creating containers, a step for the documentation is simply added.

The result is stored as a PDF, a website or exported to a company-wide wiki such as Confluence. Platforms such as GitHub and GitLab render Markdown and AsciiDoc directly in the browser anyway, so you can always see the end result.

The documentation runs with the software. If you build version 7.3, the documentation is created in version 7.3 and is stored there. Version 7.2 remains accessible in a different directory. You can access any version at any time.

Reviews make missing documentation visible

A review process ensures that documentation is not forgotten. It becomes part of the Definition of Done.

If a developer implements a feature with an impact on the architecture, the documentation is part of the Definition of Done. During the review, it is noticeable if the code has been adapted and tested but the documentation has not been updated. Then it is not noted and it goes back again.

This is exactly where the text format comes in. In the pull request, you can see the diff of the documentation directly, just like with the code. For tools with their own binary format, this diff is missing and the review becomes more difficult.

Generate yes, but only where the code does not provide it

Generating documentation from the source code is worthwhile where information is otherwise not visible, not for what is in the code anyway.

What is directly readable in the source code should not end up redundantly in the documentation, not even generated. Nobody reads it. Documentation must remain lean to ensure acceptance.

Software architecture, on the other hand, is not obviously in the code. It is the big picture that holds everything together, and that’s exactly what you don’t see at first. This is where generation becomes valuable. With meta information in the code, domain-driven design elements such as bounded context, aggregates or value objects can be marked and used to generate diagrams. If the code changes, the diagram changes immediately.

A ready-made standard tool for this is hardly widespread; some industries with a specific need build their own. It is not difficult. You can write an automated test that accesses the source code, extracts information and generates a PlantUML or AsciiDoc file from it, which is then rendered in the documentation.

Share this page

Related Posts