Skip to main content

Search...

Building Trust with AI Agents

Trusting an AI agent works the same way as trusting a colleague: you need clear communication, checks, and a system that catches what the model gets wrong.

8 min read
Cover for Building Trust with AI Agents

Trusting AI agents in software systems means ensuring that automated processes produce correct, intended outputs through checks, guardrails, and validation layers built around the AI model itself. This requires solving two core problems: communicating requirements clearly enough for the system to act on them, and verifying that the output actually matches what was wanted, which testers call the Oracle problem.

Key Takeaways

  • AI systems amplify both mistakes and successes at scale, so the checks, guardrails, and validation processes built around the model matter more than the model itself.
  • Testing AI requires a shift from deterministic pass/fail checks to monitoring trends and mean time between failures, because non-deterministic outputs cannot be verified with a single green test.
  • The communication problem with AI agents is structurally identical to the bug-report problem with humans: vague input produces generic, context-free output that misses the actual need.
  • As AI-generated code becomes a black box, test specifications and acceptance criteria become the primary source of truth, making the tester’s skill set central rather than peripheral.
  • AI democratizes software creation by removing the need for programming knowledge, which surfaces long-ignored organizational problems such as document version control and missing single sources of truth.

Trust does not live in the model, it lives in the system around it

The unit of trust in AI is not the model, it is the system you build around it. A language model predicts, it does not analyze and it does not think. Trust has to come from the process, the checks, and the guardrails wrapped around that prediction.

This is the same standard you already apply to people. You trust a colleague because of a track record, shared understanding, and the ability to verify what they hand back. With an AI agent, the same scaffolding has to exist: a process, balances, and validation steps that catch the system when it drifts.

When that scaffolding works, the same logic applies as with software you subcontract. You write the spec, you accept the delivery, and you check whether you got what you asked for. Henri Terho frames the whole question of AI trust around two old problems that testers already know well.

The two problems that decide whether you can trust an AI agent

Two classic testing problems return with AI, and both have to be solved before trust is possible: the communication problem and the Oracle problem.

The communication problem is about conveying what you actually want. When someone tells an AI “fix my code now”, the result is poor for the same reason a bug report saying “production is broken” is useless. There are no logs, no context, no specifics. People get angry at the output without noticing they gave the system nothing to work with.

Humans paper over these gaps with experience. A colleague fills in the missing context from years of shared work. A model has no such experience, so the prompt, the requirements, and everything you feed it have to be specific enough to stand on their own.

The Oracle problem is about knowing whether the output is right. Even when the AI produces something, you still have to decide whether it is correct or merely plausible. That question gets philosophical fast inside a company. “Right” might mean the boss asked for it, or that the feature makes money, or that it matches a real specification. Without a clear answer, you cannot judge the output.

Answer both problems and the trust you are looking for follows. Skip them and you are guessing.

Why generic AI output fails you

A generic prompt produces a generic answer, and that is the failure mode most people walk into. Ask a general-purpose chatbot for a business strategy as a small software business owner, and you get a clean, well-structured strategy that fits no one in particular.

The output looks complete. It has all the expected points and reads well. But it carries none of your context. It does not know you sell from Finland into another country, or that German rules apply, unless you said so.

The fix is unglamorous: more context, more specifics, better requirements. The reason this annoys people is simple. Writing a precise prompt is more work than typing “fix this”, and most people would rather have the fast answer than the correct one.

AI testing shifts from a single pass-fail result to statistical behavior over time. Conventional software is deterministic. You give it an input, you expect one answer, and if you do not get it, you have a bug. That makes testing straightforward.

AI models are non-deterministic, so a single green test proves little. You watch trends, mean time between failures, and behavior across many runs. This is closer to how machine shops and aviation think about reliability than to a one-point check.

Plan for rare failures. Black swan events appear in these models because you cannot predict every output. A test suite that only confirms “green once” gives you false confidence.

There is a moving-target problem on top of this. The models change underneath you when providers update them. When the components of your system keep shifting, stability becomes something you have to engineer deliberately, not something you inherit.

Code becomes a black box, so the spec becomes the truth

As AI generates more of the application, the specification and the test become the only reliable source of truth. If the software passes what you specified, it is fine, and you do not need to read the generated code.

This pushes practices like behavior-driven and test-driven development from “nice tooling” to the core of the work. The real question becomes how you write down what you want, in a form precise enough to act as acceptance criteria.

There is a trap waiting here. People will overcorrect and write enormous specs, and you end up with millions of lines of specification the way teams now have millions of lines of code. Those specs will conflict. Systems will behave strangely, and you will spend your time debugging the spec rather than the code.

The abstraction level rises sharply. Writing a full open specification for a system as large as a CRM platform is genuinely hard, because no single person can lay out everything such a system should do.

AI pressure comes from outside IT, not from within it

The push for AI arrives from the business side, not from the engineering core. People from marketing now walk up and ask whether AI can solve a concrete problem they have. That kind of inbound request rarely happened with traditional software development.

This changes who drives the work. Demand comes from people closer to where the business creates value, who want a tool that tracks clients or surfaces what is happening, rather than from someone whose identity is built on being the database expert.

AI is an approachable enabler because it does not require fluency in a programming language. You can prompt it in plain language, and that democratizes computing. The flip side is that quality becomes questionable, because almost anyone can now produce software without anyone checking whether it holds up.

AI is not gonna take your job. They’re not gonna automate you, it’s going to augment you and the way that you work. And there’s a lot of work to be done around this area. — Henri Terho

Old organizational problems resurface, and now you have to solve them

AI forces unsolved human problems back to the surface, because the system needs a single source of truth that people never bothered to define. When AI becomes the main interface to a company’s knowledge, you have to write down what the organization actually does.

Consider a document store with twenty versions of the same file. Which one is correct? Answering that needs real context: whether a version went to a client, which one, and whether it was revised afterward. The problem sounds trivial and is not.

Software decisions have always carried hidden business decisions. A developer who locked a company into one cloud platform made a far-reaching call that shaped everything afterward, often without anyone noticing at the time. AI makes these buried decisions visible and demands they be addressed.

To handle the verification side, build guardrails into the platform. One option is to have multiple AI instances examine and discuss whether a given output is actually good, so the system checks itself rather than trusting a single pass.

Why a tester’s skill set fits the AI era

The testing mindset maps directly onto what AI work demands. The job already revolves around defining criteria, writing them into specs and test cases, and verifying outputs. That baseline puts testers ahead of many other roles in IT.

What you add on top is statistical thinking. Move away from caring about one test going green and toward watching how results trend across runs. Then widen your reach beyond the tester’s box.

Testing and validation will absorb parts of neighboring roles, including pieces of DevOps and programming. Expanding into those areas raises your value, and the broader your context, the better your judgment, exactly as more context improves an AI’s output.

The dominant reaction to all this change is fear, and it is a natural response to a shift this large. The fear of being automated away is the one worth setting aside first. The work ahead augments testers rather than replacing them, and there is a great deal of it.

Share this page