Audit-Ready AML: Evidence-First Investigations and the Four-Eyes Principle
When an examiner asks you to walk through a single decision, can you? How an append-only audit trail, evidence-first design and the four-eyes principle make every AML decision reconstructable — and inspections far less painful.

An examiner rarely asks to see your whole programme at once. They ask about one thing. One customer who scored low-risk and probably should not have. One alert that was closed without a report. One transaction that moved through a corridor nobody flagged. Then they ask the question that decides how the rest of the inspection goes: walk me through this decision. Who made it, on what evidence, who checked it, and when.
If your answer is a confident reconstruction — here is the alert, here is the customer's risk band at the time, here is the analyst's rationale, here is the second approver, here is the timestamp on each step — the inspection becomes a conversation between professionals. If your answer is "let me get back to you" while three people search across systems and email threads, the examiner has learned something about your controls that no policy document will undo.
Audit-readiness is not a report you produce at the end of a quarter. It is a property of how your AML system records what it does, every day, whether or not anyone is watching. This article explains the three design choices that make a Creodata-based programme reconstructable on demand: an append-only immutable audit log, evidence-first design, and four-eyes approval on consequential actions. For where audit-readiness sits among the other controls, see the complete AML platform guide; this piece is about making any single decision defensible.
What examiners and assessors actually ask for
It helps to start from the demand side. Two kinds of reviewer will test your records, and they want slightly different things.
The first is your national supervisor or financial intelligence unit — the FRC in Kenya, the FIA in Uganda, the FIU in Tanzania, the FIC in Zambia or Rwanda — conducting an on-site or thematic examination. Their questions are concrete and case-level. Show me how this customer was risk-rated. Why was this alert closed? Who approved this exception to policy? Prove that the person who escalated the case is not the same person who later signed it off. They are testing whether your stated procedures are the procedures you actually follow.
The second is a mutual-evaluation assessment under the FATF methodology, run regionally through ESAAMLG. Here the lens is wider — effectiveness, not just technical compliance — but it still resolves to evidence. Assessors sample real files and ask the institution to demonstrate that decisions were made on a reasoned basis and can be reconstructed. A programme that produces clean, time-stamped, attributable records sample after sample is making the effectiveness case for itself. We cover that exercise in more depth in preparing for an ESAAMLG mutual evaluation.
Both reviewers are, in the end, asking the same thing: can you reconstruct a decision faithfully, from the evidence that existed at the time, without anyone having had the opportunity to quietly rewrite history? Three design choices answer that question.
The append-only, immutable audit log
Most systems have logs. Far fewer have logs an examiner would trust, because an ordinary log is editable, and an editable record proves nothing about the past — it only tells you what someone last decided it should say.
A Creodata programme records consequential events to an append-only, immutable audit log. Entries are written, never updated or deleted. A correction is not an edit over the old value; it is a new entry that supersedes the previous one, leaving both visible in sequence. The log is the system's memory, and like a real memory of a meeting, you cannot reach back and change what was said — you can only add what was said next.
Practically, every consequential action lands in that log with the same anatomy: what happened, who did it, when, and against which entity or case. A risk band changes; an alert is assigned; an RFI is sent; a case is escalated; a manual override is requested and approved; a report moves from draft to submitted. Because the platform raises every domain event through a transactional outbox — the event is committed in the same transaction as the change that caused it — the log cannot drift out of step with reality. There is no window in which the action happened but the record of it did not, and no path by which a change lands without a corresponding event.
This is what lets you answer the "who was responsible" question with a timestamp instead of a shrug. The sequence is the evidence. When an examiner asks why a customer's risk rating dropped in March, the log shows the change, the user, the moment, and — through evidence-first design — the reasoning attached to it.
Evidence-first: the proof is one click away
An audit trail tells you that a decision was made and by whom. Evidence-first design tells you why, without a scavenger hunt.
The principle is simple to state and demanding to build: every consequential decision shows its evidence one click away. The system does not merely record an outcome and trust you to remember the basis for it. It keeps the basis attached to the outcome, so that opening the decision opens the case for it.
Consider what that means at each surface:
- A risk rating does not just display a band. It shows the six factors behind it — country, industry or business type, product, delivery channel, customer behaviour, and PEP or sanctions exposure — with the weights and the scoring that produced the band. If the rating was changed by hand, the override and its justification sit beside the automated result, not in a separate file.
- A screening hit does not just say "matched." It surfaces the match score and the top-three reasons the engine flagged it, so a reviewer can see whether the system caught a genuine sanctions match or a coincidence of common names. The false-positive disposition is recorded against that evidence.
- A closed alert carries the analyst's reasoning, the documents gathered during any request-for-information cycle, and the linked-case context — the case decisions you must be able to evidence — so the close can be defended on the same screen where it was made.
- An AI-assisted decision carries its own provenance: an "AI · model · version" label, the SHAP top-three explanation, a confidence percentage, and the human Accept, Modify or Reject control that a person actually used. The decision is logged with the model version and an inputs hash, so the exact scoring can be reconstructed later. Evidencing model-assisted judgement is a discipline in itself, which we treat separately in evidencing AI-assisted decisions.
The pattern is the same everywhere: the consequence and its evidence travel together. You are never reconstructing the rationale from memory or inference, because the rationale never left the decision.
Four-eyes approval on anything consequential
The third choice is about who is allowed to act alone. The answer, for consequential actions, is no one.
Four-eyes approval means a second authorised person must review and approve before a consequential action takes effect. It is the control that prevents a single individual — through error, pressure, or intent — from quietly making the decision that matters. In a Creodata programme it is applied where the stakes justify it rather than everywhere, because friction in the wrong place trains people to route around the control.
Where it bites:
- Manual risk overrides. An analyst can propose to override a customer's calculated risk band, but the override does not take effect until a second person approves it. The proposal, the justification, and the approver all enter the audit log.
- Model activation. Bringing an AI model into live scoring is a four-eyes action — one person cannot promote a model on their own authority — and the platform keeps a kill switch and rollback for when something looks wrong after the fact.
- Consequential case and reporting actions. Escalations, exceptions, and the move to submit a regulatory report follow the same draft, review, approve, submit discipline, so the decision to file or not file is never one person's unrecorded call.
The audit value of four-eyes is not only that two people agreed. It is that the separation is provable. When an examiner suspects that the approver and the maker were the same person wearing two hats, the log settles it: distinct identities, distinct timestamps, recorded in sequence. Combined with role-based access control, four-eyes turns "we have segregation of duties" from an assertion in a policy into a fact in the record.
Reconstructing a single decision
Put the three together and the examiner's hardest question becomes a routine retrieval. Suppose the question is: why was this high-value customer rated medium-risk, and why was their alert closed without a report?
You open the customer. The risk rating shows its six factors and the band they produced; one of them was overridden, so you see the original automated band, the analyst's stated justification, the second approver who signed it, and the timestamp on each — all from the append-only log. You open the alert. It carries the rule that fired, the transactions involved, the RFI documents the analyst gathered, the screening evidence with its top-three reasons, and the close rationale. If a model contributed, its version, confidence, and SHAP explanation are stamped on the decision, with the human disposition recorded beside it.
At no point did you assemble this from three systems and a shared inbox. The evidence was attached to each decision; the sequence was preserved in the log; the separation of duties was provable. That is what "reconstructable" means in practice, and it is the same retrieval whether the question comes during a quiet internal review or a live examination.
Preparing for inspection without a fire drill
The institutions that find examinations painful are usually the ones that treat audit-readiness as a project that begins when the examiner schedules a visit. The ones that find them manageable have made readiness a steady-state property and do three things ahead of time.
First, they sample their own files the way an assessor would. Pick a handful of cases at random each quarter, attempt to reconstruct each one cold, and note anywhere the evidence was thin or the trail had a gap. The exercise finds the weaknesses while you still have time to fix the underlying practice, not the record.
Second, they rehearse the walk-through. The reconstruction above should be muscle memory for your MLRO and senior analysts. If retrieving the evidence for a single decision takes a meeting, the system is fine but the team has not practised using it.
Third, they treat audit as a pillar of the programme, not a clean-up after it. Risk assessment, screening, monitoring, case management and reporting each generate the records that make the others defensible, which is why we frame audit as a programme pillar rather than a reporting afterthought. When the trail is a by-product of doing the work properly, readiness is not extra effort — it is what is left behind.
Frequently asked questions
How is an append-only audit log different from an ordinary system log?
An ordinary log can be edited or deleted, so it only proves what the record currently says, not what happened. An append-only log is written once and never changed; corrections are added as new superseding entries with the originals still visible. Because the entries are raised through a transactional outbox alongside the changes that caused them, the log cannot drift out of step with what the system actually did — which is what makes it trustworthy as evidence.
Doesn't four-eyes approval slow everything down?
Only where it should. Four-eyes is applied to consequential actions — manual risk overrides, model activation, escalations, the decision to submit a report — not to routine work. Applying it selectively keeps the control credible; if every click needed a second signature, people would learn to treat the approval as a rubber stamp, which defeats the purpose. The aim is a provable second pair of eyes on the decisions that carry risk.
What does "evidence one click away" mean for an examiner?
It means a reviewer never has to assemble the basis for a decision from separate systems. Opening a risk rating shows the factors and any override; opening an alert shows the rule, the documents, and the screening reasons; opening an AI-assisted decision shows the model version, confidence, and SHAP explanation. The evidence is attached to the outcome, so reconstructing why a decision was made is a retrieval, not an investigation.
How does this help with an ESAAMLG mutual evaluation specifically?
Mutual evaluations test effectiveness by sampling real files and asking you to demonstrate reasoned, reconstructable decisions. A programme with an immutable trail, evidence attached to every consequential decision, and provable separation of duties makes that case sample after sample. The detail of the exercise is covered in preparing for an ESAAMLG mutual evaluation, and the reporting hand-off it depends on lives in the goAML Reporting Platform.
If your honest answer to "walk me through this decision" is currently a search across systems, that is a control gap worth closing before an examiner finds it for you. See how the Creodata AML Platform builds the audit trail, evidence-first surfaces and four-eyes approvals into everyday work — or, if you would rather start with the programme around them, our financial crime compliance advisory can help. To see a single decision reconstructed end to end, book a demo.
