Back to The LedgerProduct

How to run an audit drill on your AML screening system

AMLO-FINMA Article 22 requires that documentation allow a knowledgeable third party to reconstruct individual transactions. The best way to find out whether yours actually does is to run a drill.

Antoine Bedaton
Antoine Bedaton
04 Feb 20267 min read
How to run an audit drill on your AML screening system

Part of our complete guide to negative news screening for Swiss banks. This post is the deep dive on running an audit drill on your screening system; the guide covers the end-to-end picture.

AMLO-FINMA Art. 22 requires Swiss financial intermediaries to keep documents and supporting evidence so that individual transactions can be reconstructed by a knowledgeable third party. The retention period under AMLA Art. 7 is ten years. Most AML stacks claim to satisfy these requirements. Most have never been tested under realistic adversarial conditions. The substance of what those requirements actually are is in our breakdown of Swiss AML evidence rules; this post is about how to verify yours measure up.

The cheapest insurance against an examination going badly is to run a drill against your own system before someone else does. This post is a playbook for running one.

What a drill is, and what it is not

An audit drill is a controlled exercise. A randomly-selected past investigation gets pulled out of the system, and a person who did not work on it has to reconstruct, in writing, why the original decision was made. The output is a timeline an external auditor could follow without explanation.

A drill is not:

  • A demo. Your vendor will not run it. The point is to see what your team can produce from your data.
  • A spot check by the original analyst. They remember the context. Reconstruction has to come from someone who does not.
  • A walkthrough of features. Whether your system can show evidence is not the question. Whether someone unfamiliar with the case can turn it into a defensible written record is.

Why this matters

A reconstruction obligation is meaningfully different from a record-keeping obligation. Record-keeping says "store the data". Reconstruction says "prove a third party can use it". The two are not the same, and most deployments quietly fail the second test even when they pass the first.

FINMA's published enforcement actions over the last several years (Credit Suisse / FIFA / Petrobras / PDVSA (2018), Credit Suisse / Mozambique (2021), the Julius Baer Latin America matter (2020)) did not turn primarily on missing files. They turned on documentation and reporting that could not, when produced, support the bank's earlier decisions. The artifacts existed; the trail did not lead anywhere defensible. A drill is how you find out whether yours does.

The protocol

Five rules that make the drill produce useful findings:

  1. Pick the auditor before you pick the cases. The selector should not be the team being tested. Internal audit, an external auditor, or a board member is appropriate. The compliance team running the drill is not.
  2. Random selection from a defined pool. "All COMPLETED investigations from year X". Not a curated shortlist. The selector's job is to surface randomness, not difficulty.
  3. Reconstruction by an analyst who did not work on the case. This is the load-bearing rule. If the original analyst does the reconstruction, the system has not really been tested. Their memory has.
  4. Time-box the exercise. Examiners ask for reconstruction in days, not weeks. A 30-minute time box per investigation surfaces the workflows that need streamlining.
  5. Demand a written output. A timeline an external auditor could read cold. If it cannot be produced in 30 minutes from a clean read of the system, the system is not as ready as it looked.
What 'reconstruct' actually means

Under AMLO-FINMA Art. 22, the third party performing reconstruction is not just any external observer. The ordinance specifically refers to FINMA, FINMA-engaged auditors, FINMA-appointed investigators, and audit firms approved by the audit oversight authority. The reconstruction has to be understandable to that audience. Internal jargon, half-explained acronyms, and "ask the analyst" notes do not meet the bar.

What a drill typically surfaces

The findings vary by institution, but the patterns are consistent:

  • Evidence that no longer renders. A captured news article that relied on a third-party CDN, a registry record stored as a link rather than a snapshot, a PDF that referenced an internal SharePoint permission since changed. (Hash-chained capture is what stops this happening.)
  • Free-text fields that lost meaning. Escalation reasons like "talk to D about this" are useless once D has left the firm. Structured fields with named individuals resolved at write-time hold up; free text often does not.
  • Schema changes that broke historical context. A custom-fields schema that has evolved twice means investigations from year X used fields that no longer exist or have been renamed. The system either preserves the old schema for old records or quietly hides them.
  • Identity resolution gaps. "Reviewer X" turns out to be a service account, or a person who has since changed roles, or someone whose authority was time-limited and is now expired. The audit trail needs to attribute decisions to the person at the time, not the current identity.
  • Annotation drift. Notes that were structured at capture but got exported to flat formats during a system migration. The notes still exist. The metadata around them is gone.

An audit drill mostly tells you what your system makes hard to reconstruct. The findings are what to fix before someone with subpoena power asks the same questions.

Categories of drill, in increasing difficulty

The same protocol applies to several scenarios. The easy ones first, the hard ones last:

  • Year-old completed investigation. Schema is current. Evidence is fresh. Reviewers are still at the firm. Should be near-instant.
  • Three-year-old investigation. Schema has likely changed once. Source URLs may have rotted. Some reviewers may have moved teams. The test of whether snapshots, not links, were captured.
  • Investigation predating a system migration. This is where most institutions discover that the migration lost more than expected.
  • Escalated or rejected investigation. Tests whether the disagreement is reconstructable, not just the final decision. The hardest category for most systems.
  • Investigation involving a reviewer who has left the firm. Tests identity resolution and whether the audit trail can stand without that person available to fill in gaps.

A useful first drill covers one investigation from each category. The findings from the harder cases tend to be the actionable ones.

Frequency

Quarterly works for most institutions, with a different selector each time. Half a day of effort per quarter; the alternative is finding out during an actual examination that the answer is no.

A drill is portable

The protocol does not depend on which screening system you use. If you are mid-procurement and trying to compare vendors, a drill against each vendor's demo environment (using the same fictional case) is one of the more honest evaluation methods available. Vendors who do not support drill access during a procurement should be a yellow flag.

What to do with the findings

The most valuable outcome of a drill is a punch list of specific gaps, each with an owner and a deadline. A few rules of thumb that hold up:

  • Capture is more urgent than retrieval. Lost evidence cannot be reconstructed; messy evidence can be indexed later.
  • Structured fields beat free text for anything an auditor will care about. Reasons, escalations, named individuals, decision codes: structure these at capture, not at export.
  • Identity must be resolved at write-time, not read-time. The question "who was this person at the moment they made this decision?" has to be answerable by reading the audit log, not by querying the current user directory.
  • Schema versioning matters. Investigations from prior years should remain comprehensible. Either preserve the old schema (read-only) or migrate forward with explicit backfill.

If you would like the protocol templates we use internally, including the selector instructions and the written-output template, get in touch. The exercise is valuable independent of which system you run it against. (NNSFlow is what we built when we ran it ourselves.)

#audit#playbook#evidence#AMLO-FINMA