The problem everyone just noticed

There is a kind of work where being wrong is not a quirk. It is a liability. Tax. Law. Trade finance. Banking. Healthcare. Securities. In these domains a confident wrong answer costs someone money, or their license, or their freedom, or their health. The cost is not embarrassment. It is real.

These are the domains the industry now wants to point AI at. And it has finally noticed the obstacle. A probabilistic system cannot serve a domain that runs on certainty. "The model was 95% confident" is not a sentence you can say to a regulator, an auditor, or a court. A compliance system that is 95% correct is a liability machine the other 5% of the time.

The field's instinct has been to make the model better. Bigger, more grounded, better prompted. That instinct does not solve the problem, because the model is the part that lies, and you do not fix lying by asking more nicely. You fix it by putting something underneath the model that does not lie, and letting the model do the one thing it is good at, which is reading and writing language.

That something is rules. Codified properly. Deterministic where the regulation is deterministic, and flagged for human judgment where it is not. The rule produces the answer. The model only makes it readable. The rule is what survives audit. The model is what makes it usable.

None of this is new in computer science. Formal methods, deterministic verification, expert systems, the lineage runs back decades. What is new is the urgency, and a realization that the bottleneck was never the engine. It was always the rules. Someone has to write them down.

This paper is about what "written down properly" means, and about a claim that sounds too strong until you have seen the mechanism work. The same engine, unchanged, serves every regulated domain.

What a rule looks like when you build it right

Most regulatory knowledge lives in one of three forms, and all three fail a machine.

It lives in PDFs. Auditable, but not executable. A human reads and interprets it every time.

It lives in content databases. Queryable, but not executable. You get the text of the regulation and still have to turn it into a decision yourself.

It lives in hand-coded business logic buried inside a product. Executable, but not auditable. The code and the regulation drifted apart years ago and nobody can prove they still agree.

No form is executable, auditable, and queryable at once. That is the gap.

A rule built right is a typed object that pulls apart four things people usually tangle together. What the rule says, meaning its conditions and outcomes. What the rule rests on, meaning the source publication, the article, the effective date, the amendment history. How the rule runs, meaning deterministic, or needing a language model for a genuinely semantic judgment, and whether it is ready for production. And how far you should trust it, meaning whether a human reviewed it and what evidence stands behind it.

Here is the shape, drawn entirely from public law. The GDPR's 72-hour breach-notification window:

{
  "rule_id": "DP-BREACH-EU-001",
  "title": "Personal-data breach not notified within 72 hours",
  "jurisdiction": "eu",
  "source": "GDPR Article 33",
  "severity": "block",
  "expected_outcome": {
    "action": "review",
    "message": "A notifiable breach was not reported to the supervisory authority within 72 hours of awareness. Document the delay justification or notify immediately."
  },
  "conditions": [
    {
      "type": "breach_notification_window_check",
      "regime": "eu_gdpr",
      "notify_party": "authority",
      "awareness_path": "breach.aware_at",
      "notified_path": "breach.authority_notified_at"
    }
  ],
  "deterministic": true,
  "requires_llm": false,
  "source_authority_tier": "authoritative_source",
  "validation_status": "expert_reviewed"
}

Look at what is kept apart. The logic, a time-window check, sits independent of the provenance, GDPR Article 33, which sits independent of the trust signal, expert_reviewed. Every field is queryable on its own. Every decision the rule produces traces back to the article. Every claim of certainty is held to the evidence behind it. A rule no human has reviewed is marked that way, and is never allowed to read as a confident verdict.

Do this once and you have a rule. Do it for tens of thousands of rules across a domain and you have something else. A substrate. Probabilistic systems cannot reach it, because they cannot guarantee the answer. Content libraries cannot execute it, because they only hold the text. It gets built one rule at a time. Boring, on purpose.

The claim: every domain takes the same shape

This is the part that is easy to say and hard to believe until you have built it several times over.

Every regulated domain, however different its subject, comes down to the same four moves.

First, a normalization step. Messy real-world input gets resolved to canonical entities before any rule runs. A party name resolves to a sanctioned-entity record. A drug brand resolves to its generic. A stock ticker resolves to an instrument. A contract resolves to typed clauses. A data flow resolves to a pair of jurisdictions. The subject changes. The move does not. Turn the messy thing into the canonical thing.

Second, a split between rules and reference data. Bulk facts that change on someone else's schedule, like sanctions lists, drug interactions, tax rates, adequacy decisions between countries, live in maintained reference data that gets refreshed when the world changes. The high-judgment layer, the obligations and thresholds and prohibitions, lives as version-controlled rules a domain expert signs off on. The two stay separate because they go stale at different speeds and belong to different people.

Third, mostly-reused logic plus a few new operators. Each domain reuses almost all of the condition types already built and adds only three or four genuinely new evaluators. Never a new engine. A whole regulated industry, delivered by recombining operators that already exist plus a small handful of new ones.

Fourth, a fail-closed advisory endpoint. Each domain exposes an interface that says "could not verify" out loud, and never lets silence pass for "compliant." In high-stakes work, a false clear is the catastrophic failure. An unlawful transfer waved through. A sanctioned party missed. So when the system cannot prove an answer, it says so plainly instead of going quiet and being mistaken for approval.

A domain that does not fit this shape is not a rule-engine domain. The surprise, after building across trade finance, banking, sanctions, healthcare, securities, legal, data protection, and AI governance, is that they all fit. The shape was general from the start.

Why this compounds: the primitive that travels

The deepest reason one engine can serve every domain is not tidiness. It is that each domain hands the next one a reusable primitive, and the next domain gets cheaper because of it.

Take the deadline check. It gets built first for a trade-finance presentation period, the window in which documents must be presented under a letter of credit. That same logic, unchanged, becomes a securities disclosure window, the days within which a material holding must be filed. The same logic again becomes a data-protection breach clock, the 72 hours within which a regulator must be told. The same logic again becomes a tax filing deadline. One primitive, built once, four domains served.

Take the cross-jurisdiction matrix. Built first for data protection. Given a data flow from country A to country B, is the transfer lawful, and under what mechanism? That same mechanism, pointed at a different matrix, becomes tax treaty withholding. Given a payment from country A to a resident of country B, what is the treaty-reduced rate? The mechanism is the same. Only the matrix changes.

Take the semantic comparison built to fuzzy-match trade-document fields. Point it at contract clauses and it becomes the engine that scores whether a clause deviates from a standard position.

This is the thing that never shows up in a feature list and matters more than any feature. The domains teach each other. Choosing the next domain is not about market size. It is about which domain hands forward the primitive that unlocks the most of what comes after. Build them in the right order and the engine gets cheaper to extend with every domain, not more expensive. The last domain, governance of AI systems themselves, contributes the primitive that makes the engine truly one engine. A rule in one domain that calls the rules of another, both verdicts assembled into a single report with its provenance intact.

Where the model belongs, and where it does not

None of this is anti-AI. The architecture leans on a language model. It just keeps the model to the job it is actually good at.

The model reads. It takes a messy document, an unstructured contract, a question in plain language, and turns it into the typed, canonical input the rules need. The model also writes. It turns a rule's verdict into a clear explanation a person can act on.

What the model does not do is decide. The decision belongs to the rule. Where a judgment is genuinely semantic, like whether a contract clause deviates materially from the standard position, the model is called explicitly, marked as such, and held to the same discipline as everything else. A low-confidence semantic judgment is a preview, never a verdict. It goes to a human. It does not quietly become a decision.

That line is the whole game. Probabilistic where the world is genuinely ambiguous. Deterministic where the regulation is precise. Honest, at the level of each rule, about which is which.

Fail closed, or do not bother

One commitment holds the whole approach up, and it is the one most easily skipped. The system has to fail closed.

In low-stakes AI, a system that does not know can guess, and a wrong guess costs little. In these domains the wrong guess is the entire risk. So the discipline flips. When the system cannot verify, whether from an unknown jurisdiction, a missing reference row, a semantic confidence below threshold, or a data source that is briefly down, it returns "could not verify," visibly, and never an implied "fine."

This sounds obvious and gets violated constantly, because failing closed makes a system look less capable in a demo. It says "I don't know" more often. But in a domain where a false clear is a fine, a missed sanction, or a harmed patient, "I don't know" is the correct and responsible answer, and a system that cannot say it is not safe to deploy. Failing closed is the line between infrastructure a bank can run in production and a clever demo that cannot survive a lawsuit.

What this is, in one line

It is the rule layer. The deterministic substrate beneath the probabilistic model. One engine whose architecture was general from the start, extended domain by domain, each domain making the next one cheaper, every decision traceable to its source, every claim of certainty held to its evidence, and every uncertainty stated out loud instead of hidden.

The world's hardest regulated problems were never unsolvable. They were unwritten. This is the work of writing them down properly, so a machine can reason over them with certainty, and a human can always check its work.

Enso Intelligence builds deterministic rule infrastructure for regulated domains. The first domain, trade finance, runs in production. This paper describes the architecture, not the rule corpus. The encoded rules are the asset, and they stay where they belong.

Dhaka, June 2026.