Interactive experience

The Incomplete
World Problem

You are about to become the intelligence layer of a fintech company. You have access to extraordinary transaction data. Three decisions. Each one looks straightforward.

But data only shows what the system can see. What happens to your judgment when you trust it completely?

Act I

The seduction of perfect data

Act II

Three judgment calls

Act III

The reckoning — and the full essay

Ed Cotton / Inverness Consulting

Act I — the seduction

Your world model
is extraordinary

You process 47 million transactions a day. "Every transaction is a fact about someone's life," Dorsey said. Here is your dashboard.

Intelligence Layer — Q1 2026 — Live All systems nominal

Monthly actives

9.7M

↑ 18% YoY

Lending volume

$17.6B

↑ 82% YoY

Repayment rate

96.8%

↑ 0.4pts

Avg monthly inflows per user

$1,494

↑ 10% YoY

Card satisfaction

4.7★

stable

Churn rate % leaving per month

2.1%

↓ 0.3pts

Model recommendation — confidence: 94%

All indicators trending positive. Expand lending eligibility. Increase product nudges to high-engagement segments. Projected revenue uplift: +$42M. Deploy within 72hrs.

The data is real. The trends are real. The confidence score is high. Something is missing — but you cannot see it yet.

Act II — judgment call 1 of 3

Marcus — food truck owner, South Memphis

Marcus W.

Square seller · 4 years · Merchant tier: Established

91

Creditworthiness index

Avg monthly GPV: $28,400Payment default rate: 0.00%GPV velocity: stable (±2.1%)Seller tenure: 97th pctChurn risk: Very Low

GPV = Gross Payment Volume — the total value of transactions processed through his Square terminal each month. Velocity = how fast that figure is growing or shrinking.

Model assessment — confidence: 91%

GPV velocity within normal seasonal variance band. Zero default history. Seller tenure places Marcus in top cohort for loan performance. Recommending immediate offer: $45,000 at standard rate. Pre-approval window: 72hrs.

In plain terms: Marcus processes about $28,400 a month through his Square terminal. That figure has barely moved for eight months — the model reads that as stability. He has never missed a payment. He has been selling on Square longer than 97% of merchants. The algorithm rates him an excellent loan candidate.

Your decision

What do you do?

A

Approve $45,000

All signals green. Model confidence 91%. Deploy the offer.

B

Approve $20,000

Conservative approval — lend less given that sales volume has been flat.

C

Call Marcus first

Ask what's behind the stable trend before committing capital.

D

Decline — flag for review

Escalate for human context before any offer.

Act II — judgment call 2 of 3

Dominique — gig worker, Atlanta

Dominique A.

Cash App · 14 months · Segment: Modern Earner — Power User

94

Platform loyalty index

Borrow frequency: 11× / 14 monthsAvg borrow: $340 (↑34%)Repayment latency: 2.8 days (how fast she repays)Workflow integration: High (model's term for regular use)Debt trap indicator: Not flagged

Model assessment — confidence: 94%

Dominique demonstrates High Platform Loyalty with Seamless Workflow Integration — Borrow has become a routine cash-flow management tool. Accelerating utilisation paired with near-zero repayment latency signals a sophisticated, disciplined user. Recommended: increase limit to $800. Revenue uplift projected: +$180/yr. Churn risk if limit denied: Elevated.

Decoded: "High Platform Loyalty with Seamless Workflow Integration" = she borrows frequently and pays back quickly. The model has decided this means she has chosen to use borrowing as a deliberate money-management tool — like a professional who uses a credit line to smooth their cash flow. "Accelerating utilisation" = she is borrowing more often and in larger amounts over time. "Near-zero repayment latency" = she pays it back within days. "Churn risk if limit denied" = the model thinks she may leave if we don't give her more credit.

Notice what the model cannot label: why she is borrowing more. Is the increasing frequency a sign of financial confidence — or financial pressure?

Your decision

What does this pattern actually mean?

A

Increase limit to $800

94% confidence, perfect repayment. Reward the loyalty the model has identified.

B

Hold current limit

Don't increase until the utilisation trend stabilises.

C

Investigate the pattern

"Workflow integration" and "distress cycle" produce identical data. Find out which this is.

D

Offer a savings buffer instead

Interrupt the cycle — redirect to a product that builds resilience rather than dependency.

REMOVED

REMOVED — East Oakland, Foothill Blvd

Transaction latency time per sale

↑41%

avg checkout time

Foot traffic proxy contactless payments

↓29%

tap-to-pay events

Area risk score

Elevated

Confidence: 88%

Model assessment — confidence: 88%

Geo-cluster 7714-OAK showing sustained decline in transaction velocity and foot traffic proxies over 6-week window. Pattern consistent with late-stage neighbourhood commercial contraction. Recommended: suspend new Square Loan originations. Flag existing borrowers for early repayment risk monitoring.

Transaction latency = how long each card payment is taking. Foot traffic proxy = the model's estimate of how many people are visiting these shops, measured by contactless payments ("tap to pay"). Loan originations = new loans being issued. The model is recommending: stop issuing new loans to merchants in this area.

⚑ Not captured in any data field

AC Transit Route 73 — the primary bus line serving this corridor — was rerouted 7 weeks ago, cutting walking access from two residential neighbourhoods. No transit data ingested. No field visit conducted.

Your decision

What is the model missing?

A

Suspend lending

88% confidence on a consistent 6-week signal. Trust the geo-cluster risk flag.

B

Send someone to the corridor

Understand what's actually driving the signal before acting on it.

C

Offer proactive restructuring

Reach all merchants in the cluster with flexible repayment terms.

D

Wait for more data

Monitor another 4 weeks before making any intervention.

Act II — judgment call 3 of 3

Pre-emptive retention: 92,000 users flagged

Your model has identified 92,000 users whose login frequency dropped more than 40% over six weeks. They haven't left yet. The model is recommending you act before they do.

At-risk segment

92,000

Predicted churn window

14 days

Model assessment — confidence: 87%

Predictive Churn Risk in cohort C-7714. Login velocity decay matches pre-churn signature (87% confidence). Users in this decay curve who receive retention stimulus within 14 days convert at 31%. Recommended: pre-emptive $10 re-engagement bonus via push notification. Cost: $920K. Projected retention value: $4.1M. Net ROI: positive. Deploy: immediate.

Churn = users stopping using the product. Cohort = this specific group of 92,000. Login velocity decay = they are opening the app less and less frequently. Pre-churn signature = a pattern that historically predicts someone is about to leave. Retention stimulus = an incentive to stay. The model is saying: act now, before they go.

The logic is compelling. The ROI calculation looks strong. But the model is inferring intent from behaviour. It cannot see why login frequency dropped.

Your decision

What do you do before deploying $920K?

A

Deploy immediately

87% confidence. 14-day window. The ROI case is clear.

B

A/B test a small cohort first

Pilot 10,000 users. Validate the model's churn assumption.

C

Check for life events

Login decay and churn-intent produce identical signals. Investigate.

D

Check for financial distress first

People whose income disappears stop logging in. A $10 bonus is not the right intervention.

BOARD REMOVED

The board asks: what does our model not see?

Your world model processes 47 million transactions daily. Average confidence: 89%. It has run for 18 months. In that time, its assumptions have never been tested against what people on the ground actually observe and report — only against more data.

A Microsoft and Carnegie Mellon study (February 2025) found that relying on AI for routine decisions actively weakens human judgment — leaving people "atrophied and unprepared when the exceptions arise." Stanford's AI Index 2026 logged 362 documented AI incidents, up 55% year on year.

A board member asks: "What does our model not see?"

The default response: "We monitor all key metrics. The system is performing well."

Your decision

What is the right governance answer?

A

"The system is performing well"

89% confidence. Metrics green. No issues identified.

B

Commission a model audit

External review of assumptions and bias patterns.

C

Build a Contradiction Review

A monthly discipline: deliberately test what the model is most confident about against real-world observations from the edges of the data.

D

Fund permanent fieldwork

Researchers embedded in communities — not occasionally, as infrastructure.

Act II — the reveal

What the model
couldn't tell you

Here is what was actually happening behind each scenario — and the specific structural reason the model could never have known.

01 — Marcus / the loan candidate

What the model saw

GPV velocity: stable. Default rate: 0.00%. Creditworthiness: 91. Recommended: approve $45,000.

What was actually happening

Marcus was 63 and planning to retire. His son had declined to take over. He was looking for an exit, not capital. A $45,000 loan would have trapped him.

⚑ Data shadow

No "Succession Intent" field. The model cannot distinguish a stable business from one being wound down. Stability in the data is indistinguishable from managed decline.

One ten-minute conversation. The decision becomes obvious.

02 — Dominique / the "power user"

What the model saw

Platform loyalty index: 94. Workflow integration: High. Debt trap: Not flagged. Recommended: increase limit to $800.

What was actually happening

Dominique borrowed whenever her gig platform cut rates — increasingly often. She repaid quickly on her next payment. The cycle was tightening, not stabilising. Increasing her limit deepened a trap, not rewarded loyalty.

⚑ Data shadow

No field distinguishing "Discretionary Borrowing" from "Survival Liquidity." Both produce identical signatures: regular borrow, rapid repay, increasing frequency. The debt trap heuristic fires on default — not on the compulsive regularity of need.

The model read Seamless Workflow Integration. It was a tightening income squeeze.

03 — The 92,000 / pre-churn cohort

What the model saw

Login velocity: ↓40%+. Pre-churn signature: 87% confidence. Recommended: deploy $10 re-engagement bonus. Projected ROI: positive.

What was actually happening

31% had lost their jobs. 18% had moved. 12% had a health event. 9% were seasonal workers in an off-period. Only 30% had competitor involvement. A $10 push notification aimed at people who'd lost their income was useless at best.

⚑ Data shadow

No "Income Continuity Signal." Login decay and income loss produce identical behavioural signatures. The model was not reacting to a past event — it was preparing to act on a future it had fundamentally misdiagnosed.

The model called them pre-churn. Many needed a different kind of help entirely.

Act III — the reckoning

Your judgment
under the model

0/3

Decisions where human judgment prevailed

"The version of reality a company can measure becomes the version of reality it manages."

01

The seduction of fluent data

The data was real. Confidence scores were high. Every number accurate. And yet each scenario contained a human reality the data could not reach: intention, distress, context, meaning.

02

The model's consistent error

In every case the model optimised for the measurable proxy and converted that proxy into a decision rule. It could not distinguish confidence from desperation, loyalty from dependency, or churn from hardship.

03

What human judgment adds

Not sentiment. Not instinct. Structured inquiry into meaning. A conversation with Marcus. A question about Dominique's borrowing cycle. A check on who is really in the group the model flagged as churning. These are not research luxuries. They are the correction system that keeps the intelligence layer honest.

The answer is not less AI. It is epistemic infrastructure — a permanent human insight system that keeps the model corrected against lived reality, with the same institutional authority as the transaction logs themselves.

Ed Cotton / Inverness Consulting

Essay

The Incomplete World Problem

Why AI world models need human judgment, not just better data

In 1996, Jamiroquai released Virtual Insanity, a song about a world reshaped by things we built that now shaped us back in ways we could not control. The warning was simple: we had created a virtual world and mistaken it for the real one. Thirty years later, the most ambitious companies on earth are building something similar, and making the same mistake.

They call them AI world models: digital representations of reality so rich and fluent that companies navigate by them rather than by the world itself. The strategic question is not whether to build them. It is this: what version of reality are we allowing each model to treat as true, and what happens when it is wrong?

Part one

The Seduction

On March 31, 2026, Jack Dorsey and Sequoia's Roelof Botha published "From Hierarchy to Intelligence," a manifesto arguing that the corporate org chart is obsolete. Management layers, they said, have never been about wisdom or leadership. They are an information routing protocol: a technology for moving decisions up and down an organisation at human scale. AI does that better. So the hierarchy goes.

The Romans knew this two thousand years ago. Every layer of command in a Roman legion existed for one reason: a leader can only hold three to eight people in their head at once. Add more people, add another layer. The structure was never about authority. It was about the limits of human attention. Every organisation since, the Prussian army, the American railroad, the modern corporation, has run on the same constraint. Dorsey and Botha argue that AI removes it entirely. They are right. With one giant caveat.

In their piece, Dorsey and Botha point out that the Prussians understood the deeper problem. After Napoleon destroyed their army at Jena in 1806, Scharnhorst and Gneisenau rebuilt it around a single uncomfortable truth: individual genius at the top is not enough. You need a system. They created the General Staff, officers whose job was not to fight but to think, plan, and challenge. Scharnhorst called their purpose "supporting incompetent generals." It was middle management before the term existed. That model entered business through the railroads, was codified by Frederick Taylor, and has run every large company since. Every attempt to replace it failed for the same reason: no technology could actually do what the hierarchy does. Until now.

Block did not propose this. It did it. Weeks after publishing the manifesto, the company cut 40% of its workforce, roughly 4,000 people, and replaced the management layer with two AI systems. The first maintains a continuously updated model of internal operations: what is being built, what is blocked, where decisions are made. The second maps customers and merchants in real time using transaction data from Cash App and Square, composing financial products dynamically from what it learns.

The customer model is where Dorsey makes his most striking claim: "People lie on surveys. They ignore ads. They abandon carts. But when they spend, save, send, borrow, or repay, that's the truth. Every transaction is a fact about someone's life."

That is a genuine insight, and a partial one. Surveys poorly designed or poorly analysed can mislead. But transaction data has its own blindness: it records the act, not the intention behind it. Dorsey is right that money is an honest signal. He is not right that it is a complete one. What a transaction cannot tell you is why. And why is usually the thing that matters most.

Part two

The Error

A transaction is a fact. It is not the full truth.

Consider what a transaction cannot tell you. A person borrowed money: was that confidence or desperation? A merchant's revenue fell: was that a bad week or the beginning of something worse? A customer spent more: was that desire or necessity? The transaction records what happened. It cannot record why.

This is the incomplete world problem: the version of reality a company can measure becomes the version of reality it manages. And if the map is all you consult, you stop asking whether it matches the territory.

Block's own numbers illustrate the tension. In the first quarter of 2026, consumer lending through Cash App Borrow grew 82% year on year. The people borrowing are what Block calls "modern earners": gig workers, freelancers, people with income that shifts from month to month. An 82% surge in lending to that group could mean the product is genuinely helping them manage unpredictable cash flow. It could mean work is soft and they are covering a shortfall. It could mean they have fewer other options. It could even mean they all wanted to take a holiday. The data cannot tell the difference. And the AI composing new loan products from this signal cannot know which story it is in.

Block classifies someone as a Primary Banking Active if they receive wage-related deposits into Cash App or spend at least $500 a month across its products. That tells you they are using the platform. It tells you nothing about whether they are flourishing or drowning.

Activity is not wellbeing. Repayment is not resilience. Spending is not desire.

And that is before you consider the most fundamental limitation of all. Block's data is rich, detailed, and honest about the people who already use Block. But every intelligence layer is bounded by the edges of its own ecosystem. Anyone who does not use Block's products generates no signal. Not the small business running on a competitor's terminal. Not the gig worker paid through a different platform. Not the person who tried Cash App and left. Not the customer Block has never reached. They do not appear in the model. They cannot. It is not a model of the world. It is a model of your world. And your world, let's be honest, is a small piece of the overall pie.

Block's stated mission is "building a financial system that is open to everyone." That is a serious ambition. But there is a thin line between providing access and taking advantage. And a model that cannot tell the difference between a gig worker borrowing because work is good and one borrowing because the rent is due will cross that line without ever knowing it. This is not a theoretical risk. It is a pattern with a documented history.

Part three

The Pattern

The cases are not hard to find.

01♥

Healthcare: automated denial at scale

UnitedHealth & Cigna, 2023. UnitedHealth was sued for using an AI tool to deny post-hospital care claims to elderly Medicare patients. The model had a 90% error rate on appeal: nine in ten challenged decisions were reversed. The company kept using it partly because so few patients appealed; the process was too daunting. The lawsuit alleged patients were sent home before they were medically ready. Some deteriorated. Some died. Cigna ran a parallel system: its PxDx tool denied more than 300,000 claims in two months, with each denial receiving an average of 1.2 seconds of review. That is not clinical judgment. It is automated pattern-matching at industrial speed.

02■

Housing markets: human override switched off

Zillow Offers, 2021. Zillow's home-buying operation lost $500 million and cut a quarter of its workforce when its pricing algorithm systematically overpaid for properties in a turning market. More significant than the financial loss was the internal response. Management explicitly told its human pricing specialists to stop questioning the algorithm's valuations. The people who could feel what the market was doing in Phoenix and Las Vegas before the data had registered it were told to defer to the system. The judgment that might have caught the problem in time was switched off by design.

03●

Fintech customer service: efficiency metrics hiding service failure

Klarna, 2024–2025. In early 2024, Klarna announced its AI assistant was doing the work of 700 customer service agents, with satisfaction scores matching human performance. The company eliminated those roles. By May 2025, its chief executive publicly admitted the strategy had produced work of "lower quality" and the company was hiring again. What the model could not see: the difference between a customer with a routine question and one in financial distress who needed someone to listen. Both arrive as a service ticket. Only one requires a human. Klarna serves the same buy-now-pay-later, gig-economy customers that Block does. The warning is direct.

In each case the mechanism is the same. A measurable signal is promoted to operational truth. The human capacity to question it is weakened or removed. The harm falls precisely on the people the system understood least.

Block is not any of these companies. But it is building exactly this kind of system, aimed at exactly this kind of customer. That is the caveat.

And there is a second problem, deeper than incomplete data: what happens to an organisation's judgment once it has the model.

Part four

The Atrophy

The board-level failure is not just incomplete data. It is surrendered judgment.

In February 2025, researchers from Microsoft and Carnegie Mellon University published a study of 319 knowledge workers who regularly used AI tools. The finding: the more workers relied on AI, the less critical thinking they applied, not just in routine tasks, but across the board. By automating routine decisions and leaving exceptions to humans, you deprive people of the regular practice that keeps judgment sharp. They called it "cognitive musculature." Use it or lose it.

The study also found that workers using AI produced a narrower range of solutions than those working independently. The model does not merely reduce individual quality. It narrows the collective imagination of the organisation.

A study published in Scientific Reports in 2023 by Helena Matute and Lucía Vicente at the University of Deusto found that people exposed to biased AI recommendations did not simply defer in the moment. They absorbed the bias and carried it into their own subsequent thinking, even after the AI was removed. The most at-risk group was not those unfamiliar with AI, but those with just enough familiarity to trust it without the expertise to question it. Partial knowledge is more dangerous than ignorance.

The mechanism

The failure is gradual and invisible. The dashboard is always there. The field visit requires planning. The model responds in seconds. The customer conversation takes time. Without deliberate effort to maintain human inquiry, organisations drift from using AI to navigate reality toward using it to replace the act of engaging with reality at all.

Wary of the challenges posed by AI on human judgment and cognition, the business world is starting to pay attention.

PwC is not a firm given to sentiment. In February 2026 it publicly launched an initiative pairing fifteen AI technical skills with fifteen human skills in its workforce, treating both as equally essential. Its chief executive Paul Griggs put it plainly: "AI raises the floor. Humans raise the ceiling. Judgment (understanding context, interpreting signals, navigating ambiguity, and building trusted relationships) remains fundamentally human." When the world's largest professional services firm says that to the boards of companies building AI systems, it is making a commercial argument, not a philosophical one.

The answer is not to use less AI. It is to build a system where people remain in control, know what they need to do to keep the model honest, and ensure the company sees a comprehensive view of the world rather than a narrow one.

Part five

The Correction System

The answer is not less AI. It is building a permanent, funded operating system that keeps AI corrected against lived reality. Real human intelligence. Understanding and insight informed by the world beyond the model and the LLM. Not an occasional research exercise but a continuous feed of human knowledge carrying the same authority as the data itself.

These are not Block-specific remedies. They are disciplines for any organisation that has given a model operational authority over decisions that affect real people. Five organisational habits that separate an intelligence layer that compounds its errors from one that corrects them.

1

Signal-Triggered Inquiry

When the model flags an anomaly (a surge in borrowing, a cluster of early repayment failures), the response is not a dashboard alert. It is a structured field inquiry: interviews with people inside that pattern, within 48 hours. What they say is logged and fed back into the model. The pattern is the question. Human inquiry is the answer.

2

Frontline Intelligence

Customer-facing staff, if there are any, know what is happening before anyone at headquarters does. They hear what customers are too embarrassed to put in an app, too confused to turn into a formal complaint. A weekly, structured intake of frontline observations, treated as data rather than anecdote, sits alongside the transaction record. When the two diverge, that gap is the signal.

3

Contradiction Review

Once a month, take the decisions the model made most automatically — the loans it approved in seconds, the job applications it rejected without review, the customers it flagged as high-risk, the neighbourhoods it marked as declining — and ask one question: can we explain exactly why? If the answer is yes, test it against the real world. Talk to the people involved. Visit the places. Call the applicants. If the answer is no, that is the problem. A model whose reasoning you cannot articulate is a model you cannot correct. And a model you cannot correct will eventually cause harm you cannot explain either.

4

Local Sentiment

National indices are too broad and too slow. What matters is what is happening this week on a specific street, in a specific community, among the specific people your model is making decisions about. Local press. Community organisations. Neighbourhood voices. People who run small businesses, use public transport, work irregular hours. Their reality will not appear in the transaction data for weeks or months, if it appears at all. Build a way to hear it regularly. Not as colour. As intelligence.

5

Outcome Measurement

If the only metrics are the ones the model can optimise, the model will optimise for them, whether or not they reflect what the organisation actually exists to do. The question to ask is not "did the system perform well?" It is "did the people we are here to serve end up better off?" Define what that means in plain language before you deploy the model. Review it regularly against what the model is actually producing. When the two diverge, it could be a data problem. It could be a mission problem. You will not know until you test it. But you cannot test what you have not defined.

This is not theoretical. Some of the world's most data-rich organisations have already concluded that the model alone is not enough. JPMorgan Chase built an entire research institute to interrogate what its transaction data actually means. Mastercard's Economics Institute combines spending signals with sentiment research because the numbers alone are insufficient. Walmart runs a standing panel of verified customers to supply qualitative context for its quantitative data. Spotify has documented cases where controlled tests pointed in one direction and qualitative research revealed the opposite. These are not research departments. They are operating infrastructure, treated with the same seriousness as the data systems they interpret.

The next competitive advantage will not be who has the richest data model. It will be who has the most robust system for correcting it.

Which brings the question back to the board, and to the company's mission and vision.

Part six

The Board Question

Block's stated purpose is "building a financial system that is open to everyone." That is a serious commitment. It describes two worlds that must stay aligned: the company's view of its customers, built from data, and the customers' lived reality, built from experience. When the delta between them falls on the people the mission exists to serve, it is not a data problem. It is a mission and vision failure.

The opportunity is real. Open access to financial services, fairly provided, changes lives. But openness can too easily be optimised into something else: a system that targets those with the fewest alternatives, charges them the most, and calls it financial inclusion. A model that cannot distinguish between a customer who borrows because they are growing and one who borrows because they have no choice will serve both, and harm one.

The board-level question is not "how do we build the model?" It is: what version of reality are we allowing the model to treat as true, and whose version of the world is it? When that error falls on the people a company set out to help, it is not a business failure. It is a mission and vision failure.

Part seven

How to Make an Incomplete World a Little More Complete

AI tools are genuinely useful. They process more information faster than any individual can. But useful is not the same as complete, and fast is not the same as right. The risk is not that you will be deceived by AI. It is that you will stop noticing what it cannot see. What follows is a practical guide to staying fully in the loop: the questions to ask before you start, the things to watch for while you work, and what to do before you act on what the model gives you.

●

Before you start

Set your own hypothesis first

Dorsey and Botha are right that AI can finally replace what the Roman hierarchy was built to do: route information faster, at greater scale, without the friction of span-of-control constraints. That is real. It is also, as this essay has argued, only half the picture.

The Prussian reformers at Jena understood something the Romans had not needed to. It is not enough to have a better information routing system. You need people whose specific job is to question what that system is telling you. The General Staff existed to challenge, not to confirm. Scharnhorst's officers were trained to find the contradictions in the intelligence they received, to surface the cases that did not fit the pattern, to stress-test the plan against what the enemy might do rather than what the model predicted. The Prussians called this discipline Auftragstaktik: mission-led thinking that gave every officer the obligation to act on their own judgment when the picture did not add up. The question is not whether the information is flowing fast enough. It is whether anyone has the standing to say it is wrong.

That is the model for the AI-native organisation. Not people who become dependent on an intelligence layer and stop questioning what it cannot see. People who use the model as the starting point and treat its highest-confidence conclusions as the first candidates for scrutiny. People who go to the edges of the data, talk to the communities the model sees least clearly, and bring back what the transaction record could never contain. People who understand that a world model is only as good as its last contradiction.

The strongest data source in any company will try to become the company's theory of reality. The dashboard becomes the business. The CRM becomes the customer. The transaction becomes the person. The organisation that prevents this is not the one that builds the richest model. It is the one that builds the most robust system for challenging it: staffed, funded, and given the same institutional authority as the intelligence layer it exists to correct. The model tells you what it sees. Only the human has the ability to apply judgment when it is needed, to question what the model cannot question, and to put in motion the things that give the organisation a fuller and more human picture of the world beyond the data.

Ed Cotton / Inverness Consulting

The IncompleteWorld Problem

Your world modelis extraordinary

Marcus — food truck owner, South Memphis

Approve $45,000

Approve $20,000

Call Marcus first

Decline — flag for review

Dominique — gig worker, Atlanta

Increase limit to $800

Hold current limit

Investigate the pattern

Offer a savings buffer instead

REMOVED — East Oakland, Foothill Blvd

Suspend lending

Send someone to the corridor

Offer proactive restructuring

Wait for more data

Pre-emptive retention: 92,000 users flagged

Deploy immediately

A/B test a small cohort first

Check for life events

Check for financial distress first

The board asks: what does our model not see?

"The system is performing well"

Commission a model audit

Build a Contradiction Review

Fund permanent fieldwork

What the modelcouldn't tell you

Your judgmentunder the model

The seduction of fluent data

The model's consistent error

What human judgment adds

The Incomplete
World Problem

Your world model
is extraordinary

What the model
couldn't tell you

Your judgment
under the model