It was the moment the technology was built for.

France against Senegal, Kylian Mbappé going down in the box after a challenge from Sadio Mané. The on-field referee, Alireza Faghani, Iranian-born, now Australian-based, and one of the most decorated officials in the world, working his fourth World Cup, judged it no penalty in real time. Then the VAR technology went to work. Every camera angle, frozen and slowed, the contact captured with a fidelity no human eye could manage at full speed. The VAR officials in the booth looked at that footage, judged the on-field call a likely error, and recommended Faghani come to the monitor to see for himself.

Faghani went to the pitch side monitor. More than eighty thousand people in the stadium and over 20 million on television watched him stand in front of the replay and look.

And then he turned around, resumed play and kept his original call – no penalty.

The 2010 World Cup final assistant referee, Darren Cann, when asked said he couldn’t support the decision. He believed it was a clear penalty, many others including some of the world’s best referees agreed.

So far this has been one of the most controversial decisions of this world cup, not because the technology failed, but because it worked perfectly and the call still came out wrong. The cameras saw everything. The footage was flawless. The human had every angle, every replay, all the time he needed, and the full authority to change his mind.

And he didn’t.

That is the part worth pausing on, because it cuts against the entire promise of the technology. The machine’s job is to see — to capture the physical world with a completeness no human can match. In this case it did just that. What it cannot do is the next thing: judge what the evidence means and decide what to do about it. That part is still human. And in the moment that mattered, with perfect sight in front of him, the human judged it wrong.

A missed call in real time is forgivable; the eye is slow and the game is fast. That was exactly why VAR was first introduced. But, a referee who is shown everything, in perfect detail, and still gets it wrong is a different kind of problem, and a more instructive one particularly for this conversation. Because it isn’t a problem with the seeing.

It’s a problem in the space between seeing and deciding.

The question every fan is actually arguing about

Strip the football away and you have one of the most important questions in artificial intelligence, and football is the one place on earth currently arguing about it out loud, in front of billions of people.

The promise of VAR was simple: give the human a powerful machine to check his work, and the decisions get better.

More information, more accuracy, fewer mistakes.

It is the same promise being made, right now, about AI in every consequential industry on earth. Put a capable system next to a skilled human, and the pairing will outperform either alone.

And here is the part that matters, the part that makes football – the beautiful game – a wonderful teacher: for the most part, the promise holds. The research on VAR proves that it works. One study found the accuracy of referees’ final decisions rose from 92 percent to over 98 percent once VAR was in the mix. The technology, on the whole, makes the officiating better.

So this is not a story about a system that failed.

It’s a story about what’s left over after the system mostly succeeds. Because even with accuracy up near 98 percent, the moments that decide tournaments, the ones everybody remember like the Mbappé penalty decision can still be wrong. But not due to the machine, due to the human. Sometimes the referee over-trusts the screen and defers when his own read was better. Sometimes, as with Faghani, the referee under-trusts the machine — he waves away what the replay plainly showed and backs his own eyes when he shouldn’t. But the failure runs just as easily the other way. A referee with VAR fall back can feel the instinct to avoid committing to a hard call in real time, instead rubber-stamping the monitor. Present on the pitch, no longer really deciding.

That is over-trust and under-trust, the same relationship breaking in opposite directions.

None of that means the technology is broken. The cameras are close to perfect. The human still has the authority. The hard part was never the seeing.

The hard part is what the human does with what the machine sees, in the seconds that count.

The thing football gets right that almost no one else does

Here is what should give every organisation deploying AI pause or put differently, what we can all learn from the evolution of VAR in football.

VAR is the most scrutinised human-machine decision system on the planet. Every contentious call is dissected by a billion people in real time. The protocols are constantly being reviewed and adjusted. The pitchside monitor is a relatively recent addition, since early trials showed no monitor moved the referee too far from the decision. The review powers keep expanding; the offside tracking keeps getting faster and more automated, referees now use a microphone to explain the decision. This is a human-machine relationship being actively, publicly, continuously redesigned, by a governing body that treats every failure as a reason to revise.

And notice what’s being automated and what isn’t. The offside line is a factual call, where was the body when the ball was played, and the machine now owns it, because the question is a measurement and a tracking system answers it better than any eye. The penalty is a different kind of call. Whether the contact was a foul isn’t a measurement; it’s a judgment about what happened and what it meant. That one the machine can inform, but it cannot make. The system has quietly sorted its decisions by type – measurement to the machine, meaning to the human – and the hard cases, the ones that decide tournaments, are all on the human’s side of that line.

And yet this remains incredibly hard.

With all that scrutiny, all those iteration, all that accuracy .. the decisive moments still come down to a human at a monitor who can get it wrong, even with the right answer in front of him.

And that is the part worth mulling. Not that “VAR is bad.” VAR is maybe the best example of active problem solving of the human to machine relationship. Which still gets it wrong.

The research nobody in the AI conversation wants to quote

There is wider research on this topic, beyond VAR.

The largest analysis of its kind – over a hundred experiments, hundreds of measured outcomes – looked at humans alone, AI alone, and the two combined. The assumption, and the one underneath every “human in the loop” promise, is that the combination always wins.

It doesn’t.

On average, the human-AI pairing performed worse than the best one alone. And the losses were concentrated in one place above all: decision-making. On tasks where the job was to decide, bolting a human and a machine together tended to drag the result down, not up.

That is not an argument against the pairing. It’s an argument that the pairing is not automatically better. Whether it helps or hurts depends entirely on how the relationship between the two is designed. Football is what it looks like when someone designs and redesigns that relationship in public and even then, as we have noted, it is hard.

Most industries have not even begun down this path.

There are names for the failure modes, drawn from decades of work in aviation and medicine, where the stakes forced somebody to study it:

Automation bias: The human defers to the system even when it’s wrong.
Over-compliance: The human stops checking, because it’s usually right.
Deskilling: The human’s judgment atrophies from disuse, until the day the system fails and there’s nothing to fall back on.
Out-of-loop syndrome: The human is technically present, nominally in charge, and no longer able to actually intervene.

Every one of them is a referee at a monitor. And every one is happening, right now, inside organisations that believe their “human oversight” is doing the work they think it is.

Even the off-switch is a referee at a monitor

The most serious people in AI keep walking right up to this, and it’s worth watching where they stop.

Will Marshall, in a recent and important essay, makes the case for governing advanced AI before it outruns us. He quotes the founders of the leading labs, whose predictions on the odds of disaster are simply scary. His frame, and theirs, is about control: keep the human above the machine, build the off-switch, hold the line. And he is honest about the catch — he concedes, almost in passing, that any off-switch will probably fail.

.. because the weakest link is always the human element.

He’s right, and that line is the one worth taking seriously rather than waving past — it’s the truest thing in the argument. The off-switch does fail at the human. What it leaves open is why. And the answer isn’t that the human is weak. Faghani isn’t weak. He is one of the most decorated officials in the world, he had a flawless replay and complete authority, and he still got it wrong. The human is the failure point not because the person is deficient, but because the judgment is the one part of the system nobody designed. Everyone engineered the seeing. Everyone engineered the controls. No one engineered the seconds in which a human looks at perfect evidence and has to decide what it means.

You design the conditions around the judgment so that good judgment is more likely and bad judgment is more catchable.

That is the gap, not the human, but the undesigned space around the human’s judgment. Control puts a hand on the switch. It does nothing for the judgment behind the hand.

Where this stops being a metaphor

Football works as the example because its decisions are made in the real world, under time pressure, and cannot be reversed. Once the whistle blows, the goal stands or it doesn’t, and there is no revisiting.

That is exactly the kind of decision where the human-machine relationship matters most, and exactly the kind where getting it wrong costs something you can’t take back. And unlike in football, almost none of these decisions are watched, scrutinised, or designed at all.

These are the decisions I am focused on. Not the boardroom call you can revisit, but the ones made under a running clock about the physical world. An incident commander deferring to a model’s evacuation sequence as the fire turns. A grid operator trusting a load-shed recommendation with seconds to act. A claims adjuster ratifying a total-loss call on a property they half-doubt. A flood-response team routing resources off a prediction of where the water goes next. In every one, the machine sees more than any human could.

And in every one, the real question is not whether the human is allowed to overrule it, but whether the pairing was designed so the answer comes out better instead of worse.

Football has a billion people watching this relationship and a governing body revising it continuously, and it remains incredibly hard. The incident commander, the grid operator, the adjuster have none of that. The machine was installed. A human was placed beside it. Everyone assumed the combination would help. No one has gone back to ask whether it actually does.

The technology in these systems is in large part fine. The unsolved problem is the one playing out on every pitch in this summer’s World Cup, in the one arena honest enough to argue about it in public: when you put a capable machine next to a skilled human and ask them to decide together, you do not automatically get a better decision. Sometimes, in the moment that matters most, you get a worse one.

Which way it goes is a design choice. And outside of football, it is one almost no one is making.

VAR saw everything. Faghani had the authority. He still got it wrong. Which decisions belong to the machine, which stay with the human, and how you design the handoff between them … that’s the one I’ll take up Thursday.

Has your organisation designed the human layer .. or just assumed it’s there?

If you’re not sure, that’s usually your answer.

I write Decision Layer Weekly for people working on exactly this gap: Subscribe here. And if it’s live for you right now, reach me directly: mattsheehan@spatialnext.io

Matt Sheehan

Matt is a geographer and AI strategist with 25 years at the intersection of geospatial intelligence and decision-making. He maps the architecture connecting three layers most organisations haven’t yet seen together: the sensing layer the geospatial industry has built, the causal reasoning layer now arriving, and the human decision layer nobody is designing. The third layer is where most deployments fail. And where Matt has his primary focus.

VAR Saw Everything. The Referee had the Authority. He Still Got It Wrong.

The question every fan is actually arguing about

The thing football gets right that almost no one else does

The research nobody in the AI conversation wants to quote

Even the off-switch is a referee at a monitor

Where this stops being a metaphor

Matt Sheehan

Like this:

Related

Leave a ReplyCancel reply

The question every fan is actually arguing about

The thing football gets right that almost no one else does

The research nobody in the AI conversation wants to quote

Even the off-switch is a referee at a monitor

Where this stops being a metaphor

Matt Sheehan

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from SpatialNext