All insights

Data readiness

When AI projects fail, the data is usually to blame

Not the model, not the prompt, not the platform. The thing feeding it.

Financerevenue = 480k Sales CRMrevenue = 512k Ops sheetrevenue = 467k Modelpicks one?
Three systems, three definitions of “revenue” — the model builds on the contradiction.

Most failed AI projects don't fail because the technology was weak. They fail because the data underneath was wrong, out of date, scattered across systems that don't talk to each other, or never properly defined in the first place.

There's a line a UK data firm uses that has stuck with me: AI doesn't fix your data problems, it puts them on display. That matches what we see. A model is happy to take a messy, half-contradictory dataset and produce a confident answer from it. The answer just won't be right, and now it's wearing a clean interface that makes it look right.

"Clean" data and "AI-ready" data are not the same thing

This is the part most people miss. Your data can pass every quality check, no blanks, no duplicates, all the formats correct, and still be useless to an AI system.

Gartner's framing is helpful here. Data that's ready for AI isn't just tidy. It's tied to a specific use, governed properly, fed through pipelines that catch problems as they happen, and kept current rather than checked once a quarter. By that definition, Gartner reckons a large share of organisations, well over half, don't actually have AI-ready data, and predicted that most AI projects without it would be abandoned.

A concrete example. Ask three teams in the same company what "revenue" means and you'll often get three answers. Finance counts it after refunds. The sales system counts it before. Operations has its own version baked into a spreadsheet from 2019. A human knows to ask which one you mean. A model doesn't. It treats all three as the same number and builds on the contradiction without telling you.

The unglamorous work that actually moves the needle

None of the fixes here are exciting, which is probably why they get skipped.

Agree on definitions. Every metric that matters should mean one thing across the whole business, written down and enforced, not left to whoever built the report. Connect the sources, so the model isn't pulling from four systems that each hold a slightly different truth. And treat data quality as something you maintain, not a one-off cleanup before launch. Models in production need their inputs checked in something close to real time, not reviewed at the next governance meeting six weeks out.

McKinsey's data backs the order of operations: the firms getting real returns from AI were far more likely to have sorted out their data flows before picking a model. Fix the foundation, then build on it. Do it the other way round and you're decorating a house with no footings.

Why we lead with this

We'd rather tell a client their data isn't ready than sell them a clever model that's going to embarrass them in three months. It's a less flattering conversation to have on a first call. It's also the difference between a project that quietly works and one that quietly gets switched off.

If your AI initiative has stalled, or never made it past the demo, start by looking under it. The cause is almost always in the data, not the algorithm. The good news is that's fixable, and fixing it tends to pay off well beyond the one project that prompted it.

Share

Think your data and AI could work harder?

Book a free 30-minute diagnostic — honest advice, no obligation.

Book a free diagnostic