The numbers are brutal. In August 2025, MIT published a study revealing that 95% of AI projects at large US corporations were failing to deliver meaningful value. But here's what should really keep you awake at night: that figure is actually worse than it sounds. The MIT researchers included a small subset of projects that had performed exceptionally well in their analysis. When you remove those bright spots, the failure rate becomes even more devastating.
There's one number, however, that offers genuine hope. When companies brought in outside help—particularly smaller vendors with specialized expertise—their success rate jumped to 67%. That's the difference between one success out of twenty and roughly two successes out of three. It's the difference between despair and redemption.
So why are projects failing? And more importantly, why are companies ignoring the solution that actually works?
The Seduction of Control
A few years ago, right after ChatGPT launched and captured the world's imagination, I pitched a document retrieval system—what's known as RAG, or Retrieval-Augmented Generation—to a large organization where we'd previously delivered a successful AI project. They had the budget. They had the motivation. But they made a decision that I've seen countless times since: they decided to build it internally.
Their reasoning was logical, or so they thought. "This is too core to our business," they told us. "We need to own this technology."
I'm confident they cobbled together something that kind of worked, probably using open-source tools like LangChain and whatever frameworks were trending on GitHub at the time. Did they get the full benefit of AI? I seriously doubt it. More likely, they got something that limped along, consuming resources and attention while delivering mediocre results.
The tragedy wasn't unique to them. I've watched this pattern repeat across industries: mature companies with impressive track records in their core business, suddenly convinced they could become leaders in AI. What inspired this confidence? Largely, it was ChatGPT's deceptive simplicity. The chatbot made AI seem almost trivial—just write a prompt, get an answer. How hard could it be?
Very hard, as it turns out.
The Hidden Complexity
Here's what companies discover too late: ChatGPT made artificial intelligence seem seductively simple, but in practice, things get complicated fast. And complexity requires experience—real, hard-won experience that most organizations simply don't possess.
Building effective AI systems requires a solid foundation in software engineering fundamentals. You need to understand testing. You need to know how to benchmark what you're actually building. You need to recognize when something is broken, and here's where AI is uniquely deceptive.
In traditional software development, the program has the decency to crash when something goes wrong. It either works or it doesn't. Engineers might sometimes bury their heads in the sand if something works intermittently, but eventually, reality forces accountability. The program fails in obvious ways.
AI is worse. I've watched teams build models riddled with fundamental bugs—broken approaches, flawed architectures, garbage data—and they don't even realize it. The system sort of works. It produces outputs. It doesn't crash. So they keep going, making one more tweak, certain that everything will be okay. "Surely the next iteration will fix this," they tell themselves.
It won't.
The problem is this: with traditional software, it's usually clear whether your code is wrong. With AI, you often can't tell whether the model itself is broken or whether your implementation is faulty. You can't tell whether your data is poisoned or your training approach is fundamentally flawed. That clarity—that hard-won understanding of what's working and what's failing—is what experience provides. And you can't buy experience quickly. You can pay for outside experts, or you can pay your team to learn on your dime. Either way, you're paying. The only choice is how long you want to pay and how much organizational pain you're willing to endure during the learning process.
Bad Tools in Inexperienced Hands
The situation isn't helped by technology vendors making things worse.
Microsoft's Copilot initiative is a perfect case study in misdirected ambition. The branding itself is so confused that most people don't actually know what Copilot is or what it does. Part of it involves allowing organizations to build their own AI workflows. Microsoft initially tried to charge $108,000 per year for this capability. Then they cut the price to $360 per year—a 300-fold reduction. Now they give it away for free.
I tried building something with it once, and I told my engineering team: if you ever produced something this poorly designed, I would fire the lot of you.
Think about that pricing trajectory. Microsoft doesn't drastically slash prices on products, and then eventually give them away, if people actually want to use them. That pricing tells you everything you need to know. The tool is bad. It was clearly built by software engineers who understood traditional software development but had little comprehension of how AI actually works. Despite being terrible, it's now in the hands of IT departments across America—most of them with zero training in AI—who are attempting to use it to build AI workflows for their organizations.
But here's what should really concern you: even if you somehow get it to work in Copilot, it won't stay free. Microsoft will hike the price to at least $100,000 per year at some point. You can count on it. There's an old joke in the software industry that explains Microsoft's business model perfectly: they call their customers "users" because they give away free tools to get them hooked, and once you're dependent on the platform, they squeeze you. You've already seen this exact pattern with Copilot's pricing. Don't expect it to be the last time.
Is it surprising that projects built with inadequate tools and insufficient expertise are failing? Of course not. It would be surprising if they succeeded.
The Perfect Storm
Now we can see the complete picture of why 95% of AI projects fail.
First, companies and their internal AI champions wanted to own everything. They were seduced by visions of becoming the next Elon Musk—the next visionary who would build the magic system that changed everything. This desire for control blinded them to a simple truth: they didn't have the expertise to do it well.
Second, they reached for tools that were never designed for the work. They used platforms like Microsoft's Copilot that were built by people who didn't understand AI or what AI requires. They used generic software frameworks repurposed for machine learning. They used whatever was cheapest or most readily available, not what was actually appropriate.
Third, and most critically, they lacked the skill and experience to engineer good AI systems. Building production AI is different from running a successful traditional business, no matter how technologically sophisticated that business might be. The principles are different. The debugging process is different. The entire mental model of what constitutes success is different.
The result? Less than 5% of projects delivered results that justified the investment.
It's a tragedy not just for those companies, but for AI itself. AI has genuine power and potential. But when it's deployed poorly, when projects fail repeatedly, when billions are wasted chasing unrealistic goals with inadequate tools and inexperienced teams, AI gets a reputation for being hype rather than substance.
The Unrealistic Expectations Problem
Beyond execution failures, another reason projects collapse is simpler and more preventable: companies have wildly unrealistic expectations about what AI can actually do.
Too many leaders have listened to the hype. They've heard Sam Altman at OpenAI, Elon Musk, and countless other technology evangelists talk about AI as if it's on the verge of replacing human thinking entirely. They've absorbed talk of AGI—Artificial General Intelligence—and the brilliant capabilities of large language models, while largely ignoring the elephant in the room: hallucinations.
AI systems confidently produce false information. They make up facts. They sound perfectly plausible while being completely wrong. But when you're approaching AI through the lens of hype, these limitations seem like minor speed bumps. They're not. They're fundamental constraints that need to shape how you deploy AI technology.
This is where the concept of "Assistive Intelligence" becomes crucial. When you frame AI as a tool that helps you achieve more, rather than replaces what you do, you start thinking about AI fundamentally differently. You ask better questions about what AI is actually good at.
AI excels at processing vast quantities of information quickly. It's exceptional at sifting through thousands of documents and finding patterns that would take humans weeks to locate. It's good at asking hundreds of questions in the time it would take a human to ask one. It is genuinely powerful at narrowly defined, specific tasks where the data is clean and the objectives are clear.
But AI is terrible at the things that amateurs often expect from it: replacing human judgment, making consequential decisions independently, or serving as a substitute for human expertise. Almost every project that tries to use AI as a replacement for human workers ends in disaster. The technology simply isn't there. We keep overselling it, and we keep being disappointed when reality doesn't match the marketing.
The path to success is precisely the opposite: identify very narrow, specific tasks where AI can augment human capability. Spend the time to build and train your systems with that specific task in mind. Treat AI as one more check in the system, like the "Swiss cheese" approach to safety, where no single check is sufficient to catch every problem, but multiple overlapping checks catch most of them. Use AI as an extra set of eyes. Don't use it as a replacement for the eyes you already have.
Who Bears Responsibility
The blame for inflated expectations doesn't rest entirely on the companies that tried to build AI systems in-house. The technology industry has pushed those expectations relentlessly.
Sam Altman has repeatedly overstated what AI can do. The constant drumbeat of AGI talk has created an expectation that AI systems are far more capable than they actually are. Elon Musk has promised autonomous vehicles and artificial general intelligence with a confidence that the underlying technology doesn't yet support. And Microsoft has released half-baked tools that give the impression that AI development is simpler and more approachable than it actually is.
When leaders and technology vendors oversell capabilities, they create a market expecting miracles. When the miracles don't materialize on schedule, disappointment is inevitable. Budgets expand. Timelines slip. Patience erodes. Projects collapse under the weight of unmet expectations that were never realistic to begin with.
At smartR AI, we had to learn what these models could and couldn't do when large language models burst onto the scene with ChatGPT. But we had an advantage: we'd already been working with GPT-2 and other language models for some time. We had built experience with their limitations and quirks. We understood hallucinations because we'd encountered them. We knew about token limits and context windows and the various ways that language models could confidently produce gibberish because we'd already seen it happen dozens of times.
In-house teams building AI for the first time? They're learning all of this from scratch. Results take longer. Budgets expand. Disappointment builds. And by the time they've learned what they need to know, the project is already overbudget, behind schedule, and at risk of cancellation.
The Path to Redemption
Here's the good news: it's not too late. That MIT study showing a 67% success rate when organizations bring in outside expertise isn't just a statistic—it's a roadmap.
But there's a specific kind of outside expertise that works. The study found that smaller vendors with specialized expertise were significantly more effective than the giant consulting houses like McKinsey or Accenture. Why? Because those large firms are learning on your nickel, and they're expensive learners. Their advantage is their scale and their brand, not necessarily their depth of expertise in machine learning. Smaller, independent teams with years of hands-on AI experience? They've already paid the price of learning. They know what works and what doesn't. They recognize failure modes you haven't even thought of yet. They can move fast because they're not rediscovering things from first principles.
Bringing in expertise from outside isn't a failure. It's an acknowledgment of reality. In the engineering and software development communities, there's a powerful cultural bias against asking for help. Admitting you need assistance can feel like admitting you're not smart enough or experienced enough. That bias is costing you billions in failed projects.
The simple "hack" of asking for help gets you to success 67% of the time. That doesn't guarantee every project will succeed, but it's immeasurably better than failing 19 out of 20 times. The math is overwhelmingly compelling.
Your Next Move
If you have an in-house AI project that seems destined to become one of those 19 failures, you still have a window to change course. It's not too late. You need to swallow your pride and ask for help.
Start by bringing in a small team of genuine AI experts—people who've been building production machine learning systems for years, not consultants who've been reading about AI in the news. Ask them to audit your project. Have them assess your approach, your tools, your data, your architecture. Most importantly, ask them whether you're solving the right problem in the right way.
Then, make a decision: do you try to build this capability in-house with expert guidance, or do you partner more deeply with outside specialists? Both paths can work. What doesn't work is pretending you can build production AI systems without expertise, using tools that weren't designed for the job, with expectations that are disconnected from reality.
The executives and engineering managers reading this have built successful products and businesses. You know how to deliver results. But AI is different enough that your existing playbook isn't sufficient. Recognizing that difference isn't weakness. It's the mark of a leader who understands the boundary between confidence and arrogance.
The question isn't whether you can afford to bring in help. The question is whether you can afford not to.
Written by Oliver King-Smith, Founder and CEO smartR AI