|
Getting your Trinity Audio player ready...
|
The AI Lab: The Good, the Bad, and the Stuff That Blew Up in My Face
“Your scientists were so preoccupied with whether they could, they didn’t stop to think if they should.”
— Dr. Ian Malcolm, who clearly tried to ship software with AI at some point
In the last blog I talked about why I’m deliberately disrupting my own consulting firm and dragging it toward an AI-first future. That was the aspirational part. The vision part. The “future of the firm” part.
This is the part where I tell you that the reality is a lot messier, a lot dumber, and a lot more humbling than people on LinkedIn would have you believe. And dragging my company into this world has been like taking your teen’s phone away – literally that challenging…
Because over the last 12 months, I’ve been building. A lot. Some of it started with SportsOwl, which many of you already know was my side quest into sports analytics. Some of it turned into Legion. Some of it turned into Onyx AI Labs. Some of it turned into me staring at a screen at 1:30 in the morning wondering why a machine that sounds so confident is acting like a drunken intern with admin rights.
There’s a ton of noise right now around AI. Founders seem to be sorting themselves into two camps. One thinks AI is a miracle cure that’s going to save the business. The other thinks it’s the angel of death coming to wipe out everything they’ve spent twenty years building.
I think both takes are wrong. Or at least incomplete.
AI is not magic. It is not judgment. It is not wisdom. It is not a business model. And it is definitely not some magical little founder-in-a-box that can build you a hardened commercial platform between lunch and happy hour while you smoke cigars and fantasize about your future crypto empire.
That fantasy — the whole “vibecoding” mythology — is doing real damage. It’s convincing people that professional-grade software is now basically just a personality test with a prompt window attached. That if you can describe something with enough swagger, the machine will just sort it out.
It won’t.
Or more accurately: it will sort out just enough of it to make you think it’s working, right up until you realize you’ve built a very attractive shell around a hollow center.
That was one of my first big lessons.
Mistake Number One: Assuming the AI Knew What It Was Doing
When I first got going, I did what a lot of people did. Opened ChatGPT. Started asking it for code. Took that code and dropped it into VS Code. Rinse and repeat.
And to be fair, at first, it felt incredible.
Stuff appeared on the screen. Buttons worked. Pages rendered. Layouts looked good. It felt like cheating. Like I’d somehow stumbled into a warp tunnel where software that used to take months was now showing up in hours. That part is real. The endorphin hit is very real.
The problem is I made a very naive assumption: I assumed the AI had some idea what it was doing.
It did not.
What it had was confidence.
At the time I didn’t understand context windows properly. I didn’t understand drift. I didn’t understand how quickly these tools can lose a train of thought, contradict themselves, forget prior decisions, invent architecture, or go wandering off into some enthusiastic side quest you never asked for. So, I treated the output like it had some coherent internal plan behind it.
That was a mistake. A big one.
Because once you start from a vague “let’s build something cool” prompt, the AI will absolutely take that vague energy and run directly into the woods with it with mindless abandon. Imagine Gru’s Minions with way too much caffeine. It will add features. It will refactor things that were working. It will confidently suggest paths that are either stupid, impossible, or both. And if you don’t have a clear plan, clear steps, and a clear end state, you are not building. You are just generating chaos at a very efficient rate.
I had resets. Many. Big messes. Untold frustration. Entire stretches of time where I was basically digging myself out of holes I’d paid for with my own bad assumptions.
That was the first real lesson: AI is not a substitute for thinking.
If anything, it punishes lazy thinking faster than any tool I’ve ever used.
The Good: It Looks Like You’re Winning
To be fair, there is a “good” part to all of this.
Seeing something actually work on screen feels fantastic. If you’ve ever wanted to build software and been blocked by time, cost, teams, complexity, or just the sheer pain in the ass nature of software development, AI can absolutely blast through some of those barriers.
That part is not hype. It’s real.
You can move faster. You can prototype faster. You can test ideas faster. You can get from concept to visible thing with a kind of speed that honestly would have sounded ridiculous not very long ago.
That’s the seductive part.
That’s also the dangerous part.
Because that display-layer progress can fool you into thinking the whole thing is further along than it really is.
The Bad: It Wasn’t Actually Working
This was the next punch in the face.
What looked done often wasn’t done at all. It just looked done.
Mike: ‘Looks good to me!’ ChatGPT: Muahahahaha! Moron!
The front end looked great. The screens were there. The interactions were there. Maybe even a few demo flows worked. Enough to make you think you were cooking.
Meanwhile, underneath it all, there was a decent chance the backend wasn’t really there, the wiring wasn’t complete, the flows weren’t durable, the architecture was shaky, and the whole thing had structural integrity of the Maple Leafs in the playoffs….
That’s the thing AI is frighteningly good at: making something appear complete.
It can produce the illusion of completion better than almost any technology I’ve seen.
Which is great if you want a mockup.
Less great if you want a company or a product
The Ugly: Security, Infrastructure, and All the Stuff AI Can’t Magically Wish Into Existence
Then came the truly ugly part.
You start realizing that even if the thing looks good, and even if parts of it work, you still have all the real-world problems sitting there waiting for you like land mines.
Infrastructure. Security. Hosting. Authentication. Permissions. Payments. Deployment. Protocols. Hardening. Edge cases. The endless list of boring but absolutely essential steps required to turn a “look what I built” demo into something an actual customer can safely use.
And let me tell you, AI is more than happy to help you create something with security holes you could drive a tank through.
That was one of the uglier realizations for me. Not just that parts of what I’d built were hollow, but that parts of it were hollow in dangerous ways. The kind of dangerous that’s fine in a sandbox and catastrophic in public.
This is where the founder fantasy really starts to collapse.
Because publishing something is not the same as finishing something.
Making it available is not the same as making it real.
And “it works on my machine” remains one of the dumbest sentences in technology, regardless of how many LLMs you strap onto the workflow.
What the Tools Actually Taught Me
Each tool ended up teaching me a different lesson, usually by punching me in the face in a slightly different way.
ChatGPT taught me about hallucination and drift.
GCP taught me about infrastructure, security, and just how quickly cloud complexity shows up once you stop messing around.
Cloudflare taught me about web security and protocols.
Stripe taught me that payments are not some cute little widget you toss on top at the end. They’re an actual system and they matter.
Cursor taught me that sometimes you need to stop pretending the AI is going to sort it out and just open the IDE and fix the bloody line yourself because the machine is not seeing the problem.
That was an important mindset shift for me.
The goal is not to sit there like some AI priest hoping the model will bless the codebase.
The goal is to use the tools well, know when they’re helping, know when they’re lying, and know when to grab the wheel.
What Finally Started Working
Things improved when I stopped treating AI like an oracle and started treating it like a fast, talented, unreliable collaborator that needed adult supervision.
Claude Code: ‘Why didn’t we add an ice cream dispenser?’ Mike: ‘Shut up.’
The biggest change was not some giant technical breakthrough. It was discipline.
Better planning. Clearer tasks. Stricter context. Better boundaries. Less vague wandering. Less “build me this awesome platform.” More “do this exact thing, in this exact way, within these exact constraints.”
Once I got more serious, the outcomes got more serious too. Funny how that works.
Once I moved to Claude Code with skills and tighter context discipline, things started moving forward in a far less stupid way. Not perfect. Not magical. But meaningfully better. That’s actually going to be the next blog, because I think the operational side of how to work with these tools is where a lot of the real value is.
The Real Point
If you’re a consulting founder reading all this and wondering whether AI is going to save your firm or kill it, I think you’re asking the wrong question.
The real question is whether you are capable of using it without fooling yourself.
Can you build with it honestly?
Can you separate a prototype from a product?
Can you tell the difference between visible progress and actual progress?
Can you accept that velocity without governance is just a faster route to a bigger mess?
Because that, for me, has been the real lesson of the last year.
Building Onyx AI labs is less about models and more about governance, iteration discipline, and surviving your own bad assumptions.
That’s not as sexy as the internet wants this story to be. But it is a lot closer to the truth.
I’ve built things I’m proud of. I’ve also built things that were basically digital movie sets — looked great from the front, nothing behind the door.
Explosions. That’s part of the process in any good lab. Maybe. Or at least it was for me.
Next time I’ll talk about how I’m actually working now, what I tightened up, the tools and methods that work for me. Yes – I will spill the magic beans.