Quiet AI: shipping value, not buzzwords
Most AI projects fail because they start from the technology and look for a problem. Here’s how we flip that — and the four-question test we use before quoting any AI work.
The current AI cycle has trained a lot of teams to start from the technology — “we have GPT-4, where can we use it?” — and look for a problem. This is exactly backwards. It’s also why most AI initiatives quietly disappear after the launch announcement.
We start AI engagements with four questions, in this order. If we can’t answer all four with confidence, we don’t take the project.
1. What hour are we trying to give back?
Every useful AI feature replaces a specific hour of human time. Customer-support agents replace the hour someone spends answering “what’s your return policy?” in WhatsApp. Document-summarization pipelines replace the hour someone spends reading 40-page contracts. Content-moderation classifiers replace the hour someone spends scanning 500 user submissions for spam.
If you can’t name the hour, you’re not building an AI feature. You’re building a demo.
2. What does failure cost?
This determines architecture. A customer-support agent that occasionally gives a wrong answer about return policy is annoying. A medical-triage agent that occasionally misclassifies symptoms is dangerous.
The cost of failure decides:
- Whether you need a human-in-the-loop step (yes if cost-of-failure is high).
- Whether you can use a single LLM call or need a chain with verification (chain if stakes are real).
- Whether you can ship in 4 weeks or need 12 (longer if guardrails are non-negotiable).
Teams that skip this question end up either over-engineering low-stakes features or under-engineering high-stakes ones.
3. Who actually answers when it breaks?
Every AI system breaks. The question is what happens next. If the answer is “the founder Slacks the engineer at midnight,” you don’t have a production system — you have a fragile prototype.
We won’t ship an AI feature without:
- A clear failure mode (graceful fallback to a deterministic path).
- Monitoring that catches drift before users do.
- An on-call playbook the operating team can actually execute.
This is the unglamorous part of AI work, and it’s the part that determines whether the system is still working in six months.
4. What does the model not need to do?
The mistake we see most often: trying to make one AI feature do everything. The chatbot answers questions, books appointments, processes refunds, and writes marketing copy. The result is a system that does all of those things badly.
Good AI work is narrowly scoped on purpose. It does one thing well, with clear edges where it hands off to deterministic code or a human.
The narrower the scope, the more reliable the system, and the more confidently you can ship it.
What this looks like in practice
A recent client wanted to “use AI for customer support.” We narrowed the scope to: a WhatsApp agent that answers questions about three specific product categories, with a human handoff for everything else.
That narrow agent ships in 4 weeks, runs reliably, and handles 60% of inbound volume. A “general AI customer support” project for the same business would have taken 6 months, drifted in production, and probably been quietly retired within a year.
Quiet AI. Less ambition per feature, more value per ship.
If you’re thinking about AI automation and want a calibrated take rather than a sales pitch, start a project and we’ll have an honest conversation about whether AI is the right tool for what you’re trying to do.