#122
Stop selling AI like it's a layoff Machine, notification systems for MCP and a contrarian take to Google I/O.
Welcome to another edition of our Builder Series Newsletter, where we dive into some of the challenges in building with AI, share research and wrap up all the latest AI news from across the world.
Stop Selling AI Like It's a Layoff Machine
The pressure to justify the staggering sums invested in artificial intelligence is real, and using it to cut costs and reduce headcount, seems to be the only argument put forward to the CFO’s table. But if history is any guide, this approach will destroy as much value as it creates, and the people responsible will never be held to account.
The Doorman Fallacy
A consultant walks into a hotel and asks how much the doorman is paid. Perhaps it’s a modest five-figure salary. The consultant defines the doorman’s role as narrowly as possible, opening the door, proposes replacing him with an automatic mechanism, claims the cost saving, and moves on. What is never accounted for is everything else the doorman actually did: recognising regular guests, hailing taxis, quietly deterring trouble, lending the hotel a sense of occasion. None of that appears on a spreadsheet, so none of it is protected.
Another example is the introduction of self-checkout tills. As an option for customers in a hurry, they were perfectly reasonable. The problem came when finance departments decided that getting customers to do the work themselves was cheaper than employing till operatives, and what began as a choice became an obligation. Shoplifting surged, large family shops became an ordeal, and the human interaction many customers depended on evaporated. The cost saving was real and immediate. The value destruction was real but slower to realise. Now, most supermarket chains have increased their % assumed lost from shop lifting from 1-2% —> 3-4%…potentially eating the margin gains from introducing self-checkout.
This asymmetry defines much of modern business decision-making. You can claim credit for a cost reduction on the day it is implemented. The damage accumulates over years and lands on someone else’s budget.
Tangible vs. Non-Tangible
Beneath this pattern lies a deeper divide. Peter Drucker argued that only two business functions create value: marketing and innovation. Everything else is a cost. This view is rooted in the Austrian school of economics, which holds that value is subjective, existing in the minds of those who experience it, not in objects themselves. In this framework, creating the conditions in which something can be appreciated is just as productive as manufacturing the thing itself. Marketing is value creation, not overhead.
The competing Chicago School of Business assumes people already know what they want and can price it accurately, so the only useful thing a business can do is deliver it as cheaply as possible. Marketing, in this model, is at best a necessary evil.
The latter won, not because it is correct, but because it fits on a spreadsheet. Abstract concepts like trust, perceived value, and emotional resonance resist quantification in a way that cost-per-unit and conversion rates do not. In a world governed by quarterly reporting, measurable short-term metrics will always crowd out things that matter in the long term but are harder to see.
Customer Success by humans MATTERS!!
Against this backdrop, long tail value carries several uncomfortable truths that the business world has chosen to ignore at scale;
Different is often better than better. As AI floods every sector, the businesses that retain genuine human interaction will stand out sharply. This is not nostalgia; it is strategy.
Human contact dominates customer satisfaction. Royal Mail spent heavily on improving delivery reliability, expecting brand perception to follow. It had no effect. Researchers found no correlation between service reliability and how much people liked the brand. What actually determined their attitude was whether they liked their postman. Customers whose postman left a parcel in the back porch while they were on holiday thought Royal Mail was wonderful. Those with an unfriendly postman did not, regardless of punctuality. This is consistent across service industries: the human element is disproportionately responsible for how customers evaluate their experience, and it is systematically underinvested in because its value is slow to accumulate and hard to attribute.
More channels mean more sales. Driving customers to the lowest-cost channel saves money on the spreadsheet while destroying revenue in reality. One online travel company found that website visitors converted at 0.5%. Phone callers converted at 30%. Hiding the phone number was not a cost-saving measure. It was revenue destruction dressed up as one.
Three Phases of AI Justification
Phase One: The Same, But Cheaper. Already underway. AI is sold as a cost-reduction tool because that is the easiest message to take to a CFO. Some applications will be genuinely useful. Many will quietly destroy value in ways that only become apparent years later, long after the people who implemented them have moved on.
Phase Two: The Same, But Better. Eventually, some organisations will ask not how to do this more cheaply, but how to use AI to genuinely improve the customer experience. This phase is more likely to founder-led businesses or family owned. Four of the five 2024 IPA Advertising Effectiveness Award winners were family-owned: McCain, Laithwaites, Yorkshire Tea, and Specsavers. Family businesses can pursue long-term objectives without the tyranny of quarterly reporting, a freedom that may prove a decisive competitive advantage.
Phase Three: Something Entirely New. The most transformative phase will not be automating existing processes but reinventing them entirely. When the electric motor was invented, factories initially just replaced their steam engines with electric ones. Gains were modest. The real revolution came when engineers realised small electric motors were efficient and practical, and rebuilt their entire production logic around that fact. AI will follow the same curve. The businesses that extract the most value will be those that built something that could not have existed before… and go all in.
A parallel to this is how we build software today vs. in the past. The AI software factory is not about cost cutting, it is about doing more, while reducing the traditional trade-offs, like quality, speed, cost and at a lower level, skills/knowledge. There was always natural attritton, and that will continue, but let’s not mask it as AI magic replacement.
Sell How You Think, Not What You Do
Selling and marketing in the AI era needs a different prism.
Marketers typically defend their existence by pointing to what they produce: campaigns, conversions, brand metrics. This is a losing position that keeps them permanently on the defensive, justifying budgets to finance departments focused on the short term.
The real value of marketing is not what it produces. It is how it thinks. The marketer’s fundamental contribution is the instinct to view any decision from the perspective of the person who will ultimately experience it, what Ritson calls the 180-degree flip.
Without this perspective, organisations make decisions that are rational by every internal metric and catastrophically wrong in human terms. The Concorde is the perfect illustration. An engineering triumph, and yet it contained a flaw no engineer apparently spotted, because it was a human problem, not a technical one. Flying London to New York was extraordinary. The return leg was a misery: an extra night in a hotel, an early start, an entire working day spent in the air, landing exhausted. The engineers optimised for speed. They missed the experience entirely.
This is what marketing thinking prevents. And the market for that kind of thinking, it turns out, is far larger than the market for marketing output, which is perhaps the final irony of an industry that has spent decades underselling itself.
Final Note:
You are selling change.” A decade on, those four words still land. Change can build or burn, and compounds at a different rate for both. The CFO who signs off on a headcount reduction today will be gone before the damage shows up. That is not a coincidence; it is a system.
The real case for AI is not subtraction. It is multiplication. There is a 10 million developer shortage AI can close. A 2 million QA gap it can fill. An epidemic of burnout in customer service it can genuinely ease. These are additions to human capacity, not replacements for it.
If AI is truly core to your business, cost-cutting should be an afterthought, not the headline. The companies that win will be the ones who used AI to do what was previously impossible and grow the top line in ways they couldn’t have imagined, while the rest were busy counting the savings.
The good news is that some companies are recognising this
MCP Without a Notifications System Is a Rube Goldberg Machine
There is a moment every developer using an AI coding agent will recognise. You trigger a tool call. Nothing happens. Or rather, something is happening, but you have no idea what, or how long it will take, or whether it has silently failed. You wait. You wait longer. You intervene. The environment was down the whole time.
This is not a minor inconvenience. It is a fundamental breakdown in the feedback loop that agentic coding depends on. And a the problem we faced with our agent at Kerno.
The Problem in Plain Terms
You have asked Claude to refactor an authentication service, consolidating three legacy endpoints into a single unified route. Claude writes the code, it looks clean, and now it is time to validate. The developer triggers Kerno through MCP. Kerno needs to spin up the Docker Compose environment, wait for the database and cache to become healthy, boot the application, capture baseline responses across the affected endpoints, and diff them against the new implementation.
That is real work that takes real time. And until now, from the developer’s side and the agent’s side, it produced complete silence.
Is Kerno waiting on Postgres to start? Has it already captured baselines and stalled on the diff? Did Docker fail to pull an image? Is it finished and just not responding? There was no way to know without polling kerno_job repeatedly, adding noise to an already slow loop.
The second failure mode is more serious. During testing, a developer asked the agent to validate changes to a Spring Boot service backed by an H2 database. The Docker Compose environment went down mid-session. The application was unreachable. But neither Claude nor Kerno detected this. The agent kept attempting to validate. The developer kept waiting. Nobody knew the environment was gone until a human looked directly at the Docker logs.
In a workflow built to reduce cognitive load on the developer, requiring a human to diagnose silent infrastructure failure is exactly the wrong outcome.
Both failures share a root cause: the MCP server had no way to communicate state back to the client in real time.
What Shipped
The PR that just merged into Kerno’s main branch addresses both failure modes in a single, tightly scoped change. Two things were built.
Job log streaming via MCP notifications. Background job log entries are now forwarded to the connected MCP client as protocol-level notifications. A client, whether that is Claude Code, Cursor, or any MCP-compatible agent, can subscribe to a long-running job’s progress without polling kerno_job. Each log entry carries a jobId for routing and a terminal flag that signals end-of-stream. When a job completes, McpActiveSession.emitTerminal(jobId) closes the stream cleanly on the producer side.
In practice, this means the authentication refactor scenario described above now looks completely different. The developer triggers validation. Signals start flowing back immediately:
[kerno] Environment starting
[kerno] Postgres healthy on port 5432
[kerno] Application booted on port 8080
[kerno] Capturing baseline: GET /auth/token
[kerno] Capturing baseline: POST /auth/refresh
[kerno] Capturing baseline: DELETE /auth/session
[kerno] Diff detected: POST /auth/refresh — field "expires_at" missing in response
[kerno] Validation complete
[kerno] Postgres healthy on port 5432
[kerno] Application booted on port 8080
[kerno] Capturing baseline: GET /auth/token
[kerno] Capturing baseline: POST /auth/refresh
[kerno] Capturing baseline: DELETE /auth/session
[kerno] Diff detected: POST /auth/refresh — field "expires_at" missing in response
[kerno] Validation complete
The agent receives every one of those log lines in real time as MCP notifications. It does not need to poll. It does not need to wait for a final result. It can act on the diff signal the moment it arrives, locate the dropped field in the refactored code, and surface a fix before the developer has even processed the output themselves.
Idle session reaping. The second change solves a subtler but equally big problem. The MCP SDK only removes a session when a client sends an explicit DELETE request. Clients that crash or restart abruptly, which includes Claude Code, do not send one. The result is that dead sessions accumulate indefinitely, each one holding a server instance, a full tool registry, and references to shared context objects. Given enough restarts over the course of a working day, this becomes a genuine resource problem.
The fix introduces activity tracking per session, with three new configurable parameters: a session idle timeout of ten minutes, a reaper interval of two minutes, and a maximum session cap of 1024 with LRU eviction. A background coroutine scoped to the application lifecycle checks for idle sessions and closes them cleanly. The eviction logic is written to be race-safe: the idle check happens inside a ConcurrentHashMap.compute block, so a request arriving at exactly the moment a session is being reaped cannot cause a loss.
In the Docker Compose failure scenario, this means Kerno will now detect and clean up a session that has gone silent, rather than leaving the agent waiting against a dead connection indefinitely.
Why the Implementation Details Matter for DX
The review process surfaced a few edge cases worth understanding, because they illustrate how subtle the failure modes can be at this layer.
One issue identified was a subscription leak in fire-and-forget tools. When the session reaper evicts a session, notification subscriptions for running jobs could keep attempting to send log messages to a dead session until the job itself emitted a terminal flag. The kerno_job tool handled this correctly with a cancelAndJoin() in a finally block, but other tools did not. The fix was to tie subscription lifecycle directly to the SDK server’s onClose callback, so that reaping a session automatically cancels all associated subscriptions.
Another edge case: AgentLogger.logs is a shared flow with no replay buffer. If a job finishes and emits its terminal flag before the notification subscriber has connected, the subscription will never self-cancel. The onClose guard bounds this as a resource concern rather than a correctness issue, but it is a reminder that in asynchronous notification systems, timing assumptions can create invisible leaks.
These are exactly the kinds of problems that only surface when you build a genuinely live, streaming communication layer rather than a polling wrapper. Getting them right is what separates infrastructure that holds up in production from infrastructure that works in demos.
The Broader Principle
The deeper point here is about what the feedback loop actually requires to function. Speed matters, but transparency matters just as much. A fast tool call that fails silently is worse than a slow one that tells you what it is doing, because at least the slow one gives you something to work with.
For AI coding agents in particular, the quality of context they receive determines the quality of the decisions they make. An agent that does not know the environment is down will keep trying to validate against it, burning tokens on work that cannot succeed. An agent that receives a clear notification, compose service unreachable, Postgres failed to bind on port 5432, port already in use, can stop immediately, explain the problem to the developer, and wait for resolution. That is the difference between an agent that compounds a problem and one that surfaces it.
As agentic workflows become more capable and more autonomous, this communication layer becomes more important, not less. The more an agent is expected to operate independently, the more it depends on accurate, timely signals about the state of the world it is operating in. Autonomy without awareness is not intelligence. It is just a faster way to fail.
Getting MCP notifications right is not a backend engineering detail. It is a product decision about how much developers can trust the tool that is supposed to have their back.
Validate code in flow
Switching gears……
Meme of the week
From an 1979 IBM Training Manual
News, views and more research
How much uncapped token time have we left? 6-12 months?
Remember when Uber was $4 per trip across SF?
Microsoft has cancelled it’s Claude Code Enterprise subscription because costs were becoming unattainable. Uber’s CTO reveal they have alreay burnt through all of 2026 token spend.
The illusion of building is catching up with us.
A contrarian take to google I/O
Here’s a contrarian take on Google I/O 2026
“The Biggest Search Upgrade in 30 Years” Is Mostly Just… More AI
Google dedicated the vast majority of its keynote to new Gemini models, and the fanfare was enormous. But look past the marketing: nearly every announcement revolved around making Gemini less of a chatbot and more of an always-present assistant, a direction that was not exactly a surprise or revolutionary.
Gemini Spark: Your New 24/7 Surveillance Buddy
Gemini Spark is framed as a proactive, 24/7 personal AI agent, which sounds helpful until you realise it means Google now wants a permanent, always-on window into your emails, messages, calendar, and behaviour. Users on tech forums raised persistent privacy concerns, and one commenter joked that “we’re about 2 years away from my calculator asking if I want an AI-generated summary of 7 + 5.”
Credit where credit is due, Gemini is the best model, imho, for mutli modal work - involving files, videos, or transformative work across modalities (document to video, image to text)
Universal Cart: Google Inserts Itself Between You and Every Purchase
Universal Cart feels like Google trying to position itself between users and online retailers in much the same way Search sits between users and websites. Instead of simply helping people find products, Google increasingly wants to become the layer that manages the entire shopping process.
Engadget put it bluntly: “Would you trust an AI agent with your credit card and a shopping list? I, uh, have a lot of questions.”
The Open Web Is Getting Quietly Strangled
Following I/O 2026, critics argued that the real issue isn’t whether blue links still exist in search results — it’s whether users still need them. Data cited from the Reuters Institute showed traffic sent from Google to publishers has already fallen significantly, and that decline happened before the arrival of Google’s new autonomous AI agents. Google’s rapid, defensive public response to this backlash was itself telling.
Gemini Is Becoming “the New Copilot of Android”
A widely upvoted thread on tech forums called Gemini “the new Copilot of Android” — a reference to Microsoft’s notoriously unwanted AI integration. Many users argued that this is all too much, and noted that so much of Gemini has been locked behind paid tiers anyway, meaning most of it isn’t available to regular users.
The Bottom Line
Google I/O 2026 was less a developer conference and more a declaration of intent: Google I/O has fully transformed into the company’s annual AI roadmap presentation. The company is betting its entire future on Gemini dominating every digital touchpoint in your life — search, shopping, email, glasses, science.
Tooling Corner
Some free, open-source and paid tools (by startups) worth exploring.
Charlemagne Labs | One agent - end to end privacy
Phishing defense, DLP, incident visibility, and agentic observability, powered by a local SLM your team can trust.
Thanks for reading, For more AI Builder Series Editions, subscribe here.
AI Builders is sponsored by











does kerno also solve for the semantics when the client isn't there?
progress logs for a live call are one thing. durable wakeups for future state are another. if the client drops, do you replay, ack, retry, or just lose the signal?