Consider, for a moment, the ambient dread of the contemporary open-plan office, which has somehow achieved a state of total kinetic standstill despite every individual in the room generating syntax at roughly the speed of sound.
If you look at the telemetry, specifically the almost suspiciously encouraging data DX CTO Laura Tacho presented at the Pragmatic Summit, which aggressively audited the behavior of some 121,000 developers, the baseline metrics look like a total, unmitigated triumph of the techno-optimist narrative. Almost 93% of engineers are now using AI coding assistants. A head-spinning 27% of all production code currently being pushed into repositories is AI-authored.
You look at those numbers on a glossy slide deck and your intuitive, late-capitalist reflex is to expect a corresponding, vertical rupture in organizational output. You expect the mythical 10x developer to have finally, alchemically, scaled into the 100x team.
Instead, what we are actually staring at is a brutally stubborn asymptote. The realized organizational productivity gain, the actual, tangible velocity of the business, has flatlined at roughly 10%.
The reason, once you see it, is almost embarrassing in its simplicity: we didn't build a shared system. We handed every individual contributor a private oracle and called it an AI strategy.
Every engineer on that floor has spent weeks quietly developing their own context rituals. Their own magic prompts. Their own custom configurations. Their own hard-won method for explaining five years of architectural decisions to a model with no memory of why any of them were made. And none of it - none of the breakthroughs, the refined workflows, the accumulated organizational knowledge painstakingly fed into a 128k context window - is visible to the person sitting two desks away.
We have successfully automated the isolation of the developer. Peer collaboration among developers has plummeted by nearly 80% since AI coding assistants became standard issue. We haven't built an exponential engineering army. We have built a floor full of furiously prompting monks, each one independently brilliant, collectively incoherent.
The Speed Trap and the Ontology of the Duct-Taped Monolith
The reason the 10% asymptote is so stubborn becomes visible the moment you sit behind a developer and watch them work. The model can synthesize two hundred lines of structurally flawless syntax in about four seconds. The initial burst of kinetic energy is intoxicating.
But this is where the illusion of velocity fractures against the jagged rocks of corporate reality.
As Martin Fowler recently observed regarding the friction of AI-assisted development, these models act like junior developers with infinite energy and absolutely zero context. The AI knows Python, it knows Go, it knows the abstract Platonic ideal of a microservice. It does not know that your specific payment processing pipeline is structurally dependent on a deprecated API that was duct-taped together by a guy named Todd in 2019 who now lives in a yurt in Oregon.
This creates the Ontology Gap. The AI writes code for a frictionless vacuum. The senior engineer spends forty-five agonizing minutes refactoring, debugging, and hammering that pristine, spherical syntax into the deeply compromised historical reality of the company's actual codebase. The labor hasn't disappeared; it has merely been transubstantiated from the act of creation to the act of janitorial integration.
But here is the part nobody says out loud: this is happening on every machine on the floor, simultaneously, in total isolation. The developer who finally figured out how to explain the payment service's retry logic to the model - the exact phrasing, the right context to include, the ADR to reference - keeps that knowledge in a browser tab that will be closed by end of day. Tomorrow, the engineer sitting next to them starts from zero. The day after that, the new hire starts from zero again. The breakthrough is non-transferable by design.
McKinsey's research on enterprise AI shows the organizations realizing a 20% EBITDA uplift are not using better models than their competitors. They are building shared context. The difference is not the quality of the individual oracle; it is whether the oracle's knowledge compounds across the organization or evaporates at tab close.
The Unbillable Janitor and the Non-Engineering Desert
The localized friction of this setup is exhausting, and expensive in ways that never show up in any sprint metric.
To get the AI to generate something contextually useful, the developer has to engage in a grueling, unbillable ritual of manual context-loading. They are dragging and dropping Slack tickets, copying terminal error logs, pasting abandoned Slack threads, desperately trying to teach the model the architectural history of the last three sprints inside a context window that wipes its memory the second the tab closes.
When this developer eventually leaves - and they will - their entire accumulated setup walks out the door with them. The custom prompts, the hard-won context rituals, the painstakingly assembled mental model of Todd's legacy API and why it cannot be touched. The next person inherits the codebase and a blank context window.
Meanwhile, the rest of the organization has been left wandering in a technological desert. The engineer is theoretically driving a Ferrari, even if it is stuck in pull-request traffic. The QA engineer is still commuting via spreadsheet. The product manager is still writing tickets into the void, with no access to the accumulated context the engineering team has privately built. The knowledge is entirely trapped in the silo of the IDE, and the silo has no door.
The failure here is not a limitation of the foundational models. The models are spectacular. The failure is structural: we deployed the most powerful knowledge tools in history as personal productivity software, and then expressed surprise that the organization didn't get smarter.
Engineering the Institutional Harness (Or: How to Stop Building Cathedrals in the Dark)
The alternative has a shape, and it is worth describing in human terms before we talk about how to build it.
A new engineer joins the company on Monday. By the time they open their laptop, the AI has already been configured with the company's full operational context - the active tickets, the relevant Slack history, the architectural decisions, the post-mortems. They open a chat interface. The agent already knows their team, their role, their current project. They don't explain Todd. The system already knows about Todd.
By Wednesday, the QA engineer on their team opens the same interface and asks it to check regression coverage on a PR. The system knows the codebase, the test suite, the context of the PR. It answers. By Thursday, a workflow that a senior engineer built three weeks ago for investigating staging environment conflicts surfaces in the new hire's sidebar. They didn't ask for it. The system noticed it was relevant to their current ticket.
This is what RAMP built internally with a system called Glass, and it is how they reached 99% daily AI adoption across the entire company - not just the engineering floor. The CEO framed it directly: the models were already exceptional. The bottleneck was the harness.
Here is what that harness consists of.
Step One: Shared Memory (Killing the Unbillable Janitor)
The first problem to solve is the context ritual. You replace it with an automated ingestion pipeline - a system that continuously reads the company's operational exhaust and builds a shared, queryable knowledge base. GitHub repositories, yes, but more importantly the scar tissue: the closed tickets, the incident Slack threads, the architectural decision records that explain why you don't use Redis for the payment service.
The technical architecture behind this is called a RAG pipeline - Retrieval-Augmented Generation - and the reason most implementations fail is they build the version that works for the employee handbook and stop there. A production context graph for an engineering organization has to understand that code and prose are not the same artifact. You cannot slice a Python file into arbitrary text blocks and call it indexed. You lose the function signature from the body, the class context from the method. The system needs to handle a Slack incident thread differently from an ADR differently from a Linear ticket.
The other failure mode is staleness - what happens to the embedded knowledge of a deprecated API nobody has touched in three years. The correct answer is not a nightly full re-index. You store a content fingerprint alongside every chunk and let GitHub and Linear webhooks trigger re-indexing only when something actually changed. The context graph reflects current reality, not the last time a cron job ran.
For the infrastructure: teams already on Postgres can start with pgvector without adding new systems. For connectors, Airbyte provides open-source integrations to the tools your organization already runs.
Step Two: Shared Workflows (The End of the Local Script)
The second problem is the breakthrough that stays on someone's MacBook. You solve this by building a mechanism for publishing agent workflows so they become organizational infrastructure rather than personal property.
The standard for this - the one that OpenAI, Google, Microsoft, and Anthropic have all converged on - is MCP, the Model Context Protocol.MCP the N×M problem: without a shared protocol, connecting N different AI models to M internal tools requires N×M custom integrations. MCP collapses this to N+M. You write a tool once, and every AI client in the organization can use it - Claude, Cursor, any LangChain agent - without modification.
The human implication matters more than the technical one. When one engineer builds a workflow for untangling staging environment conflicts, that workflow does not live on their machine anymore. It lives in a shared registry. The engineer who joins six months later inherits it on day one. One person's late-night breakthrough becomes the organizational baseline by morning.
RAMP's internal marketplace - the Dojo - has over 350 of these shared skills. The governance mechanism is elegant: skills are plain markdown files in a Git repository. Anyone submits a PR. Anyone reviews one. The same pull request workflow engineers already use every day. Not an IT request form, not a three-week committee. A PR.
Step Three: Intelligent Routing (Making the System Find You)
The obvious objection to a 350-skill marketplace is that nobody will use it because nobody can find what they need. This is the failure mode of every internal wiki, every Confluence space, every well-intentioned Notion database that a team built and then quietly abandoned.
The solution is not better documentation. It is making the system find you instead of making you find the system.
When a developer opens the platform, it reads their current context - their active ticket, their open branch, their recent file activity - and surfaces the three or four most relevant tools for what they are actually working on right now. The rest are invisible. RAMP calls this component the Sensei. When a new sales rep joins, the Sensei reads their role and their active deals and surfaces the five relevant skills out of 350, without requiring them to know the marketplace exists.
The technical name is Tool RAG - applying the same retrieval logic used for documents to the tool catalog itself. Anthropic's research on this pattern shows tool selection accuracy jumping from 13% to 43% on large toolsets when you do this. But the more important description is the human one: the system is proactively doing the work of figuring out what you need, rather than asking you to navigate a catalog that grows faster than anyone can follow.
The Compounding Asymptote (Or: Why You Cannot Rent a Nervous System)
There is a desperate, almost pathological desire among enterprise leadership to solve this by writing a check. They look at the fragmented chaos of the engineering floor, panic, and reach for a SaaS contract.
This is a fatal misunderstanding of what the problem actually is.
You cannot buy shared organizational memory off a shelf. If your institutional AI layer is a third-party wrapper around a generic model with no connection to your specific codebase, your specific incident history, your specific architectural decisions, your competitive advantage is exactly as deep as your closest competitor's subscription budget. When the context pipeline chokes on your particular, deeply weird legacy architecture, you cannot submit a Zendesk ticket and wait for someone in San Francisco to care. You need the capacity to fix it yourself, today.
The McKinsey data on the 20% EBITDA uplift is not about which model companies are using. It is about whether they have made AI an internal discipline or a rented service. Companies building internal AI infrastructure are, necessarily, building the organizational muscle to maintain it - which means they compound. The context graph gets richer. The Dojo gets more skills. The routing gets smarter. The system learns the company. A SaaS subscription does not.
An AI transformation is not a software deployment. It is a people transformation.
The final measure of this architecture is not what it does for your best engineers. It is what it does for everyone else. The QA engineer who used to commute via spreadsheet can now interrogate the entire codebase in plain English. The new hire on day two inherits three years of accumulated context rather than a blank window. The product manager's tickets land in a system that already understands the architectural constraints. The breakthrough one engineer had at 11pm on Tuesday becomes the shared default by 9am on Wednesday.
The isolated brilliance of the individual, finally, stops evaporating at tab close.
We are standing at the exact moment where syntax has become cheap and institutional memory has become the only thing left that compounds. The companies that survive the next decade won't be the ones that gave everyone a faster text box. They will be the ones that realized the oracle needs to be shared.