From Builders to Orchestrators

AI won’t take your job. But it is quietly rewriting your operating model. Here are the four questions every leader has to answer — and what the 2026 evidence actually says.

Short-term turbulence for long-term abundance.

There is a line, usually attributed to Scott Galloway, that has stuck to this moment because it is true: “AI won’t take your job. Someone using AI will.” I would go one step further. AI does not have ambition, a P&L, or a plan to reorganise your division. People do. Competitors do. And the quiet reshaping now underway is not really about the technology at all — it is about the operating model we build around it.

That is the shift I keep seeing in the room. Eighteen months ago, boards asked me whether they should be doing AI. Now the question is sharper and far more uncomfortable: how do we organise, staff, structure and govern for a world where software is nearly free to build and judgement is the scarce commodity? PwC’s 29th Global CEO Survey, released at Davos in January, captures the anxiety precisely. Confidence in revenue growth over the next twelve months has fallen to 30% — down from 38% a year earlier and 56% in 2022 — even as investment in AI accelerates. More than half of CEOs (56%) say they have seen neither higher revenues nor lower costs from that investment over the past year. Only about one in eight — 12%, PwC’s “vanguard” — report both revenue gains and cost reductions from AI. As PwC’s Global Chairman Mohamed Kande put it, 2026 is shaping up to be a decisive year: a small group is turning AI into measurable returns while most are still stuck in pilots, and that gap will widen quickly.

The interesting thing is why the vanguard pulls away. It is not that they bought better models. It is that they built the operating model first. The survey found only 26% of organisations have strong AI foundations across at least six of seven core areas — strategy, data, technology, talent, investment, culture and responsible AI — and those that do are roughly twice as likely to be seeing revenue and cost gains. The bottleneck is not intelligence. It is everything around the intelligence.

So this piece is organised around the four questions I now walk every leadership team through, in order: economics, talent, structure and governance. Get them in the wrong order and you burn the company. Get them right and you earn the abundance on the other side of the turbulence.

1. Economics: read the evidence honestly

Start with the data, because the headlines are almost all wrong in both directions.

The most careful piece of empirical work I have seen is Anthropic’s March 2026 study, Labor market impacts of AI: A new measure and early evidence, by Maxim Massenkoff and Peter McCrory. Its contribution is methodological, and it matters. Previous studies measured theoretical exposure — what a large language model could, in principle, do to a job. Anthropic instead built a measure they call observed exposure, combining three sources: the O*NET database of roughly 800 US occupations; the task-level “could-an-LLM-do-this” scores from Eloundou et al. (2023); and Anthropic’s own real-world usage data from the Economic Index. In plain terms: not what AI might do, but what it is actually doing, weighted towards automated and work-related use.

The gap between the two is the whole story. Tasks that are fully feasible for an LLM account for about 68% of observed Claude usage — yet actual coverage remains a fraction of the theoretical ceiling. In Computer & Maths occupations, where you would expect AI to be furthest along, roughly 94% of tasks are theoretically exposed but only 33% are actually being done with AI today. Office & Admin sits near 90% theoretical with a small fraction observed. At the top of the exposure table are Computer Programmers (about 75% coverage), Customer Service Representatives, Data Entry Keyers (67%) and financial analysts. At the other end, some 30% of workers register zero observed coverage — cooks, bartenders, lifeguards, mechanics — because their work simply is not showing up in the data.

Two findings deserve to be printed and pinned above every workforce-planning meeting.

First, exposure is not destiny — at least not yet. Massenkoff and McCrory find no systematic increase in unemploymentfor the most-exposed workers since late 2022. The difference-in-differences gap between the most- and least-exposed groups is small and statistically indistinguishable from zero. The wave everyone fears has not arrived in the aggregate figures.

Second, the reshaping is real, and it is landing on the young first. The study finds suggestive evidence that hiring of workers aged 22–25 into exposed occupations has slowed by about 14% since ChatGPT’s release — echoing Brynjolfsson, Chandar and Chen’s “canaries in the coal mine” work, which found a 6–16% fall in employment for that same age group in exposed roles. Not a collapse. A slowdown. And it is concentrated at the entry level.

There is a counter-intuitive twist the data insists on. The most-exposed workers are not the low-skilled and low-paid. They are disproportionately older, more likely to be female, better educated and better paid — earning around 47% more on average, with graduate degrees nearly four times more common than in the unexposed group. The old story that automation eats the bottom of the ladder first is, on this evidence, backwards.

Now layer PwC’s own 2026 Global AI Jobs Barometer on top — more than a billion job advertisements across 27 countries — and the picture resolves into a bifurcation. The Barometer describes a two-track labour market. In “professionalised” roles, AI removes routine work and amplifies human judgement; these jobs are growing roughly twice as fast as “democratised” roles, with 42% faster wage growth since 2021. The wage premium for genuine AI skills has climbed to 62% globally (up from 57% the year before, and more than double the level of a couple of years ago), ranging from as high as 118% in consumer markets to about 16% in government. Jobs demanding AI skills grew 69% against 9% for the market as a whole — roughly eight times faster.

And here is the finding that should end the reflexive “AI means headcount cuts” conversation in the boardroom. Among the companies most exposed to AI, headcount grew 52% since 2018, against 36% for the least exposed. Productivity growth ran at 34% versus 24%, and the top quintile of AI-exposed firms — PwC’s “super-stars” — posted an extraordinary 163%. Used for growth rather than merely for cost, AI is behaving less like a job killer and more like a job expander. This is the empirical spine of the argument I have made for years that Human Experience is the sum of Customer and Employee Experience — HX = CX + EX — and that the firms which invest in the human side of the equation, not just the machine side, are the ones that compound.

But there is a warning tucked inside the Barometer for anyone tempted to protect the pipeline by pretending nothing has changed: AI-exposed entry-level roles are now seven times more likely to demand traditionally senior skills — judgement, leadership, stakeholder communication — than the least-exposed entry-level roles. Those reshaped junior roles grew 35% since 2019 while other entry-level roles fell 10%. The bottom rung of the ladder has not been removed. It has been raised.

The pricing scissors

Why is software becoming nearly free to build? Because two cost curves are moving in opposite directions. The cost of training a frontier model is rising on the order of 2.4× a year, pushing the price of building one towards the billions — a game only a handful of firms can play. The cost of inference — using a model — is falling roughly 10× a year, collapsing towards zero. The gap between those curves, the pricing scissors, is opening at something like 12–24× per year.

The strategic implication is that a proprietary small model is rarely the moat people hope it is. Value migrates from owning the intelligence to composing and orchestrating it. I frame the choice for clients as three worlds: Use (consume someone else’s managed AI — highest leverage, lowest differentiation), Compose (stitch frontier APIs into your own workflows — the pragmatic middle), and Build (train or fine-tune your own — highest control, highest cost, lowest speed, and justified only where you genuinely differentiate). The mistake is deciding which world you live in. The discipline is deciding which world each part of a workflow lives in, and letting economics move it over time — the frontier model does everything on day one; by month six you have learned enough to push commodity steps to Use and Compose; by year two the high-volume, high-differentiation work has earned its way into Build. The company that declares itself a “build shop” on day one usually burns its runway training models before it understands its own workflow.

And what, then, endures as a moat? Not the things that used to be hard to do — workflow embeddedness, software scale, integration lock-in, engineering complexity — because AI is erasing exactly those. What endures is what is hard to get: compounding proprietary data (years of operations), network effects (years of adoption), regulatory permission (years of process), capital at scale (decades of trust) and physical infrastructure (years of building). As I put it in the room: AI compresses the time it takes to do things; it does not compress the time it takes for things to happen. The easier software gets to build, the more value concentrates in what software cannot replicate.

2. Talent: from builders to orchestrators

For thirty years, being valuable in technology meant being able to build the thing — write the code, design the schema, ship the feature. That was the craft and the career ladder. The craft is changing under our feet. The most valuable people over the next five years will not be the ones who build fastest; they will be the ones who orchestrate fastest — who can point an agent at a problem, evaluate the output, steer the next iteration, and know when to overrule.

Martin Fowler and his colleagues at Thoughtworks named this archetype well in their 2025 essay: the expert generalist. They identify seven characteristics — curiosity, collaborativeness, customer focus, a bias towards fundamental knowledge, a blend of generalist and specialist skills, sympathy for adjacent domains, and a grasp of first-principle patterns. What strikes me is that these are precisely the traits agents amplify. An agent multiplies a curious person with deep fundamentals; it does nothing for someone who has only ever memorised one framework. So when you hire in this new world, hire for those seven — not for the framework of the year, which will change three times before the new joiner’s first review.

Watch what happens to a team when agents arrive. The specialist, your deep domain expert, gets pulled to broaden — their single lane is no longer enough. The generalist gets pulled to deepen — agents grant them specialist-level depth on demand. They meet in the middle as what Werner Vogels has called the “Renaissance developer”: the polymath with steering hands. It is, almost exactly, the opposite of what we have hired for over the past two decades.

If you want proof rather than theory, look at Anthropic’s Build with Claude hackathon in February: 13,000 applications, 500 teams accepted, 277 shipping production code, some 21 million lines generated. The finishers who topped it were not professional developers. First place went to a lawyer who built a California permitting tool; third to an interventional cardiologist who built a patient-care platform in seven days, coding between patients. The lesson is not “developers are dead.” It is that domain expertise plus AI now beats coding skill alone — and that is a profound signal for how you hire, especially for the next two years. Anthropic’s companion research on agentic coding, drawn from around 400,000 Claude Code sessions, makes the same point from the other direction: the returns to genuine expertise persist, even as the tools get better. Agents raise the floor; they do not raise the ceiling for those who never learned the fundamentals.

This is why the shape of teams is converging. The old world was a squad of six-to-eight specialists — a product manager, front-end, back-end and data engineers, an ML engineer, DevOps, QA, a security reviewer — each owning a lane, with value lost in the hand-offs between them. The new world, for genuinely new AI work, is two or three expert generalists plus agents. Each owns a workflow end-to-end rather than a lane; the agents fill the gaps where specialist depth is needed; the coordination overhead collapses. I call this hyper-convergence. I am not claiming the eight-person team dies tomorrow — but if the team you are assembling for new AI work still looks like the old squad, you are building yesterday’s team.

Four forces, all true at once

None of this points cleanly in one direction, and any honest account has to hold the contradictions together.

The expert multiplier. Senior, knowledgeable people with AI achieve order-of-magnitude gains in speed. AWS’s Project Mantle is the standing example.
The bottleneck shifts. The question stops being “can we build it?” and becomes “do we have the data, and can we decide fast enough to keep up with what we can now build?” Judgement, not execution, becomes the constraint.
The verification tax. AI generates code perhaps ten times faster, but reviewing it is around three times harder; independent evaluations such as METR’s have found that AI-assisted output can carry materially more defects and, in some settings, can even slow experienced engineers who must now scrutinise more. The bottleneck migrates from writing to reviewing.
The deskilling trap. Juniors using AI ship more code but understand less of it — on the order of 17% less comprehension in some studies. Faster and less grounded at the same time. If they never learn to work without AI, who verifies the AI in five years?

The leader’s job is to hold that tension, not resolve it prematurely.

3. Structure: the hourglass, and the operating model that houses it

Put the labour-market evidence together and a shape emerges. Most organisations today are a pyramid — many juniors at the base, fewer seniors directing. The reflexive over-reaction is the diamond: cut the juniors because “AI can do the work,” bulk up the middle with managers to “oversee the AI.” That is the trap; it starves the pipeline. What actually works for delivery is the inverted pyramid — the pod — three to five senior, full-stack people with agents doing the execution. It ships beautifully, but on its own it has no learning path. The shape we should be building for is the hourglass: execution at the top, a lean middle, and juniors deliberately learning the craft on the way up.

Here is the nuance I most want leaders to carry out of the room. The inverted pyramid is the pod; the hourglass is the organisation. Pods run senior-heavy for delivery, while the enterprise that houses them stays hourglass-shaped to keep the junior pipeline alive. Both are true, at different altitudes.

The reason the hourglass matters is that the market is currently pulling in the opposite direction. Firms cut the base because it is the quickest route to the ROI their boards are demanding from AI, while simultaneously paying top dollar for senior talent and anyone with “AI” on their CV. The middle hollows; the top explodes. That is neither a healthy economy nor a healthy pipeline. And it sets up the question I cannot get out of my head: if you stop training juniors, where do your seniors come from? Judgement — the very thing that becomes scarce and valuable — only exists because someone spent a decade doing execution and learning from the mistakes. Matt Garman, AWS’s CEO, made the point bluntly last year: it makes no sense to stop hiring young people out of college, because in ten years you will have no one who has learned anything. He is right. The firms that cut the base today will not feel the pain today; they will feel it in 2034 — four CEO cycles away, which is precisely why so few are resisting the temptation. Your competitive advantage in 2034 is the juniors you hire now.

The CIO becomes the conductor

The org chart most of us still run was designed for determinism — runbooks, change advisory boards, SLAs measured in nines, predictable outputs for predictable inputs. In an agentic world the functions do not disappear, but the boundaries between them dissolve. Security stops being a review gate and becomes code running at the gateway. Architecture stops being a whiteboard and becomes the policy your agents respect at runtime. Foundation services stop being provisioning and become the operating envelope for autonomous systems. The CIO’s role shifts from owner of the stack to conductorof it.

That demands the single hardest mental-model change of all. Non-determinism is a feature, not a bug — you are asking agents to handle the cases you did not anticipate. So the discipline inverts: become tolerant of variance in execution and strict about variance in outcome. Stop prescribing the route; define the riverbank — what the outcome must be — and let the water find its way. Build a toll gate in the middle of a river and the river simply flows around it.

Structurally, that resolves into three operating models. Model A — build, then throw it over the wall to a separate operations team — was already creaking in the cloud era and breaks outright with agents; it is the anti-pattern, and it is dead. Model B — embedded, “you build it, you run it,” the same three-to-five senior engineers building and operating the pod — is DORA-Elite territory: multiple deployments a day, sub-hour recovery, low change-failure rates, because the people who wrote the agent are the ones debugging it at three in the morning. Model B strains past roughly ten pods, through duplication and drift. Model C — pod plus platform — solves that: a shared, governed platform providing runtime, memory, identity and observability, on which autonomous pods still choose their own models, agents and data. The principle that makes it work is simple: the platform enables, it does not constrain. Full autonomy, full accountability, on a common road.

4. Governance: policy as the riverbank

This is my home ground, and it is where most transformations quietly fail. If the new discipline is strict about outcomes, then governance is how you make “strict” real — and for autonomous systems it cannot live inside the model’s own reasoning loop. It has to sit outside it, as enforceable policy.

The most mature template I have seen is the one Singapore’s IMDA published in January — the world’s first Model AI Governance Framework written specifically for agentic AI, which I have had the privilege of engaging with closely. It lands on four dimensions, and it is telling that when Amazon shipped Bedrock AgentCore it converged on the same four — a case of independent evolution reaching the same answer:

Risk assessment — classify each agent by autonomy, data sensitivity and blast radius before it is deployed.
Human accountability — a named, accountable owner for every agent in production.
Technical guardrails — identity, authorisation, testing, and controls for multi-agent interactions.
End-user transparency — disclose the agent, its limits and its authorised actions.

In practice, that reduces to five questions you must be able to answer before an agent acts, and audit after: Who is the agent? Who authorised it? What is it allowed to do? Is it behaving as expected? Can we prove what it did? This is the essence of what I have called Loop Governance — governing not just the prompt, but the loop of autonomous action around it — and it is the layer my work on TrustOS is built to operationalise across MAS FEAT/AIRG, the EU AI Act, NIST’s AI RMF, ISO 42001 and the IMDA framework itself. Policy, in this world, is code. Policy as code is the riverbank.

The CEO Survey makes the stakes concrete. Two-thirds of CEOs experienced stakeholder-trust concerns in the past year — many linked directly to AI, transparency and the pace of change — yet only 27% believe their leadership teams can anticipate disruption. Cyber risk now tops the threat list for 31% of CEOs, up from 24%. In regulated sectors especially, trust is not a soft virtue. It is the load-bearing structure that lets you move an agent from pilot to production with confidence, and then do it a thousand times more. Governance is not the tax on innovation. It is what makes innovation repeatable.

What to do on Monday morning

Six moves, in order. Do not jump ahead.

Economics. Pick one workflow and find out, honestly, what it would cost a well-funded rival to rebuild it at a tenth of the headcount. If you don’t know, that is the finding.
Talent. Decide who is on the pod — two or three expert generalists, plus agents — and hire for the seven traits, not the framework of the year.
Structure. Retire Model A. Choose embedded (B) or pod-plus-platform (C) depending on scale.
Governance. Stand up the four dimensions and the five pre-action questions before, not after, the agents scale.
People. Invest in your senior domain experts — the orchestrators with steering hands.
Pipeline. Protect the juniors. Your 2034 advantage is the base you build today.

The Fork

I have long framed AI’s trajectory as The Fork — the choice between a Star Trek future of shared abundance and a Mad Max future of concentrated scarcity. The 2026 evidence tells me the fork is not decided by the models. It is decided by the operating models. The Anthropic data says the disruption is still ambiguous and the young are feeling it first. PwC’s Barometer says the firms that use AI to amplify people are hiring more, paying more and pulling away. The CEO Survey says only a small vanguard has turned ambition into returns, and that the difference is foundations, not algorithms.

Put simply: it is not the best AI that wins — it is the best operating model around it. Economics, talent, structure and governance; then people and pipeline. Build the hourglass, run the pods, own the orchestration, and define the riverbank. Do that, and the turbulence of this decade resolves, as it always has, into abundance on the other side.

Short-term turbulence. Long-term abundance. The work now is to make sure the abundance is shared.

— Luke

Sources & further reading

Anthropic — Massenkoff, M. & McCrory, P. (2026). Labor market impacts of AI: A new measure and early evidence. anthropic.com/research/labor-market-impacts
Anthropic — Agentic coding and persistent returns to expertise (2026); and the Anthropic Economic Index reports. anthropic.com/economic-index
Eloundou, T., Manning, S., Mishkin, P. & Rock, D. (2023). GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv:2303.10130
Brynjolfsson, E., Chandar, B. & Chen, R. (2025). Canaries in the coal mine? Six facts about the recent employment effects of artificial intelligence.
PwC — 2026 Global AI Jobs Barometer: Two futures for jobs in an AI era (15 June 2026). pwc.com/gx/en/services/ai/ai-jobs-barometer.html
PwC — 29th Annual Global CEO Survey: Leading through uncertainty in the age of AI (January 2026). pwc.com/gx/en/issues/c-suite-insights/ceo-survey.html
PwC — Global Workforce Hopes & Fears Survey (2025).
Fowler, M. / Thoughtworks — The Expert Generalist (2025). martinfowler.com/articles/expert-generalist.html
METR (2025) — evaluations of AI’s effect on experienced-developer productivity and code review.
Matt Garman (AWS) — remarks on continuing to hire early-career talent (2025).
IMDA Singapore — Model AI Governance Framework for Agentic AI (January 2026).
Amazon — Bedrock AgentCore (2025–26), for the convergent governance dimensions.