AI Engineering Has Crossed the Delegation Threshold

David Wang
Jun 1
6 min read

By David Wang (Founder, OpenDeSci Foundation) and Norman Colon (Chief Technology Officer, OpenDeSci Foundation), with research support and co-authorship by Prof. Dr. Christian Abegglen (President of the Council, StGallen Integrated imt Business School)

Published: 01 June 2026

1. The headline finding

Between 2023 and 2026, software engineering has become the first business function in which generative AI has crossed from augmenting the human operator to executing on its behalf. The evidence is unambiguous: on SWE-bench Verified, frontier models advanced from approximately 60% of the human baseline to near 100% in a single calendar year [1]. Top-quintile organisations now report 16–30% productivity and 31–45% software-quality gains from structured AI adoption [2], while frontier teams achieve up to 20× operating leverage — a few practitioners delivering what once required a full department [3].

The strategic question has shifted. It is no longer "should we adopt?" It is "how quickly can we redesign our operating model around delegation?"

2. Three productivity regimes in 36 months

A review of the empirical literature reveals three distinct regimes — each with a different productivity ceiling and a different management implication.

Regime 1: AI as Autocomplete (2020–2023). GitHub Copilot (GA June 2022), TabNine, and similar IDE plugins offered inline completions; in benchmark testing, approximately 29% of generations from early Copilot were fully correct on HumanEval, with a further 51% partially correct [4]. Controlled productivity studies showed task speed-ups of around 56% [5]; real-world reported gains were more modest, typically in the +10–15% range. The bottleneck was boilerplate, syntax recall, and documentation lookup. The operating model required no change.

Regime 2: AI as Pair Programmer (2024–2025). Cursor (1M+ users by 2025), Windsurf, and chat-based interfaces enabled multi-file edits, with the developer reviewing each diff. McKinsey's 300-company analysis [2] showed +25–30% gains on complex tasks. Yet a controlled study by METR [6] found that experienced open-source developers were, in fact, 19% slower with AI assistance — despite believing themselves to be faster. The bottleneck shifted from typing speed to context curation, prompt skill, and model selection. Quality debt began to accumulate.

Regime 3: AI as Autonomous Agent (2026). Claude Code, OpenAI Codex, Devin, and Cursor's agent mode — combined with reasoning models such as Claude Opus 4.7 and Gemini 3.1 Pro — have fundamentally changed the unit of work. The engineer specifies intent; the agent plans, edits across 20+ files, runs tests, and ships a pull request. An empirical study of 567 pull requests across 157 open-source projects found an 83.8% acceptance rate for agent-generated PRs [7]. Anthropic's own reporting indicates that internal code output per engineer grew approximately 200% year-over-year following the deployment of agentic workflows [8]. The bottleneck is now inference cost, agent reliability, observability, and security — none of which are tooling problems.

3. Why integrated management — not tooling — is the missing variable

The Stanford 2026 AI Index reports that 88% of organisations now use AI in at least one business function, yet — according to the McKinsey 2025 Global Survey on the State of AI — only 24% of technology firms have scaled AI agents in software engineering [1, 9]. The gap between adoption and structural adoption is the gap that determines competitive advantage.

This is not a tooling problem. It is what the St.Gallen Integrated Management Concept [10, 11, 12] would describe as a systemic one: a change at the operational level that cannot succeed without simultaneous adjustments at the strategic and normative levels.

AI is not a trend. It is the operating system of the future. And like every operating system, it has to be installed not only in the tools your engineers use, but in the way your organisation defines roles, allocates capital, measures quality, and assigns accountability. The decisive question is no longer whether to adopt, but how to integrate.

— Prof. Dr. Christian Abegglen, StGallen Integrated imt Business School

In the integrated management framework [11, 12], three dimensions must move together:

Normative: how the organisation defines responsibility, governance, and ethics when AI agents take action on its behalf;
Strategic: how it redesigns workflows, restructures roles (engineers as agent-orchestrators rather than line-by-line authors), and reallocates capital from headcount to inference;
Operational: which tools, observability layers, and reliability practices are deployed day-to-day.

Top-quintile organisations are six to seven times more likely than peers to scale four or more AI use cases concurrently across the development life cycle [2] — precisely because they treat the change as integrated, not isolated.

4. From research to practice: a working case

OpenDeSci, a consumer-first science platform, was designed AI-natively from inception. Its engineering organisation operates on the agentic stack today: Claude Code for autonomous, multi-file delivery (backend services, smart-contract integration, large refactors); Cursor for daily development flow on the React Native mobile app and the AI Topic Explorer; and ClaudeBot, a custom in-house agent that handles code review, CI/CD automation, and on-call operations.

OpenDeSci is engineering-first and AI-native from day one. We built the engineering organisation around the assumption that engineers orchestrate agents, not write code line-by-line. The stack lets a small team ship what a 50-person engineering organisation delivered three years ago — but only because we designed the workflows, observability, and review processes around delegation from the outset, not bolted them on afterwards.

— Norman Colon, Chief Technology Officer, OpenDeSci Foundation

The biggest mistake teams make today is treating agentic AI as a tooling decision instead of an operating-model decision. We did the opposite. The St.Gallen lens is what kept us from optimising for output speed and forgetting about the integrity, governance, and quality dimensions that compound over time. In science, that compounding matters more than anywhere else.

— David Wang, Founder, OpenDeSci Foundation

The result, in OpenDeSci's case, is a small team operating at the throughput of an organisation several times its size — consistent with the leverage pattern documented at frontier teams [3].

5. Implications for management

For organisations beyond the top quintile, three actions follow from the evidence:

Treat AI engineering as an integrated change, not a tool rollout. Adoption alone is insufficient; without simultaneous redesign of roles, processes, and incentives, productivity gains plateau and quality debt accumulates [2, 6].
Measure inference cost as a first-class budget line. The economics of agentic engineering are fundamentally different: capital shifts from headcount to compute. Organisations that fail to instrument this lose visibility on unit economics.
Redesign the developer role. As engineers become orchestrators of asynchronous agent teams [2], the required skill profile shifts toward systems thinking, evaluation, and governance — competencies the St.Gallen tradition has emphasised for six decades [10, 11, 12].

The 36-month window is closing. First-movers compound their lead structurally; followers face a quality-debt and operating-cost penalty that grows with every release.

About this research brief

This brief was authored by David Wang (Founder) and Norman Colon (CTO) of the OpenDeSci Foundation, with research support and co-authorship by Prof. Dr. Christian Abegglen of the StGallen Integrated imt Business School. It is part of the ongoing strategic research collaboration between the two institutions, announced in February 2025 [13]. The collaboration applies the St.Gallen Integrated Management Concept to questions at the frontier of AI, decentralised science, and technology-enabled education.

References

[1] Stanford Institute for Human-Centered Artificial Intelligence. (2026). The 2026 AI Index Report. Stanford University. Retrieved from https://hai.stanford.edu/ai-index/2026-ai-index-report

[2] McKinsey & Company. (2025, November 3). Unlocking the value of AI in software development. QuantumBlack, AI by McKinsey & McKinsey Technology Practice. Analysis of nearly 300 publicly traded companies.

[3] Lamarre, E., Smaje, K., Levin, R., Singla, A., & Sukharevsky, A. (2026). Rewired: How Leading Companies Win with Technology and AI (2nd ed.). Hoboken, NJ: Wiley.

[4] Yetistiren, B., Ozsoy, I., & Tuzun, E. (2022). Assessing the quality of GitHub Copilot's code generation. Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE '22), 62–71. https://doi.org/10.1145/3558489.3559072

[5] Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: Evidence from GitHub Copilot (arXiv:2302.06590). Cornell University.

[6] Becker, J., et al. (2025). Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. METR (Model Evaluation & Threat Research).

[7] On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub. (2025). arXiv:2509.14745. Empirical study of 567 PRs across 157 open-source projects.

[8] Anthropic. (2026, March 9). Code Review for Claude Code. Anthropic Research Blog.

[9] McKinsey & Company. (2025, November). The state of AI: Global Survey 2025 (n = 1,993 organisations across 105 countries; field period: 25 June – 29 July 2025). McKinsey QuantumBlack.

[10] Ulrich, H. (1968). Die Unternehmung als produktives soziales System. Bern/Stuttgart: Haupt Verlag.

[11] Bleicher, K., & Abegglen, C. (2017). Das Konzept Integriertes Management: Visionen – Missionen – Programme (9th rev. ed.). Frankfurt am Main: Campus Verlag.

[12] Abegglen, C. (2021). Das Konzept Integriertes Management (10th, fully revised ed.). Frankfurt am Main: Campus Verlag.

[13] StGallen Integrated imt Business School & OpenDeSci. (2025, February 13). Strategic collaboration to shape the future of science. OpenDeSci Press Release.