Éric Larchevêque is a French entrepreneur, investor, and engineer, best known as the cofounder of Ledger and TBSO. His perspective on agentic enterprise systems is grounded in three decades of building, scaling, investing in, and operating technology companies.
Most small companies run on human clicks across software. Your team spends hours switching between Gmail, Slack, Notion, HubSpot, and Stripe. Instead of making decisions, serving customers, or building products, they navigate interfaces carrying context between tools that can't see each other.
You know AI should help. But after automating a few workflows, or using your favorite chatbot, you hit a wall. You get brittle integrations, cannot find anyone in-house who can maintain them, and you don't have clear expertise on what to do next. The real problem is also that your company's knowledge is scattered across twenty surfaces with no single version of the truth.
This paper describes the Agentic Enterprise OS: a single control surface where you express intent, agents execute across your tools, and the company's memory stays unified and governed. SaaS tools don't disappear; they move into the background.
Pieces of it are already in production at large enterprises (Salesforce, Microsoft, Atos...). But no one has made it real for a 1-50 person company yet, and only a few have addressed the hard problems that determine whether it works or becomes another failed AI pilot.
This paper focuses on five of those problems:
We also acknowledge what this paper is not. It is not a product spec, not a vendor-neutral analysis, and not a guarantee. TBSO is building toward this vision through its openskl.ai initiative. This paper is our honest assessment of what must be true for it to work.
From interface labor to supervised execution
Your day probably looks something like this: email for a customer request, CRM to check their history, Slack to ask your team about pricing, back to email to send a quote, then a task created somewhere so someone follows up. Maybe a spreadsheet update. Maybe a mental note that gets lost.
None of this is hard. All of it is time-consuming. The work isn't thinking, it's navigating.
The Agentic Enterprise OS replaces this with a cockpit. Imagine one screen where you see what matters, tell the system what you want, review what it proposes, and approve what it does. The CRM, billing system, email, and project tool still exist, but transition into background services called by the cockpit.
To understand exactly what this means, let's consider a basic scenario where a contact form is filled requesting a quote for a product in an SMB environment.
Why This Is More Than an Agent Orchestrator
Five things make this different from classical agent orchestrators :
- Scope — Orchestrators automate one workflow or one domain. An OS runs sales, support, finance, and founder tasks on the same backbone.
- Memory — Orchestrators pull a few fields into context. An OS maintains governed company memory based on which system owns which fact and using temporal proofs.
- Governance — In an orchestrator, each agent has its own rules. In an OS, a central safety layer classifies every action and enforces human oversight uniformly.
- Interface — Most orchestrators run invisibly. The OS puts the cockpit front and center, where you work.
- Adoption — An orchestrator assumes someone knows what to automate. The OS observes your existing work and suggests where to start.
An agent orchestrator is one building block inside the Agentic Enterprise OS, but the OS is what turns a collection of flows into a coherent way to start, grow, and run an agentic company without requiring the founder to become an AI workflow engineer.
Why it must be a control room, not a chatbot
A company can't be run through chat. Enterprise work has three modes that need different UX:
- Monitoring — What changed? What needs attention? (Always running in the background)
- Execution — Do this. Show me the plan. Then do it. (Staged, deliberate)
- Interrupts — Stop. Something's wrong. (Fast, with rollback)
A single chat thread is too slow for monitoring, too ambiguous for execution, and too noisy for interrupts. The cockpit needs to feel more like a Bloomberg terminal than like ChatGPT.
The new role of SaaS System of Records
SaaS tools don't disappear, but they retreat to the background. CRMs, billing systems, wikis, and project trackers become headless infrastructure the cockpit queries and updates.
The company's primary interface becomes the cockpit, and SaaS applications become infrastructure the cockpit calls into, rather than destinations where humans spend their days.
The Agentic Enterprise OS can be understood as seven interacting conceptual layers.
| Layer | Core question | Main responsibility |
|---|---|---|
| Systems of Record | Where does operational data live today? | Existing SaaS, internal tools, databases, documents, and data platforms. |
| Integration Gateway | How do agents reach those systems safely and uniformly? | MCP servers, APIs, webhooks, connectors, adapters, authentication, and normalization. |
| Governance & Safety Kernel | What is allowed, by whom, under what conditions? | Policy engine, permissions, classification, delegation tokens, escalation rules, and human-in-the-loop routing. |
| Company Memory | What does the company know and has decided? | Facts, decisions, roles, authority, context, temporal validity, commitments, and institutional memory. |
| Observability & Audit | What did the system actually do, and why? | Temporal audit feed, traces, metrics, explanations, approvals, replay, and accountability. |
| Agent Runtime Layer | Who plans, reasons, and executes? | Planners, orchestrators, domain agents, tool-calling agents, task decomposition, execution, and recovery. |
| Cockpit | Where do humans see, decide, and intervene? | Control room for plans, approvals, exceptions, monitoring, delegation, and operational supervision. |
Here is how a single request moves through the system:
What each layer does
Your existing tools (Systems of Record). Nothing moves, nothing migrates. HubSpot stays HubSpot. Gmail stays Gmail. Stripe stays Stripe. These systems keep doing what they do, which is storing contacts, emails, invoices, tasks. What changes is that you stop opening them directly to get work done. They become background services the cockpit reads from and writes to.
The connection layer (Integration Gateway). This is what lets agents reach your tools safely. It is like a secure switchboard where every agent request goes through it, making sure each agent only touches what it's supposed to touch, with credentials that expire after the task is done. No agent should ever hold a permanent key to your tools.
The rules engine (Governance & Safety Kernel). Before any agent takes an action, this layer asks: is this allowed? It classifies every proposed action from fully autonomous (write a draft, search memory) to requiring your explicit sign-off (send an external email, change pricing, trigger a payment). It also learns over time as actions you approve routinely without editing will gradually require less of your attention and become fully autonomous.
Company memory. This is the layer that makes the Agentic OS different from a chat bot or a workflow. It maps your company (people, projects, products...) and holds what your company actually knows (customers, deals, priorities, decisions, commitments, policies...) Contrary to most approaches, memory is not built as a pile of documents to search through but as a governed record where each important fact has a declared source, an owner, and a freshness date. This is probably the crown jewel of the system.
The audit trail (Observability & Audit). Every action the system takes is logged for future reference (what was done, why, which memory facts were used, who approved it, etc.) This is what lets you ask "what happened with this thing last Tuesday?" and get a precise answer. It's also what gives you a rollback path when something goes wrong, and the compliance evidence if anyone ever asks (useful to deflect blame and organize agent execution to appease auditors).
The agents themselves (Agent Runtime). This is where reasoning and execution happen. There is a planner that figures out what needs doing, specialized agents that handle specific areas like sales or support, and lightweight executors that carry out individual steps. They work in parallel where they can, check memory before acting, and escalate to you when they hit something they shouldn't decide alone.
The cockpit. In theory, the last screen you'll ever use for company work. This is where you see what's happening, tell the system what you want, review what it's proposing, and approve what it does. KPIs, state of your business, priorities, the list of things needing your attention, and the plans agents are about to execute (or are executing).
Why "just connect all your data" fails
Consider the question "What did we promise this client?"
A naive RAG (Retrieval-Augmented Generation) system will attempt to synthesize an answer from:
- A signed contract in Drive.
- A Slack thread where a sales engineer promised a feature.
- A CRM note summarizing a call.
- An email thread negotiating a discount.
- A Notion page describing the standard plan.
The real problem isn't model hallucination, it is that the company itself has unresolved truth. There is no single source of truth because different fields are legitimately owned by different systems and roles.
Before your agents write anywhere, you need to answer four questions:
- Who is allowed to assert which facts?
- From which system or channel?
- For how long are those assertions considered valid?
- How are conflicts detected, surfaced, and resolved?
The four memory layers and their failure modes
In an agentic system, there are four distinct layers of memory, each with its own failure mode.
| Memory Layer | Purpose | Typical Failure Mode |
|---|---|---|
| Working memory | Current task context and active reasoning state | Context window truncation, causing loss of relevant short-term context |
| Episodic memory | Past interactions, actions, and conversations | Summaries that lose detail or reflect the model's bias |
| Semantic memory | Company knowledge, facts, and reference information | Stale or conflicting information with no freshness or authority weighting |
| Procedural memory | Operational know-how and workflows ("how things are done") | Knowledge remaining implicit and undocumented, living only in people’s heads |
Agents cannot safely act on behalf of a company if these layers are not designed intentionally.
Decision memory and event sourcing
Companies don't run on data, they run on decisions. But decisions have never been made machine-readable. They are scattered across emails, meetings, and implicit conventions.
What is missing is a context graph that captures decisions and their reasons. One implementation pattern is event sourcing: you record decisions as append-only events and rebuild the current state from them.
This has two major implications:
- Facts cannot be overwritten; they can only be superseded.
- Every answer the cockpit gives can be traced back through a chain of proofs.
The memory constitution
Before deploying agents that write anywhere, a company must write a memory constitution.
At minimum, it should define:
- Authority per field — For each important fact (contract value, renewal date, payment status, health score, owner, etc.), specify the authoritative system and role.
- Temporal rules — For each field, specify freshness requirements, TTLs, and when old facts must be treated as stale.
- Invalidation responsibilities — Who is responsible for updating memory when a fact changes?
- Conflict resolution — How conflicts are surfaced and who resolves them (who is the memory steward?).
This is not only technical. It forces the company to decide which source is trusted for which fact. If that decision stays implicit, agents will automate the confusion.
The trust bridge problem
When an agent can read low-trust inputs and trigger high-trust actions, its permissions become the attack surface. Examples:
- A crafted email with hidden instructions that an agent ingests and uses to update pricing.
- A poisoned Notion page with embedded prompts that steer an agent updating CRM records.
- A compromised MCP tool description that contains malicious meta-instructions executed at discovery time.
The cockpit is a trust bridge from low-trust surfaces (email, documents, tickets, external tools) to high-trust actions (payments, critical changes, even code pushed to production). If that bridge is not segmented and monitored, the cockpit becomes the largest lateral movement surface in the organization.
Emerging attack vectors
There are three active vectors already exploited in real MCP ecosystems:
- Indirect prompt injection in content the agent is supposed to read (emails, docs, messages).
- Tool poisoning in MCP server descriptions read during tool discovery, not logged in conversations.
- Agent-to-agent lateral movement via shared memory and context one agent trusts because another agent trusted it.
In 2026, the majority of MCP deployments have no enforced security boundary between agent and tools and only 8.5% of MCP servers use OAuth for authentication[01]; 73% of production AI deployments are vulnerable to prompt injection[02]. OpenAI has formally labeled prompt injection a "frontier, unsolved security problem"[03].
OAuth and the confused deputy
Most current MCP deployments use token passthrough. A user authorizes an agent with a broad OAuth token; the agent passes it to every server. This creates a classic confused deputy scenario: the agent acts with the user's full authority in contexts the user never intended.[04]
For instance, one agent may need access to Google Workspace so it can summarize emails, but because it receives the full OAuth token and passes it to every MCP server, a compromised tool can suddenly read confidential Drive documents and send emails on your behalf.
There are, however, emerging best practices:
- The agent is a distinct principal from the user.
- The agent receives a task-scoped delegation token, not the user's full credential.
- Tokens are short-lived and revocable, issued by the safety kernel, not by individual agents.
Universality vs segmentation: the structural paradox
The cockpit's value proposition is universality: one interface, all systems, all context. But at the same time, its security requires a full segmentation, meaning that no agent can reach everything.
This creates a paradox. A cockpit that is truly unified at the UI layer must be deeply segmented under the surface. Every connection must have a credential boundary, every action must get an explicit delegation, and every escalation is a structured, auditable handoff.
This presents a real challenge. The architecture must live with this contradiction and own it completely, rather than hand-waving it away with basic solutions like scoped agents.
Three latencies
There are three kinds of latencies:
- Technical latency — processing time for LLM inference, tool calls, network trips.
- Perceptual latency — how fast the system feels, shaped by visible progress and responsiveness.
- Cognitive latency — how long the human must hold uncertainty about whether the system will do the right thing.
Technical optimizations (model speed, faster I/O) help, but they do not solve the cognitive problem. A frozen spinner during a critical sequence feels catastrophic, even if the result arrives in three seconds.
Conversational vs command: a false choice
Pure chat interfaces fail because enterprise work is not inherently conversational. Complex, parameterized instructions with constraints do not map cleanly to back-and-forth dialogue.
Pure command interfaces fail because enterprise intent is ambiguous and context-laden. Forcing users to learn a DSL (domain-specific language) to express "fix the deal with customer Y" pushes cognitive burden back onto humans.
The correct pattern is intent-first, plan-driven:
- The user expresses intent in natural language.
- The system interprets it, asks clarifying questions if needed and generates a structured execution plan with parameters, constraints, and decision points.
- The user reviews, edits, and confirms.
- The agent executes under that plan, with stage gates where needed.
Speculative execution and parallelism
Speculative execution is one of the most effective techniques for reducing perceived latency in agentic systems. While the user is reviewing a plan (often 5-15 seconds), the system pre-executes high-confidence dry-run sub-steps whose results are very likely to be needed regardless of minor plan edits.
When the user clicks Confirm some results are already available. If the plan is modified in ways that invalidate speculative work, that work is discarded. This is a direct borrow from CPU speculative execution applied to agent workflows.
Additionally, tool calls that do not depend on each other's outputs should never be serialized. Parallelizing them can halve perceived latency with no conceptual complexity.
Approval fatigue and fake governance
Approval fatigue is a governance catastrophe. Here is what happens when humans receive too many approval prompts:
- They approve faster than they read.
- They treat approvals as a mechanical step.
- The audit log records approvals that were never substantively given.
This outcome is worse than full autonomy! The company gains neither efficiency nor real oversight. To address this, there are three common models for human oversight:
- HITL (Human-in-the-Loop) — the agent proposes actions, but execution is blocked until a human explicitly approves.
- HOTL (Human-on-the-Loop) — the agent executes autonomously within predefined boundaries, while humans supervise, monitor, and retain intervention authority.
- HIC (Human-in-Command) — humans define objectives, constraints, and governance, while the system operates continuously and autonomously unless escalation or override is required.
Four-tier action classification
The solution: classify actions by reversibility and how much damage if it goes wrong:
| Tier | Characteristics | Approval posture | Examples |
|---|---|---|---|
| Tier 1 Autonomous | Reversible, narrow scope | Execute and log | Draft internal emails, summarize documents, search internal memory |
| Tier 2 Notify | Reversible, moderate scope; real systems touched | Execute, surface for review | Update non-critical CRM fields, create tasks, schedule internal meetings |
| Tier 3 Approve | Partially reversible or sensitive | Stage for human approval | Send external emails, modify pricing on an account, create a new vendor |
| Tier 4 Governance | Irreversible or high blast radius | Multi-party sign-off, challenge-and-response | Payment execution, contract signature, access changes, data deletion |
T3 and T4 actions must stay a small share of total actions. If they are not, the agent's scope is too broad and will bury the user under approval volume.
Challenge-and-response approvals
For T3 and T4 actions, simple Approve/Deny buttons encourage rubber-stamping. A better pattern is a structured checklist the approver must complete:
- I confirm the intent of this action is what I authorized.
- I confirm the data sources used are appropriate.
- I confirm this action is within the agent's declared scope.
- I understand which systems will be affected and the blast radius.
- I know how to reverse this action if needed.
This takes 30-60 seconds, but that is appropriate for high-stakes actions.
HITL as learning signal
HITL systems fail when human oversight becomes repetitive friction rather than a learning mechanism.
The correct architecture treats every HITL decision as a training signal:
- Approvals without modification → candidates for T1/T2 promotion.
- Approvals with edits → reveal miscalibrations in the agent's reasoning or prompts.
- Denials → indicate scope definitions that are too broad or misaligned.
- Approved actions that later produce bad outcomes → signals that policy or risk models need revision.
This creates a HITL flywheel: every human review helps recalibrate the system, progressively reducing the number of actions that require user intervention.
Empirical data is discouraging
- IDC: 54% of AI proofs-of-concept never reach production.[05]
- MIT NANDA: GenAI pilot failure around 95%.[06]
- PwC CEO survey: 56% of CEOs see no financial impact from AI investments despite adoption.[07]
- Gartner: over 40% of agentic AI projects are predicted to be cancelled by 2027.[08]
However, small companies have a structural advantage. The founder decides, the team aligns in days not months, and the tool stack is 5-15 SaaS products, not 200.
The failure pattern doesn't have to apply if you adopt progressively and avoid one specific trap.
The one trap to avoid: contested reality
In a naive transition, both old interfaces (Notion, Salesforce, Gmail, Stripe) and the cockpit are live and can write to operational truth. This creates a contested reality where two systems are both authoritative in practice but neither formally primary.
A forecast built on partially migrated data can be both precise and wrong. A support response based on cockpit memory can contradict what someone set manually in the underlying tool five minutes earlier.
The result is not hallucination in the classic LLM sense. It is institutional hallucination, multiple inconsistent truths written to systems by multiple humans and agents.
The fix is straightforward in principle: company memory must stay as close to real time as possible, and every important field must have a declared source of authority. If someone updates the CRM, the cockpit should know quickly; if Slack, email, and the CRM disagree, the system should know which source is allowed to decide.
The adoption pattern: connect, observe, act, expand
Migration to Agentic OS for a small team follows a natural rhythm:
The people side
In a large enterprise, migration is politically charged. It threatens middle-management roles, coordination empires, and information gatekeepers. In a small company, the friction is lighter but still real:
- The person who "owns the spreadsheet" may feel exposed when that knowledge becomes shared memory.
- Team members comfortable with their current routines may resist changing habits.
- If the owner doesn't visibly adopt the cockpit first, no one else will.
The fix is leading by example, starting with a domain that clearly saves everyone time, and making early wins visible fast. Once one loop works noticeably better through the cockpit, adoption tends to pull itself forward.
The Agentic Enterprise OS is still destination, not something you can buy off the shelf today. There are still a number of limitations and open questions that need to be addressed.
What about the costs?
For a 10-person company running this daily, you may expect LLM inference costs of $200–$2,000/month depending on usage volume and model choice, excluding Agentic OS vendor fees and the SaaS subscriptions that, at least initially, are still required underneath the system.
At first glance, this can feel like an additional operational burden, especially for small businesses already saturated with software costs. The economic value is in compressing administrative overhead, reducing context switching, and allowing owners and operators to focus on higher-value work such as sales, execution, customer relationships, decision-making, and growth.
For many small business owners already overwhelmed by operational complexity and unsure how to transition into the AI era, this may therefore become a powerful technology investment with a positive ROI.
Can it work for everyone?
This model also doesn't work well for companies whose work is inherently non-procedural (pure creative agencies, early-stage R&D). It works best where there are repeating operational loops that currently live in human heads and scattered tools.
Privacy, GDPR, and employee trust
A cockpit that ingests email, Slack, documents, CRM records, billing data, and calendar events can quickly feel like surveillance if deployed carelessly. For European SMBs, this is not only a cultural concern but a legal one: GDPR imposes clear obligations around purpose limitation, data minimization, lawful basis, retention, and the right to erasure, none of which map cleanly onto an always-on agentic memory layer.
Before deployment, a company needs explicit answers to a small set of questions: what is ingested, what is excluded by default (private DMs, personal email, HR conversations), who can query what, how long memory is retained, how deletion requests propagate across the memory graph, and whether any data leaves the EU through model providers. Data processing agreements with LLM vendors, sub-processor chains, and cross-border transfer mechanisms all need to be worked out before the cockpit touches production data.
Reliability and rollback must be designed in
Agentic execution will fail. Agents will misread context, act on stale data, pick the wrong tool, trigger the right action at the wrong time, or chain several small mistakes into one large one.
A serious Agentic OS therefore needs rollback patterns from day one. Every executed action should be logged with enough context to be explained, undone, or compensated. Irreversible actions (payments, deletions, external communications, contract changes) need explicit escalation paths, dry-run modes, and manual override.
Liability remains unresolved
Most agentic deployments still sit on top of SaaS-era contracts written for passive software, not autonomous execution. If an agent misprices a product, sends a misleading customer message, or performs an action that causes damage, liability will often remain with the deploying company, even if the mistake originated in the agent's reasoning.
This is a major open question for the Agentic Enterprise OS. Companies will need clear internal accountability for agent outcomes, and vendors will need contracts that recognize agents as operational actors, not just software features. Until that legal model matures, high-impact actions should stay gated, auditable, and tied to explicit human approval.
State of the Market
We are not the only ones seeing this shift. Major players are already moving in this direction, each from a different angle:
- Salesforce is pushing the "Agentic Enterprise" through Agentforce 360, with Data 360 as context and Slack as a collaborative agentic work surface.[09]
- Microsoft is expanding Copilot Studio into multi-agent orchestration across Microsoft 365, Fabric, SDKs, and Agent-to-Agent protocols.[10]
- ServiceNow is building AI Control Tower capabilities to govern and monitor agent workforces.[11]
- Palantir has long framed AIP, Foundry, and Apollo as an operating system for AI-driven operations.[12]
- SAP and Workday are turning core systems of record into agent platforms and agent governance layers.[13]:
- UiPath is extending automation into an agentic control plane through Maestro.[14]
- Atos, in Europe, is framing agentic AI as sovereign production infrastructure through Sovereign Agentic Studios.[15]
These moves confirm the direction that agentic AI is becoming an operating model. But most approaches remain enterprise-led, ecosystem-bound, consulting-heavy, or tied to a specific system of record. The missing layer is still the founder-friendly version: one cockpit for small and medium businesses, working across the tools they already use, with memory, governance, and execution designed together.
For a small business owner, the real question isn't "should I use AI?" You already know you should. The question is how to do it?.
You face four choices:
We believe option 4 is where this is all heading. Not because it's easy (this paper has tried to be honest about how hard it is) but because the alternative is spending the next decade as the human glue between fifteen SaaS tabs.
- According to NimbleBrains analysis of the full MCP registry, published March 2026: "The State of MCP Security in 2026" https://nimblebrain.ai/mcp/mcp-security/state-of-mcp-security/ ↩
- According to Obsidian Security's analysis of production AI deployments, prompt injection appears in over 73% of enterprise AI deployments assessed. Obsidian Security, "Prompt Injection Attacks: The Most Common AI Exploit in 2025" https://www.obsidiansecurity.com/blog/prompt-injection ↩
- OpenAI CISO Dane Stuckey stated, in October 2025 at the launch of the ChatGPT Atlas browser: "prompt injection remains a frontier, unsolved security problem, and our adversaries will spend significant time and resources to find ways to make ChatGPT agent fall for these attacks." Source: Simon Willison, "Dane Stuckey (OpenAI CISO) on prompt injection risks" https://simonwillison.net/2025/Oct/22/openai-ciso-on-atlas/ ↩
- The confused deputy problem was originally described by Norm Hardy in "The Confused Deputy (or why capabilities might have been invented)" Operating Systems Review, ACM, October 1988. https://people.cs.vt.edu/~kafura/cs6204/Readings/ConfusedDeputy.pdf; Token passthrough is explicitly forbidden by the official MCP specification, which states: "If the MCP server not only accepts tokens with incorrect audiences but also forwards these unmodified tokens to downstream services, it can potentially cause the 'confused deputy' problem." Official MCP Authorization Specification, June 2025: https://modelcontextprotocol.io/specification/2025-06-18/basic/authorization; For implementation-level analysis of the vulnerability in the wild, see: Obsidian Security, "When MCP Meets OAuth: Common Pitfalls Leading to One-Click Account Takeover", February 2026. https://www.obsidiansecurity.com/blog/when-mcp-meets-oauth-common-pitfalls-leading-to-one-click-account-takeover ↩
- Lenovo CIO Playbook 2026 (4th annual edition), a study commissioned by Lenovo and conducted by IDC, published January 2026, surveying 3,000+ IT and business decision-makers globally. Found that only 46% of AI POCs have progressed to production-scale deployment. Lenovo Newsroom, January 2026. https://news.lenovo.com/pressroom/press-releases/research-reveals-ai-is-paying-off-cios-arent-ready/ ↩
- MIT Project NANDA (MIT Media Lab) published The GenAI Divide: State of AI in Business 2025 in July 2025 (press coverage began August 2025), authored by Aditya Challapally, Chris Pease, Ramesh Raskar, and Pradyumna Chari. Methodology: systematic review of 300+ publicly disclosed AI initiatives, structured interviews with representatives from 52 organizations, and survey responses from 153 senior leaders across four major industry conferences. The report's precise finding: 95% of enterprise AI solutions delivered no measurable P&L impact. Only 5% of custom enterprise AI tools reached production. Reported by Fortune, August 2025: https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/ ↩
- PwC's 29th Global CEO Survey, "Leading Through Uncertainty in the Age of AI" published January 19, 2026 at Davos. Conducted September 30 to November 10, 2025, with 4,454 CEOs across 95 countries and territories. Exact finding: "more than half (56%) say they've realized neither revenue nor cost benefits" from AI; "only one in eight (12%) report both" positive impacts. Full report: https://www.pwc.com/gx/en/ceo-survey/2026/pwc-ceo-survey-2026.pdf ↩
- Gartner press release, June 25, 2025. "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027." Anushree Verma, Senior Director Analyst, Gartner. https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027 ↩
- Salesforce launched Agentforce 360 in late 2025, positioning Slack as the primary front-end interface for its agentic AI ecosystem. Agentforce in Slack grounds agent responses in both Slack conversational data and Salesforce CRM data, enabling agents to take actions such as creating and updating channels, lists, and canvases directly in the flow of work. Sources: Salesforce EU, "Work with AI Agents in Slack using Agentforce" https://www.salesforce.com/eu/slack/agentforce/ ↩
- Microsoft Copilot Studio has expanded into multi-agent orchestration, with capabilities reaching general availability in April 2026. The platform enables agents built in Copilot Studio, Azure AI Foundry, Microsoft Fabric, and the Microsoft 365 Agents SDK to collaborate across workflows using open Agent-to-Agent (A2A) protocols. Source: Microsoft Copilot Blog, "What's new in Copilot Studio: Updates to multi-agent systems" https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/new-and-improved-multi-agent-orchestration-connected-experiences ↩
- ServiceNow launched AI Control Tower at Knowledge 2025 (May 2025) as a centralized command center to govern, manage, secure, and realize value from any AI agent, model, and workflow. In May 2026, ServiceNow further expanded AI Control Tower through a deeper integration with Microsoft Agent 365. Sources: ServiceNow Newsroom, "ServiceNow launches AI Control Tower at Knowledge 2025" https://www.servicenow.com/uk/company/media/press-room/ai-control-tower-knowledge-25.html; ServiceNow Newsroom, "ServiceNow expands AI agent governance through deeper integration with Microsoft" https://newsroom.servicenow.com/press-releases/details/2026/ServiceNow-expands-AI-agent-governance-through-deeper-integration-wi ↩
- Palantir explicitly describes its combined AIP + Foundry + Apollo architecture as an "Enterprise Operating System." Foundry serves as the data operations platform, AIP as the generative AI platform, and Apollo as the continuous delivery platform. Source: Palantir, "Integrated platforms: AIP, Foundry, and Apollo" https://palantir.com/docs/foundry/architecture-center/platforms/ ↩
- SAP has grown Joule to over 40 AI agents and 2,400+ skills embedded across S/4HANA, Ariba, SuccessFactors, and IBP, and launched an AI Agent Hub within SAP LeanIX for central agent governance. Workday announced its Agent System of Record in February 2025, providing a unified governance layer for managing both Workday and third-party AI agents. Sources: SAP News Center, "SAP Business AI: Release Highlights Q1 2026" https://news.sap.com/2026/04/sap-business-ai-release-highlights-q1-2026/; Workday Newsroom, "Workday Unveils New Agent System of Record" https://newsroom.workday.com/2025-02-11-The-Next-Generation-of-Workforce-Management-is-Here-Workday-Unveils-New-Agent-System-of ↩
- UiPath Maestro™ is described officially as "a cloud-native orchestration platform that unifies automation, AI agents, and human interactions into streamlined, end-to-end business processes" using BPMN (Business Process Model and Notation) and DMN (Decision Model and Notation) for visual workflow and rules modeling. https://www.uipath.com/platform/agentic-automation/agentic-orchestration ↩
- Atos Group launched Sovereign Agentic Studios on March 12, 2026, as a new operating model to help organizations move from agentic AI pilots to production under full sovereign control. Source: Atos Group Press Release, "Atos Group Launches Sovereign Agentic Studios" https://www.atosgroup.com/en/press/atos-group-launches-sovereign-agentic-studios-bring-ai-safely-production-across-organizations ↩
