Mar 2026 Update: Computer Use has gotten a lot better. I myself have been very impressed with Manus’s ability to take over a browser tab and architect actions. While this has weakened the initial premise, I still think this is an interesting viewpoint on how much can change in a year. And more importantly, the original argument — that enterprises with workflows worth automating have a SOP would not choose an unorchestrated agent over an orchestrated agent (which can make use of more repeatable blocks like Playwright MCP).
There’s a lot of noise right now about autonomous agents — systems where you give an AI a goal and let it figure out how to accomplish it. OpenAI’s Operator, Computer Use, CrewAI — these products are betting big on this paradigm. The enterprise pitch is compelling on the surface. But I think it misreads how enterprises actually work.
Two kinds of agents
It helps to think about the agentic world as a spectrum.
On one end are orchestrated agents — discrete workflows with LLM elements baked in. The flow is predefined; the LLM handles specific steps like routing, extraction, or generation, but it doesn’t decide the overall path. Writer’s Agent Editor works this way. So does LangGraph.
On the other end are unorchestrated agents — you give the agent a task, it decides how to accomplish it using whatever tools it has. More flexible, but significantly less reliable. Multi-agent benchmarks are still failing at around 70%. That’s not a foundation you can build mission-critical workflows on.
Enterprises already have SOPs
The core argument for unorchestrated agents is that they handle ill-defined, ambiguous processes better. The problem is that enterprises don’t rely on ill-defined processes for anything important.
Think about how pricing a contract works at a large company. You don’t hand someone a pile of documents and say “figure it out.” You follow the procedure: talk to Deal Desk, get a quote, put it on paper. That process exists because hundreds of people have refined it over years. It works for 99% of cases.
Enterprises have SOPs for everything mission-critical. That means the upfront cost of building an orchestrated agent — codifying the workflow — has largely already been done. You’re translating an existing process, not inventing a new one.
The guardrail problem
Orchestrated agents have another practical advantage: they’re fully guardrailable. When you define all possible paths through a workflow, you can reason about all possible failure modes. Unorchestrated agents can’t offer that. The best guardrail Cursor has for its agentic loops is killing the process after 30 iterations — a hard stop, not real safety.
For enterprise use cases touching revenue, customers, or regulated data, that’s not good enough.
The industry already learned this
LangChain tried the unorchestrated loop approach a few years ago and ran into the same wall — LLMs weren’t reliable enough. LangGraph was their response: a DAG-based framework that enforces tight orchestration. The most widely used agentic framework in the ecosystem pivoted away from unorchestrated agents because it didn’t work in practice.
Anthropic’s own decision framework for agents reinforces this: the higher the cost of error, the more you want human-in-the-loop or read-only modes. For genuinely mission-critical workflows, the cost of error is almost always high.
Who unorchestrated agents are actually for
Experimental engineers exploring the frontier. Research teams. Internal tools where a failure means a retry, not a compliance incident. That’s a real and valuable use case — just not an enterprise production one.
When models reach the reasoning level required to make autonomous agents reliably safe for high-stakes workflows, the calculus will change. But betting your mission-critical processes on that today is a risk most enterprises can’t afford to take.