MCP Is a Great Protocol. Stop Using It as Your Default.
MCP adoption accelerated fast. Then engineers hit production and started reporting back. This is what the benchmarks, practitioner data, and real failures actually show — and what it means for how you build agent integrations.
Key Findings
- Three MCP servers can consume 72% of an agent's context window before it processes a single user message.
- In a 985-task benchmark, neither raw CLI nor MCP was optimal — an agent-optimized CLI design beat both on every metric: cost, speed, reliability, and turns.
- MCP costs 4–32x more tokens than CLI for identical operations, per Scalekit's controlled benchmark of 75 runs.
- The deciding variable is not the protocol. It is whether the interface was designed for an agent or for a human.
- MCP's real use case is distribution — publishing capabilities to the broader agent ecosystem. For internal tooling, it is the wrong default.
What Actually Happened
MCP launched in late 2024 with a compelling pitch: a universal protocol that lets any AI agent discover and use any tool through a standardized interface. Build once, connect anywhere. Claude, Cursor, VS Code, and Windsurf all added support. Adoption moved faster than most open standards ever do — 97 million monthly SDK downloads across Python and TypeScript in the first year, with over 10,000 active servers.
Then engineers moved from demos to production, and the reports started coming in. In March 2026, Perplexity CTO Denis Yarats announced at the Ask 2026 conference that Perplexity was moving away from MCP internally — a company that had shipped its own MCP server just four months earlier. Y Combinator CEO Garry Tan posted that he got frustrated, built a CLI wrapper for Playwright in 30 minutes, and said it worked 100x better. The CTO of Merge reported his team's MCP setup consuming 40–50% of their context window before the agent did anything at all.
This wasn't backlash. It was engineers encountering a real structural problem at scale and reporting back honestly. The protocol wasn't broken. It was being used in the wrong places.
The Context Tax
Every MCP server connected to an agent loads its full tool schema into the context window upfront — names, descriptions, parameter types, enumerations, system instructions. This is the protocol working as designed. The problem is what it costs.
| Metric | Value | |--------|-------| | Tokens consumed before a single user message (GitHub + Slack + Sentry, ~40 tools) | 55,000 | | Context window burned on tool definitions in a real three-server deployment | 72% | | Maximum token cost multiplier of MCP vs CLI for identical operations | 32x |
The most telling single data point: a task as simple as checking which programming language a repository uses consumed 1,365 tokens via CLI and 44,026 via MCP. The difference is entirely schema — 43 tool definitions injected into every conversation, of which the agent uses one or two.
Stack multiple servers and it compounds fast. At 143,000 tokens consumed by tool definitions on a 200K-token model, the agent has 57,000 tokens left for the actual conversation, retrieved documents, reasoning, and response. That is not a configuration problem. That is the predictable result of a design that loads everything upfront.
"MCP is how we wish AI worked. CLI is how it actually works today." — Vitor Zucher, 3x Founder, March 2026
But Raw CLI Is Not the Answer
The community's immediate response — rip out MCP, replace it with shell commands — solves the token problem and creates different ones. Raw CLI is unstructured. It returns text the agent has to parse and interpret. There is no typed output, no reliable error handling, no defined empty states. Agents improvise through --help flags and guess at subcommand behavior.
More critically, raw CLI fails silently in ways MCP does not. The benchmark data makes this visible.
| Interface | Success Rate | Avg Cost / Task | Avg Duration | Avg Turns | |-----------|-------------|-----------------|--------------|-----------| | AXI (agent-optimized CLI) | 100% | $0.050 | 15.7s | 3 | | Raw CLI | 86% | $0.054 | 17.4s | 4 | | MCP (raw) | 87% | $0.148 | 34.2s | 6 | | MCP + Tool Search | 82% | $0.147 | 41.1s | 8 | | MCP + Code Mode | 84% | $0.101 | 43.4s | 7 |
Source: Kun Chen, Lead Principal Engineer, Atlassian Rovo. 985 task runs, Claude Sonnet 4.6 as agent and judge. GitHub repository tasks including simple lookups, multi-step investigations, aggregate counts, and error handling.
Raw CLI failed specifically on aggregate tasks. When a CLI returns a paginated list, the agent has no way to know whether the page size equals the total count. It assumes it does. The result is a silent, uncorrectable wrong answer — no error thrown, no retry triggered. That 86% success rate is not close enough for production.
Adding Tool Search to MCP to reduce upfront schema loading made things worse, dropping success to 82% while adding turns without recovering the token savings. The extra discovery steps burned more context than the schema reduction saved.
The Variable Nobody Is Talking About
The finding that emerged from Atlassian's benchmark is the one that reframes the entire debate. The winning approach — AXI, Agent eXperience Interface — is not a new protocol. It is a CLI designed around how agents actually consume information, rather than how humans do.
The design principles are straightforward: pre-computed fields so agents never have to infer totals, explicit empty states so "0 results" is never ambiguous, structured error output that agents can parse without guessing, and output discipline that separates data from debug information. The result was 100% success at $0.050 per task — two-thirds cheaper than the best MCP variant, and faster than everything.
The conclusion from Kun Chen's analysis is blunt: "The debate between CLI and MCP misses the deeper question — what design principles make any interface effective for agents? It's not a CLI. It's not an MCP."
Most teams are choosing between two protocols while never asking whether either was designed for the agent consuming it. The answer is almost always no.
Where MCP Is Actually the Right Call
None of this makes MCP the wrong choice everywhere. There is one scenario where its architecture is not just acceptable — it is irreplaceable.
If you are a SaaS company and you want any AI agent, built by anyone, to discover and use your product, MCP is how you publish that capability to the ecosystem. Stripe, Asana, Intercom, Cloudflare, and Webflow have all done this. It is a distribution channel. Agents connecting to these platforms do not need to know the underlying implementation — they discover the tools through the protocol, authenticate, and use them. There is no CLI equivalent for this use case.
MCP also earns its overhead for fixed, well-defined tool sets used frequently — cases where the schema cost is amortized across many interactions and where typed, structured tool calls reduce failure rates on complex multi-step workflows. For anything requiring OAuth, scoped permissions, or audit trails across organizational boundaries, MCP's auth model is genuinely better than long-lived API keys in config files.
The GitHub example is instructive in both directions: GitHub's official MCP server underperformed raw CLI in every benchmark. But a well-designed MCP server with fewer, better-composed tools — rather than 43 tool definitions dumped into every conversation — would narrow that gap significantly. The problem is not the protocol. It is the implementation.
The Decision
Use CLI (designed for agents) when:
- Internal developer workflows — git, deployments, infra, CI/CD
- Tools with mature CLIs the model already knows from training
- Prototyping where speed matters more than governance
- Cost-sensitive production workloads
- Single-user or single-tenant environments
Use MCP when:
- Publishing your SaaS product to the agent ecosystem
- Fixed, well-defined tool sets used at high frequency
- Multi-tenant products requiring scoped OAuth and audit logs
- Tools with no CLI and no public API
- Any integration that needs to be agent-agnostic
The shorthand: if you are automating workflows inside your own environment, design a CLI for agents. If you are publishing capabilities to the outside world for agents you do not control, build an MCP server.
The Real Gap in the Market
The CLI vs. MCP debate has been framed as a protocol war. It is not. It is a design problem. The vast majority of agent integrations in production today — internal tools, proprietary systems, developer toolchains — were built for humans first and never revisited for agents. That is the actual gap.
Building an agent integration that performs in production means asking a different question: not which protocol, but what does the agent need to know, in what format, at what cost, to complete this task reliably? Answer that and the choice between MCP and CLI becomes obvious.
Most teams are not asking it yet.