MCP vs. CLI Benchmarking

My thoughts

Oct 04, 2025

Someone recently wrote a great, detailed post benchmarking MCP servers against CLI tools for AI agents. The author, Mario Zechner, built a tool, created both MCP and CLI interfaces for it, and then ran it through a series of tests to see which performed better. Pretty awesome stuff.

His conclusion? It’s a wash. The protocol you use (MCP or CLI) is just plumbing. What really matters is how well your tool is designed.

He’s right. In the end, they’re both prompting tricks and can theoretically be close enough to each other to not matter. In practice, that can be a different story. We wrote more on our thoughts about tool architectures and MCP vs CLI token differences previously

The choice is an architectural decision about your workflow. The real question isn’t “Which is better?” but “Who is this tool for?”

The Right Tool for the Job

MCP and CLI tools are different ways to tell an AI about a function it can ask you to run. The model doesn’t care how that tool definition got into its prompt.

CLI tools are for your personal workflow. I build a set of simple scripts for every project I work on and put them in a Justfile. Then I tell my agent to use that file. It’s the most token-efficient way to work because it gives the model only the tools it needs for the job.
MCP servers are for sharing. If I wanted a colleague or a non-technical client to use my tools, sending them a folder of scripts is a pain. Or helping them install what they need to, and teach them to use a terminal can be tricky. Walking them through adding an MCP server to Claude Desktop or other agent is simpler. MCP makes tools portable and plug-and-play.

It’s about choosing the right workflow for the human and the intended audience.

The High Cost of Clutter

Zechner’s post shows that context clutter kills agent performance. Many general-purpose MCP servers have dozens of tools you don’t need for a specific task.

One popular GitHub MCP server, for instance, injects definitions for 93 tools, consuming a 55,000 tokens before you’ve even asked a question.

However, Zechner’s benchmark revealed a small hidden cost for CLIs: Claude Code runs a security check on every command, adding token overhead that MCP calls bypass entirely. I believe this to be less important than all the other considerations, so it won’t impact my decision making process.

Ultimately it reinforced a core principle of clarity: strip everything to its cleanest components.

Better Tools, Not Better Protocols

I agree with Zechner’s takeaway: we should focus on building better tools. The protocol is just plumbing.

Start with a well-designed, token-efficient CLI. It’s simpler. Once you have that, adding an MCP server for a different distribution pattern is straightforward. But always start with the work itself. Build a tool that helps an agent complete a task cleanly and efficiently.

Elite AI Assisted Coding

Discussion about this post

Ready for more?