Skills Over Tools: Why CLI Wrappers Beat MCP Servers for AI Agents

The standard approach to giving AI agents capabilities is tool definitions — fixed function signatures with typed parameters. MCP servers formalize this further: a dedicated process that exposes a set of tools over a protocol.

Both work. Both are also unnecessarily rigid. There’s a simpler pattern that’s more flexible and requires less infrastructure: a skill file that teaches the agent how to use an existing CLI tool.

The Three Approaches

Fixed tools are function signatures baked into the agent’s configuration. A browser tool might expose navigate(url), click(selector), extract_text(selector). Each action is a separate tool call with a defined schema.

MCP servers package these tools into a standalone process. The agent discovers available tools via the protocol, calls them with structured JSON, gets structured JSON back. It’s tools-as-a-service.

Skills are instruction files — markdown documents that describe how to use a CLI tool via the agent’s existing shell access. No protocol, no server, no schema. Just documentation the agent reads and follows.

What a Skill Looks Like

I built bb, a browser automation CLI. Instead of wrapping it in an MCP server, I wrote a skill file — a markdown document that explains the commands:

# Browse
 
Browse the web using `bb`, a browser automation CLI
that drives a persistent headless Chrome instance.
 
## Quick reference
 
bb open https://example.com      # navigate and read
bb text ".article-body"          # extract specific content
bb click "button.load-more"      # interact
bb wait ".results"               # wait for dynamic content
bb screenshot page.png           # capture
bb stop                          # clean up

The file covers the most common commands and patterns — not all of them. It doesn’t have to. If the agent needs a command that isn’t in the skill file, it runs bb --help and figures it out. The skill provides the fast path; the CLI’s own documentation covers everything else.

Why This Works Better

No Capability Ceiling

Fixed tools define a finite set of operations. If the tool definition doesn’t include “wait for network idle then extract text from a specific selector,” the agent can’t do it. Someone has to anticipate every useful combination and define a tool for each.

With a CLI, the agent composes commands freely:

bb open https://example.com
bb wait ".results"
bb text ".results .item:first-child"

No one had to define a navigate_wait_and_extract_first_item tool. The agent read the available commands and composed them to fit the task.

Commands Chain Naturally

CLI tools compose with pipes and subshells. A single line can do what would take multiple round-trip tool calls:

bb text ".item" | grep "in stock" | wc -l

Count of in-stock items, one command. With fixed tools, this is three separate calls: extract text, filter results, count matches — each one a full round trip through the tool interface.

The agent already knows bash. Piping, redirection, grep, jq, awk — these aren’t features anyone needs to build. They’re already there, and they work with any CLI that outputs plain text.

No Server to Run

MCP servers are processes. They need to start, stay running, handle connections, manage state. That’s operational overhead for what amounts to “let the agent click buttons in a browser.”

A CLI is already running infrastructure. bb auto-starts Chrome on first use and persists state in ~/.bb/. The agent calls it like any other command. No daemon to manage, no protocol to speak.

The Agent Already Has a Shell

This is the part that gets overlooked. Agents that can execute bash commands already have the most flexible tool interface possible. A shell command can do anything a tool definition can do, plus everything else.

Adding an MCP server for browser automation means: install the server, configure the connection, handle the protocol overhead — all to give the agent capabilities it could have through bb open https://example.com.

Skill Files Are Self-Documenting

A tool definition is a schema. A skill file is documentation. The agent doesn’t just know what it can call — it knows when and how to use each command effectively.

From the bb skill:

## Tips
 
- Start with `bb open` — it navigates and extracts readable
  content in one step.
- For SPAs, use `bb open <url>` then `bb wait <selector>`
  before extracting.
- Use `bb ax-tree` when you need to understand page structure
  without reading raw HTML.

This kind of contextual guidance doesn’t fit in a tool schema. In an MCP server, it lives in description fields that are often too short to be genuinely useful. In a skill file, there’s no length limit. Write whatever the agent needs to know.

Updates Are Trivial

Adding a new command to bb means updating the skill file — a markdown edit. No schema changes, no server redeployment, no version negotiation. The agent reads the updated file next time it needs to browse.

When Tools and MCP Servers Make Sense

This isn’t universal. Fixed tools work well when:

The agent doesn’t have shell access. Web-based agents or sandboxed environments need structured tool interfaces.
You need strict input validation. Tool schemas enforce types and required fields at the protocol level.
Multiple agents share the same server. MCP’s discovery protocol helps when you have many consumers.

But for a coding agent that already runs in a terminal — which is most of them — skills over CLIs are simpler, more flexible, and require less maintenance.

The Pattern

Build the capability as a CLI with plain-text output.
Write a skill file that documents usage, patterns, and tips.
Let the agent read the skill and call the CLI through bash.

The agent gets full access to the tool’s capabilities without anyone having to predict which combinations of operations it might need. The CLI handles the complexity. The skill file provides the context. The shell connects them.

No servers. No schemas. No protocol overhead. Just commands and documentation.