The Agent
Ecosystem.
A tour of the shape of the field — the metaphors we bring to it, the risks inside it, and the decisions everyone building is actually arguing about.
The questions everyone building agents
is actually arguing about.
Who is the agent for?
(the family computer)
- One license, many fingers — kids, partners, friends share the agent.
- Casual jobs. Brainstorming, writing, code-along, household research.
- Privacy boundary is the front door. No SLAs, no audit log, no IT.
- Failures are inconvenient, not catastrophic.
(the factory)
- Many seats, many roles. RBAC, audit log, retention, BAA.
- Throughput jobs. Summarize 10k tickets, generate 500 PRs, route inboxes.
- SLAs and pagers. Failures cost money or trust at scale.
- Procurement, security review, line-item invoicing.
Two audiences. Two metaphors. Same models underneath — wildly different products.
The lethal trifecta.
Three capabilities that are each useful on their own. Combine all three in one agent and an attacker can drain your data with a sentence of text.
Where agents live.
Three zip codes. Each one gets you different trust, different latency, different blast radius.
A · ON METAL Local hardware. ▾
Yours end-to-end. Private by construction, limited by what fits in RAM.
B · CLOUD API Cloud API. ▾
Best capability, strongest guardrails — trust boundary at the TLS handshake.
C · SHIPPED Binaries & software. ▾
Agents as programs. Installable, sandboxed, carrying your creds.
Model vs Harness.
knows.
- Reasoning, planning, judgment under ambiguity.
- Deciding which tool, when, why.
- Long-context coherence — staying on the thread.
can do.
- Loop control, retries, budgets, interruption.
- Tool catalog, sandboxes, permissions, memory.
- Everything around the model that makes it act.
A smart model in a dumb harness acts stupid. A dumb model in a thoughtful harness gets real work done.
Performance vs Quality.
Not every step needs the biggest brain. Four principles for keeping agents fast, cheap, and correct.
Principle 01 ▾ Match context length to the task.
A summarizer doesn't need a million-token window. Choose the smallest model that can hold the actual job — speed and cost compound.
Principle 02 ▾ Use deterministic code for 80–90% of the work.
Parse, validate, route, format — these are solved problems. Save
the model for judgment, not for things jq already does.
Principle 03 ▾ Bundle repeatable steps into skills.
If the agent does the same dance twice, name it. Skills are vocabulary — they let the model reason at a higher level instead of re-deriving every step.
Principle 04 ▾ Not every task requires max intelligence.
Route by complexity. Haiku for triage, Sonnet for the work, Opus when it's genuinely hard. Spending Opus on a regex is just expensive shrugging.
The goal isn't the smartest agent. It's the agent that gets this specific job done, reliably, for a price that makes sense.
Managing complexity.
Every layer you add is a layer that can fail. Every layer you remove is a layer you have to think about yourself.
Metaphors matter.
The metaphor you choose for the work — not the codebase, not the UI — shows up in every decision downstream.
- Stable schemas. Migrations are renovations — slow, deliberate.
- Permissions per room. Guest accounts. Property lines.
- "Don't break the kitchen while fixing the bathroom."
- You optimize for inhabitants. Comfort. Quiet.
- Direction matters more than location. Heading, ETA, course.
- Watches. On-call rotations. Someone is always steering.
- "Repair the hull at sea." Hot-patch culture.
- You optimize for the journey. Resilience. Speed.
Same model, same team — different metaphor. You'll get different architectures, different review culture, different pagers at 3am.