Chapter I  ·  The Primer

See which agent
harness patterns
actually work.

Primer reads the working notes of your engineering team — every session with Claude Code, Cursor, Codex and Gemini — and measures how your agent harness determines outcome quality. Tool design, context management, caching, orchestration, permission boundaries: it turns each dimension into evidence your team can act on.


Captures Claude Code · Cursor · Codex CLI · Gemini CLI
Chapter II

Measure how your harness shapes outcomes.

Primer turns noisy sessions into harness pattern signatures, then shows which tool designs, caching strategies, and orchestration depths lead to cleaner reviews, stronger merge rates, and better follow-through. It is less a dashboard than a ledger of how your team's harness actually performs.

  • Compare harness configurations by merge rate, review load, and cost efficiency.
  • See which tool designs and context strategies create quality without overspending.
  • Move from usage tracking to harness-level evidence.
Read: Server & Intelligence Model →
Plate I. Quality clustered by harness pattern.
Plate I. Quality clustered by harness pattern.
Chapter III

See cost through the lens of harness effectiveness.

FinOps becomes useful only when it stops treating model spend as the whole story. Primer shows which harness configurations are efficient, which are expensive but justified, and where caching, boundary design, or orchestration choices are creating unnecessary cost.

  • Break down spend by harness pattern and caching strategy.
  • Tie cost to effective harness configurations, not raw activity.
  • Spot when boundary design and permission modes create unnecessary friction costs.
Read: FinOps & Cost Management →
Plate II. Spend grouped by harness pattern, not by model.
Plate II. Spend grouped by harness pattern, not by model.
Chapter IV

Turn effective patterns into standards your team can copy.

Primer surfaces the harness configurations behind your best outcomes — tool compositions, MCP stacks, skill setups, caching strategies, and orchestration patterns — so teams can standardize what is already working, not guess at best practices.

  • Promote effective harness configurations into reusable templates.
  • Find underused tool designs and context strategies worth spreading.
  • Coach with harness evidence instead of generic enablement advice.
Read: Hooks & Capture →
Plate III. An effective pattern promoted into a standard.
Plate III. An effective pattern promoted into a standard.
Technical interlude

The platform is evidence-forward.
Every metric is a trail back to the
sessions that earned it.

A live snapshot from the public demo instance, updated as sessions arrive.

Primer · live instance
Synchronised 2s ago
Sessions captured
4,947
across Claude, Cursor, Codex, Gemini
Success rate
96 %
+4.1 after intervention
Median time to merge
7.1 h
−37% versus Q3 baseline
Cost per success
$0.41
−22% after playbook rollout
Plate IV. Selected metrics from the open demo.
Knowing which engineers ship more is easy. Knowing which harness patterns — tool designs, caching strategies, context boundaries — compound their effectiveness, and which are dead weight, is the question Primer is written to answer.
— From the Primer field guide
Apparatus

Common questions, briefly answered.

What is Primer?
Primer is an open-source agent harness intelligence platform. It captures session data from Claude Code, Cursor, Codex CLI, and Gemini CLI and measures how your agent harness — tool design, context management, caching, orchestration, and permission boundaries — determines outcome quality. It turns that data into harness effectiveness scores, exemplar configurations, coaching, and measurable experiments.
What do you mean by 'agent harness'?
The agent harness is everything around the model: which tools are available, how context is managed (subagents, compaction, memory), caching strategy, orchestration depth, and permission boundaries. Research shows that harness design determines outcome quality more than model capability alone. Primer measures each of these dimensions.
How is this different from generic AI usage tracking?
Most tools count tokens, sessions, or accepted suggestions. Primer measures how your harness configuration determines outcomes — which tool compositions produce cleaner merges, which caching strategies reduce cost without sacrificing quality, which orchestration patterns correlate with faster follow-through. It lets you standardize what actually works.
Is Primer self-hosted?
Yes. Primer is open source under MIT, runs on your own infrastructure, and stores all data in your own PostgreSQL database. Telemetry never leaves your network. You can deploy it via Docker Compose, Helm, or directly with pip.
Can I try Primer without installing anything?
Yes — demo.useprimer.dev is a live read-only instance pre-loaded with a fictional 25-engineer org so you can explore every dashboard before installing.
Colophon

Open source. Self-hosted.
Your own infrastructure.

Primer is MIT-licensed and runs on your own Postgres. Telemetry never leaves your network. Install with a single command, connect your agents and GitHub data, and begin the study.