Spec Kitty vs OpenSpec: Which Spec-Driven Tool Should You Use?

Spec-driven development is the current answer to “vibe coding” — the failure mode where your requirements live only in a chat log and the agent’s output drifts from what you actually wanted. The idea is simple: write the spec down, in the repo, and have humans and agents agree on it before code gets written. Two open-source tools take that idea in noticeably different directions.

Spec Kitty (Priivacy-ai) builds a governed software factory: missions, work-package lanes, isolated git worktrees, review/merge gates, a dashboard, and an audit trail. OpenSpec (Fission-AI) adds a lightweight spec layer: a change folder per feature, delta specs, and a fluid propose → apply → archive loop with no phase gates. Both are MIT-licensed, repo-native, and work with most AI coding agents. This post compares them head to head and tells you when to reach for each.

The two at a glance

Spec Kitty is a Python CLI (pipx install spec-kitty-cli) for teams turning agentic coding into a repeatable, governed process. Its loop is spec → plan → tasks → next → review → accept → merge, with mission artifacts under kitty-specs/, work packages moving through lanes (planned, in_progress, for_review, approved, done), isolated worktrees under .worktrees/ so multiple agents work in parallel without branch chaos, and per-mission retrospectives. Humans define intent and acceptance criteria; agents implement; reviewers accept or reject with an audit trail.

OpenSpec is a Node package (npm i -g @fission-ai/openspec) built around a smaller, fluid loop. You run /opsx:propose, it scaffolds a change folder, you /opsx:apply to implement, and /opsx:archive when done. Its philosophy is explicit: fluid not rigid, iterative not waterfall, easy not complex, and built for brownfield (existing codebases) — which is why it uses delta specs that describe only what’s changing (ADDED/MODIFIED Requirements) rather than rewriting the whole spec each time.

How they differ

The clearest way to see the difference is the shape of the workflow. OpenSpec keeps it to three light steps; Spec Kitty wraps the same intent in a longer, gated runtime loop.

Same goal, different weight: OpenSpec's three fluid steps versus Spec Kitty's seven-step governed loop with explicit review, accept, and merge gates.

That picture is the whole comparison in miniature. OpenSpec optimizes for getting a single change specced and shipped with minimal ceremony; Spec Kitty optimizes for many changes, many agents, and a visible chain of custody from intent to merge.

Dimension	Spec Kitty	OpenSpec
Posture	Governed software factory	Lightweight spec layer
Runtime	Python CLI (`pipx`)	Node package (`npm`)
Workflow	spec → plan → tasks → next → review → accept → merge	propose → apply → archive
Spec style	Full mission artifacts	Delta specs (ADDED/MODIFIED)
Parallel agents	Git worktree isolation built in	Not a built-in concern
Governance	Review/accept/merge gates, audit trail, retrospectives	You supply the discipline
Best fit	Teams, multi-agent, brownfield + greenfield	Solo/teams, especially brownfield
Overhead	Higher (more concepts)	Lower (deliberately minimal)

Examples

With OpenSpec, a feature is three commands. You ask for add-dark-mode; it scaffolds openspec/changes/add-dark-mode/ with proposal.md, specs/, design.md, and tasks.md; you apply the tasks; you archive. The spec it writes is a delta — it records that a requirement was ADDED or MODIFIED, not the entire product spec — which is what makes it comfortable on a codebase that already exists.

With Spec Kitty, the same feature becomes a mission. You run /spec-kitty.charter, /spec-kitty.specify, /spec-kitty.plan, /spec-kitty.tasks, then spec-kitty next drives execution while work packages move across lanes; /spec-kitty.review, /spec-kitty.accept, and /spec-kitty.merge --push close the loop, and a retrospective is generated. If three agents are working at once, each runs in its own worktree under .worktrees/, and spec-kitty dashboard shows the board. The extra steps are the point: they’re where governance lives.

Pros and cons

Spec Kitty — the strengths are governance and scale: a real audit trail, review/merge gates, worktree isolation for parallel agents, a dashboard, and retrospective learning across missions. The costs are weight and surface area: more concepts to learn, more process per change, a Python install, and it’s genuinely overkill for one-off edits or tiny scripts. Independent third-party coverage of Spec Kitty specifically is also thin so far, so you’re largely evaluating it from its own docs.

OpenSpec — the strengths are speed and fit: minimal ceremony, a fluid model where you can edit any artifact anytime, delta specs that suit brownfield work, and broad tool support, with a faster path from zero to your first spec. The costs are the flip side: it’s a spec layer, not an orchestration system, so it gives you no built-in parallel-agent isolation or formal merge gates — that discipline is on you — and telemetry is on by default (opt-out via an environment variable). It has more visible community write-ups, though some echo the project’s own framing.

When to use which

Reach for OpenSpec when you’re solo or a small team, working mostly on an existing codebase, and you want just enough spec discipline to stop the agent from wandering — without adopting a process. It’s the right call when “agree on the change, then build it” is all the structure you need, and you value iterating freely over formal gates.

Reach for Spec Kitty when you’re scaling agentic coding across a team, running multiple agents in parallel, and you need the chain of custody: who specified what, which work package an agent touched, who reviewed it, and when it merged. If you’re building a repeatable “software factory” and an audit trail is a requirement rather than a nice-to-have, the extra ceremony pays for itself.

A simple heuristic: if your bottleneck is clarity (the agent keeps building the wrong thing), OpenSpec fixes that cheaply. If your bottleneck is coordination and accountability (many agents, many changes, who-did-what), that’s what Spec Kitty is built for. And they’re not mutually exclusive religions — both are MIT-licensed and repo-native, so trying one on a real feature is the fastest way to learn which friction you actually have.

Takeaway

Spec Kitty and OpenSpec answer the same question — how do we make AI coding predictable? — at two different scales. OpenSpec is the lightweight spec layer you can adopt in an afternoon and barely feel; Spec Kitty is the governed factory you adopt when parallelism and accountability become the hard part. Pick by your actual constraint: clarity for one change, or coordination across many. Start light; graduate to governance only when the lack of it starts to hurt.