ai-spec is an AI-driven development orchestrator — from a single requirement to fully reviewed, test-covered, spec-aligned code, in minutes.
Every AI coding tool faces the same structural limitations. ai-spec is designed to address all of them.
A fully automated 10-step pipeline from idea to reviewed, scored, production-ready code.
Scans routes, schemas, dependencies, middleware, and the project constitution. Every prompt is grounded in your actual codebase — not a generic template.
Generates a human-readable Markdown spec and decomposes it into ordered tasks: data → service → api → view → route → test. One AI call, complete output.
AI polishes the spec and shows a colored diff. You approve, reject, or request changes. Multiple rounds supported — no code is written until you say so.
Extracts a SpecDSL JSON — models, endpoints, behaviors — from the spec. Validated against 9 schema rules. The single source of truth for codegen, tests, and exports.
Generates file-by-file in dependency order. Each completed file's exports are cached and injected into subsequent prompts — eliminating cross-task hallucinations.
Runs npm test / lint / tsc, parses errors by file, and sends targeted AI fixes with DSL context. Dependency-sorted repair order maximizes cycle efficiency.
Pass 1: architecture & spec compliance. Pass 2: implementation correctness & edge cases. Pass 3: blast radius, complexity score, breaking change risk.
Scores on 4 dimensions: compliance (30%) + DSL coverage (25%) + compile (20%) + review (25%). Linked to prompt hash — tracks quality over time with zero AI calls.
Every feature addresses a real pain point in AI-assisted development.
Self-evolving knowledge base (§1–§9) that auto-injects into every prompt. Scans routes, middleware, schema, and conventions on init. Grows smarter with every review via §9 lesson accumulation.
ai-spec initHuman-readable Markdown Spec for engineers to review and align on. Machine-readable SpecDSL JSON for tools to consume. Both versioned, both auditable. Codegen, tests, and exports all share one contract.
Spec + DSLDSL Gap Loop: detects sparse contracts before codegen and triggers targeted spec enrichment. Review→DSL Loop: structural review issues feed back into the contract — so the next run starts cleaner.
Self-correctingRecord real AI responses on first run. Replay them deterministically in subsequent runs — zero API calls, zero cost. Iterate on pipeline logic and UI without burning tokens.
ai-spec create --vcr-recordHuman review happens at the right moment: after the spec is clear and the DSL is valid, but before any code is written. Abort means zero disk residue. Proceed means every step has a verified contract to follow.
[Gate] checkpointEvery successful import fix is appended to a ledger. On the next codegen run, a "DO NOT REPEAT" section is automatically injected into prompts — preventing the same hallucination from ever reoccurring.
v0.54+ zero-cost learningEvery run gets a unique RunId. Before any file is written, the original content is snapshotted. One command restores your entire repo to pre-run state — precise to the file, precise to the run.
ai-spec restore <runId>Gemini, Claude, OpenAI, DeepSeek, Qwen, GLM, MiniMax, Doubao, MiMo. Mix and match: use one model for spec generation, another for codegen. Per-run provider override supported.
--provider --codegen-providerThe only pipeline that wires your backend and frontend together — automatically.
After frontend generation, the cross-stack verifier scans every API call in the frontend code and checks it against the backend DSL. Phantom routes (hallucinated endpoints), method mismatches, and string-concatenated paths are all detected and reported before you push.
The SpecDSL isn't just for codegen — it powers your entire development workflow.
DSL → production-ready YAML or JSON. Plug directly into Postman, Swagger UI, or any SDK generator.
DSL → Express mock server + MSW handlers + Vite proxy config. Frontend development without waiting for the backend.
DSL → typed interfaces, request/response types, and API endpoint constants. Shared across frontend and backend.
Generate a static HTML quality dashboard. Track harness scores, compliance rates, and review trends across all runs.
Every step is visible, every decision is auditable. No black box — you see exactly what's happening, what scored how, and what was fixed automatically.
[1/10] Loading project context... Constitution : ✔ found (§1–§9) Tech stack : vue · vite · pinia [2/10] Generating spec with glm/glm-4.5... ✔ Spec generated ✔ 8 tasks [3.4/10] Spec quality assessment... Coverage [██████████████████░░] 9/10 Clarity [████████████████░░░░] 8/10 [Gate] Approval Gate — awaiting decision ✔ Approved — continuing... [DSL] Extracting structured contract... ✔ DSL valid — Models: 3 Endpoints: 7 [6/10] Code generation (8 files)... ✔ service · src/api/task.ts ✔ api · src/stores/taskStore.ts ✔ view · src/views/TaskList.vue ████████████████████ 100% [8/10] ⚠ 3 errors — auto-fixing cycle 1... ✔ All errors resolved in 1 cycle [9/10] 3-pass code review... Pass 1 ✔ Architecture aligned Pass 2 ✔ Implementation correct Score [████████████████░░░░] 8.2/10 [10/10] Harness Self-Evaluation... Total [██████████████████░░] 92/100 ✔ 2 lessons → constitution §9 RunId: 20260409-143022-a7f2
ai-spec turns code generation quality into data — comparable, trackable, and improvable over time.
Track quality across all runs. See if your pipeline is improving.
Every stage is timed and logged to .ai-spec-logs/<runId>.json.
The harness score is deterministic — no AI calls after generation completes.
Don't like the result? One command restores all modified files to their pre-run state.
Use any combination of providers. Mix a reasoning model for spec generation with a fast model for codegen.
Install globally, set your API key, register a repo, and start shipping.