Two Hard Gates
No code ships without an approved plan and a passing review. Every change must clear both stops — no bypasses, no self-approval.
Hard gates, TDD, security scanning, and human approval —
so your AI agent ships production-quality code, not untested prototypes.
Why MeowKit
Models provide intelligence. MeowKit provides the constraints,
safety gates, and repeatable workflows that production engineering demands.
No code ships without an approved plan and a passing review. Every change must clear both stops — no bypasses, no self-approval.
Strict test-first discipline when enabled, fast spikes when you need speed. The harness adapts without relaxing quality standards.
Prompt injection defense across input, instruction, context, and output layers. Untrusted content is data — never instructions.
Process
Every task follows the same enforced sequence — orient, plan, test, build, review, ship, reflect. Two hard gates block shipping unreviewed or untested code.
Detect task domain, classify complexity, assign model tier and agents.
Scope-adaptive plan with acceptance criteria. No code until the plan is approved.
Write failing tests first when TDD is enabled. Correctness proof before implementation.
Implement against the approved plan and passing tests. File ownership enforced.
Adversarial structural audit across 5 dimensions. Security scan for BLOCK patterns.
PR creation, conventional commit, deploy pipeline. Only after Gate 2 clears.
Capture lessons, update memory files, run retrospective. Knowledge persists.
See the difference
# No plan, no gates, no tests
claude "add user auth to the API"
# AI ships code directly:
✗ No approved spec
✗ No failing tests first
✗ No security scan
✗ No review gate
→ untested code in production# Enforced 7-phase workflow
npx meowkit "add user auth"
# Harness enforces:
✓ Plan approved at Gate 1
✓ Tests written first (TDD)
✓ Security scan — no BLOCKs
✓ Review passed at Gate 2
→ production-quality PRCapabilities
Dedicated agents for planning, security, review, testing, documentation, and more — each with scoped file ownership and model-tier routing.
From database migrations to multimodal AI, frontend design to CTF research — skills activate only when the task demands them.
Lessons, fixes, review patterns, and architecture decisions persist across sessions. The harness learns from every run.
Domain complexity CSV classifies every task into TRIVIAL, STANDARD, or COMPLEX — routing the right model and scaffolding density automatically.
Multi-agent deliberation for architectural decisions. Multiple agents argue different positions before a decision is made.
Gate 2 runs parallel reviewers across correctness, security, design, scope, and craft. Any FAIL blocks the ship.
Pure prompt engineering — no SDK required. Works offline, works in any Claude Code environment, no vendor lock-in.
Opt-in test-first enforcement with RED → GREEN → REFACTOR gates. Self-healing loop with 3-attempt cap and human escalation.
By the numbers
Quick Start
npx meowkit initnpx meowkit initScaffolds the harness into your project.
npx meowkit setupChoose your workflow modes and agents.
/mk:cook "add feature X"The 7-phase pipeline enforces the rest.
Ship better. Now.
The harness is free, open-source, and works inside Claude Code today. No sign-up. No external dependencies. Just discipline.