AI Dev Kit

A comprehensive AI development toolkit featuring 59 skills and 33 agents for professional-grade workflows.

AI Dev Kit

Production-ready AI dev workflow scaffold with 59 skills, 33 agents, hooks, and rules.

License: MIT Platform Skills Agents Commands


What Is This?

AI Dev Kit is a complete plugin-oriented developer workspace for AI coding assistants. It ships everything needed to run professional-grade engineering workflows — TDD, code review, security audit, CI/CD, ML pipelines, infrastructure-as-code, code quality analysis, and debt tracking — across Claude Code, Codex, OpenCode, and Gemini CLI.

Think of it as a system prompt that scales to 59 specialized skills and 33 domain agents, with lifecycle hooks and automated validation.


Quick Start

Claude Code

/plugin marketplace add noah-sheldon/ai-dev-kit
/plugin install ai-dev-kit@ai-dev-kit

Or install directly via HTTPS:

/plugin marketplace add https://github.com/noah-sheldon/ai-dev-kit
/plugin install ai-dev-kit@ai-dev-kit

Codex

codex extensions add https://github.com/noah-sheldon/ai-dev-kit

Or clone and discover automatically:

git clone https://github.com/noah-sheldon/ai-dev-kit

The marketplace at .agents/plugins/marketplace.json will be discovered automatically.

OpenCode

cd .opencode/plugins && npm install && npm run build

Or install via HTTPS:

npx opencode plugin install https://github.com/noah-sheldon/ai-dev-kit

Gemini CLI

gemini extensions link https://github.com/noah-sheldon/ai-dev-kit

Or install via HTTPS directly:

gemini extensions install https://github.com/noah-sheldon/ai-dev-kit

GitHub Copilot CLI

copilot plugin marketplace add https://github.com/noah-sheldon/ai-dev-kit
copilot plugin install ai-dev-kit@ai-dev-kit

Or install via HTTPS shorthand:

copilot plugin install https://github.com/noah-sheldon/ai-dev-kit

Qwen Code

qwen extensions install https://github.com/noah-sheldon/ai-dev-kit

Or install locally:

qwen extensions install /path/to/ai-dev-kit

Manual Install (any harness)

./install.sh

What's Inside

ai-dev-kit/
├── agents/              33 specialized agents (planner, architect, code-reviewer, debt analyzers, ...)
├── skills/              59 skill playbooks (TDD, security, ML, infra, web, data, ...)
├── commands/            41 workflow commands and shims
├── hooks/               lifecycle automation (pre-tool, post-tool, session events)
├── rules/               language-specific guidance (common, python, typescript, web)
├── manifests/           install manifests for deterministic setup
├── schemas/             JSON schemas for validation
├── docs/                architecture, design decisions, troubleshooting
├── examples/            reusable templates
├── scripts/             install, validate, sync, and template helpers
├── tests/               smoke tests and surface validation
├── .claude-plugin/      Claude Code plugin manifest + marketplace
├── .codex-plugin/       Codex plugin manifest
├── .agents/             Codex marketplace catalog
├── .gemini/             Gemini CLI extension
└── .opencode/           OpenCode project config + TypeScript plugin

Agents (33)

AgentWhen to Use
plannerComplex feature work — breaks requirements into phased, mergeable plans
architectSystem design, component boundaries, API contracts
tdd-guideWriting tests first — RED/GREEN/REFACTOR loop
code-reviewerReviewing diffs for correctness, regressions, quality
security-reviewerAuth, secrets, input validation, OWASP review
ai-judgeRubric-based validation of plans and outputs
build-error-resolverFixing TypeScript, Python, and build pipeline errors
e2e-runnerEnd-to-end test authoring and execution
refactor-cleanerCleanup, modernization, tech debt paydown
doc-updaterSyncing docs with code changes
docs-lookupFinding and referencing documentation
python-reviewerPython-specific code review (Pandas, FastAPI, SQLAlchemy)
database-reviewerSchema, migration, and query review
git-agent-coordinatorBranch coordination, merges, PR orchestration
ml-engineerML/LLMOps: RAG, evals, model training, deployment
chrome-ext-developerWXT and Chrome extension development
data-engineerETL, data quality, pipeline architecture
infra-as-code-specialistIaC, CI/CD, deployment pipelines
observability-telemetryLogs, metrics, traces, dashboard setup
multi-agent-project-managerMulti-workflow orchestration, backlog, priority queue (never stops)
workflow-auditorHealth checks, stuck detection, quality gate trending, anomaly reporting
reddit-researcherReddit sentiment, production war stories, community consensus
codebase-analyzerCodebase structure, dependency, and complexity analysis
codebase-learnerLearning and understanding unfamiliar codebases
code-quality-analyzerStatic analysis, code quality metrics, and standards enforcement
test-debt-analyzerIdentifying and tracking test coverage gaps
security-debt-analyzerSecurity vulnerabilities and debt tracking
performance-debt-analyzerPerformance bottlenecks and optimization opportunities
dependency-debt-analyzerOutdated dependencies and upgrade paths
architecture-debt-analyzerArchitectural issues and design debt
process-debt-analyzerWorkflow inefficiencies and process bottlenecks
documentation-debt-analyzerDocumentation gaps and staleness detection
technical-debt-analyzerOverall technical debt assessment and prioritization

Skills (59)

Core Engineering

agentic-engineering api-design api-integrations backend-patterns frontend-patterns frontend-design hexagonal-architecture coding-standards codebase-onboarding

Testing & Quality

tdd-workflow code-review security-review security-scan e2e-testing python-testing eval-harness verification-loop

AI / ML

claude-api openai-api langchain-llamaindex mlops-workflow mlops-rag pytorch-patterns deep-research exa-search search-first iterative-retrieval autonomous-agent-harness autonomous-loops continuous-agent-loop context-prune token-budget-advisor prompt-optimizer mcp-server-patterns

Data

data-pipelines data-pipelines-ai database-migrations postgres-patterns document-processing

Infrastructure

aws-devops aws-deployment docker-patterns deployment-patterns ci-pipeline github-ops observability-telemetry multi-agent-git-workflow

Workflow & Operations

architecture-decision-records dmux-workflows documentation-lookup git-workflow skill-authoring backlog-management workflow-status

Code Quality & Analysis

codebase-report technical-debt-report

Specialized

wxt-chrome-extension


Commands (41)

Build & quality: build-fix code-review doctor eval ml-review quality-gate review review-pr test-coverage validate verify

Workflow: checkpoint feature-dev plan project-template promote resume-session save-session sessions skill-create skill-health

Git & multi-agent: git-agent multi-agent-status

DevEx: context-budget context-prune install uninstall update-codemaps update-docs

Hook automation: hookify hookify-configure hookify-list hookify-help

Loop & continuous: loop-start loop-status

ML: e2e launch

Learning: learn learn-eval


Cross-Platform Support

PlatformManifestMarketplaceInstall
Claude Code.claude-plugin/plugin.json.claude-plugin/marketplace.json/plugin marketplace add noah-sheldon/ai-dev-kit
Codex.codex-plugin/plugin.json.agents/plugins/marketplace.jsonPlugin Directory (after clone)
Gemini CLI.gemini/gemini-extension.json.gemini/GEMINI.mdgemini extensions link .
OpenCode.opencode/opencode.json.opencode/plugins/ (npm)npm install opencode-ai-dev-kit
Copilot CLI.github-copilot/plugin.json.github-copilot/marketplace.jsoncopilot plugin install noah-sheldon/ai-dev-kit:.github-copilot
Qwen Codeqwen-extension.json.qwen/marketplace.jsonqwen extensions install noah-sheldon/ai-dev-kit
Cursor.cursor/Manual context pack

Core Principles

  1. Agent-first — delegate domain work to the right specialist.
  2. Test-driven — write tests before implementation when behavior changes.
  3. Security-first — validate inputs, avoid unsafe defaults, never hardcode secrets.
  4. Plan-before-execute — break complex work into phases with the planner.
  5. Model fallback — if a requested model is unavailable, use the default and continue.

Workflow

Request → Planner → Architect → Domain Agents → AI-Judge → Implementation → Code Review → Security Review → Merge
  1. Plan complex work with the planner agent before touching code.
  2. Write tests first — use tdd-guide for RED/GREEN/REFACTOR discipline.
  3. Implement with domain specialists (python-reviewer, ml-engineer, etc.).
  4. Validate with ai-judge — rubric covers completeness, correctness, security, feasibility, testability.
  5. Review with code-reviewer and security-reviewer before merge.
  6. Ship with git-agent-coordinator for clean branch management.

MCP Servers (Optional)

MCP servers are not bundled with this kit to avoid corporate network/proxy issues. Add your own MCP servers as needed:

ServerPurpose
githubGitHub API access, PR management
context7Live documentation lookup
exaNeural web search
memoryPersistent memory across sessions
playwrightBrowser automation & E2E testing
sequential-thinkingStep-by-step reasoning

Configure MCP servers in your own .mcp.json at the project root.


Requirements

  • Claude Code v2.1+ (hooks auto-load by convention)
  • Node.js 24+ LTS (for scripts and validation)
  • Git (for agent coordination and version control)

FAQ

Q: How do I adapt this for my project? Start from docs/examples/project-guidelines-template.md and customize the skills you need.

Q: What if a model is unavailable? Every agent has a fallback_model setting. If the specified model is down, it falls back to the workspace default.

Q: How do I validate everything is correct?

npm test                    # smoke tests
node scripts/validate-surface.js  # production surface check

Q: Can I install only certain skills? Yes — the kit supports selective install. Each skill is self-contained. Copy only what you need.

Q: How do I contribute? See CONTRIBUTING.md. Run npm test before submitting.


Docs

DocumentPurpose
CatalogFull inventory of all skills, agents, commands
Command-Agent MapWhich agent handles each command
Skill Development GuideHow to write new skills
TroubleshootingCommon issues and workarounds
Architecture ImprovementsDesign recommendations
Selective Install DesignHow partial installs work
Session Adapter ContractSession state specification

Security

  • Never hardcode secrets — use environment variables or secret managers.
  • Validate all external input at boundaries.
  • Prefer least privilege and explicit allowlists.
  • Review any change touching auth, secrets, or shell execution with security-reviewer.

See SECURITY.md for the full policy.


License

MIT — see LICENSE.

Author

Noah Sheldonnoahsheldon.dev