
17 results

Use this skill for analytics and data-science workflow setup, exploratory analysis, notebook-first EDA, repo normalization for analysis projects, experiment comparison, AutoML, causal analysis, and promotion from ad hoc exploration into reusable pipelines. Trigger when the user asks for analysis best practices, how to structure an analytics repo, how to organize notebooks and runs, whether to use marimo or Quarto/qmd, how to handle experiment sweeps, how to compare models, or how to make analysis reproducible. Also trigger on phrases such as analytic workbench, EDA, exploratory analysis, notebook workflow, analytics pipeline, reproducible analysis, experiment sweep, hyperparameter comparison, comparison table, marimo, Quarto, qmd, Hamilton, sf-hamilton, dataflow, DAG driver, Hydra, DVC, Kedro, MLflow, AutoML, PyCaret, causal analysis, feature engineering, or model review.

Instrument agentic LLM apps built on the Claude Agent SDK (claude-agent-sdk) and/or LangGraph with Arize Phoenix and OpenInference — tracing, evaluation, annotations, experiments, cost tracking, and self-hosting. Use when the user mentions Phoenix, arize-phoenix, openinference, LLM observability, LLM-as-judge evals, tracing Claude Agent SDK `query()` / `ClaudeSDKClient` calls, tool-use observability, tracing LangGraph nodes/edges, or debugging latency/cost/quality of an agent.

Convert UI designs into structured JSONC spec packages before code is written, especially for constrained platforms like extensions, dashboards, desktop shells, and mobile apps. Use for design handoff and design-to-spec workflows. Outputs specs, not implementation code.

Design and implement end-to-end tests using BDD/Gherkin scenarios and browser automation. Use this skill when the user wants to write E2E tests, define user journeys, create acceptance tests for a web app, set up browser testing infrastructure, or convert requirements into executable Gherkin scenarios. Also use when the user asks about testing strategy, wants to add E2E tests to an existing project, or mentions Cucumber, BDD, Gherkin, Playwright, Cypress, or browser testing.

Use this skill for LangGraph, Deep Agents, LangChain agents built on LangGraph, MCP-to-LangGraph tool bridging, stateful workflows, subgraphs, subagents, interrupts, checkpointing, streaming, and multi-agent orchestration. Trigger when code imports langgraph, deepagents, langchain_mcp_adapters, langchain.agents, or when the user asks for agent graphs, orchestration, durable execution, HITL, or LangGraph architecture and patterns.

Launch and manage ralph-orchestrator planner-builder-reviewer loops for autonomous multi-step implementation. Use this skill whenever the user says "ralph loop", "ralph orchestrate", "ralph run", wants to delegate work to a plan/build/review cycle, mentions phase plans, wants to configure loop iterations (max activations), hat workflows, cost budgets, or guardrails. Also trigger when the user asks to "orchestrate", "delegate to ralph", "launch a loop", "reduce max to N", or references the planner/builder/reviewer pattern. Covers project setup, spec writing, tmux launch, loop monitoring, steering, and ceremony.

Recover a prior Claude Code session from natural-language hints, search Claude history by topic/date/project, and import the useful context into the current conversation. Use this for Claude session handoff, transcript recovery, context transfer into Codex or another agent, and continuing after Claude hit a usage or rate limit.

Use this skill for general Robot Framework work: authoring `.robot` suites, tasks, keywords, variables, resource files, execution, dry runs, tags, Rebot/Libdoc usage, and Python test-library patterns. Trigger when the user mentions Robot Framework, `.robot` files, keywords, libraries, resource files, tasks, listeners, Libdoc, Rebot, or Robot Framework syntax and execution.

Spec-driven development: plan → go → review → project loop with spec lifecycle states, YAML frontmatter, an AI-operated feature projection grounded in code reality, and AI-operated refresh of project documentation against that same grounding. Use for planning features, implementing from specs, refining specs, regenerating the project-level feature ledger, bringing README / docs/ / CHANGELOG up to date with specs and code, tracking what features exist across specs, and resuming work. Trigger on requests mentioning specs, requirements/design/tasks, spec-help, spec-plan, spec-project, spec-docs, feature ledger, FEATURES.md, spec-link, `.kiro`, `specs/`, refreshing or syncing docs from specs, updating README to match reality, or any ask to audit, project, or refresh the state of features and docs in the project. IMPORTANT: Never edit spec files without first reading this skill. Never hand-edit FEATURES.md — it is a derived projection. Never regenerate managed doc sections without reading `references/docs-refresh-playbook.md`.

Design and implement distinctive, production-grade screen UI for vibe-coding workflows: loose prompts, taste-driven iteration, and fast frontend polishing that still respects usability, layout discipline, and real code constraints. Use this skill whenever the user wants a frontend to feel more polished, modern, premium, cohesive, bold, or simply "better", including landing pages, dashboards, app screens, forms, lists, onboarding, settings, and component restyling. Also use it for vague requests like "make this UI nicer", "give this more personality", "make it feel expensive", or "vibe code this screen." Preserve existing design systems when they are good; introduce a clearer visual and interaction system when they are weak or absent.

Umbrella skill for agent work discipline across development, analysis, and documentation: inspect the repo before restructuring, keep durable truth in repo artifacts instead of chat memory, co-evolve specs/design/steering/user docs with code, apply sound coding patterns, verify work honestly, avoid shortcuts, work efficiently with subagents without hallucinating, and keep moving through the next concrete work item when the human is away. References cover coding patterns, AI-authored code review, and artifact co-evolution. Trigger when the user asks for workflow discipline, coding patterns, doc/artifact maintenance, code review of AI-authored code, project hygiene, execution guardrails, repo normalization, or when a task risks drifting across architecture, storage, specs, continuity, or tooling boundaries.