Victoria Duan

Victoria Duan

@Victoriakaey
16 published skills0 installs

16 results

Victoria Duan
Collection

build-reliable-agents

0
Victoria Duan
Skill

agent-architecture

Use at the start of any LLM agent project, or when reconsidering an existing architecture. Guides decisions across four layers: workflow vs agent, single-agent vs multi-agent, tool-use vs specialized nodes, and retrieval strategy. Each decision has concrete tradeoffs and a recommended default.

0
Victoria Duan
Skill

agent-integration

Use when connecting an LLM agent to a full-stack application, external API, or third-party platform. Covers four integration patterns (REST, WebSocket/SSE, Webhook, Message Queue), interface design, reliability, security, and observability. Framework-agnostic — guides you to the right pattern for your situation, then gives concrete implementation direction for your chosen stack.

0
Victoria Duan
Skill

ai-system-design

Use when a user has an idea for a product or feature that might involve AI, but doesn't know where to start or how to design the system. Guides non-technical users through a conversational process to clarify their idea, decide where AI fits, and produce a system design they can understand — and that Claude can use to start building. One question at a time. Never assume technical knowledge.

0
Victoria Duan
Skill

code-review

Use when reviewing code — either Claude reviews your code and produces a structured report, or Claude guides you through reviewing someone else's code. Default mode: Claude performs the review and produces a report organized by severity. Second mode: guided self-review with a structured checklist and probing questions. Covers correctness, security, performance, maintainability, and AI-specific concerns for LLM applications.

0
Victoria Duan
Skill

critic-judge-design

Use when designing any LLM-as-Judge, Critic, or Evaluator node. Covers input structure, output schema, chain-of-thought ordering, single-pass vs multi-stage tradeoffs, and known failure modes. Prevents the most common design mistakes that cause Critic nodes to be unreliable.

0
Victoria Duan
Skill

database-design

Use when designing a database schema for a new system, or when an existing schema needs to be revised. Two modes: (1) general schema design for any application, (2) AI-specific schema design for systems with conversation history, embeddings, agent state, or LLM outputs. Guides through data modeling decisions, type choices, indexing, and AI-specific storage patterns.

0
Victoria Duan
Skill

devops

Use when deploying an application to production, setting up CI/CD, or moving from local development to a hosted environment. Guides through deployment target selection (PaaS, Docker+VPS, cloud platform), containerization, environment configuration, CI/CD pipeline design, monitoring, and rollback strategy. Covers both simple and complex deployment scenarios.

0
Victoria Duan
Skill

experiment-driven-development

Use at the start of any implementation task on an LLM system. A 10-step process from evidence collection to git commit. Prevents the most common failure mode in LLM development: writing code based on assumptions instead of observed system behavior. Use for any change — prompt tuning, node logic, routing, architecture — not just experiments.

0
Victoria Duan
Skill

harness-design

Use when designing or improving the action space, observation format, tool boundaries, and evaluation signals of an LLM agent. Covers the five layers of harness design: what tools the agent has, what it sees after acting, how each tool is scoped, how behavior is evaluated, and how to iterate when the agent misbehaves. Prevents the most common harness failures — agents that pick wrong tools, ignore critical information, or loop without progress.

0
Victoria Duan
Skill

memory-system

Use when designing how an agent remembers and uses information across turns, sessions, or runs. Covers the four types of agent memory (in-context, episodic, semantic, procedural), when to use each, how to manage context window limits, retrieval strategies, and memory decay. Cross-references database-design skill for storage schema patterns.

0
Victoria Duan
Skill

model-selection

Use when choosing a model for an LLM application, deciding between API and local deployment, or evaluating whether fine-tuning is needed. Guides through deployment decision (API vs local), model selection based on task requirements and constraints, configuration setup, and the prompt engineering vs fine-tuning decision. Records user's setup for future reference.

0
Victoria Duan
Skill

problem-exploration

Use when facing a technical problem with multiple possible solutions, or when a previous approach failed and you need to reason about alternatives. Structures the space of options before committing to any one. Core output: a brainstorm log with options, tradeoffs, a chosen approach, and unresolved gaps that must be answered before implementation starts. Use before experiment-driven-development Step 4 when the design space is large or uncertain.

0
Victoria Duan
Skill

prompt-change-management

Use before making any prompt change in an LLM system. Covers the full cycle: pre-change documentation, change scoping, post-change validation, and rollback protocol. Prevents silent regressions caused by prompt changes with unexpected downstream effects.

0
Victoria Duan
Skill

prompt-design

Use before writing any new LLM component prompt, or when an existing prompt is producing unstable, inconsistent, or wrong outputs. Two modes: (1) design from scratch by defining component boundaries and structure before writing, (2) diagnose an existing prompt to find root cause of failure. Covers all dimensions of prompt quality: job definition, input structure, output schema, rules, few-shot examples, system/user split, token budget, and prompt decay over time.

0
Victoria Duan
Skill

regression-testing

Use when comparing two versions of an LLM system, debugging non-deterministic behavior, or establishing a baseline before making changes. Covers controlled comparison protocol, determinism verification, and bisect protocol for finding which change caused a regression.

0