Matan Tsach

Matan Tsach

@matantsach
3 published skills0 installs

3 results

Matan Tsach
Collection

snapeval

0
Matan Tsach
Skill

create-evals

Set up evaluations for an AI skill from scratch — designs test scenarios, writes evals.json, and runs the first benchmark. Use when no evals exist yet and the user wants to evaluate, test, benchmark, or review a skill. Triggers on "evaluate my skill", "test my skill", "set up evals", "how good is my skill", "benchmark this skill", "create evals for", or any request to assess skill quality when there is no existing evals/evals.json file.

0
Matan Tsach
Skill

run-evals

Run and iterate on existing skill evaluations. Use when evals/evals.json already exists and the user wants to run evals, re-evaluate after skill changes, check results, compare iterations, add/modify eval cases, or gate CI with thresholds. Triggers on "run evals", "re-eval", "how did it do", "check results", "compare iterations", "run benchmarks", or any eval-related request when evals already exist.

0