Tal Muskal

Tal Muskal

@tmuskal
7 published skills0 installs

7 results

Tal Muskal
Collection

Claude Code Self-Benchmarking Marketplace

0
Tal Muskal
Skill

Browse Tests

Explore available ARC-AGI environments - lists games, shows details with ASCII grid visualization, and displays historical scores

0
Tal Muskal
Skill

Compare Runs

Compare two or more ARC-AGI benchmark runs - shows score deltas, config changes, and trends to track improvement or regression

0
Tal Muskal
Skill

Cross Harness

Cross-harness benchmarking - generate instructions for Codex/Gemini/OpenCode, import results, and compare across harnesses

0
Tal Muskal
Skill

report

Generate and display comprehensive reports from completed ARC-AGI benchmark runs - shows scores, per-game breakdowns, and performance analysis

0
Tal Muskal
Skill

Run Benchmark

Execute benchmark runs against ARC-AGI games - plays games with Claude Code as the agent and records scores

0
Tal Muskal
Skill

setup

Set up the ARC-AGI benchmarking environment - installs dependencies, configures API access, and verifies the setup works

0