home/categories/testing/dwmkerr-claude-toolkit-plugins-toolkit-skills-anthropic-evaluations-skill-md
testingtesting-security

anthropic-evaluations

This skill should be used when the user asks to "create evals", "evaluate an agent", "build evaluation suite", or mentions agent testing, graders, or benchmarks. Also suggest when building coding agents, conversational agents, or research agents that need quality assurance.

dwmkerr
maintainer
dwmkerr
Actualizado 1/19/2026
Estrellas
1
Forks
0
quick start

Installation and usage

This skill should be used when the user asks to "create evals", "evaluate an agent", "build evaluation suite", or mentions agent testing, graders, or benchmarks. Also suggest when building coding agents, conversational agents, or research agents that need quality assurance.

Instalación
$ install --globalskills.sh
Uso

Después de instalarlo, puedes usar este skill ejecutando el siguiente comando en tu terminal:

skills use anthropic-evaluations