home/categories/testing/dwmkerr-claude-toolkit-plugins-toolkit-skills-anthropic-evaluations-skill-md
testingtesting-security

anthropic-evaluations

This skill should be used when the user asks to "create evals", "evaluate an agent", "build evaluation suite", or mentions agent testing, graders, or benchmarks. Also suggest when building coding agents, conversational agents, or research agents that need quality assurance.

dwmkerr
maintainer
dwmkerr
更新於 1/19/2026
星標
1
分支
0
quick start

Installation and usage

This skill should be used when the user asks to "create evals", "evaluate an agent", "build evaluation suite", or mentions agent testing, graders, or benchmarks. Also suggest when building coding agents, conversational agents, or research agents that need quality assurance.

安裝
$ install --globalskills.sh
使用

安裝後,您可以透過在終端機執行以下指令來使用此技能:

skills use anthropic-evaluations