home/categories/testing/dwmkerr-claude-toolkit-plugins-toolkit-skills-anthropic-evaluations-skill-md

testingtesting-security

anthropic-evaluations

Name: anthropic-evaluations
Author: dwmkerr

This skill should be used when the user asks to "create evals", "evaluate an agent", "build evaluation suite", or mentions agent testing, graders, or benchmarks. Also suggest when building coding agents, conversational agents, or research agents that need quality assurance.

查看源碼 testing

maintainer

dwmkerr

更新於 1/19/2026

星標

分支

quick start

Installation and usage

安裝

$ install --globalskills.sh

使用

安裝後，您可以通過在終端運行以下命令來使用此技能：

skills use anthropic-evaluations