home/categories/testing/dwmkerr-claude-toolkit-plugins-toolkit-skills-anthropic-evaluations-skill-md

testingtesting-security

anthropic-evaluations

Name: anthropic-evaluations
Author: dwmkerr

This skill should be used when the user asks to "create evals", "evaluate an agent", "build evaluation suite", or mentions agent testing, graders, or benchmarks. Also suggest when building coding agents, conversational agents, or research agents that need quality assurance.

查看源码 testing

maintainer

dwmkerr

更新于 1/19/2026

星标

分支

quick start

Installation and usage

安装

$ install --globalskills.sh

使用

安装后，您可以通过在终端运行以下命令来使用此技能：

skills use anthropic-evaluations