home/categories/sales-marketing/sickn33-antigravity-awesome-skills-plugins-antigravity-awesome-skills-claude-skills-agent-evaluation-skill-md
sales-marketingbusiness
agent-evaluation
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchmarks
maintainer
sickn33
Обновлено 4/7/2026
Звёзды
32093
Форки
5340
quick start
Installation and usage
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchmarks
Установка
$ install --globalskills.sh
Использование
После установки вы можете использовать этот skill, выполнив следующую команду в терминале:
skills use agent-evaluation