home/categories/academic/greyhaven-ai-claude-code-config-grey-haven-plugins-core-skills-evaluation-skill-md
academicresearch
grey-haven-evaluation
Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, comparing models, or when user mentions 'evaluation', 'testing LLM', 'rubric', 'LLM-as-judge', 'output quality', 'prompt testing', or 'model comparison'.
maintainer
greyhaven-ai
업데이트됨 1/10/2026
스타
16
포크
2
quick start
Installation and usage
Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, comparing models, or when user mentions 'evaluation', 'testing LLM', 'rubric', 'LLM-as-judge', 'output quality', 'prompt testing', or 'model comparison'.
설치
$ install --globalskills.sh
사용법
설치 후 터미널에서 다음 명령을 실행하여 이 스킬을 사용할 수 있습니다:
skills use grey-haven-evaluation