agent-evaluation-mlflow

Implement agent evaluation and safety gates using MLflow 3.x. Use for creating LLM-as-Judge scorers, evaluation datasets, quality gates, tracing, and continuous evaluation. Triggers on "evaluate agent", "MLflow scorer", "LLM judge", "safety evaluation", "quality gate", "agent testing", "hallucination detection", or when implementing spec/010-agent-evaluation.md requirements.

查看源码 machine-learning

maintainer

raphaelmansuy

更新于 12/19/2025

星标

分支

quick start

Installation and usage

安装

$ install --globalskills.sh

使用

安装后，您可以通过在终端运行以下命令来使用此技能：

skills use agent-evaluation-mlflow