home/categories/machine-learning/databricks-solutions-ai-dev-kit-databricks-skills-mlflow-evaluation-skill-md
machine-learningdata-ai

mlflow-evaluation

MLflow 3 GenAI evaluation for agent development. Use when (1) writing mlflow.genai.evaluate() code, (2) creating @scorer functions, (3) building evaluation datasets from traces, (4) using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), (5) analyzing traces for latency/errors/architecture, (6) optimizing agent context/prompts/token usage, (7) debugging evaluation failures. Covers the full eval workflow: trace analysis -> dataset building -> scorer creation -> evaluation execution.

databricks-solutions
maintainer
databricks-solutions
Обновлено 1/19/2026
Звёзды
5
Форки
5
quick start

Installation and usage

MLflow 3 GenAI evaluation for agent development. Use when (1) writing mlflow.genai.evaluate() code, (2) creating @scorer functions, (3) building evaluation datasets from traces, (4) using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), (5) analyzing traces for latency/errors/architecture, (6) optimizing agent context/prompts/token usage, (7) debugging evaluation failures. Covers the full eval workflow: trace analysis -> dataset building -> scorer creation -> evaluation execution.

Установка
$ install --globalskills.sh
Использование

После установки вы можете использовать этот skill, выполнив следующую команду в терминале:

skills use mlflow-evaluation