evaluation

Name: evaluation
Author: NVIDIA

Evaluates accuracy of quantized or unquantized LLMs using NeMo Evaluator Launcher (NEL). Triggers on "evaluate model", "benchmark accuracy", "run MMLU", "evaluate quantized model", "accuracy drop", "run nel". Handles deployment, config generation, and evaluation execution. Not for quantizing models (use ptq) or deploying/serving models (use deployment).

View Source machine-learning

maintainer

NVIDIA

Updated 4/3/2026

Stars

2429

Forks

344

quick start

Installation and usage

Installation

$ install --globalskills.sh

Usage

Once installed, you can use this skill by running the following command in your terminal:

skills use evaluation