home/categories/machine-learning/orchestra-research-ai-research-skills-06-post-training-verl-skill-md
machine-learningdata-ai

verl-rl-training

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

Orchestra-Research
maintainer
Orchestra-Research
Mis à jour 1/29/2026
Étoiles
6563
Forks
515
quick start

Installation and usage

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

Installation
$ install --globalskills.sh
Utilisation

Après l'installation, vous pouvez utiliser ce skill en exécutant la commande suivante dans votre terminal :

skills use verl-rl-training