home/categories/machine-learning/orchestra-research-ai-research-skills-06-post-training-verl-skill-md
machine-learningdata-ai
verl-rl-training
Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.
maintainer
Orchestra-Research
अपडेट किया गया 1/29/2026
स्टार
6563
फोर्क
515
quick start
Installation and usage
Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.
इंस्टॉलेशन
$ install --globalskills.sh
उपयोग
इंस्टॉल करने के बाद, आप टर्मिनल में यह कमांड चलाकर इस स्किल का उपयोग कर सकते हैं:
skills use verl-rl-training