home/categories/machine-learning/letta-ai-skills-letta-benchmarks-trajectory-only-torch-pipeline-parallelism-skill-md
machine-learningdata-ai

torch-pipeline-parallelism

This skill provides guidance for implementing PyTorch pipeline parallelism for distributed training of large language models. It should be used when implementing pipeline parallel training loops, partitioning transformer models across GPUs, or working with AFAB (All-Forward-All-Backward) scheduling patterns. The skill covers model partitioning, inter-rank communication, gradient flow management, and common pitfalls in distributed training implementations.

letta-ai
maintainer
letta-ai
업데이트됨 1/19/2026
스타
31
포크
5
quick start

Installation and usage

This skill provides guidance for implementing PyTorch pipeline parallelism for distributed training of large language models. It should be used when implementing pipeline parallel training loops, partitioning transformer models across GPUs, or working with AFAB (All-Forward-All-Backward) scheduling patterns. The skill covers model partitioning, inter-rank communication, gradient flow management, and common pitfalls in distributed training implementations.

설치
$ install --globalskills.sh
사용법

설치 후 터미널에서 다음 명령을 실행하여 이 스킬을 사용할 수 있습니다:

skills use torch-pipeline-parallelism