home/categories/framework-internals/nvidia-tensorrt-llm-claude-skills-kernel-triton-writing-skill-md
framework-internalsdevelopment

kernel-triton-writing

ONLY for OpenAI Triton (@triton.jit) kernel development. NEVER use for CUDA C++ kernels, TileIR, or profiling tools (ncu, nsys). The user's request must involve Triton explicitly. Covers Triton-specific patterns: fused elementwise, reductions (softmax, LayerNorm, RMSNorm), tiled GEMM with triton.autotune, and flash attention. Workflow: design, write, verify (with fast-path for explicit requests).

NVIDIA
maintainer
NVIDIA
Обновлено 4/8/2026
Звёзды
13335
Форки
2271
quick start

Installation and usage

ONLY for OpenAI Triton (@triton.jit) kernel development. NEVER use for CUDA C++ kernels, TileIR, or profiling tools (ncu, nsys). The user's request must involve Triton explicitly. Covers Triton-specific patterns: fused elementwise, reductions (softmax, LayerNorm, RMSNorm), tiled GEMM with triton.autotune, and flash attention. Workflow: design, write, verify (with fast-path for explicit requests).

Установка
$ install --globalskills.sh
Использование

После установки вы можете использовать этот skill, выполнив следующую команду в терминале:

skills use kernel-triton-writing