kernel-cute-writing

Name: kernel-cute-writing
Author: NVIDIA

Write and implement GPU kernels using NVIDIA CuTe DSL (CUTLASS 4.x Python API) — NOT for Triton, CUDA C++, or conceptual explanations. Trigger only when the user wants to write or implement a kernel, not when asking questions about CuTe DSL concepts or layouts. CuTe DSL uses cute.jit/cute.kernel decorators and cutlass.cute imports. Covers element-wise kernels, GEMM patterns, reductions, memory hierarchy (global/shared/register/TMA), MMA tensor core operations, software pipelining, and framework integration.

Ver código fuente framework-internals

maintainer

NVIDIA

Actualizado 4/8/2026

Estrellas

13335

Forks

2271

quick start

Installation and usage

Instalación

$ install --globalskills.sh

Uso

Después de instalarlo, puedes usar este skill ejecutando el siguiente comando en tu terminal:

skills use kernel-cute-writing