home/categories/debugging/nvidia-tensorrt-llm-claude-skills-perf-nsight-compute-analysis-skill-md
debuggingtools

perf-nsight-compute-analysis

Analyze ncu (NVIDIA Nsight Compute) profiling output: SOL% bottleneck classification, roofline analysis, occupancy diagnosis, memory hierarchy analysis, warp stall analysis, metric interpretation, and programmatic .ncu-rep report analysis. NOT for kernel writing or code generation, Nsight Systems (nsys), host-side profiling, or system-level profiling.

NVIDIA
maintainer
NVIDIA
更新於 4/8/2026
星標
13335
分支
2271
quick start

Installation and usage

Analyze ncu (NVIDIA Nsight Compute) profiling output: SOL% bottleneck classification, roofline analysis, occupancy diagnosis, memory hierarchy analysis, warp stall analysis, metric interpretation, and programmatic .ncu-rep report analysis. NOT for kernel writing or code generation, Nsight Systems (nsys), host-side profiling, or system-level profiling.

安裝
$ install --globalskills.sh
使用

安裝後,您可以透過在終端機執行以下指令來使用此技能:

skills use perf-nsight-compute-analysis