home/categories/containers/project-hami-hami-skill-k8s-debug-gpu-pod-skill-md
containersdevops

k8s-gpu-pod-troubleshooter

A comprehensive diagnostic skill for troubleshooting GPU pod scheduling and allocation issues in Kubernetes clusters using HAMi (Heterogeneous AI Computing Virtualization Middleware). It identifies GPU resource constraints, webhook configuration problems, device plugin issues, and scheduler policy misconfigurations to provide actionable remediation guidance.

Project-HAMi
maintainer
Project-HAMi
업데이트됨 3/2/2026
스타
3256
포크
506
quick start

Installation and usage

A comprehensive diagnostic skill for troubleshooting GPU pod scheduling and allocation issues in Kubernetes clusters using HAMi (Heterogeneous AI Computing Virtualization Middleware). It identifies GPU resource constraints, webhook configuration problems, device plugin issues, and scheduler policy misconfigurations to provide actionable remediation guidance.

설치
$ install --globalskills.sh
사용법

설치 후 터미널에서 다음 명령을 실행하여 이 스킬을 사용할 수 있습니다:

skills use k8s-gpu-pod-troubleshooter