containersdevops
k8s-gpu-pod-troubleshooter
A comprehensive diagnostic skill for troubleshooting GPU pod scheduling and allocation issues in Kubernetes clusters using HAMi (Heterogeneous AI Computing Virtualization Middleware). It identifies GPU resource constraints, webhook configuration problems, device plugin issues, and scheduler policy misconfigurations to provide actionable remediation guidance.
maintainer
Project-HAMi
更新于 3/2/2026
星标
3256
分支
506
quick start
Installation and usage
A comprehensive diagnostic skill for troubleshooting GPU pod scheduling and allocation issues in Kubernetes clusters using HAMi (Heterogeneous AI Computing Virtualization Middleware). It identifies GPU resource constraints, webhook configuration problems, device plugin issues, and scheduler policy misconfigurations to provide actionable remediation guidance.
安装
$ install --globalskills.sh
使用
安装后,您可以通过在终端运行以下命令来使用此技能:
skills use k8s-gpu-pod-troubleshooter