home/categories/framework-internals

category focus

Frameworks

Deep dive into framework internals.

1580 스킬all categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

framework-internals

634

typescript-pro

Master TypeScript with advanced types, generics, and strict type safety. Handles complex type systems, decorators, and enterprise-grade patterns. Use PROACTIVELY for TypeScript architecture, type inference optimization, or advanced typing patterns.

rmyndharis

development

open

framework-internals

634

legacy-modernizer

Refactor legacy codebases, migrate outdated frameworks, and implement gradual modernization. Handles technical debt, dependency updates, and backward compatibility. Use PROACTIVELY for legacy system updates, framework migrations, or technical debt reduction.

rmyndharis

development

open

framework-internals

625

weight-conversion

Converting PyTorch model weights to Keras h5 format for keras_cv_attention_models

leondgarse

development

open

framework-internals

614

validate-binaries

Validate voxtype binaries for CPU instruction contamination. Use when checking release binaries for AVX-512 or GFNI instruction leaks that would crash on older CPUs.

peteonrails

development

open

framework-internals

605

typescript

This skill should be used when the user asks to "optimize TypeScript performance", "speed up tsc compilation", "configure tsconfig.json", "fix type errors", "improve async patterns", or encounters TS errors (TS2322, TS2339, "is not assignable to"). Also triggers on .ts, .tsx, .d.ts file work involving type definitions, module organization, or memory management. Does NOT cover TypeScript basics, framework-specific patterns, or testing.

stablyai

development

open

framework-internals

564

cuda-kernels

Provides guidance for writing and benchmarking optimized CUDA kernels for NVIDIA GPUs (H100, A100, T4) targeting HuggingFace diffusers and transformers libraries. Supports models like LTX-Video, Stable Diffusion, LLaMA, Mistral, and Qwen. Includes integration with HuggingFace Kernels Hub (get_kernel) for loading pre-compiled kernels. Includes benchmarking scripts to compare kernel performance against baseline implementations.

huggingface

development

open

framework-internals

560

sequence-packing

Operational guide for enabling packed sequences and long-context config paths in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA-NeMo

development

open

framework-internals

560

expert-parallel-overlap

Validate and use MoE expert-parallel communication overlap in Megatron-Bridge, including overlap_moe_expert_parallel_comm, delay_wgrad_compute, and flex dispatcher backends such as DeepEP and HybridEP.

NVIDIA-NeMo

development

open

framework-internals

538

data-flow-analysis-framework

Design and implement data-flow analyses for compiler optimization

a5c-ai

development

open

framework-internals

538

nccl-communication

NVIDIA Collective Communications Library integration for multi-GPU operations. Initialize NCCL communicators, execute collective operations, configure communication topologies, profile collective performance, and support RCCL for AMD compatibility.

a5c-ai

development

open

framework-internals

538

cutlass-triton

High-performance kernel template libraries and DSLs. Generate CUTLASS GEMM configurations, implement Triton kernel definitions, configure epilogue operations, tune tile sizes and warp arrangements, and benchmark against cuBLAS.

a5c-ai

development

open

framework-internals

538

cuda-toolkit

Deep integration with NVIDIA CUDA toolkit for kernel development, compilation, and debugging. Execute nvcc compilation with optimization flags analysis, generate and validate CUDA kernel code, analyze PTX/SASS assembly output, and configure execution parameters.

a5c-ai

development

open

framework-internals

538

opencl-runtime

Cross-vendor OpenCL runtime management and kernel development. Query platforms/devices, generate portable OpenCL C kernel code, handle vendor-specific extensions, manage contexts and command queues, compile and cache programs.

a5c-ai

development

open

framework-internals

538

cuda-graphs

Expert skill for CUDA Graph capture and optimization for reduced launch overhead. Capture CUDA operations into graphs, instantiate and execute graph instances, update graph node parameters, profile graph vs stream execution, design graph-friendly kernel patterns, and optimize launch latency for inference.

a5c-ai

development

open

framework-internals

538

cublas-cudnn

Expert integration with NVIDIA GPU-accelerated math libraries. Configure cuBLAS tensor core operations, generate cuBLAS GEMM calls, integrate cuDNN layers, handle algorithm selection, and support mixed-precision operations.

a5c-ai

development

open

framework-internals

538