avoiding-non-null-assertions
Avoid non-null assertion operator (!) and use type-safe alternatives instead
Avoid non-null assertion operator (!) and use type-safe alternatives instead
Debug PyTorch issues systematically. Use when encountering tensor errors, CUDA out of memory errors, gradient problems like NaN loss or exploding gradients, shape mismatches between layers, device conflicts between CPU and GPU, autograd graph issues, DataLoader problems, dtype mismatches, or training instabilities in deep learning workflows.
ChimeraX bundle/extension development reference
Astral社製ツール(uv, Ruff, ty)でPython開発環境を構築。プロジェクト作成、依存関係管理、コード品質改善、CI/CD設定時に使用。
Fix common PyTorch bugs: percentile calculations, LayerNorm for Conv1d, buffer edge cases. Trigger when writing PyTorch code for RL or neural networks.
Use when improving data loading, validation, and IO boundaries in Python research code.
Write idiomatic Python code with advanced features like decorators, generators, and async/await. Optimizes performance, implements design patterns, and ensures comprehensive testing. Use for ML training, analytics tools, performance profiling, or any Python heavy lifting.
Universal functions (ufuncs) for vectorization, including reductions, in-place operations, and custom Python-function wrapping. Triggers: ufunc, vectorize, reduce, accumulate, frompyfunc, in-place.
Protocol untuk meningkatkan kecerdasan, presisi, dan keamanan dalam coding dan debugging.
Exporting PyTorch models to ONNX format for cross-platform deployment. Includes handling dynamic axes, graph optimization in ONNX Runtime, and INT8 model quantization. (onnx, onnxruntime, torch.onnx.export, dynamic_axes, constant-folding, edge-deployment)
Assert CPU-only runtime inside a container using PyTorch (torch.cuda.is_available()==False) and optional env var checks. Use to prevent accidental GPU execution during CPU smoke tests.
WebGPU fundamentals for high-performance canvas rendering. Covers device initialization, buffer management, WGSL shaders, render pipelines, compute shaders, and web component integration. Use when building GPU-accelerated graphics, particle systems, or compute-intensive visualizations.
统一开发编排器:将 Superpowers 方法论框架与 AI 代码工厂深度整合, 创建统一的开发工作流,结合 TDD 纪律、SoT 合规和自动化代码生成。
Guides adding new Higher Inductive Types to the ComputationalPaths library. Use when creating new HITs, defining fundamental group (pi1) calculations, implementing encode-decode proofs, or adding new topological spaces.
Optimizes TensorFlow.js WebGPU backend, WebLLM offline inference, and implements memory safety for heavy ML models
Audit and remove unnecessary abstractions from the codebase. Use when user wants to simplify, reduce dependencies, or eliminate complexity that slows down AI agents.
Use profunctor optics from the Collimator library for Lean 4. Use when working with lenses, prisms, traversals, or nested data access.
[ORCHESTRATOR] Runs full performance optimization loop for Truffle languages. Entry: User wants to optimize performance. Process: Determines current state, executes appropriate phase (1→2→3→4), loops until performance meets expectations. IMPORTANT: Single entry point for all optimization work.
Distributed training strategies including DistributedDataParallel (DDP) and Fully Sharded Data Parallel (FSDP). Covers multi-node setup, checkpointing, and process management using torchrun. (ddp, fsdp, distributeddataparallel, torchrun, nccl, rank, process-group)
Apply appropriate type hints for ML/PyTorch code. Use when adding type annotations to ML code or addressing mypy errors.
Expert embedded systems engineer specializing in microcontroller programming, RTOS development, and hardware optimization. Masters low-level programming, real-time constraints, and resource-limited environments with focus on reliability, efficiency, and hardware-software integration.