manage-memories
Memory layer operations for persistent session storage
Memory layer operations for persistent session storage
Otimizador automático de código usando análise evolutiva GEPA para melhorar qualidade, performance e manutenibilidade
Write idiomatic Python code with advanced features like decorators, generators, and async/await. Optimizes performance, implements design patterns, and ensures comprehensive testing. Use for ML training, analytics tools, performance profiling, or any Python heavy lifting.
Optimize PyTorch with torch.compile (TorchDynamo/Inductor), focusing on compile overhead, graph breaks, and benchmark methodology. Use when speeding up PyTorch models or debugging compile behavior; triggers: torch.compile, torchdynamo, inductor, graph break, pytorch optimization.
Convert Python loops to vectorized PyTorch tensor operations for performance. This skill should be used when optimizing computational bottlenecks in PyTorch code during Phase 4 performance optimization.
Fundamental NumPy operations including ndarray creation, dtypes, shape manipulation, and basic operations with a focus on memory alignment and data views. Triggers: numpy, ndarray, dtype, reshape, memory alignment, array-creation.
A template for skills that include executable code for deterministic operations.
Fixes type mismatch errors by adding appropriate casts or conversions.
Create FFI bindings between Lean 4 and C code. Use when working with foreign functions, native libraries, Metal, or system APIs.
Advanced 4-bit quantization techniques using Unsloth and BitsAndBytes for extreme VRAM efficiency (triggers: QLoRA, 4-bit, load_in_4bit, bnb-4bit, VRAM optimization, dynamic quantization).
Analyze the competitive and adjacent solution landscape to surface differentiation opportunities.
提供嵌入式系统软硬件协同思考框架,涵盖硬件层、软件架构、资源约束、实时性、测试调试五大维度。当需要设计嵌入式应用、评审物联网系统、或需要全局视角审视 MCU/MPU 与软件配合时使用。支持裸机/RTOS 选型、功耗优化、内存预算、中断响应、OTA 升级等嵌入式特有场景决策。
GPU-safe parallel processing patterns for KINTSUGI to prevent OOM crashes and ensure Jupyter-compatible progress output
Exploiting custom interpreters and virtual machines
Step-by-step guide for adding new feature modules to the renderer process using feature-based architecture
Deep dive into memory layout, including strides, C vs Fortran order, and zero-copy view generation via stride tricks. Triggers: strides, C-order, Fortran-order, memory locality, stride_tricks.
Guides adding new Higher Inductive Types to the ComputationalPaths library. Use when creating new HITs, defining fundamental group (pi1) calculations, implementing encode-decode proofs, or adding new topological spaces.
Experto en optimización de bajo nivel. Su misión es garantizar 60 FPS constantes, gestión de memoria eficiente mediante Isolates y disponibilidad offline total.
Build OBS Studio plugins for Windows using MSVC or MinGW. Covers Visual Studio setup, .def file exports, Windows linking (ws2_32, comctl32), platform-specific sources, and DLL verification. Use when building OBS plugins natively on Windows or troubleshooting Windows builds.
Create new vision detector plugins following Bob The Skull's detector architecture. Use when adding new detectors like object detection, pose estimation, gesture recognition, or any computer vision detector.
Solves linear systems (Ax=b) using NumPy, JAX, PyTorch, and Lineax. Covers dense, sparse, and specialized (tridiagonal) solvers.
High-performance C development for data-intensive systems, with explicit emphasis on time-indexed / log-structured in-memory engines (e.g., Timelog-class designs). Use when building advanced data structures, algorithms, or libraries in C with focus on: memory efficiency, cache locality, immutable segment layouts, atomic publication, snapshot reads, SIMD/bit operations, and (future) Python bindings. Applies to: custom allocators, paged/segment storage, compression and bitmaps, index structures, single-writer/multi-reader concurrency patterns, background maintenance (flush/compaction), and performance-critical library development.