readiness-check
Classifies pre-run readiness (proceed/modify/skip) using recent load, recovery, and self-reported signals.
Classifies pre-run readiness (proceed/modify/skip) using recent load, recovery, and self-reported signals.
WHEN: Machine Learning/Deep Learning code review, PyTorch/TensorFlow patterns, Model training optimization, MLOps checks WHAT: Model architecture review + Training patterns + Data pipeline checks + GPU optimization + Experiment tracking WHEN NOT: Data analysis only → python-data-reviewer, General Python → python-reviewer
Model-trader version compatibility protocol: Embed version metadata in checkpoints, validate at load time. Trigger when: (1) training and live trading versions diverge, (2) models fail to load, (3) action interpretation issues.
Systematically search hyperparameter space. Use when tuning learning rate, batch size, or other hyperparameters.
Master ε-greedy, UCB, curiosity-driven, RND, intrinsic motivation exploration
Debug TensorFlow and Keras issues systematically. This skill helps diagnose and resolve machine learning problems including tensor shape mismatches, GPU/CUDA detection failures, out-of-memory errors, NaN/Inf values in loss functions, vanishing/exploding gradients, SavedModel loading errors, and data pipeline bottlenecks. Provides tf.debugging assertions, TensorBoard profiling, eager execution debugging, and version compatibility guidance.
7-action space with position sizing (25/50/75%) + small account simulation. Trigger when: (1) model needs sizing decisions, (2) training for <$25K accounts, (3) upgrading obs_dim 5600->5900.
Configuring and optimizing 16-bit Low-Rank Adaptation (LoRA) and Rank-Stabilized LoRA (rsLoRA) for efficient LLM fine-tuning using triggers like lora, qlora, rslora, rank selection, lora_alpha, lora_dropout, and target_modules.
Master REINFORCE, PPO, TRPO - direct policy optimization with trust regions
Rigorous A/B/C testing framework for empirically evaluating reasoning patterns. Use when you need data-driven pattern selection, want to quantify trade-offs between patterns, or need to validate claims about which cognitive methodology performs best. Enables scientific measurement of quality, cost, and time trade-offs across ToT, BoT, SRC, HE, AR, DR, AT, RTR, and NDF patterns.
Strategies for evaluating agents in production - sampling, baselines, and regression detection
Model serving engine for PyTorch. Focuses on MAR packaging, custom handlers for preprocessing/inference, and management of multi-GPU worker scaling. (torchserve, mar-file, handler, basehandler, model-archiver, inference-api)
Bayesian regression models including linear, logistic, Poisson, negative binomial, and robust regression with Stan and JAGS implementations.
Add account state (P&L, win rate, drawdown) to RL observations + drawdown penalty in rewards. Trigger when: (1) model needs account awareness, (2) training should penalize drawdowns, (3) upgrading obs_dim 5300→5600.
深度学习模型结构设计专家。当用户询问“模型设计”“网络结构”“模块划分”“数据流” “损失函数”等问题,或在训练问题中尚未明确模型结构时使用。
Run Microsoft's eval-recipes benchmarks to validate amplihack improvements against baseline agents. Activates when testing with eval-recipes, running evals, or benchmarking changes.
Decomposed reasoning with explicit confidence scoring. Use for complex decisions, debugging failures, and architectural choices where tracking uncertainty prevents wasted effort.
Train RL models across multiple timeframes with resampling. Trigger when: (1) multi-timeframe training, (2) resampling data, (3) creating 1Hour/4Hour models.
Debug Scikit-learn issues systematically. Use when encountering model errors like NotFittedError, shape mismatches between train and test data, NaN/infinity value errors, pipeline configuration issues, convergence warnings from optimizers, cross-validation failures due to class imbalance, data leakage causing suspiciously high scores, or preprocessing errors with ColumnTransformer and feature alignment.
Detects unsafe training load spikes (>20-30% week-over-week) and emits safety flags. Use in nightly background jobs or when reviewing weekly training volume with conservative adjustment recommendations.
Run and monitor ATFT-GAT-FAN training loops, hyper-parameter sweeps, and safety modes on A100 GPUs.
Interactive diagnostic workflow for training problems. Use when training is failing, loss is stuck, gradients explode, NaN occurs, or convergence is poor.