data-cleaning
Data cleaning, preprocessing, and quality assurance techniques
Data cleaning, preprocessing, and quality assurance techniques
Track and evaluate AI predictions over time to assess accuracy. Use when reviewing past predictions to determine if they came true, failed, or remain uncertain.
Comprehensive guidance for interpreting backtest results and detecting overfitting (project)
Select appropriate AI/ML models based on capability matching, benchmarks, cost-performance tradeoffs, and deployment constraints.
Plan explainable AI (XAI) requirements including SHAP, LIME, attention visualization, and regulatory explainability needs.
Expert in observing, benchmarking, and optimizing AI agents. Specializes in token usage tracking, latency analysis, and quality evaluation metrics. Use when optimizing agent costs, measuring performance, or implementing evals. Triggers include "agent performance", "token usage", "latency optimization", "eval", "agent metrics", "cost optimization", "agent benchmarking".
AI self-improvement analyst. Tracks AI agent mistakes, analyzes failure patterns, and proposes system improvements. Implements continuous learning loop for trading system enhancement.
Use this skill when coordinating multiple AI models (Claude, GPT, Gemini, DeepSeek) for competitive trading analysis, running parallel model execution, aggregating decisions, or managing model performance tracking. Essential for multi-model trading arena operations.
Expert guidance on machine learning and feature engineering for fantasy football player projection models. Use this skill when building predictive models, engineering features from player statistics, selecting appropriate ML algorithms, or addressing sports-specific ML challenges. Covers feature engineering patterns, model selection frameworks, validation strategies, and interpretability techniques for fantasy football analytics.
Expert in statistical analysis, predictive modeling, machine learning, and data storytelling to drive business insights.
Use when "statistical modeling", "A/B testing", "experiment design", "causal inference", "predictive modeling", or asking about "hypothesis testing", "feature engineering", "data analysis", "pandas", "scikit-learn"
Interactive data exploration and visualization skill. Use when users ask to visualize data, analyze datasets, create charts, or explore data files (CSV, Excel, Parquet, JSON). This skill guides through data exploration, proposes visualization strategies based on data characteristics, creates interactive Plotly charts in marimo notebooks, and generates analytical conclusions.
Training monitoring dashboard setup with TensorBoard and Weights & Biases (WandB) including real-time metrics tracking, experiment comparison, hyperparameter visualization, and integration patterns. Use when setting up training monitoring, tracking experiments, visualizing metrics, comparing model runs, or when user mentions TensorBoard, WandB, training metrics, experiment tracking, or monitoring dashboard.
M3社が開発したPythonの機械学習パイプラインツールgokartに関する深い知識とベストプラクティスを提供する。Use when reviewing gokart task designs, implementing type-safe ML pipelines, writing gokart tests, or working with gokart and Pandera integration. Trigger when user mentions gokart, TaskOnKart, TaskInstanceParameter, Pandera DataFrames with gokart, or requests code review for ML pipeline tasks.
Guide through complete Kaggle competition workflow from TODO updates to submission. Use PROACTIVELY when user starts new experiments, prepares submissions, or asks about competition workflow. Keywords: Kaggle, 提出, submission, workflow, ワークフロー, コンペ
Эксперт по data annotation. Используй для ML labeling, annotation workflows и quality control.
Data science and machine learning platform functions for the East language (TypeScript types). Use when writing East programs that need optimization (MADS, Optuna, SimAnneal, Scipy), machine learning (XGBoost, LightGBM, NGBoost, Torch MLP, Lightning, GP), ML utilities (Sklearn preprocessing, metrics, splits), conformal prediction (MAPIE), or model explainability (SHAP). Triggers for: (1) Writing East programs with @elaraai/east-py-datascience, (2) Derivative-free optimization with MADS, (3) Bayesian optimization with Optuna, (4) Discrete/combinatorial optimization with SimAnneal, (5) Gradient boosting with XGBoost or LightGBM, (6) Probabilistic predictions with NGBoost or GP, (7) Neural networks with Torch MLP or Lightning, (8) Data preprocessing and metrics with Sklearn, (9) Conformal prediction intervals with MAPIE, (10) Model explainability with Shap.
Master SOTA data prep for Kaggle comps: automated EDA (Sweetviz), cleaning (Pyjanitor), and feature selection (Polars + XGBoost) for medium datasets (100MB–5GB) in Colab.
Conducts Exploratory Data Analysis (EDA) on datasets. Use when the user asks to "explore", "clean", or "visualize" a new CSV or dataset.
BigQuery query optimization, feature engineering, and BigQuery ML models for recommendation systems. Use when writing optimized SQL queries, implementing partitioning/clustering, creating materialized views, engineering user/product/interaction features, handling schema evolution, setting up streaming with Storage Write API, or building Matrix Factorization models.
Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.
Protecting personal and sensitive data throughout the machine learning lifecycle, from training to inference.
Elite AI/ML Senior Engineer with 20+ years experience. Transforms Claude into a world-class AI researcher and engineer capable of building production-grade ML systems, LLMs, transformers, and computer vision solutions. Use when: (1) Building ML/DL models from scratch or fine-tuning, (2) Designing neural network architectures, (3) Implementing LLMs, transformers, attention mechanisms, (4) Computer vision tasks (object detection, segmentation, GANs), (5) NLP tasks (NER, sentiment, embeddings), (6) MLOps and production deployment, (7) Data preprocessing and feature engineering, (8) Model optimization and debugging, (9) Clean code review for ML projects, (10) Choosing optimal libraries and frameworks. Triggers: "ML", "AI", "deep learning", "neural network", "transformer", "LLM", "computer vision", "NLP", "TensorFlow", "PyTorch", "sklearn", "train model", "fine-tune", "embedding", "CNN", "RNN", "LSTM", "attention", "GPT", "BERT", "diffusion", "GAN", "object detection", "segmentation".