home/categories/machine-learning

category focus

Machine Learning

Training models and neural networks.

1987 个技能all categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

machine-learning

cross-validation-setup

Cross Validation Setup - Auto-activating skill for ML Training. Triggers on: cross validation setup, cross validation setup Part of the ML Training skill category.

jeremylongshore

data-ai

open

machine-learning

gradient-clipping-helper

Gradient Clipping Helper - Auto-activating skill for ML Training. Triggers on: gradient clipping helper, gradient clipping helper Part of the ML Training skill category.

jeremylongshore

data-ai

open

machine-learning

setting-up-experiment-tracking

This skill automates the setup of machine learning experiment tracking using tools like MLflow or Weights & Biases (W&B). It is triggered when the user requests to "track experiments", "setup experiment tracking", "initialize MLflow", or "integrate W&B". The skill configures the necessary environment, initializes the tracking server (if needed), and provides code snippets for logging experiment parameters, metrics, and artifacts. It helps ensure reproducibility and simplifies the comparison of different model runs.

jeremylongshore

data-ai

open

machine-learning

setting-up-experiment-tracking

jeremylongshore

data-ai

open

machine-learning

domain-ml

Use when building ML/AI apps in Rust. Keywords: machine learning, ML, AI, tensor, model, inference, neural network, deep learning, training, prediction, ndarray, tch-rs, burn, candle, 机器学习, 人工智能, 模型推理

actionbook

data-ai

open

machine-learning

974

hugging-face-trackio

Track and visualize ML training experiments with Trackio. Use when logging metrics during training (Python API) or retrieving/analyzing logged metrics (CLI). Supports real-time dashboard visualization, HF Space syncing, and JSON output for automation.

huggingface

data-ai

open

machine-learning

974

hugging-face-evaluation

Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.

huggingface

data-ai

open

machine-learning

974

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

huggingface

data-ai

open

machine-learning

950

esm

Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.

wu-yc

data-ai

open

machine-learning

950

scvi-tools

Deep generative models for single-cell omics. Use when you need probabilistic batch correction (scVI), transfer learning, differential expression with uncertainty, or multi-modal integration (TOTALVI, MultiVI). Best for advanced modeling, batch effects, multimodal data. For standard analysis pipelines use scanpy.

wu-yc

data-ai

open

machine-learning

950

umap-learn

UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.

wu-yc

data-ai

open

machine-learning

950

aeon

This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.

wu-yc

data-ai

open

machine-learning

950

pymc-bayesian-modeling

Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.

wu-yc

data-ai

open

machine-learning

950

scikit-learn

Machine learning in Python with scikit-learn. Use when working with supervised learning (classification, regression), unsupervised learning (clustering, dimensionality reduction), model evaluation, hyperparameter tuning, preprocessing, or building ML pipelines. Provides comprehensive reference documentation for algorithms, preprocessing techniques, pipelines, and best practices.

wu-yc

data-ai

open

machine-learning

950

scikit-survival

Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.

wu-yc

data-ai

open

machine-learning

950

shap

Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model.

wu-yc

data-ai

open

machine-learning

950

transformers

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

wu-yc

data-ai

open

machine-learning

950

pyhealth

Comprehensive healthcare AI toolkit for developing, testing, and deploying machine learning models with clinical data. This skill should be used when working with electronic health records (EHR), clinical prediction tasks (mortality, readmission, drug recommendation), medical coding systems (ICD, NDC, ATC), physiological signals (EEG, ECG), healthcare datasets (MIMIC-III/IV, eICU, OMOP), or implementing deep learning models for healthcare applications (RETAIN, SafeDrug, Transformer, GNN).

wu-yc

data-ai

open

machine-learning

946

mhc-algorithm

Implement mHC (Manifold-Constrained Hyper-Connections) for stabilizing deep network training. Use when implementing residual connection improvements with doubly stochastic matrices via Sinkhorn-Knopp algorithm. Based on DeepSeek's 2025 paper (arXiv:2512.24880).

benchflow-ai

data-ai

open

machine-learning

946

first-order-model-fitting

Fit first-order dynamic models to experimental step response data and extract K (gain) and tau (time constant) parameters.

benchflow-ai

data-ai

open

machine-learning

946

scipy-curve-fit

Use scipy.optimize.curve_fit for nonlinear least squares parameter estimation from experimental data.

benchflow-ai

data-ai

open

machine-learning

946

nanogpt-training

Train GPT-2 scale models (~124M parameters) efficiently on a single GPU. Covers GPT-124M architecture, tokenized dataset loading (e.g., HuggingFace Hub shards), modern optimizers (Muon, AdamW), mixed precision training, and training loop implementation.

benchflow-ai

data-ai

open

machine-learning

946

imc-tuning-rules

Calculate PI/PID controller gains using Internal Model Control (IMC) tuning rules for first-order systems.

benchflow-ai

data-ai

open

machine-learning

946

excitation-signal-design

Design effective excitation signals (step tests) for system identification and parameter estimation in control systems.

benchflow-ai

data-ai

open

Page 24 / 83