home/categories/data-ai

domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 اسکلزall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

machine-learning

exps-logistic

Documentation for the logistic regression MI estimation experiment (exps_logistic)

TerryTong-Git

data-ai

open

machine-learning

network-architecture-sizing

PPO network architecture sizing for trading models. Trigger: (1) model files are unexpectedly small/large, (2) choosing hidden_dims for training, (3) balancing model capacity vs inference speed.

smith6jt-cop

data-ai

open

machine-learning

Refactor Scikit-learn and machine learning code to improve maintainability, reproducibility, and adherence to best practices. This skill transforms working ML code into production-ready pipelines that prevent data leakage and ensure reproducible results. It addresses preprocessing outside pipelines, missing random_state parameters, improper cross-validation, and custom transformers not following sklearn API conventions. Implements proper Pipeline and ColumnTransformer patterns, systematic hyperparameter tuning, and appropriate evaluation metrics.

SnakeO

data-ai

open

machine-learning

training-archive-gating

Mandatory training archive with model gating (APPROVED/REVIEW/DROP). Trigger when: (1) training run completes, (2) need to decide which models to deploy, (3) want historical training reference, (4) need checkpoint recommendations for overfitting.

smith6jt-cop

data-ai

open

machine-learning

dpo

Direct Preference Optimization for learning from preference pairs. Covers DPOTrainer, preference dataset preparation, implicit reward modeling, and beta tuning for stable preference learning without explicit reward models. Includes thinking quality patterns.

atrawog

data-ai

open

machine-learning

kaggle-api-expert

Expert agent for Kaggle API authentication, dataset management, and running Kaggle notebooks on Texas Tech HPCC. Specializes in connecting Jupyter notebooks to Kaggle API and submitting to code competitions. Always checks VPN connection first before HPCC operations.

sweeden-ttu

data-ai

open

machine-learning

pytorch-geometric

Library for Graph Neural Networks (GNNs). Covers MessagePassing layers, modular aggregation schemes, and handling large graphs via mini-batching with disjoint graph representation. (pyg, messagepassing, gnn, gcn, gat, edge_index, knn_graph, global_mean_pool)

cuba6112

data-ai

open

machine-learning

training-improvements-v245

Training improvements: LR warmup, validation intervals, reward weights. Trigger when: (1) training unstable in early epochs, (2) need more validation visibility, (3) model too conservative.

smith6jt-cop

data-ai

open

machine-learning

export-wizard

Master coordinator for model export. Guides the user through selecting the right format and initiating the process.

innV0

data-ai

open

machine-learning

pytorch-lightning

High-level training framework for PyTorch that abstracts boilerplate while maintaining flexibility. Includes the Trainer, LightningModule, and support for multi-GPU scaling and reproducibility. (lightning, pytorch-lightning, lightningmodule, trainer, callback, ddp, fast_dev_run, seed_everything)

cuba6112

data-ai

open

machine-learning

credit-model-validation-banking

Автоматизация процесса валидации моделей кредитного риска в банковской сфере. Используется для полного цикла валидации - от загрузки pickle модели и анализа данных до генерации детального отчета с метриками (AUC, Gini, Recall, Precision, F1, KS, PSI, CSI), визуализациями и соответствием регуляторным требованиям Казахстана.

00060633

data-ai

open

machine-learning

arxiv-learn

INTERNAL MODULE - Use `arxiv learn` instead. This module provides the learn pipeline implementation for the arxiv skill.

grahama1970

data-ai

open

machine-learning

few-shot-learning-finance

Use when implementing models that learn from minimal data or need to adapt to new market regimes rapidly. Covers episodic learning, context sets, support and query sequences, zero-shot vs few-shot learning, meta-learning for finance, transfer learning across assets and regimes, and quick adaptation to market changes.

Donaldshen27

data-ai

open

machine-learning

describe-image

Uses a local model to describe something about an image

richardanaya

data-ai

open

machine-learning

rl-foundations

Master RL theory - MDPs, value functions, Bellman equations, value/policy iteration, TD

tachyon-beep

data-ai

open

machine-learning

marker-engine-rl

Vertieft den Marker-Engine-Skill um SFT/RL-Feinabstimmung mit LeanDeep 4.0; lädt Marker aus Supabase/ZIP und lernt eine Policy zur präzisen, kontextualisierten Marker-Anwendung bei strikter Bottom-up-Logik.

DYAI2025

data-ai

open

machine-learning

ai-engineering-skill

Practical guide for building production ML systems based on Chip Huyen's AI Engineering book. Use when users ask about model evaluation, deployment strategies, monitoring, data pipelines, feature engineering, cost optimization, or MLOps. Covers metrics, A/B testing, serving patterns, drift detection, and production best practices.

odewahn

data-ai

open

machine-learning

training-resilience

Fix PPO training early-stop issues. Trigger when: (1) impossible drawdown values (>100%), (2) training stops too early, (3) need adaptive recovery instead of hard stop.

smith6jt-cop

data-ai

open

machine-learning

trl

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

evalstate

data-ai

open

machine-learning

training-mlps

A skill for defining and training Multi-Layer Perceptrons (MLPs) using Flax NNX.

yonesuke

data-ai

open

machine-learning

reward-shaping-engineering

Master reward function design - potential-based shaping, hacking patterns, validation

tachyon-beep

data-ai

open

machine-learning

recommendation-ml

ML recommendation system development with collaborative filtering (Matrix Factorization), content-based filtering, and hybrid approaches. Use when building recommendation models, implementing Feast feature stores, setting up MLflow model registry, handling cold-start problems for new users/products, implementing diversity with MMR algorithm, or adding exploration with Thompson Sampling/epsilon-greedy bandits.

ilorozco11

data-ai

open

machine-learning

model-explainability-and-interpretability

Techniques and tools for understanding how machine learning models make decisions and explaining those decisions to stakeholders.

AmnadTaowsoam

data-ai

open

machine-learning

model-bias-and-fairness

Identifying, measuring, and mitigating algorithmic bias to ensure equitable outcomes in AI systems.

AmnadTaowsoam

data-ai

open

Page 405 / 406