domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
1.4K

stream-db

Stream-backed reactive database with @durable-streams/state. createStreamDB() with schema and stream options, db.preload() lazy initialization, db.collections for TanStack DB collections, optimistic actions with onMutate and mutationFn, db.utils.awaitTxId() for transaction confirmation, control events (snapshot-start, snapshot-end, reset), db.close() cleanup, re-exported TanStack DB operators (eq, gt, and, or, count, sum, avg, min, max).

durable-streams
durable-streams
data-ai
open
data-engineering
1.4K

yjs-sync

Yjs CRDT sync over durable streams with @durable-streams/y-durable-streams. DurableStreamsProvider setup, document stream and awareness stream config, transport modes (SSE vs long-poll), provider lifecycle (connect, disconnect, destroy), synced/status/error events, lib0 VarUint8Array framing, awareness heartbeat. Requires yjs, y-protocols, lib0 peer dependencies. Load when integrating Yjs collaborative editing with durable streams.

durable-streams
durable-streams
data-ai
open
llm-ai
1.4K

deploy

Deploy the Simba chat widget to a client website. Use when embedding the widget, connecting to Simba cloud or local, and configuring appearance.

GitHamza0206
GitHamza0206
data-ai
open
llm-ai
1.4K

deepinit

Deep codebase initialization with hierarchical AGENTS.md documentation

Yeachan-Heo
Yeachan-Heo
data-ai
open
llm-ai
1.4K

ultrawork

Activate maximum performance mode with parallel agent orchestration for high-throughput task completion

Yeachan-Heo
Yeachan-Heo
data-ai
open
data-analysis
1.4K

upload-parity-experiments

Create or reuse Hugging Face dataset PRs for `harborframework/parity-experiments` and upload Harbor parity/oracle result folders efficiently with sparse checkout, raw git pushes, and Git LFS.

harbor-framework
harbor-framework
data-ai
open
machine-learning
1.4K

e2e-auto

Run a fixed OTA regression flow for `examples/v0.81.0` with `agent-device`. Use when the caller wants a built-in end-to-end scenario instead of defining one manually: deploy a known-good OTA bundle, verify a visible UI change plus the deployed `bundleId`, then deploy an intentionally crashing OTA bundle and verify rollback to the previous stable bundle with `RECOVERED` and `crashedBundleId` evidence.

gronxb
gronxb
data-ai
open
data-analysis
1.4K

report-generate

联网检索并产出结构化研究报告,强调结论、证据、风险与建议。

limecloud
limecloud
data-ai
open
data-analysis
1.4K

research

联网信息检索与趋势调研(优先产出可引用结论,而非原始片段堆砌)。

limecloud
limecloud
data-ai
open
llm-ai
1.4K

context-engineering

Master context engineering for AI agent systems. Use when designing agent architectures, debugging context failures, optimizing token usage, implementing memory systems, building multi-agent coordination, evaluating agent performance, or developing LLM-powered pipelines. Covers context fundamentals, degradation patterns, optimization techniques (compaction, masking, caching), compression strategies, memory architectures, multi-agent patterns, LLM-as-Judge evaluation, tool design, and project development.

mrgoonie
mrgoonie
data-ai
open
data-analysis
1.3K

correlation-analyzer

Correlation Analyzer - Auto-activating skill for Data Analytics. Triggers on: correlation analyzer, correlation analyzer Part of the Data Analytics skill category.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-analysis
1.3K

data-artist

Create beautiful data visualizations with mathematical elegance, color theory, and narrative design - the "Data is Beautiful" aesthetic.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-analysis
1.3K

datacommons-client

Work with Data Commons, a platform providing programmatic access to public statistical data from global sources. Use this skill when working with demographic data, economic indicators, health statistics, environmental data, or any public datasets available through Data Commons. Applicable for querying population statistics, GDP figures, unemployment rates, disease prevalence, geographic entity resolution, and exploring relationships between statistical entities.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-analysis
1.3K

hypothesis-generation

Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-analysis
1.3K

matlab

MATLAB and GNU Octave numerical computing for matrix operations, data analysis, visualization, and scientific computing. Use when writing MATLAB/Octave scripts for linear algebra, signal processing, image processing, differential equations, optimization, statistics, or creating scientific visualizations. Also use when the user needs help with MATLAB syntax, functions, or wants to convert between MATLAB and Python code. Scripts can be executed with MATLAB or the open-source GNU Octave interpreter.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-engineering
1.3K

data-exploration-visualization

自动化数据探索和可视化工具,提供从数据加载到专业报告生成的完整EDA解决方案。支持多种图表类型、智能数据诊断、建模评估和HTML报告生成。适用于医疗、金融、电商等领域的数据分析项目。

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-engineering
1.3K

data-quality-checker

Data Quality Checker - Auto-activating skill for Data Pipelines. Triggers on: data quality checker, data quality checker Part of the Data Pipelines skill category.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-engineering
1.3K

polars

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-engineering
1.3K

vaex

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-engineering
1.3K

xan

High-performance CSV processing with xan CLI for large tabular datasets, streaming transformations, and low-memory pipelines.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
data-engineering
1.3K

zarr-python

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
machine-learning
1.3K

confusion-matrix-generator

Confusion Matrix Generator - Auto-activating skill for ML Training. Triggers on: confusion matrix generator, confusion matrix generator Part of the ML Training skill category.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
machine-learning
1.3K

data-normalization-tool

Data Normalization Tool - Auto-activating skill for ML Training. Triggers on: data normalization tool, data normalization tool Part of the ML Training skill category.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
machine-learning
1.3K

esm

Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.

foryourhealth111-pixel
foryourhealth111-pixel
data-ai
open
Previous
Page 79 / 406
Next