home/categories/data-ai

domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 스킬all categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

data-analysis

115

review-writer

将论文笔记和对比矩阵综合为结构化学术文献综述，含 BibTeX 引用

gy-hou

data-ai

open

data-analysis

115

paper-analyzer

从学术论文中提取结构化信息，生成标准化笔记和文献对比矩阵

gy-hou

data-ai

open

data-analysis

115

sc-pseudotime

Single-cell pseudotime and lineage inference after clustering, with DPT, Palantir, VIA, CellRank, or Slingshot plus post-hoc trajectory gene ranking.

TianGzlab

data-ai

open

data-analysis

115

Spatial statistics for spatial transcriptomics using neighborhood enrichment, Ripley's statistics, co-occurrence, Moran/Geary autocorrelation, local Moran, Getis-Ord Gi*, bivariate Moran, and spatial graph centrality summaries.

TianGzlab

data-ai

open

data-analysis

115

genomics-alignment

Alignment statistics from SAM/BAM files: mapping rate, MAPQ distribution, insert size, duplicate rate, proper pair rate. Mirrors samtools-flagstat.

TianGzlab

data-ai

open

data-engineering

115

real-time-streaming

Use this skill when building real-time data pipelines, stream processing jobs, or change data capture systems. Triggers on tasks involving Apache Kafka (producers, consumers, topics, partitions, consumer groups, Connect, Streams), Apache Flink (DataStream API, windowing, checkpointing, stateful processing), event sourcing implementations, CDC with Debezium, stream processing patterns (windowing, watermarks, exactly-once semantics), and any pipeline that processes unbounded data in motion rather than data at rest.

AbsolutelySkilled

data-ai

open

data-engineering

115

spatial-raw-processing

Process barcoded spatial transcriptomics FASTQ pairs with st_pipeline, preserve upstream artifacts, convert the counts matrix into a standardized raw_counts.h5ad, and hand off cleanly to spatial-preprocess.

TianGzlab

data-ai

open

data-engineering

115

data-warehousing

Use this skill when designing data warehouses, building star or snowflake schemas, implementing slowly changing dimensions (SCDs), writing analytical SQL for Snowflake or BigQuery, creating fact and dimension tables, or planning ETL/ELT pipelines for analytics. Triggers on dimensional modeling, surrogate keys, conformed dimensions, warehouse architecture, data vault, partitioning strategies, materialized views, and any task requiring OLAP schema design or warehouse query optimization.

AbsolutelySkilled

data-ai

open

data-engineering

115

data-quality

Use this skill when implementing data validation, data quality monitoring, data lineage tracking, data contracts, or Great Expectations test suites. Triggers on schema validation, data profiling, freshness checks, row-count anomalies, column drift, expectation suites, contract testing between producers and consumers, lineage graphs, data observability, and any task requiring data integrity enforcement across pipelines.

AbsolutelySkilled

data-ai

open

data-engineering

115

data-pipelines

Use this skill when building data pipelines, ETL/ELT workflows, or data transformation layers. Triggers on Airflow DAG design, dbt model creation, Spark job optimization, streaming vs batch architecture decisions, data ingestion, data quality checks, pipeline orchestration, incremental loads, CDC (change data capture), schema evolution, and data warehouse modeling. Acts as a senior data engineer advisor for building reliable, scalable data infrastructure.

AbsolutelySkilled

data-ai

open

data-engineering

115

analytics-engineering

Use this skill when building dbt models, designing semantic layers, defining metrics, creating self-serve analytics, or structuring a data warehouse for analyst consumption. Triggers on dbt project setup, model layering (staging, intermediate, marts), ref() and source() usage, YAML schema definitions, metrics definitions, semantic layer configuration, dimensional modeling, slowly changing dimensions, data testing, and any task requiring analytics engineering best practices.

AbsolutelySkilled

data-ai

open

data-engineering

115

research-vault

将研究成果持久化到 Obsidian vault，维护论文池索引。支持每日研究日志、论文卡片、综述归档，以及跨项目论文去重和快速检索。

gy-hou

data-ai

open

data-engineering

115

tracking-live-gtm

Use when the user wants to inspect the real live GTM runtime before schema generation or compare multiple live GTM containers.

jtrackingai

data-ai

open

data-engineering

114

data-analysis

Analyze CSV and tabular data, create summaries, and generate insights

chrispangg

data-ai

open

data-engineering

114

exploring-data

Exploratory data analysis using ydata-profiling. Use when users upload .csv/.xlsx/.json/.parquet files or request "explore data", "analyze dataset", "EDA", "profile data". Generates interactive HTML or JSON reports with statistics, visualizations, correlations, and quality alerts.

oaustegard

data-ai

open

data-analysis

114

charting-vega-lite

Create interactive data visualizations using Vega-Lite declarative JSON grammar. Supports 20+ chart types (bar, line, scatter, histogram, boxplot, grouped/stacked variations, etc.) via templates and programmatic builders. Use when users upload data for charting, request specific chart types, or mention visualizations. Produces portable JSON specs with inline data islands that work in Claude artifacts and can be adapted for production.

oaustegard

data-ai

open

llm-ai

114

uloop-get-provider-details

Get Unity Search provider details via uloop CLI. Use when you need to: (1) Discover available search providers, (2) Understand search capabilities and filters, (3) Configure searches with specific provider options.

hatayama

data-ai

open

data-analysis

114

content-experimentation-best-practices

Content experimentation and A/B testing guidance covering experiment design, hypotheses, metrics, sample size, statistical foundations, CMS-managed variants, and common analysis pitfalls. Use this skill when planning experiments, setting up variants, choosing success metrics, interpreting statistical results, or building experimentation workflows in a CMS or frontend stack.

sanity-io

data-ai

open

data-analysis

114

charting

Select the right Python charting library (seaborn, matplotlib, graphviz) and produce publication-quality static visualizations. Use when creating charts, plots, graphs, diagrams, heatmaps, visualizations from data, or when choosing between matplotlib/seaborn/graphviz. Also triggers for network diagrams, flowcharts, dependency trees, state machines, and entity-relationship diagrams. For interactive browser-rendered charts or uploaded data exploration, defer to charting-vega-lite instead.

oaustegard

data-ai

open

data-analysis

113

data-visualization

Data visualization with chart selection, color theory, and annotation best practices. Covers chart types (bar, line, scatter, heatmap), axes rules, and storytelling with data. Use for: charts, graphs, dashboards, reports, presentations, infographics, data stories. Triggers: data visualization, chart, graph, data chart, bar chart, line chart, scatter plot, data viz, visualization, dashboard chart, infographic data, data presentation, chart design, plot, heatmap, pie chart alternative

NeverSight

data-ai

open

data-analysis

113

openclaw-stock-skill

使用 data.diemeng.chat 提供的接口查询股票日线、分钟线、财务指标等数据，支持 A 股等市场。

NeverSight

data-ai

open

data-analysis

113

data-visualization

NeverSight

data-ai

open

data-analysis

113

csv-data-analyst

Analyze CSV files, generate summary statistics, and create visualizations using Python and pandas. Use when the user uploads, attaches, or references a CSV file, asks to summarize or analyze tabular data, requests insights from CSV data, or wants to understand data structure and quality.

NeverSight

data-ai

open

data-analysis

113

sdmx-explorer

Guided, interactive exploration of statistical data via SDMX providers (Eurostat, OECD, ECB, World Bank, ISTAT, and others) using the opensdmx CLI. Use this skill whenever the user asks a question about statistics that could be answered with SDMX data: demographics, economy, employment, births, deaths, population, prices, trade, health, agriculture, or any other topic. Also use it when the user mentions a specific dataflow ID they want to explore. The skill guides the user step by step: discovers relevant datasets, proposes the most meaningful candidates, explores the schema using real constraints (not codelists), explains the dataset structure, and invites the user to make informed filter choices before fetching any data.

NeverSight

data-ai

open

Page 187 / 406