home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541 스킬all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
3

build-only-validate-capability-flow

Validate a capability flow specification against schema constraints. Use after designing a process to ensure it conforms to framework rules. Triggers on "validate capability flow", "check process spec", "verify schema compliance".

pidster
pidster
data-ai
open
data-engineering
3

data-quality

Implement data validation rules, quality metrics, and data cleansing strategies

dasien
dasien
data-ai
open
data-engineering
3

data-engineering

ETL pipelines, Apache Spark, data warehousing, and big data processing. Use for building data pipelines, processing large datasets, or data infrastructure.

pluginagentmarketplace
pluginagentmarketplace
data-ai
open
data-engineering
3

validating-database-integrity

Use when you need to ensure database integrity through comprehensive data validation. This skill validates data types, ranges, formats, referential integrity, and business rules. Trigger with phrases like "validate database data", "implement data validation rules", "enforce data integrity constraints", or "validate data formats".

BbgnsurfTech
BbgnsurfTech
data-ai
open
data-engineering
2

data-modeling

Dimensional modeling, normalization, and schema design for analytics.

timequity
timequity
data-ai
open
data-engineering
2

td-data-profiling

Comprehensive data profiling and quality assessment using Teradata ClearScape Analytics descriptive statistics functions

teradata-labs
teradata-labs
data-ai
open
data-engineering
2

cleaning-data

Systematic data quality remediation - detect duplicates/outliers/inconsistencies, design cleaning strategy, execute transformations, verify results (component skill for DataPeeker analysis sessions)

tilmon-engineering
tilmon-engineering
data-ai
open
data-engineering
2

duckdb-quadruple-interleave

Chaotic interleaving across local DuckDB databases modeled as coupled quadruple pendula. Random walks both BETWEEN databases and WITHIN tables for context injection.

plurigrid
plurigrid
data-ai
open
data-engineering
2

airflow

Airflow DAG patterns, KubernetesPodOperator, and debugging. Triggers on "dag", "airflow", "task", "operator", "KPO", "scheduler", "XCom".

pypeaday
pypeaday
data-ai
open
data-engineering
2

agentdb-state-manager

Persistent state management using AgentDB (DuckDB) for workflow analytics and checkpoints. Provides read-only analytics cache synchronized from TODO_*.md files, enabling: - Complex dependency graph queries - Historical workflow metrics - Context checkpoint storage/recovery - State transition analysis Use when: Data gathering and analysis for workflow state tracking Triggers: "analyze workflow", "query state", "checkpoint", "workflow metrics"

stharrold
stharrold
data-ai
open
data-engineering
2

dst-check-freshness

Check data freshness and age for DST tables in DuckDB. Use when determining if data needs refreshing or validating data currency before analysis.

mikkelkrogsholm
mikkelkrogsholm
data-ai
open
data-engineering
2

duck-agent

DuckDB file discovery agent with verified absolute paths

plurigrid
plurigrid
data-ai
open
data-engineering
2

oracle

Use the @steipete/oracle CLI to bundle a prompt plus the right files and get a second-model review (API or browser) for debugging, refactors, design checks, or cross-validation.

LarsEckart
LarsEckart
data-ai
open
data-engineering
2

pulse-mcp-stream

Layer 1 Real-Time Social Stream Monitoring via MCP with DuckDB persistence

plurigrid
plurigrid
data-ai
open
data-engineeringmarketplace
2

adapter-assistant

Complete adapter lifecycle assistant for LimaCharlie. Supports External Adapters (cloud-managed), Cloud Sensors (SaaS/cloud integrations), and On-prem USP adapters. Dynamically researches adapter types from local docs and GitHub usp-adapters repo. Creates, validates, deploys, and troubleshoots adapter configurations. Handles parsing rules (Grok, regex), field mappings, credential setup, and multi-adapter configs. Use when setting up new data sources (Okta, S3, Azure Event Hub, syslog, webhook, etc.), troubleshooting ingestion issues, or managing adapter deployments.

refractionPOINT
refractionPOINT
data-ai
open
data-engineering
2

scalardb-sizing-estimator

ScalarDB Cluster および ScalarDB Analytics のアーキテクチャ、サイジング、構成を見積もるスキル。 性能要件、可用性要件、クラウド環境からScalarDB Cluster Pod数、Kubernetes構成、 バックエンドDB、API Gateway、監視システム等の全体構成を見積もる。 ScalarDB Analyticsを使用する場合はEMR/Databricksのサイジングも含む。 使用タイミング: - 「ScalarDBのサイジングを見積もりたい」「ScalarDB環境を構築したい」 - 「ScalarDB Clusterの構成を決めたい」「ScalarDBの費用を算出したい」 - 「開発/テスト/ステージング/本番環境のScalarDB構成」 - CI/CD、Blue/Green、Canary Deploymentを含む本番環境設計 - 「ScalarDB Analyticsを使いたい」「分析クエリ環境を構築したい」 - 「EMR/Databricksのサイジングを見積もりたい」 出力: Markdown形式の見積もり結果 + HTML形式のレポート 費用: USD/JPY両建て(為替レート明記)

wfukatsu
wfukatsu
data-ai
open
data-engineering
2

airflow

Airflow DAG patterns, KubernetesPodOperator, and debugging. Use on 'dag', 'airflow', 'task', 'operator', 'KPO', 'scheduler', 'XCom'.

pypeaday
pypeaday
data-ai
open
data-engineering
2

dst-data

Fetch actual data from Danmarks Statistik API and store in DuckDB. Use when user wants to download and store specific DST table data for analysis.

mikkelkrogsholm
mikkelkrogsholm
data-ai
open
data-engineering
2

spark-basics

PySpark fundamentals for distributed data processing.

timequity
timequity
data-ai
open
data-engineering
2

anonymise

Anonymise CSV files by removing personal identifying information and adding datetime stamps. Use when user wants to process a new CSV file or strip PII from data.

sofer
sofer
data-ai
open
data-engineering
2

cobol-migration-analyzer

Analyzes legacy COBOL programs and JCL jobs to assist with migration to modern Java applications. Extracts business logic, identifies dependencies, generates migration reports, and creates Java implementation strategies. Use when working with mainframe migration, COBOL analysis, legacy system modernization, JCL workflows, or when users mention COBOL to Java conversion, analyzing .cbl/.CBL/.cob files, working with copybooks, or planning Java service implementations from COBOL programs.

DauQuangThanh
DauQuangThanh
data-ai
open
data-engineering
2

build-graph

GraphDB構築エージェント - ユビキタス言語とコード解析結果からRyuGraphデータベースを構築。/build-graph [対象パス] で呼び出し。

wfukatsu
wfukatsu
data-ai
open
data-engineering
2

entropy-sequencer

Layer 5 Interaction Interleaving for Maximum Information Gain with DuckDB

plurigrid
plurigrid
data-ai
open
Previous
Page 53 / 65
Next