home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541 個技能all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
414

airflow-dag-patterns

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

data-engineer

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms. Use PROACTIVELY for data pipeline design, analytics infrastructure, or modern data stack implementation.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

data-quality-frameworks

Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

database-admin

Expert database administrator specializing in modern cloud databases, automation, and reliability engineering. Masters AWS/Azure/GCP database services, Infrastructure as Code, high availability, disaster recovery, performance optimization, and compliance. Handles multi-cloud strategies, container databases, and cost optimization. Use PROACTIVELY for database architecture, operations, or reliability engineering.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

database-architect

Expert database architect specializing in data layer design from scratch, technology selection, schema modeling, and scalable database architectures. Masters SQL/NoSQL/TimeSeries database selection, normalization strategies, migration planning, and performance-first design. Handles both greenfield architectures and re-architecture of existing systems. Use PROACTIVELY for database architecture, technology selection, or data modeling decisions.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

database-migration

MASTER DB: Zero-Downtime, Schema Design (3NF), SQL/NoSQL.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

dbt-transformation-patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

error-debugging-error-analysis

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

error-diagnostics-error-analysis

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

nosql-expert

Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

production-code-audit

Autonomously deep-scan entire codebase line-by-line, understand architecture and patterns, then systematically transform it to production-grade, corporate-level professional quality with optimizations

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

scala-pro

Master enterprise-grade Scala development with functional programming, distributed systems, and big data processing. Expert in Apache Pekko, Akka, Spark, ZIO/Cats Effect, and reactive architectures. Use PROACTIVELY for Scala system design, performance optimization, or enterprise integration.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

spark-optimization

Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

tdd-orchestrator

Master TDD orchestrator specializing in red-green-refactor discipline, multi-agent workflow coordination, and comprehensive test-driven development practices. Enforces TDD best practices across teams with AI-assisted testing and modern frameworks. Use PROACTIVELY for TDD implementation and governance.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

temporal-python-pro

Master Temporal workflow orchestration with Python SDK. Implements durable workflows, saga patterns, and distributed transactions. Covers async/await, testing strategies, and production deployment. Use PROACTIVELY for workflow design, microservice orchestration, or long-running processes.

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
414

production-code-audit

Autonomously deep-scan entire codebase line-by-line, understand architecture and patterns, then systematically transform it to production-grade, corporate-level professional quality with optimizations

Dokhacgiakhoa
Dokhacgiakhoa
data-ai
open
data-engineering
411

commit

Stage and commit changes with intelligent Conventional Commits message generation and commit-splitting analysis.

joshukraine
joshukraine
data-ai
open
data-engineering
401

work

Coordinate non-trivial engineering work with the right level of analysis, planning, delegation, implementation, and validation. Use for multi-step requests, multi-file changes, or work that benefits from explicit execution strategy.

TechDufus
TechDufus
data-ai
open
data-engineering
397

table-filler

Fill `outline/tables_index.md` from `outline/table_schema.md` + evidence packs (NO PROSE in cells; citation-backed rows). **Trigger**: table filler, fill tables, evidence-first tables, index tables, 表格填充, 索引表. **Use when**: table schema exists and evidence packs are ready; you want a compact, citation-backed index table to support later writing and Appendix table curation. **Skip if**: `outline/tables_index.md` already exists and is refined (>=2 tables; citations in rows; no placeholders). **Network**: none. **Guardrail**: do not invent facts; every row must include citations; do not write paragraph cells.

WILLOSCAR
WILLOSCAR
data-ai
open
data-engineering
397

extraction-form

Extract study data into a structured table (`papers/extraction_table.csv`) using the protocol’s extraction schema. **Trigger**: extraction form, extraction table, data extraction, 信息提取, 提取表. **Use when**: systematic review 在 screening 后进入 extraction(C3),需要把纳入论文按字段落到 CSV 以支持后续 synthesis。 **Skip if**: 还没有 `papers/screening_log.csv` 或 protocol 未锁定。 **Network**: none. **Guardrail**: 严格按 schema 填字段;不要在此阶段写 narrative synthesis(那是 `synthesis-writer`)。

WILLOSCAR
WILLOSCAR
data-ai
open
Previous
Page 26 / 65
Next