home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541টি স্কিলall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
0

lakehouse-patterns

Comprehensive guide to data lakehouse architecture combining data lake flexibility with data warehouse performance using Delta Lake, Iceberg, and Hudi

AmnadTaowsoam
AmnadTaowsoam
data-ai
open
data-engineering
0

oracle

Use the @steipete/oracle CLI to bundle a prompt plus the right files and get a second-model review (API or browser) for debugging, refactors, design checks, or cross-validation.

doubleflannel
doubleflannel
data-ai
open
data-engineering
0

pm-04-clean-filter

Apply cleaning and filtering actions based on data quality decisions and generate filtered log artefacts.

Wattysaid
Wattysaid
data-ai
open
data-engineering
0

data-catalog-entry

Create standardized metadata for data assets. Use when documenting new datasets, building data catalogs, improving data discoverability, or creating data dictionaries for teams.

nimrodfisher
nimrodfisher
data-ai
open
data-engineering
0

clojure-charred

High-performance JSON and CSV parsing library for Clojure. Use when working with JSON or CSV data and need fast in Clojure, efficient parsing/writing with a clojure.data.json/clojure.data.csv compatible API.

Ramblurr
Ramblurr
data-ai
open
data-engineering
0

clickhouse-cloud-connection

Test and validate ClickHouse Cloud connection using clickhouse-connect for gapless-crypto-clickhouse. Use when validating connectivity, troubleshooting connection issues, or verifying environment configuration. Includes version check and query validation.

terrylica
terrylica
data-ai
open
data-engineering
0

awkward-array

Guidance for working with Awkward Array 2.0 jagged arrays and records in Python. Use when building or debugging `awkward` workflows, including record construction with `ak.zip`, adding fields with `ak.with_field`, filtering/aggregation, combinatorics (`ak.cartesian`/`ak.combinations`), `argmin`/`argmax` slicing, flattening, sorting, and NumPy interop or common Awkward pitfalls.

gordonwatts
gordonwatts
data-ai
open
data-engineering
0

data-quality-checks

Comprehensive guide to data quality validation, testing frameworks, anomaly detection, and data observability for production data pipelines

AmnadTaowsoam
AmnadTaowsoam
data-ai
open
data-engineering
0

data-modeler

イミュータブルデータモデルに基づくデータモデリング自動化Skill。ブラックボードパターンで段階的にエンティティ抽出からER図生成まで実行します。

tis-abe-akira
tis-abe-akira
data-ai
open
data-engineering
0

agentdb-state-manager

Persistent state management using AgentDB (DuckDB) for workflow analytics and checkpoints. Provides read-only analytics cache synchronized from TODO_*.md files, enabling: - Complex dependency graph queries - Historical workflow metrics - Context checkpoint storage/recovery - State transition analysis Use when: Data gathering and analysis for workflow state tracking Triggers: "analyze workflow", "query state", "checkpoint", "workflow metrics"

stharrold
stharrold
data-ai
open
data-engineering
0

cql-type-system-schema-handling

Implement and deserialize all CQL types including primitives (int, text, timestamp, uuid, varint, decimal), collections (list, set, map), tuples, UDTs (user-defined types), and frozen types. Use when working with CQL type deserialization, schema validation, collection parsing, UDT handling, or type-correct data generation.

pmcfadin
pmcfadin
data-ai
open
data-engineering
0

data-engineer

Data Engineer Agent. ETL 파이프라인, 데이터 웨어하우스, 데이터 레이크 구축을 담당합니다.

shaul1991
shaul1991
data-ai
open
data-engineering
0

data-pipeline

GenStage, Broadway, and Flow for Elixir data pipelines

layeddie
layeddie
data-ai
open
data-engineering
0

duckdb-remote-parquet-query

Query remote Parquet files via HTTP without downloading using DuckDB httpfs. Leverage column pruning, row filtering, and range requests for efficient bandwidth usage. Use for crypto/trading data distribution and analytics.

terrylica
terrylica
data-ai
open
data-engineering
0

data-migration-expert

Use this agent when reviewing database migrations, schema changes, or data transformations. Specializes in validating ID mappings, checking for swapped values, and verifying rollback safety. Triggers on requests like "migration review", "schema change validation".

jovermier
jovermier
data-ai
open
data-engineering
0

coding-conventions

Field naming conventions for the Job Aggregator project. Use this skill when encountering type errors related to field names (camelCase vs snake_case), database constraint violations, or data mapping issues between Python/TypeScript/PostgreSQL.

beetz12
beetz12
data-ai
open
data-engineering
0

openspec-sync-specs

Sync delta specs from a change to main specs. Use when the user wants to update main specs with changes from a delta spec, without archiving the change.

austinmoody
austinmoody
data-ai
open
data-engineering
0

implementing-io-pipelines

Implements high-performance streaming using System.IO.Pipelines in .NET. Use when building network protocols, parsing binary data, or processing large streams efficiently.

christian289
christian289
data-ai
open
data-engineering
0

memory-delta

Auto-execute when "[MEMORY_KEEPER_DELTA]" trigger detected

ZipperBagCoffee
ZipperBagCoffee
data-ai
open
data-engineering
0

bigquery-ethereum-data-acquisition

Workflow for acquiring historical Ethereum blockchain data using Google BigQuery free tier. Empirically validated for cost estimation, streaming downloads, and DuckDB integration. Use when planning bulk historical data acquisition or comparing data source options for blockchain network metrics.

terrylica
terrylica
data-ai
open
data-engineering
0

pm-02-ingest-profile

Ingest the event log, normalise schema, and generate an initial data profile with notebook and manifest updates.

Wattysaid
Wattysaid
data-ai
open
data-engineering
0

generate-dataset-synth

Instructions for generating synthetic airline data with Synth CLI and loading it into SQLite.

rusmirbecirovic
rusmirbecirovic
data-ai
open
data-engineering
0

kinesis-stream-processor

Эксперт AWS Kinesis. Используй для stream processing, real-time data и Kinesis patterns.

dengineproblem
dengineproblem
data-ai
open
Previous
Page 63 / 65
Next