home/categories/data-engineering

category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541 スキルall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

data-engineering

247

dbt-transformation-patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

aiskillstore

data-ai

open

data-engineering

247

airflow-dag-patterns

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

aiskillstore

data-ai

open

data-engineering

247

agentdb-advanced-features

Master advanced AgentDB features including QUIC synchronization, multi-database management, custom distance metrics, hybrid search, and distributed systems integration. Use when building distributed AI systems, multi-agent coordination, or advanced vector search applications.

aiskillstore

data-ai

open

data-engineering

247

backend-models

Define and configure database models with proper naming, relationships, timestamps, data types, constraints, and validation. Use this skill when creating or editing model files in app/Models/, Eloquent model classes, model relationships (hasMany, belongsTo, etc.), database table structures, model attributes and casts, model factories, or seeders. Use when working on model validation logic, database constraints, foreign key relationships, indexes, scopes, accessors, mutators, or any ORM-related model configuration.

aiskillstore

data-ai

open

data-engineering

247

docker-workflow

Comprehensive Docker containerization workflow covering multi-stage builds, docker-compose orchestration, image optimization, debugging, and production best practices. Use when containerizing applications, setting up development environments, or deploying with Docker.

aiskillstore

data-ai

open

data-engineering

247

looker-studio-bigquery

Design and configure Looker Studio dashboards with BigQuery data sources. Use when creating analytics dashboards, connecting BigQuery to visualization tools, or optimizing data pipeline performance. Handles BigQuery connections, custom SQL queries, scheduled queries, dashboard design, and performance optimization.

aiskillstore

data-ai

open

data-engineering

245

dpdata-cli

Convert and manipulate atomic simulation data formats using dpdata CLI. Use when converting between DFT/MD output formats (VASP, LAMMPS, QE, CP2K, Gaussian, ABACUS, etc.), preparing training data for DeePMD-kit, or working with DeePMD formats. Supports 50+ formats including deepmd/raw, deepmd/comp, deepmd/npy, deepmd/hdf5.

deepmodeling

data-ai

open

data-engineering

225

cashclaw-data-scraper

Extracts structured data from websites and APIs, delivering clean datasets in multiple formats. Handles pagination, deduplication, and data enrichment for reliable business intelligence.

ertugrulakben

data-ai

open

data-engineering

217

clickhouse-best-practices

ClickHouse schema design, query optimization, and operational best practices for production deployments.

duyet

data-ai

open

data-engineering

217

storage-optimization

Compression codecs, TTL policies, tiered storage, part management, and disk space optimization.

duyet

data-ai

open

data-engineering

217

cia-data-integration

Citizen Intelligence Agency data model integration, riksdag data processing, political entity mapping

Hack23

data-ai

open

data-engineering

217

classification-framework-enforcement

Data classification enforcement, sensitivity labeling, and handling controls per classification level for the CIA platform

Hack23

data-ai

open

data-engineering

217

data-pipeline-engineering

Data pipeline design, ETL processes, Spring Integration patterns, batch processing for political data

Hack23

data-ai

open

data-engineering

217

data-protection

Data classification (CIA triad), GDPR privacy by design, encryption standards, data lifecycle management

Hack23

data-ai

open

data-engineering

216

tables-data-specialist

Data table accessibility specialist for web applications. Use when building or reviewing any data table, sortable table, grid, spreadsheet-like interface, comparison table, pricing table, or any tabular data display. Covers proper markup, scope, caption, headers, sortable columns, responsive patterns, and ARIA grid/treegrid roles. Applies to any web framework or vanilla HTML/CSS/JS.

Community-Access

data-ai

open

data-engineering

216

ci-integration

CI/CD accessibility pipeline patterns, axe-core CLI configuration, SARIF output, PR annotations, baseline management, and multi-platform CI templates. Reference data for CI accessibility setup.

Community-Access

data-ai

open

data-engineering

215

notes

Expert help with the meganote system - cross-tool note capture, daily notes, and obsidian.nvim integration. Covers Hammerspoon, Shade, nvim, and the full capture → daily note linking pipeline.

megalithic

data-ai

open

data-engineering

213

temporal-parameter-staleness

Type Thought-template (instantiate before use) - Research basis Cached parameters in multi-step operations become stale when governance changes them mid-operation

PlamenTSV

data-ai

open

data-engineering

213

openspec-sync-specs

Sync delta specs from a change to main specs. Use when the user wants to update main specs with changes from a delta spec, without archiving the change.

SAP

data-ai

open

data-engineering

211

super-dev-core

Super Dev pipeline governance for research-first, commercial-grade AI coding delivery

shangyankeji

data-ai

open

data-engineering

211

super-dev-core

Super Dev pipeline governance for research-first, commercial-grade AI coding delivery

shangyankeji

data-ai

open

data-engineering

211

super-dev

Super Dev pipeline governance for research-first, commercial-grade AI coding delivery

shangyankeji

data-ai

open

data-engineering

211

super-dev

Super Dev pipeline governance: research-first, commercial-grade AI coding delivery with 10 expert roles, quality gates, and audit artifacts.

shangyankeji

data-ai

open

data-engineering

211

vault-curator

Process raw inbound content (emails, voice memos, notes) into structured Obsidian vault records with proper frontmatter, wikilinks, and file placement.

ssdavidai

data-ai

open

Page 34 / 65