home/categories/data-engineering

category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541 skillsall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

data-engineering

tableau-expert

Expert-level Tableau Desktop/Server, calculated fields, LOD expressions, dashboards, data blending, and performance optimization

personamanagmentlayer

data-ai

open

data-engineering

Lightning-fast DataFrame library written in Rust for high-performance data manipulation and analysis. Use when user wants blazing fast data transformations, working with large datasets, lazy evaluation pipelines, or needs better performance than pandas. Ideal for ETL, data wrangling, aggregations, joins, and reading/writing CSV, Parquet, JSON files.

silvainfm

data-ai

open

data-engineering

check-health

Run comprehensive data health checks on the database, reporting on data quality issues, dead letters, duplicates, and anomalies. Use when you need to assess overall data quality, identify issues requiring attention, or perform daily data quality monitoring.

harehimself

data-ai

open

data-engineering

db-seeder

This skill should be used when seeding databases with realistic fake data for development, testing, or staging environments. Supports PostgreSQL, MySQL, SQLite, MongoDB with ORM-based seeding (SQLAlchemy, Django, Prisma) and Faker library for generating realistic test data. Use when the user needs to populate databases with sample data, create test fixtures, or set up development/staging environments with realistic data.

AIA-11-HN-MIB

data-ai

open

data-engineering

nixtla-schema-mapper

Analyzes data sources and generates Nixtla-compatible schema transformations. Infers column mappings, creates transformation modules for CSV/SQL/Parquet/dbt sources, generates schema contracts, and validates data quality. Activates when user needs data transformation, schema mapping, column inference, or Nixtla format conversion.

intent-solutions-io

data-ai

open

data-engineering

yaml-workflow-executor

Execute data processing workflows defined in YAML configuration files. Supports data loading, transformation, validation, and reporting pipelines.

vamseeachanta

data-ai

open

data-engineering

auto-migration

Automatically detect schema changes and manage EF Core migrations

ademceper

data-ai

open

data-engineering

zrl-data-ingestion

This skill should be used when ingesting market data from multiple sources (APIs, databases, files) into Zipline-Reloaded. It provides standardized patterns for data normalization, quality checks, and pipeline integration.

JeanBaissari

data-ai

open

data-engineering

streams

Master Node.js streams for memory-efficient processing of large datasets, real-time data handling, and building data pipelines

pluginagentmarketplace

data-ai

open

data-engineering

data-lake-management

Data Lake architecture and management including medallion architecture (bronze/silver/gold zones), data catalog with AWS Glue, partitioning strategies, schema evolution, data quality, governance, cost optimization, S3 lifecycle policies, data retention, compliance, query optimization with Athena, data formats (Parquet, ORC, Avro), incremental processing, CDC patterns, and production best practices for scalable data lakes.

b3-competition

data-ai

open

data-engineering

field-extraction-parsing

Extract structured fields from unstructured log data using OPAL parsing functions. Covers extract_regex() for pattern matching with type casting, split() for delimited data, parse_json() for JSON logs, and JSONPath for navigating parsed structures. Use when you need to convert raw log text into queryable fields for analysis, filtering, or aggregation.

rustomax

data-ai

open

data-engineering

url-parameter-parser

Parse URLs in CSV files and extract query parameters as new columns. Use when working with CSV files containing URLs that need parameter extraction and analysis.

feed-mob

data-ai

open

data-engineering

data-migration-expert

Use this agent when reviewing PRs that touch database migrations, data backfills, or any code that transforms production data. This agent validates ID mappings against production reality, checks for swapped values, verifies rollback safety, and ensures data integrity during schema changes. Essential for any migration that involves ID mappings, column renames, or data transformations. <example>Context: The user has a PR with database migrations that involve ID mappings. user: "Review this PR that migrates from action_id to action_module_name" assistant: "I'll use the data-migration-expert agent to validate the ID mappings and migration safety" <commentary>Since the PR involves ID mappings and data migration, use the data-migration-expert to verify the mappings match production and check for swapped values.</commentary></example> <example>Context: The user has a migration that transforms enum values. user: "This migration converts status integers to string enums" assistant: "Let me have the data-migration-exper

i3ringit

data-ai

open

data-engineering

Master data engineering, ETL/ELT, data warehousing, SQL optimization, and analytics. Use when building data pipelines, designing data systems, or working with large datasets.

pluginagentmarketplace

data-ai

open

data-engineering

data-warehousing

Snowflake, BigQuery, Redshift, dimensional modeling, and modern data warehouse architecture

pluginagentmarketplace

data-ai

open

data-engineering

supabase-realtime

Comprehensive guide for implementing Supabase Realtime features with best practices, scalable patterns, and migration strategies. Use when building realtime features in Supabase applications including messaging, notifications, presence, live updates, collaborative features, or migrating from postgres_changes to broadcast. Covers client setup, database triggers with realtime.broadcast_changes, RLS authorization, naming conventions, and performance optimization.

antonpme

data-ai

open

data-engineering

tg-validation

Data validation patterns for the World of Darkness Django application including database constraints, model validators, and atomic transactions. Use when implementing XP/freebie spending transactions, adding database constraints to models, writing clean() validation methods, or ensuring data integrity for character stats.

charlesmsiegel

data-ai

open

data-engineering

airflow-expert

Expert-level Apache Airflow orchestration, DAGs, operators, sensors, XComs, task dependencies, and scheduling

personamanagmentlayer

data-ai

open

data-engineering

databricks-expert

Expert-level Databricks platform, Apache Spark, Delta Lake, MLflow, notebooks, and cluster management

personamanagmentlayer

data-ai

open

data-engineering

postgresql-replication

PostgreSQL streaming replication - setup, monitoring, failover

pluginagentmarketplace

data-ai

open

data-engineering

schema-patterns

Effect Schema conventions and patterns. Triggers on Schema class creation, tagged unions, enums, type guards, or test fixtures using Effect Schema.

jasonkuhrt

data-ai

open

data-engineering

polars

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

hxk622

data-ai

open

data-engineering

data-warehousing

Master dimensional data modeling including star schema design, slowly changing dimensions, fact tables, and data warehouse architecture

pluginagentmarketplace

data-ai

open

data-engineering

db-query

This skill enables querying Spanner databases through the AfterShip DSP API. It uses the go-admin-automizely-cli library to obtain authentication tokens and execute SQL queries against Spanner databases in different environments.

virgoC0der

data-ai

open

Page 55 / 65