home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541টি স্কিলall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
23

change-impact-analyzer

Analyzes impact of proposed changes on existing systems (brownfield projects) with delta spec validation. Trigger terms: change impact, impact analysis, brownfield, delta spec, change proposal, change management, existing system analysis, integration impact, breaking changes, dependency analysis, affected components, migration plan, risk assessment, brownfield change. Provides comprehensive change analysis for existing systems: - Affected component identification - Breaking change detection - Dependency graph updates - Integration point impact - Database migration analysis - API compatibility checks - Risk assessment and mitigation strategies - Migration plan recommendations Use when: proposing changes to existing systems, analyzing brownfield integration, or validating delta specifications.

nahisaho
nahisaho
data-ai
open
data-engineering
23

kafka-cli-tools

Expert knowledge of Kafka CLI tools (kcat, kcli, kaf, kafkactl). Auto-activates on keywords kcat, kafkacat, kcli, kaf, kafkactl, kafka cli, kafka command line, produce message, consume topic, list topics, kafka metadata. Provides command examples, installation guides, and tool comparisons.

anton-abyzov
anton-abyzov
data-ai
open
data-engineering
23

confluent-kafka-connect

Kafka Connect integration expert. Covers source and sink connectors, JDBC, Elasticsearch, S3, Debezium CDC, SMT (Single Message Transforms), connector configuration, and data pipeline patterns. Activates for kafka connect, connectors, source connector, sink connector, jdbc connector, debezium, smt, data pipeline, cdc.

anton-abyzov
anton-abyzov
data-ai
open
data-engineering
20

model-builder

Creates dbt models with proper layering (staging, marts), incremental strategies, and documentation. Use when creating dbt models, organizing data transformations, or implementing incremental models.

armanzeroeight
armanzeroeight
data-ai
open
data-engineering
20

data-quality-checker

Implement data quality checks, validation rules, and monitoring. Use when ensuring data quality, validating data pipelines, or implementing data governance.

armanzeroeight
armanzeroeight
data-ai
open
data-engineering
20

test-generator

Generates dbt tests including schema tests, data quality tests, and freshness checks. Use when adding tests to dbt models or implementing data quality validation.

armanzeroeight
armanzeroeight
data-ai
open
data-engineering
18

data-transform

Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.

vuralserhat86
vuralserhat86
data-ai
open
data-engineering
18

data-structure-checker

This skill should be used when reading any tabular data file (Excel, CSV, Parquet, ODS). It automatically detects and fixes common data issues including multi-level headers, encoding problems, empty rows/columns, and data type mismatches. Returns a clean DataFrame ready for analysis with zero user intervention.

aws-samples
aws-samples
data-ai
open
data-engineering
18

drizzle-orm

Type-safe ORM for Cloudflare D1 databases using Drizzle. Provides patterns for schema definition, migrations, and type-safe queries. Prevents transaction errors and schema mismatches. Includes templates for strict TypeScript usage.

vuralserhat86
vuralserhat86
data-ai
open
data-engineering
17

polars-expertise

This skill should be used when the user asks about Polars DataFrame library (Apache Arrow) for Python or Rust. Triggers: "polars expressions", "lazy vs eager", "scan_parquet streaming", "convert pandas to polars", "pyspark to polars", "kdb to polars", "group_by_dynamic", "rolling_mean", "polars window functions", "asof join", "polars GPU", "polars parquet", "LazyFrame". Time series: OHLCV resampling, rolling windows, financial data patterns. Performance: native expressions over map_elements, early projection, categorical types, streaming.

DeevsDeevs
DeevsDeevs
data-ai
open
data-engineering
17

data-profiler

Generate comprehensive data profiles for DataFrames. Use for EDA, data discovery, and understanding dataset characteristics.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

test-fixture-generator

Generate synthetic test data with edge cases for ETL pipeline testing.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

data-validation

Data validation patterns and pipeline helpers. Custom validation functions, schema evolution, and test assertions.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

pydantic-validation

Record-level data validation using Pydantic models. Field validators, model validators, and batch validation patterns.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

pandera-validation

DataFrame schema validation using pandera. Schema definitions, column checks, and decorator-based validation.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

litestream-coder

This skill guides configuring Litestream for continuous SQLite backup in Rails 8+ apps. Use when setting up production backups for SQLite databases (Solid Queue, Solid Cache, Solid Cable).

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

parquet-coder

Columnar file patterns including partitioning, predicate pushdown, and schema evolution.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

csv-wrangler

Handle messy CSVs with encoding detection, delimiter inference, and malformed row recovery.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

pandas-coder

DataFrame manipulation with chunked processing, memory optimization, and vectorized operations.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
17

etl-incremental-patterns

Incremental data loading patterns including backfill strategies, CDC, timestamp-based loads, and pipeline orchestration.

majesticlabs-dev
majesticlabs-dev
data-ai
open
data-engineering
16

etl-patterns

ETL workflow patterns, data pipeline architecture, and ingestion strategies for Somali dialect classifier. Covers source integration, transformation logic, staging patterns, and load strategies. Auto-invokes when discussing data pipelines, ETL, ingestion workflows, or data processing architecture.

ilyasibrahim
ilyasibrahim
data-ai
open
data-engineering
15

servicenow-table-api

Manages ServiceNow tables. Use for CRUD on any table. Triggers - generic data ops.

Knuckles-Team
Knuckles-Team
data-ai
open
data-engineering
14

table-filters

Designs optimal filtering UX for data tables. Use when building a table that needs filters - analyzes the data columns and determines the best filter type for each. Outputs a unified filter field with inline header filters.

rohunvora
rohunvora
data-ai
open
data-engineering
14

dart-drift

Complete guide for using drift database library in Dart applications (CLI, server-side, non-Flutter). Use when building Dart apps that need local SQLite database storage or PostgreSQL connection with type-safe queries, reactive streams, migrations, and efficient CRUD operations. Includes setup with sqlite3 package, PostgreSQL support with drift_postgres, connection pooling, and server-side patterns.

MADTeacher
MADTeacher
data-ai
open
Previous
Page 49 / 65
Next