home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
1

data-engineer

Build ETL pipelines, data warehouses, and streaming architectures. Implements Spark jobs, Airflow DAGs, and Kafka streams. Use PROACTIVELY for data pipeline design or analytics infrastructure.

sidetoolco
sidetoolco
data-ai
open
data-engineering
1

graphql-resolvers

Write efficient resolvers with DataLoader, batching, and N+1 prevention

pluginagentmarketplace
pluginagentmarketplace
data-ai
open
data-engineering
1

filtering-event-datasets

Filter and search event datasets (logs) using OPAL. Use when you need to find specific log events by text search, regex patterns, or field values. Covers contains(), tilda operator ~, field comparisons, boolean logic, and limit for sampling results. Does NOT cover aggregation (see aggregating-event-datasets skill).

rustomax
rustomax
data-ai
open
data-engineering
1

python-data-engineering

Comprehensive Python data engineering patterns for AWS Data Lake, including PySpark, Pandas, Apache Airflow, AWS Glue, ETL pipelines, data quality, schema management, performance optimization, FastAPI services, streaming with Kafka/Kinesis, data validation with Great Expectations, testing strategies, error handling, logging, and production deployment on AWS EMR and Glue.

b3-competition
b3-competition
data-ai
open
data-engineering
1

gcp-bq-data-loading

Use when loading data into BigQuery from CSV, JSON, Avro, Parquet files, Cloud Storage, or local files. Covers bq load command, source formats, schema detection, incremental loading, and handling parsing errors.

FunnelEnvy
FunnelEnvy
data-ai
open
data-engineering
1

erpnext-errors-database

Error handling patterns for ERPNext/Frappe database operations. Use when handling DoesNotExistError, DuplicateEntryError, transaction failures, and query errors. Covers retry patterns and data integrity. V14/V15/V16 compatible. Triggers: database error, DoesNotExistError, DuplicateEntryError, transaction failed, query error.

OpenAEC-Foundation
OpenAEC-Foundation
data-ai
open
data-engineering
1

nixtla-contract-schema-mapper

Transforms prediction market data to Nixtla format (unique_id, ds, y). Maps arbitrary column names to required schema. Validates date and numeric types. Use when preparing prediction market datasets for Nixtla forecasting tools. Trigger with "convert to Nixtla format", "schema mapping", "transform data".

intent-solutions-io
intent-solutions-io
data-ai
open
data-engineering
0

blockchain-data-collection-validation

Empirical validation workflow for blockchain data collection pipelines before production implementation. Use when validating data sources, testing DuckDB integration, building POC collectors, or verifying complete fetch-to-storage pipelines for blockchain data.

terrylica
terrylica
data-ai
open
data-engineering
0

data-analysis

Data analysis workflows and patterns for exploring, transforming, and visualizing data. Use when working with data, creating reports, or when users mention "data analysis", "analyze data", "data exploration", or "reporting".

IHKREDDY
IHKREDDY
data-ai
open
data-engineering
0

data-analysis

Executive-grade data analysis with pandas/polars and McKinsey-quality visualizations. Use when analyzing data, building dashboards, creating investor presentations, or calculating SaaS metrics.

ScientiaCapital
ScientiaCapital
data-ai
open
data-engineering
0

running-eda-process

Runs Exploratory Data Analysis (EDA) following the mandatory validation workflow. Use when performing data analysis, exploring datasets, validating data quality, or when the user mentions EDA, data exploration, sanity checks, or data validation. Always run before main analysis queries.

nimrodfisher
nimrodfisher
data-ai
open
data-engineering
0

bpa-rules

This skill should be used when the user asks to "create a BPA rule", "write a Best Practice Analyzer rule", "improve a BPA expression", "fix expression for BPA", "analyze BPA annotations", "check model for best practices", "audit BPA rules", "discover BPA rules", "list all BPA rules", "validate BPA rules", or mentions Tabular Editor BPA rules. Provides guidance for creating, improving, auditing, and understanding Best Practice Analyzer rules for Power BI semantic models.

data-goblin
data-goblin
data-ai
open
data-engineering
0

freshness-latency-slos

See the main Data Freshness and Latency skill for comprehensive coverage of freshness monitoring and SLO tracking.

AmnadTaowsoam
AmnadTaowsoam
data-ai
open
data-engineering
0

data-quality-monitoring

Techniques and tools for ensuring the accuracy, completeness, and reliability of data across the pipeline.

AmnadTaowsoam
AmnadTaowsoam
data-ai
open
data-engineering
0

data-lineage

Mapping the flow of data from source to destination for transparency, impact analysis, and troubleshooting.

AmnadTaowsoam
AmnadTaowsoam
data-ai
open
data-engineering
0

process-mining-assistant

Perform an end-to-end process mining analysis via a command-line workflow that progressively ingests, profiles, cleans, mines and reports on event logs using PM4Py. The workflow generates stage-based artefacts (including versioned notebooks) and pauses at decision checkpoints so the user can validate findings and choose how to proceed.

Wattysaid
Wattysaid
data-ai
open
data-engineering
0

data-freshness-and-latency

Monitoring and optimizing how quickly data flows through pipelines and ensuring it meets timeliness requirements.

AmnadTaowsoam
AmnadTaowsoam
data-ai
open
data-engineering
0

polars

Expert guidance for Polars dataframe manipulation in Python. Use this skill when working with dataframes, data processing, ETL pipelines, or any task involving tabular data manipulation. Provides best practices, performance optimization patterns, and comprehensive API usage for the Polars library.

iKiok
iKiok
data-ai
open
data-engineering
0

duckdb-data-explorer

This skill should be used when performing local data exploration, profiling, quality analysis, or transformation tasks using DuckDB. It handles CSV, Parquet, and JSON files, provides automated data quality reports, supports complex JSON transformations, and generates interactive HTML reports for data analysis.

alexismanuel
alexismanuel
data-ai
open
data-engineering
0

dataql-analysis

Analyze data files using SQL queries with DataQL. Use when working with CSV, JSON, Parquet, Excel files or when the user mentions data analysis, filtering, aggregation, or SQL queries on files.

adrianolaselva
adrianolaselva
data-ai
open
data-engineering
0

eda

Exploratory Data Analysis for tabular data. Use this skill when analyzing value distributions, checking for missing data, computing correlations, examining class balance, or generating data quality reports.

argythana
argythana
data-ai
open
data-engineering
0

knack-data-cleaner

Ensures accuracy for HTI compliance and performance dashboards through data validation, deduplication, normalization, and integrity checks. Critica...

willsigmon
willsigmon
data-ai
open
data-engineering
0

polars

Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"

eyadsibai
eyadsibai
data-ai
open
data-engineering
0

dc-query-building

Build semantic queries with measures, dimensions, filters, and time dimensions for Drizzle Cube.

cliftonc
cliftonc
data-ai
open
Previous
Page 57 / 65
Next