home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541টি স্কিলall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
314

sf-datacloud-prepare

Salesforce Data Cloud Prepare phase. TRIGGER when: user creates or manages Data Cloud data streams, DLOs, transforms, or Document AI configurations, or asks about ingestion into Data Cloud. DO NOT TRIGGER when: the task is connection setup only (use sf-datacloud-connect), DMOs and identity resolution (use sf-datacloud-harmonize), or query/search work (use sf-datacloud-retrieve).

Jaganpro
Jaganpro
data-ai
open
data-engineering
314

sf-datacloud

Salesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).

Jaganpro
Jaganpro
data-ai
open
data-engineering
313

create-transformer

Use when creating a new walkerOS transformer. Example-driven workflow for validation, enrichment, or redaction transformers.

elbwalker
elbwalker
data-ai
open
data-engineering
313

understanding-transformers

Use when working with transformers, understanding event validation/enrichment/redaction, or learning about transformer chaining. Covers interface, return values, and pipeline integration.

elbwalker
elbwalker
data-ai
open
data-engineering
313

understanding-mapping

Use when transforming events at any point in the flow (source→collector or collector→destination), configuring data/map/loop/condition, or understanding value extraction. Covers all mapping strategies.

elbwalker
elbwalker
data-ai
open
data-engineering
312

clickhouse-system-queries

Query ClickHouse system tables to inspect query logs, monitor cluster health, check replication status, and analyze slow queries. Use when the user mentions "system tables", "query_log", "ClickHouse monitoring", "cluster status", "slow queries", or asks to diagnose ClickHouse operational issues.

FrankChen021
FrankChen021
data-ai
open
data-engineering
312

sql-expert

Expert system for generating, validating, and optimizing ClickHouse SQL. Use this when the user needs data, queries, or analysis.

FrankChen021
FrankChen021
data-ai
open
data-engineering
312

cascadeflow

OpenClaw-native domain cascading. Use when users need cost/latency reduction via cascading, domain-aware model assignment, OpenClaw-native event handling, and command setup including /model cflow and optional /cascade stats commands.

lemony-ai
lemony-ai
data-ai
open
data-engineering
310

tracing-upstream-lineage

Trace upstream data lineage. Use when the user asks where data comes from, what feeds a table, upstream dependencies, data sources, or needs to understand data origins.

astronomer
astronomer
data-ai
open
data-engineering
310

tracing-downstream-lineage

Trace downstream data lineage and impact analysis. Use when the user asks what depends on this data, what breaks if something changes, downstream dependencies, or needs to assess change risk before modifying a table or DAG.

astronomer
astronomer
data-ai
open
data-engineering
310

analyzing-data

Queries data warehouse and answers business questions about data. Handles questions requiring database/warehouse queries including "who uses X", "how many Y", "show me Z", "find customers", "what is the count", data lookups, metrics, trends, or SQL analysis.

astronomer
astronomer
data-ai
open
data-engineering
310

airflow-hitl

Use when the user needs human-in-the-loop workflows in Airflow (approval/reject, form input, or human-driven branching). Covers ApprovalOperator, HITLOperator, HITLBranchOperator, HITLEntryOperator. Requires Airflow 3.1+. Does not cover AI/LLM calls (see airflow-ai).

astronomer
astronomer
data-ai
open
data-engineering
310

airflow

Queries, manages, and troubleshoots Apache Airflow using the af CLI. Covers listing DAGs, triggering runs, reading task logs, diagnosing failures, debugging DAG import errors, checking connections, variables, pools, and monitoring health. Also routes to sub-skills for writing DAGs, debugging, deploying, and migrating Airflow 2 to 3. Use when user mentions "Airflow", "DAG", "DAG run", "task log", "import error", "parse error", "broken DAG", or asks to "trigger a pipeline", "debug import errors", "check Airflow health", "list connections", "retry a run", or any Airflow operation. Do NOT use for warehouse/SQL analytics on Airflow metadata tables — use analyzing-data instead.

astronomer
astronomer
data-ai
open
data-engineering
310

annotating-task-lineage

Annotate Airflow tasks with data lineage using inlets and outlets. Use when the user wants to add lineage metadata to tasks, specify input/output datasets, or enable lineage tracking for operators without built-in OpenLineage extraction.

astronomer
astronomer
data-ai
open
data-engineering
310

authoring-dags

Workflow and best practices for writing Apache Airflow DAGs. Use when the user wants to create a new DAG, write pipeline code, or asks about DAG patterns and conventions. For testing and debugging DAGs, see the testing-dags skill.

astronomer
astronomer
data-ai
open
data-engineering
310

blueprint

Define reusable Airflow task group templates with Pydantic validation and compose DAGs from YAML. Use when creating blueprint templates, composing DAGs from YAML, validating configurations, or enabling no-code DAG authoring for non-engineers.

astronomer
astronomer
data-ai
open
data-engineering
310

checking-freshness

Quick data freshness check. Use when the user asks if data is up to date, when a table was last updated, if data is stale, or needs to verify data currency before using it.

astronomer
astronomer
data-ai
open
data-engineering
310

cosmos-dbt-core

Use when turning a dbt Core project into an Airflow DAG/TaskGroup using Astronomer Cosmos. Does not cover dbt Fusion. Before implementing, verify dbt engine, warehouse, Airflow version, execution environment, DAG vs TaskGroup, and manifest availability.

astronomer
astronomer
data-ai
open
data-engineering
310

cosmos-dbt-fusion

Use when running a dbt Fusion project with Astronomer Cosmos. Covers Cosmos 1.11+ configuration for Fusion on Snowflake/Databricks with ExecutionMode.LOCAL. Before implementing, verify dbt engine is Fusion (not Core), warehouse is supported, and local execution is acceptable. Does not cover dbt Core.

astronomer
astronomer
data-ai
open
data-engineering
310

creating-openlineage-extractors

Create custom OpenLineage extractors for Airflow operators. Use when the user needs lineage from unsupported or third-party operators, wants column-level lineage, or needs complex extraction logic beyond what inlets/outlets provide.

astronomer
astronomer
data-ai
open
data-engineering
310

deploying-airflow

Deploy Airflow DAGs and projects. Use when the user wants to deploy code, push DAGs, set up CI/CD, deploy to production, or asks about deployment strategies for Airflow.

astronomer
astronomer
data-ai
open
data-engineering
310

managing-astro-local-env

Manage local Airflow environment with Astro CLI (Docker and standalone modes). Use when the user wants to start, stop, or restart Airflow, view logs, query the Airflow API, troubleshoot, or fix environment issues. For project setup, see setting-up-astro-project.

astronomer
astronomer
data-ai
open
data-engineering
310

migrating-airflow-2-to-3

Guide for migrating Apache Airflow 2.x projects to Airflow 3.x. Use when the user mentions Airflow 3 migration, upgrade, compatibility issues, breaking changes, or wants to modernize their Airflow codebase. If you detect Airflow 2.x code that needs migration, prompt the user and ask if they want you to help upgrade. Always load this skill as the first step for any migration-related request.

astronomer
astronomer
data-ai
open
data-engineering
310

warehouse-init

Initialize warehouse schema discovery. Generates .astro/warehouse.md with all table metadata for instant lookups. Run once per project, refresh when schema changes. Use when user says "/astronomer-data:warehouse-init" or asks to set up data discovery.

astronomer
astronomer
data-ai
open
Previous
Page 29 / 65
Next