domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
310

tracing-upstream-lineage

Trace upstream data lineage. Use when the user asks where data comes from, what feeds a table, upstream dependencies, data sources, or needs to understand data origins.

astronomer
astronomer
data-ai
open
data-engineering
310

tracing-downstream-lineage

Trace downstream data lineage and impact analysis. Use when the user asks what depends on this data, what breaks if something changes, downstream dependencies, or needs to assess change risk before modifying a table or DAG.

astronomer
astronomer
data-ai
open
data-engineering
310

analyzing-data

Queries data warehouse and answers business questions about data. Handles questions requiring database/warehouse queries including "who uses X", "how many Y", "show me Z", "find customers", "what is the count", data lookups, metrics, trends, or SQL analysis.

astronomer
astronomer
data-ai
open
data-engineering
310

airflow-hitl

Use when the user needs human-in-the-loop workflows in Airflow (approval/reject, form input, or human-driven branching). Covers ApprovalOperator, HITLOperator, HITLBranchOperator, HITLEntryOperator. Requires Airflow 3.1+. Does not cover AI/LLM calls (see airflow-ai).

astronomer
astronomer
data-ai
open
data-engineering
310

airflow

Queries, manages, and troubleshoots Apache Airflow using the af CLI. Covers listing DAGs, triggering runs, reading task logs, diagnosing failures, debugging DAG import errors, checking connections, variables, pools, and monitoring health. Also routes to sub-skills for writing DAGs, debugging, deploying, and migrating Airflow 2 to 3. Use when user mentions "Airflow", "DAG", "DAG run", "task log", "import error", "parse error", "broken DAG", or asks to "trigger a pipeline", "debug import errors", "check Airflow health", "list connections", "retry a run", or any Airflow operation. Do NOT use for warehouse/SQL analytics on Airflow metadata tables — use analyzing-data instead.

astronomer
astronomer
data-ai
open
data-engineering
310

annotating-task-lineage

Annotate Airflow tasks with data lineage using inlets and outlets. Use when the user wants to add lineage metadata to tasks, specify input/output datasets, or enable lineage tracking for operators without built-in OpenLineage extraction.

astronomer
astronomer
data-ai
open
data-engineering
310

authoring-dags

Workflow and best practices for writing Apache Airflow DAGs. Use when the user wants to create a new DAG, write pipeline code, or asks about DAG patterns and conventions. For testing and debugging DAGs, see the testing-dags skill.

astronomer
astronomer
data-ai
open
data-engineering
310

blueprint

Define reusable Airflow task group templates with Pydantic validation and compose DAGs from YAML. Use when creating blueprint templates, composing DAGs from YAML, validating configurations, or enabling no-code DAG authoring for non-engineers.

astronomer
astronomer
data-ai
open
data-engineering
310

checking-freshness

Quick data freshness check. Use when the user asks if data is up to date, when a table was last updated, if data is stale, or needs to verify data currency before using it.

astronomer
astronomer
data-ai
open
data-engineering
310

cosmos-dbt-core

Use when turning a dbt Core project into an Airflow DAG/TaskGroup using Astronomer Cosmos. Does not cover dbt Fusion. Before implementing, verify dbt engine, warehouse, Airflow version, execution environment, DAG vs TaskGroup, and manifest availability.

astronomer
astronomer
data-ai
open
data-engineering
310

cosmos-dbt-fusion

Use when running a dbt Fusion project with Astronomer Cosmos. Covers Cosmos 1.11+ configuration for Fusion on Snowflake/Databricks with ExecutionMode.LOCAL. Before implementing, verify dbt engine is Fusion (not Core), warehouse is supported, and local execution is acceptable. Does not cover dbt Core.

astronomer
astronomer
data-ai
open
data-engineering
310

creating-openlineage-extractors

Create custom OpenLineage extractors for Airflow operators. Use when the user needs lineage from unsupported or third-party operators, wants column-level lineage, or needs complex extraction logic beyond what inlets/outlets provide.

astronomer
astronomer
data-ai
open
data-engineering
310

deploying-airflow

Deploy Airflow DAGs and projects. Use when the user wants to deploy code, push DAGs, set up CI/CD, deploy to production, or asks about deployment strategies for Airflow.

astronomer
astronomer
data-ai
open
data-engineering
310

managing-astro-local-env

Manage local Airflow environment with Astro CLI (Docker and standalone modes). Use when the user wants to start, stop, or restart Airflow, view logs, query the Airflow API, troubleshoot, or fix environment issues. For project setup, see setting-up-astro-project.

astronomer
astronomer
data-ai
open
data-engineering
310

migrating-airflow-2-to-3

Guide for migrating Apache Airflow 2.x projects to Airflow 3.x. Use when the user mentions Airflow 3 migration, upgrade, compatibility issues, breaking changes, or wants to modernize their Airflow codebase. If you detect Airflow 2.x code that needs migration, prompt the user and ask if they want you to help upgrade. Always load this skill as the first step for any migration-related request.

astronomer
astronomer
data-ai
open
data-engineering
310

warehouse-init

Initialize warehouse schema discovery. Generates .astro/warehouse.md with all table metadata for instant lookups. Run once per project, refresh when schema changes. Use when user says "/astronomer-data:warehouse-init" or asks to set up data discovery.

astronomer
astronomer
data-ai
open
machine-learning
309

sample-scaffolder

This skill is designed to take a skill that has been submitted as a PR and scaffold it into the sample format as an expected standard by the repository.

pnp
pnp
data-ai
open
data-engineering
307

dotfile-brainstorm

Use when you want to build a project but don't have a spec yet and need to brainstorm the idea into a design doc structured for pipeline DOT generation. Use when starting from a vague idea, project concept, or feature request that needs to become a headless autonomous build pipeline.

harperreed
harperreed
data-ai
open
machine-learning
307

perf-check

Run a Maestro-style performance assessment for hotspots, regressions, and optimization planning

josstei
josstei
data-ai
open
machine-learning
307

perf-check

Run a Maestro-style performance assessment for hotspots, regressions, and optimization planning

josstei
josstei
data-ai
open
data-analysis
305

reporting

Guidelines for formatting reports using HTML details/summary tags

githubnext
githubnext
data-ai
open
data-analysis
305

proprietary-data-generator

Create original surveys, benchmarks, and aggregated data nobody else has. Automate data collection for content moats. Triggers on: "create original data", "proprietary data", "survey design", "benchmark study", "original research", "data-driven content", "create a survey", "industry benchmark", "aggregated data", "unique data", "first-party data", "data moat", "generate research data", "create a study", "original statistics", "data nobody else has", "competitive data advantage".

Affitor
Affitor
data-ai
open
data-engineering
305

hipdnn-codegen

Generate hipDNN operation boilerplate from a FlatBuffer schema. Use when the user wants to add a new operation type to hipDNN, or generate descriptor/packer/unpacker code.

ROCm
ROCm
data-ai
open
llm-ai
304

system-prompt-writer

This skill should be used when writing or improving system prompts for AI agents, providing expert guidance based on Anthropic's context engineering principles.

aws-samples
aws-samples
data-ai
open
Previous
Page 142 / 406
Next