home/categories/data-ai

domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 مهارةall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

data-engineering

fastapi-backend-template

FastAPI with PostgreSQL, async SQLAlchemy 2.0, Alembic, and Docker.

rebyteai-template

data-ai

open

data-engineering

Build scalable data pipelines, ETL/ELT processes, and data infrastructure. Use when: (1) designing data architectures or lakehouse patterns, (2) building Spark/Kafka/Flink/Beam pipelines, (3) optimizing Snowflake/BigQuery/Redshift queries, (4) implementing Airflow/Prefect/Dagster orchestration, (5) setting up data quality frameworks, (6) cost-optimizing data platforms.

robertlupo1997

data-ai

open

data-engineering

ontology-phase-1-ingest

Phase 1 of Ontology Builder Pipeline. Ingests and catalogs all input materials from _input/ folder. Use when starting ontology building process or when processing new input documents for domain analysis.

a4b-corporation

data-ai

open

data-engineering

django-import-export

Use when working with django-import-export

nibuno

data-ai

open

data-engineering

oni-calculator

Oxygen Not Included production chain calculator with SQLite database extracted from decompiled game source

lawless-m

data-ai

open

data-engineering

airbyte-connection-setup

Эксперт Airbyte. Используй для настройки ETL/ELT пайплайнов, коннекторов, синхронизации данных и data pipelines.

dengineproblem

data-ai

open

data-engineering

dbt-transformations

ALWAYS USE when working with dbt models, SQL transformations, tests, snapshots, or macros. Use IMMEDIATELY when editing dbt_project.yml, profiles.yml, or creating SQL models. MUST be loaded before any transform-layer work. Enforces dbt owns SQL principle - never parse, validate, or transform SQL in Python.

Obsidian-Owl

data-ai

open

data-engineering

duckdb-lakehouse

ALWAYS USE when building data lakehouse with DuckDB compute, configuring dbt-duckdb with Polaris plugin, or designing catalog-first architecture in floe-platform. Use IMMEDIATELY when reading/writing Iceberg tables via Polaris catalog, creating Dagster assets with DuckDB, or connecting to REST catalogs with inline credentials. Provides research steps for DuckDB + Dagster + Iceberg/Polaris integration patterns.

Obsidian-Owl

data-ai

open

data-engineering

debug-parse

Isolate and test parser behavior on specific text snippets to debug pattern matching, validate regex patterns against edge cases, understand which extraction rules triggered, and test parser changes before full deployment without running the complete pipeline or database commit. Use this skill when: (1) Debugging why parser misinterpreted a specific line or exercise description, (2) Testing new regex patterns against edge cases before adding to parser, (3) Validating parser changes on isolated examples without full workflow, (4) Understanding which parsing rule triggered for specific input text, or (5) Developing and testing new extraction patterns in isolation

zohar-ui

data-ai

open

data-engineering

senior-data-engineer

World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.

nimeshgurung

data-ai

open

data-engineering

data-expert

Performs data analysis and engineering tasks with a senior-level perspective, focusing on data quality, migration pipelines, SQL optimization, and business insights. Triggers when tasks involve database migrations, ETL, data validation, or analytical queries.

cesaramirez

data-ai

open

data-engineering

pm-07-conformance

Evaluate conformance of the event log to discovered models and generate deviation artefacts.

Wattysaid

data-ai

open

data-engineering

elt-modeling

Comprehensive guide to ELT (Extract, Load, Transform) modeling patterns, dimensional modeling, fact and dimension tables, and data warehouse design

AmnadTaowsoam

data-ai

open

data-engineering

fof-preflight

Diff-aware guardrail checker for Fear-of-Falling (FOF) changes; fails closed on raw data edits, Kxx intro/req_cols mismatches, and output discipline risks.

Tupatuko2023

data-ai

open

data-engineering

database-changes

Make database schema changes in IdeaForge. Triggers: create migration, add table/column, modify column type, add index, use JSONB, use pgvector. File-based migrations with raw SQL, no ORM.

Holo00

data-ai

open

data-engineering

atft-pipeline

Manage J-Quants ingestion, feature graph generation, and cache hygiene for the ATFT-GAT-FAN dataset pipeline.

wer-inc

data-ai

open

data-engineering

dc-cube-definition

Create and configure Drizzle Cube semantic layer cube definitions with proper security context, measures, dimensions, and joins.

cliftonc

data-ai

open

data-engineering

sync-check

원본과 생성 파일의 동기화 검증이 필요할 때

younwony

data-ai

open

data-engineering

reading-plan-designer

Design and implement Bible reading plans for the KR92 Bible Voice project. Use when: - Creating new reading plans (7-day, 30-day, yearly) - Adding daily readings to existing plans - Generating reading plan SQL migrations - Understanding the reading plan data model - Designing reading sequences (chronological, topical, book-based) - Validating reading reference formats Triggers: "reading plan", "lukusuunnitelma", "daily readings", "create plan", "add readings"

Spectaculous-Code

data-ai

open

data-engineering

kafka-streaming

Comprehensive guide to Apache Kafka for real-time data streaming including topics, producers, consumers, stream processing, and production best practices

AmnadTaowsoam

data-ai

open

data-engineering

database-manager

Comprehensive database management workflow that orchestrates database architecture, schema design, performance optimization, and data governance. Handles everything from database design and implementation to performance tuning, backup strategies, and data migration.

ajianaz

data-ai

open

data-engineering

dagster-orchestration

ALWAYS USE when working with Dagster assets, resources, IO managers, schedules, sensors, or dbt integration. CRITICAL for: @asset decorators, @dbt_assets, DbtCliResource, ConfigurableResource, IO managers, partitions. Enforces CATALOG-AS-CONTROL-PLANE architecture - ALL Iceberg writes via catalog (Polaris/Glue). Provides pluggable orchestration patterns abstractable to Airflow/Prefect. Compute abstraction: DuckDB (default), Spark, Snowflake - all via dbt.

Obsidian-Owl

data-ai

open

data-engineering

dbt-model-builder

Create dbt models following FF Analytics Kimball patterns and 2×2 stat model. This skill should be used when creating staging models, core facts/dimensions, or analytical marts. Guides through model creation with proper grain, tests, External Parquet configuration, and per-model YAML documentation using dbt 1.10+ syntax.

zazu-22

data-ai

open

data-engineering

supabase-seeding

Database seeding toolkit for Supabase projects. Use when: (1) Creating seed data files, (2) Populating lookup/reference tables, (3) Generating test data, (4) Bulk loading data with COPY, (5) Running seed files against database, (6) Managing large seed files with DVC

ninyawee

data-ai

open

Page 346 / 406