fastapi-backend-template
FastAPI with PostgreSQL, async SQLAlchemy 2.0, Alembic, and Docker.
FastAPI with PostgreSQL, async SQLAlchemy 2.0, Alembic, and Docker.
Build scalable data pipelines, ETL/ELT processes, and data infrastructure. Use when: (1) designing data architectures or lakehouse patterns, (2) building Spark/Kafka/Flink/Beam pipelines, (3) optimizing Snowflake/BigQuery/Redshift queries, (4) implementing Airflow/Prefect/Dagster orchestration, (5) setting up data quality frameworks, (6) cost-optimizing data platforms.
Phase 1 of Ontology Builder Pipeline. Ingests and catalogs all input materials from _input/ folder. Use when starting ontology building process or when processing new input documents for domain analysis.
Oxygen Not Included production chain calculator with SQLite database extracted from decompiled game source
Эксперт Airbyte. Используй для настройки ETL/ELT пайплайнов, коннекторов, синхронизации данных и data pipelines.
ALWAYS USE when working with dbt models, SQL transformations, tests, snapshots, or macros. Use IMMEDIATELY when editing dbt_project.yml, profiles.yml, or creating SQL models. MUST be loaded before any transform-layer work. Enforces dbt owns SQL principle - never parse, validate, or transform SQL in Python.
ALWAYS USE when building data lakehouse with DuckDB compute, configuring dbt-duckdb with Polaris plugin, or designing catalog-first architecture in floe-platform. Use IMMEDIATELY when reading/writing Iceberg tables via Polaris catalog, creating Dagster assets with DuckDB, or connecting to REST catalogs with inline credentials. Provides research steps for DuckDB + Dagster + Iceberg/Polaris integration patterns.
Isolate and test parser behavior on specific text snippets to debug pattern matching, validate regex patterns against edge cases, understand which extraction rules triggered, and test parser changes before full deployment without running the complete pipeline or database commit. Use this skill when: (1) Debugging why parser misinterpreted a specific line or exercise description, (2) Testing new regex patterns against edge cases before adding to parser, (3) Validating parser changes on isolated examples without full workflow, (4) Understanding which parsing rule triggered for specific input text, or (5) Developing and testing new extraction patterns in isolation
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.
Performs data analysis and engineering tasks with a senior-level perspective, focusing on data quality, migration pipelines, SQL optimization, and business insights. Triggers when tasks involve database migrations, ETL, data validation, or analytical queries.
Evaluate conformance of the event log to discovered models and generate deviation artefacts.
Comprehensive guide to ELT (Extract, Load, Transform) modeling patterns, dimensional modeling, fact and dimension tables, and data warehouse design
Diff-aware guardrail checker for Fear-of-Falling (FOF) changes; fails closed on raw data edits, Kxx intro/req_cols mismatches, and output discipline risks.
Make database schema changes in IdeaForge. Triggers: create migration, add table/column, modify column type, add index, use JSONB, use pgvector. File-based migrations with raw SQL, no ORM.
Manage J-Quants ingestion, feature graph generation, and cache hygiene for the ATFT-GAT-FAN dataset pipeline.
Create and configure Drizzle Cube semantic layer cube definitions with proper security context, measures, dimensions, and joins.
Design and implement Bible reading plans for the KR92 Bible Voice project. Use when: - Creating new reading plans (7-day, 30-day, yearly) - Adding daily readings to existing plans - Generating reading plan SQL migrations - Understanding the reading plan data model - Designing reading sequences (chronological, topical, book-based) - Validating reading reference formats Triggers: "reading plan", "lukusuunnitelma", "daily readings", "create plan", "add readings"
Comprehensive guide to Apache Kafka for real-time data streaming including topics, producers, consumers, stream processing, and production best practices
Comprehensive database management workflow that orchestrates database architecture, schema design, performance optimization, and data governance. Handles everything from database design and implementation to performance tuning, backup strategies, and data migration.
ALWAYS USE when working with Dagster assets, resources, IO managers, schedules, sensors, or dbt integration. CRITICAL for: @asset decorators, @dbt_assets, DbtCliResource, ConfigurableResource, IO managers, partitions. Enforces CATALOG-AS-CONTROL-PLANE architecture - ALL Iceberg writes via catalog (Polaris/Glue). Provides pluggable orchestration patterns abstractable to Airflow/Prefect. Compute abstraction: DuckDB (default), Spark, Snowflake - all via dbt.
Create dbt models following FF Analytics Kimball patterns and 2×2 stat model. This skill should be used when creating staging models, core facts/dimensions, or analytical marts. Guides through model creation with proper grain, tests, External Parquet configuration, and per-model YAML documentation using dbt 1.10+ syntax.
Database seeding toolkit for Supabase projects. Use when: (1) Creating seed data files, (2) Populating lookup/reference tables, (3) Generating test data, (4) Bulk loading data with COPY, (5) Running seed files against database, (6) Managing large seed files with DVC