tableau-expert
Expert-level Tableau Desktop/Server, calculated fields, LOD expressions, dashboards, data blending, and performance optimization
Expert-level Tableau Desktop/Server, calculated fields, LOD expressions, dashboards, data blending, and performance optimization
Lightning-fast DataFrame library written in Rust for high-performance data manipulation and analysis. Use when user wants blazing fast data transformations, working with large datasets, lazy evaluation pipelines, or needs better performance than pandas. Ideal for ETL, data wrangling, aggregations, joins, and reading/writing CSV, Parquet, JSON files.
Run comprehensive data health checks on the database, reporting on data quality issues, dead letters, duplicates, and anomalies. Use when you need to assess overall data quality, identify issues requiring attention, or perform daily data quality monitoring.
This skill should be used when seeding databases with realistic fake data for development, testing, or staging environments. Supports PostgreSQL, MySQL, SQLite, MongoDB with ORM-based seeding (SQLAlchemy, Django, Prisma) and Faker library for generating realistic test data. Use when the user needs to populate databases with sample data, create test fixtures, or set up development/staging environments with realistic data.
Analyzes data sources and generates Nixtla-compatible schema transformations. Infers column mappings, creates transformation modules for CSV/SQL/Parquet/dbt sources, generates schema contracts, and validates data quality. Activates when user needs data transformation, schema mapping, column inference, or Nixtla format conversion.
Execute data processing workflows defined in YAML configuration files. Supports data loading, transformation, validation, and reporting pipelines.
Automatically detect schema changes and manage EF Core migrations
This skill should be used when ingesting market data from multiple sources (APIs, databases, files) into Zipline-Reloaded. It provides standardized patterns for data normalization, quality checks, and pipeline integration.
Data Lake architecture and management including medallion architecture (bronze/silver/gold zones), data catalog with AWS Glue, partitioning strategies, schema evolution, data quality, governance, cost optimization, S3 lifecycle policies, data retention, compliance, query optimization with Athena, data formats (Parquet, ORC, Avro), incremental processing, CDC patterns, and production best practices for scalable data lakes.
Extract structured fields from unstructured log data using OPAL parsing functions. Covers extract_regex() for pattern matching with type casting, split() for delimited data, parse_json() for JSON logs, and JSONPath for navigating parsed structures. Use when you need to convert raw log text into queryable fields for analysis, filtering, or aggregation.
Parse URLs in CSV files and extract query parameters as new columns. Use when working with CSV files containing URLs that need parameter extraction and analysis.
Use this agent when reviewing PRs that touch database migrations, data backfills, or any code that transforms production data. This agent validates ID mappings against production reality, checks for swapped values, verifies rollback safety, and ensures data integrity during schema changes. Essential for any migration that involves ID mappings, column renames, or data transformations. <example>Context: The user has a PR with database migrations that involve ID mappings. user: "Review this PR that migrates from action_id to action_module_name" assistant: "I'll use the data-migration-expert agent to validate the ID mappings and migration safety" <commentary>Since the PR involves ID mappings and data migration, use the data-migration-expert to verify the mappings match production and check for swapped values.</commentary></example> <example>Context: The user has a migration that transforms enum values. user: "This migration converts status integers to string enums" assistant: "Let me have the data-migration-exper
Master data engineering, ETL/ELT, data warehousing, SQL optimization, and analytics. Use when building data pipelines, designing data systems, or working with large datasets.
Snowflake, BigQuery, Redshift, dimensional modeling, and modern data warehouse architecture
Comprehensive guide for implementing Supabase Realtime features with best practices, scalable patterns, and migration strategies. Use when building realtime features in Supabase applications including messaging, notifications, presence, live updates, collaborative features, or migrating from postgres_changes to broadcast. Covers client setup, database triggers with realtime.broadcast_changes, RLS authorization, naming conventions, and performance optimization.
Data validation patterns for the World of Darkness Django application including database constraints, model validators, and atomic transactions. Use when implementing XP/freebie spending transactions, adding database constraints to models, writing clean() validation methods, or ensuring data integrity for character stats.
Expert-level Apache Airflow orchestration, DAGs, operators, sensors, XComs, task dependencies, and scheduling
Expert-level Databricks platform, Apache Spark, Delta Lake, MLflow, notebooks, and cluster management
PostgreSQL streaming replication - setup, monitoring, failover
Effect Schema conventions and patterns. Triggers on Schema class creation, tagged unions, enums, type guards, or test fixtures using Effect Schema.
Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
Master dimensional data modeling including star schema design, slowly changing dimensions, fact tables, and data warehouse architecture