domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743টি স্কিলall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
1

linking-hms-to-hecras

Links HEC-HMS watershed models to HEC-RAS river models by extracting HMS DSS results and preparing them for RAS boundary condition import. Handles flow hydrograph export, spatial referencing (HMS outlets to RAS cross sections), DSS pathname formatting, quality validation, and time series alignment. Use when setting up HMS→RAS workflows, exporting HMS results for RAS, preparing upstream boundary conditions, or coordinating watershed-to-river integrated modeling. Leverages shared RasDss infrastructure for consistent DSS operations across both tools. Trigger keywords: HMS to RAS, link HMS RAS, boundary condition, upstream BC, watershed to river, integrated model, export HMS, DSS pathname, spatial matching, hydrograph.

gpt-cmdr
gpt-cmdr
data-ai
open
data-engineering
1

read-avro-files

Extracts and displays JSON data from Apache Avro files. Use this when the user wants to read, convert, or view the contents of an .avro file. Automatically deserializes nested JSON fields for better readability.

ranjanpoudel1234
ranjanpoudel1234
data-ai
open
data-engineering
1

continuity-ledger

Create or update continuity ledger for state preservation across clears

tfunk1030
tfunk1030
data-ai
open
data-engineering
1

big-data

Apache Spark, Hadoop, distributed computing, and large-scale data processing for petabyte-scale workloads

pluginagentmarketplace
pluginagentmarketplace
data-ai
open
data-engineering
1

snowflake-semantic-views

Create, alter, and validate Snowflake semantic views using the Snowflake CLI (`snow`). Use when asked to build or troubleshoot semantic views or semantic layer definitions with CREATE/ALTER SEMANTIC VIEW, to validate semantic view DDL against Snowflake via the CLI, or to guide Snowflake CLI installation and connection setup.

MiguelElGallo
MiguelElGallo
data-ai
open
machine-learning
1

data-engineering

ML data engineering covering data pipelines, data quality, collection strategies, storage, and versioning for machine learning systems.

doanchienthangdev
doanchienthangdev
data-ai
open
data-engineering
1

processing-data

Processes CSV files and pandas DataFrames. Use when working with CSV files, tabular data, spreadsheets, or when the user asks to query, analyze, or manipulate structured data.

binome-dev
binome-dev
data-ai
open
data-engineering
1

queue-worker

Bull queue setup, worker processes, job lifecycle, error handling, retries, progress reporting, Redis connection management, and queue patterns for OCR and knowledge processing

Intellifill
Intellifill
data-ai
open
data-engineering
1

containerization

Docker, Kubernetes, container orchestration, and cloud-native deployment for data applications

pluginagentmarketplace
pluginagentmarketplace
data-ai
open
data-engineering
1

oracle-consultation

Use before invoking Oracle to ensure appropriate usage of this expensive reasoning resource

trash-panda-v91-beta
trash-panda-v91-beta
data-ai
open
data-engineering
1

schema-validator

校验页面实例与Schema定义的一致性(类型/约束/安全),并给出可发布结论。保存页面、预发布检查或批量体检时调用。

AIYAZONE
AIYAZONE
data-ai
open
data-engineering
1

snowflake-semanticview

Create, alter, and validate Snowflake semantic views using Snowflake CLI (snow). Use when asked to build or troubleshoot semantic views/semantic layer definitions with CREATE/ALTER SEMANTIC VIEW, to validate semantic-view DDL against Snowflake via CLI, or to guide Snowflake CLI installation and connection setup.

webmaxru
webmaxru
data-ai
open
data-engineering
1

bsee-data-extractor

Extract and process BSEE (Bureau of Safety and Environmental Enforcement) data including production, WAR (Well Activity Reports), and APD (Application for Permit to Drill) data. Use for querying production data, well activities, drilling permits, completions, and workovers by API number, block, lease, or field with automatic data normalization and caching.

vamseeachanta
vamseeachanta
data-ai
open
data-engineering
1

workspace-isolation-audit

Use when asked to audit or fix Supabase queries to ensure every query filters by workspace_id and workspace access is validated.

CleanExpo
CleanExpo
data-ai
open
data-engineering
1

virtualization-and-large-data

Implement high-performance virtualization strategies for massive datasets without sacrificing UX.

harborgrid-justin
harborgrid-justin
data-ai
open
data-engineering
1

data-engineer

Build ETL pipelines, data warehouses, and streaming architectures. Implements Spark jobs, Airflow DAGs, and Kafka streams. Use PROACTIVELY for data pipeline design or analytics infrastructure.

sidetoolco
sidetoolco
data-ai
open
machine-learning
1

nixtla-prod-pipeline-generator

Transforms forecasting experiments into production-ready inference pipelines with Airflow, Prefect, or cron orchestration. Generates ETL tasks, monitoring, error handling, and deployment configs. Activates when user needs to deploy forecasts to production, schedule batch inference, operationalize models, or create production pipelines.

intent-solutions-io
intent-solutions-io
data-ai
open
data-engineering
1

graphql-resolvers

Write efficient resolvers with DataLoader, batching, and N+1 prevention

pluginagentmarketplace
pluginagentmarketplace
data-ai
open
data-engineering
1

filtering-event-datasets

Filter and search event datasets (logs) using OPAL. Use when you need to find specific log events by text search, regex patterns, or field values. Covers contains(), tilda operator ~, field comparisons, boolean logic, and limit for sampling results. Does NOT cover aggregation (see aggregating-event-datasets skill).

rustomax
rustomax
data-ai
open
data-engineering
1

python-data-engineering

Comprehensive Python data engineering patterns for AWS Data Lake, including PySpark, Pandas, Apache Airflow, AWS Glue, ETL pipelines, data quality, schema management, performance optimization, FastAPI services, streaming with Kafka/Kinesis, data validation with Great Expectations, testing strategies, error handling, logging, and production deployment on AWS EMR and Glue.

b3-competition
b3-competition
data-ai
open
data-engineering
1

gcp-bq-data-loading

Use when loading data into BigQuery from CSV, JSON, Avro, Parquet files, Cloud Storage, or local files. Covers bq load command, source formats, schema detection, incremental loading, and handling parsing errors.

FunnelEnvy
FunnelEnvy
data-ai
open
data-engineering
1

erpnext-errors-database

Error handling patterns for ERPNext/Frappe database operations. Use when handling DoesNotExistError, DuplicateEntryError, transaction failures, and query errors. Covers retry patterns and data integrity. V14/V15/V16 compatible. Triggers: database error, DoesNotExistError, DuplicateEntryError, transaction failed, query error.

OpenAEC-Foundation
OpenAEC-Foundation
data-ai
open
data-engineering
1

nixtla-contract-schema-mapper

Transforms prediction market data to Nixtla format (unique_id, ds, y). Maps arbitrary column names to required schema. Validates date and numeric types. Use when preparing prediction market datasets for Nixtla forecasting tools. Trigger with "convert to Nixtla format", "schema mapping", "transform data".

intent-solutions-io
intent-solutions-io
data-ai
open
machine-learning
1

huggingface-model-trainer

Train and fine-tune LLMs using HuggingFace TRL, Transformers, and cloud GPU infrastructure with SFT, DPO, GRPO methods

frankxai
frankxai
data-ai
open
Previous
Page 298 / 406
Next