domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-analysis
31

log-summary-date-ranges

Guidance for analyzing log files and generating summary reports with counts aggregated across multiple date ranges and severity levels. This skill applies when tasks involve parsing log files by date, counting occurrences by severity (ERROR, WARNING, INFO), and outputting structured CSV summaries across time periods like "today", "last 7 days", or "last 30 days".

letta-ai
letta-ai
data-ai
open
data-engineering
31

multi-source-data-merger

This skill provides guidance for merging data from multiple heterogeneous sources (CSV, JSON, Parquet, XML, etc.) into unified output formats with conflict detection and resolution. Use when tasks involve combining data from different file formats, field mapping between schemas, priority-based conflict resolution, or generating merged datasets with conflict reports.

letta-ai
letta-ai
data-ai
open
data-engineering
31

apache-airflow-orchestration

Complete guide for Apache Airflow orchestration including DAGs, operators, sensors, XComs, task dependencies, dynamic workflows, and production deployment

manutej
manutej
data-ai
open
data-engineering
31

apache-spark-data-processing

Complete guide for Apache Spark data processing including RDDs, DataFrames, Spark SQL, streaming, MLlib, and production deployment

manutej
manutej
data-ai
open
data-engineering
31

sparql-university

Guidance for writing SPARQL queries against RDF/Turtle datasets, particularly for university or academic data. This skill should be used when tasks involve querying RDF data with SPARQL, filtering entities based on multiple criteria, aggregating results, or working with Turtle (.ttl) files.

letta-ai
letta-ai
data-ai
open
data-engineering
31

sparql-university

Guidance for writing and verifying SPARQL queries against RDF datasets, particularly university/academic ontologies. This skill should be used when tasks involve querying RDF data with SPARQL, working with academic datasets (students, professors, departments, courses), or performing complex graph pattern matching with filters and aggregations.

letta-ai
letta-ai
data-ai
open
data-engineering
31

kafka-stream-processing

Complete guide for Apache Kafka stream processing including producers, consumers, Kafka Streams, connectors, schema registry, and production deployment

manutej
manutej
data-ai
open
data-engineering
31

multi-source-data-merger

This skill provides guidance for merging data from multiple heterogeneous sources (JSON, CSV, Parquet, XML, etc.) into a unified dataset. Use this skill when tasks involve combining records from different file formats, applying field mappings, resolving conflicts based on priority rules, or generating merged outputs with conflict reports. Applicable to ETL pipelines, data consolidation, and record deduplication scenarios.

letta-ai
letta-ai
data-ai
open
data-engineering
31

reshard-c4-data

Guidance for data resharding tasks that involve reorganizing files across directory structures with constraints on file sizes and directory contents. This skill applies when redistributing datasets, splitting large files, or reorganizing data into shards while maintaining constraints like maximum files per directory or maximum file sizes. Use when tasks involve resharding, data partitioning, or directory-constrained file reorganization.

letta-ai
letta-ai
data-ai
open
data-engineering
31

dbt-data-transformation

Complete guide for dbt data transformation including models, tests, documentation, incremental builds, macros, packages, and production workflows

manutej
manutej
data-ai
open
llm-ai
31

extract-transcripts

Extract readable transcripts from Claude Code and Codex CLI session JSONL files

0xBigBoss
0xBigBoss
data-ai
open
machine-learning
31

hf-model-inference

Guidance for setting up HuggingFace model inference services with Flask APIs. This skill applies when downloading HuggingFace models, creating inference endpoints, or building ML model serving APIs. Use for tasks involving transformers library, model caching, and REST API creation for ML models.

letta-ai
letta-ai
data-ai
open
llm-ai
31

mteb-retrieve

This skill provides guidance for semantic similarity retrieval tasks using embedding models (e.g., MTEB benchmarks, document ranking). It should be used when computing embeddings for documents/queries, ranking documents by similarity, or identifying top-k similar items. Covers data preprocessing, model selection, similarity computation, and result verification.

letta-ai
letta-ai
data-ai
open
llm-ai
31

learning-sdk-integration

Integration patterns and best practices for adding persistent memory to LLM agents using the Letta Learning SDK

letta-ai
letta-ai
data-ai
open
machine-learning
31

train-fasttext

This skill provides guidance for training FastText text classification models with constraints on accuracy and model size. It should be used when training fastText supervised models, optimizing model size while maintaining accuracy thresholds, or when hyperparameter tuning for text classification tasks.

letta-ai
letta-ai
data-ai
open
llm-ai
31

super-dev

顶级 AI 开发战队 (God-Tier)。调度 10 位精英专家 (PM/架构/UI/UX/安全/代码/DBA/QA/DevOps/RCA),交付商业级研发资产。内置思维链 (CoT) 与实时市场情报系统。

shangyankeji
shangyankeji
data-ai
open
llm-ai
31

path-tracing

Guide for reverse-engineering and recreating programmatically-generated ray-traced images. This skill should be used when tasks involve analyzing a target image to determine rendering parameters, implementing path tracing or ray tracing algorithms, matching scene geometry and lighting, or achieving high similarity scores between generated and target images.

letta-ai
letta-ai
data-ai
open
machine-learning
31

train-fasttext

Guidance for training FastText text classification models with constraints on model size and accuracy. This skill should be used when training FastText models, optimizing hyperparameters, or balancing trade-offs between model size and classification accuracy.

letta-ai
letta-ai
data-ai
open
machine-learning
31

mteb-leaderboard

Guidance for querying ML model leaderboards and benchmarks (MTEB, HuggingFace, embedding benchmarks). This skill applies when tasks involve finding top-performing models on specific benchmarks, comparing model performance across leaderboards, or answering questions about current benchmark standings. Covers strategies for accessing live leaderboard data, handling temporal requirements, and avoiding common pitfalls with outdated sources.

letta-ai
letta-ai
data-ai
open
llm-ai
31

letta-fleet-management

Manage Letta AI agent fleets declaratively with kubectl-style CLI. Use when creating, updating, or managing multiple Letta agents with shared configurations, memory blocks, tools, and folders.

letta-ai
letta-ai
data-ai
open
llm-ai
31

langchain-orchestration

Comprehensive guide for building production-grade LLM applications using LangChain's chains, agents, memory systems, RAG patterns, and advanced orchestration

manutej
manutej
data-ai
open
llm-ai
31

model-configuration

SDK/API patterns for configuring LLM models on Letta agents. Use when setting model handles, adjusting temperature/tokens, configuring provider-specific settings (reasoning, extended thinking), or setting up custom endpoints.

letta-ai
letta-ai
data-ai
open
machine-learning
31

hf-model-inference

Guidance for deploying HuggingFace models as inference APIs/services. This skill applies when tasks involve downloading pre-trained models from HuggingFace Hub, creating REST APIs for model inference, building Flask/FastAPI services around ML models, or setting up sentiment analysis, text classification, or other NLP inference endpoints.

letta-ai
letta-ai
data-ai
open
llm-ai
31

letta-development-guide

Comprehensive guide for developing Letta agents, including architecture selection, memory design, model selection, and tool configuration. Use when building or troubleshooting Letta agents.

letta-ai
letta-ai
data-ai
open
Previous
Page 212 / 406
Next