home/categories/data-engineering/red-hat-ai-innovation-team-sdg-hub-claude-skills-synthetic-data-generation-skill-md
data-engineeringdata-ai

synthetic-data-generation

Generate synthetic data using sdg_hub with composable blocks and YAML flows. Use when the user wants to create training datasets, generate QA pairs, run data generation pipelines, build custom flows, produce synthetic data from documents, use agent frameworks for data generation, or distill MCP tool-use traces. Supports pre-built flows, custom Python scripts, and YAML flow authoring with 20+ blocks, agent connectors (Langflow, LangGraph), MCP tool-use, and 100+ LLM providers via LiteLLM.

Red-Hat-AI-Innovation-Team
maintainer
Red-Hat-AI-Innovation-Team
Mis à jour 4/10/2026
Étoiles
128
Forks
53
quick start

Installation and usage

Generate synthetic data using sdg_hub with composable blocks and YAML flows. Use when the user wants to create training datasets, generate QA pairs, run data generation pipelines, build custom flows, produce synthetic data from documents, use agent frameworks for data generation, or distill MCP tool-use traces. Supports pre-built flows, custom Python scripts, and YAML flow authoring with 20+ blocks, agent connectors (Langflow, LangGraph), MCP tool-use, and 100+ LLM providers via LiteLLM.

Installation
$ install --globalskills.sh
Utilisation

Après l'installation, vous pouvez utiliser ce skill en exécutant la commande suivante dans votre terminal :

skills use synthetic-data-generation