home/categories/data-engineering/hamelsmu-evals-skills-skills-generate-synthetic-data-skill-md
data-engineeringdata-ai

generate-synthetic-data

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.

hamelsmu
maintainer
hamelsmu
अपडेट किया गया 3/3/2026
स्टार
1105
फोर्क
117
quick start

Installation and usage

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.

इंस्टॉलेशन
$ install --globalskills.sh
उपयोग

इंस्टॉल करने के बाद, आप टर्मिनल में यह कमांड चलाकर इस स्किल का उपयोग कर सकते हैं:

skills use generate-synthetic-data