home/categories/data-engineering/databricks-solutions-ai-dev-kit-databricks-skills-synthetic-data-generation-skill-md
data-engineeringdata-ai

synthetic-data-generation

Generate realistic synthetic data using Faker and Spark, with non-linear distributions, integrity constraints, and save to Databricks. Use when creating test data, demo datasets, or synthetic tables.

databricks-solutions
maintainer
databricks-solutions
Updated 1/19/2026
Stars
5
Forks
5
quick start

Installation and usage

Generate realistic synthetic data using Faker and Spark, with non-linear distributions, integrity constraints, and save to Databricks. Use when creating test data, demo datasets, or synthetic tables.

Installation
$ install --globalskills.sh
Usage

Once installed, you can use this skill by running the following command in your terminal:

skills use synthetic-data-generation