JSON Schema Faker Skill
Generate realistic fake data from JSON schemas with guaranteed validity.
Quick Start
When you have a JSON schema and need realistic sample data:
node scripts/generate-fake-data.mjs /path/to/schema.json 25
This generates 25 valid records conforming to your schema.
Requirements
Install dependencies first:
npm install @json-schema-faker/core ajv
How It Works
- •Parse schema - Loads and validates your JSON schema
- •Configure faker - Sets up realistic data generation with custom rules
- •Generate records - Creates 1-50 records using faker
- •Validate output - Validates each record against the schema
- •Return JSON - Outputs only valid records as pretty-printed JSON
Features
Schema Compliance: All generated records are validated against the schema—only valid JSON is returned.
Realistic Data: Uses faker.js with domain-specific extensions (currency codes, country codes, time windows) to produce realistic data beyond simple random strings.
Customizable: Specify 1-50 records. Script auto-caps at 50.
Error Handling: Handles generation failures gracefully; reports warnings for records that couldn't be generated after multiple attempts.
Usage
Basic Generation
node scripts/generate-fake-data.mjs schema.json 10
Generates 10 records from schema.json, outputs JSON to stdout.
Specify Record Count
node scripts/generate-fake-data.mjs schema.json 50
Records are capped at 50 maximum. If omitted, defaults to 10.
Save to File
node scripts/generate-fake-data.mjs schema.json 30 > output.json
Improving Data Quality
The faker library generates realistic data based on schema hints. To improve data quality:
- •Add format hints: Use
"format": "email","format": "date","format": "uri" - •Add faker directives: Use
"faker": "name.firstName"to guide generation - •Use enums: For constrained fields, provide valid options
- •Set constraints: Use
minimum,maximum,minLength,maxLengthappropriately - •Avoid overly restrictive patterns: Very complex regex patterns may cause generation to fail
See references/faker-customization.md for detailed customization options and available faker methods.
Example: Accommodation Listings
Given a schema for accommodation listings (with faker hints for currency and country), the skill generates realistic data like:
[
{
"id": "listing_ashdod_001",
"title": "Beachfront Apartment",
"type": "apartment",
"location": {
"city": "Ashdod",
"countryCode": "IL",
"distanceToBeachMinutes": 5,
"coordinates": { "lat": 31.8, "lng": 34.65 }
},
"rating": { "score": 8.5, "count": 45 },
"pricePreview": {
"currency": "USD",
"nightly": 180,
"total": 540,
"nights": 3
},
...
}
]
Troubleshooting
"Could not generate valid record": The schema may have constraints that conflict (e.g., minLength > maxLength). Review your schema constraints.
"Failed to parse schema": Ensure the schema file is valid JSON.
Too few records generated: Some schemas have tight constraints. Increase attempts or relax schema constraints.
Advanced Customization
To add custom faker methods beyond the built-in ones (currencyCode, countryCode, timeWindow), edit the setupFaker() function in scripts/generate-fake-data.mjs.
Example: Add a custom "accommodationType" method:
faker.accommodationType = () => {
const types = ['apartment', 'house', 'hotel', 'bnb']
return faker.random.arrayElement(types)
}
Then use in your schema:
{
"type": "string",
"faker": "accommodationType"
}