json-schema-rs (contribution guide)
Supported and unsupported features are documented in the README. This skill focuses on how to contribute to the crate.
Purpose and Philosophy
The json-schema-rs crate has two goals:
- •Generate Rust structs from JSON Schema files (current)
- •Validate data against a JSON Schema file (future)
Design Principles
- •Testability-first:
generate_to_writer<W: Write>is the core API. Tests useVec<u8>orCursor<Vec<u8>>without file I/O. - •Deterministic output: Uses
BTreeMapfor alphabetical struct and field ordering. Same input always produces same output. - •Custom JSON Schema structs: Only model fields needed for the PoC. Use
serde deserialization with
#[serde(default)]for optional fields. - •Custom error enum:
JsonSchemaGenErrorwith manualDebug,Display,Error, andFromimpls. Do not use thiserror.
Architecture
For public API and feature set, see the README.
flowchart TD
subgraph PublicAPI [Public API]
FromFile[generate_from_file]
end
subgraph Core [Core - Testable]
ToWriter[generate_to_writer]
end
subgraph Inputs [Inputs]
SchemaFile[JSON Schema file]
SchemaStr[JSON Schema string]
end
subgraph Outputs [Outputs]
FileOut[File on disk]
VecOut["Vec u8 in memory"]
end
SchemaFile --> FromFile
FromFile --> ToWriter
SchemaStr --> ToWriter
ToWriter --> FileOut
ToWriter --> VecOut
- •
generate_to_writeris the single source of truth.generate_from_fileis a thin wrapper that reads the schema file and callsgenerate_to_writerwith aFile. - •schema.rs:
JsonSchemastruct withBTreeMap<String, Box<JsonSchema>>for recursive properties,r#enum: Option<Vec<serde_json::Value>>for enum support. - •codegen.rs:
collect_structs(recursive traversal),emission_order(topological sort for nested-first output),emit_struct(writes Rust code),emit_enum(writes Rust enums). Enums emitted first (alphabetically), then structs. - •Naming: Struct name from
titleif present and non-empty, else PascalCase of property key. Field sanitization: replace-with_. Add#[serde(rename = "...")]when Rust field name differs from JSON key.
Code organization
| Module | Responsibility |
|---|---|
lib.rs | Public API: generate_to_writer, generate_from_file. Re-exports JsonSchemaGenError, JsonSchema. |
schema.rs | Serde-deserializable JSON Schema types. Add new schema keywords here when extending support. |
codegen.rs | Struct/enum collection, emission order (topological sort), and Rust code emission. Extend collect_structs and emit_* when adding new types. |
error.rs | JsonSchemaGenError enum and From impls. No thiserror. |
main.rs | CLI binary (json-schema-gen); reads input path, writes output path. |
Where to add new support
To support a new JSON Schema keyword or type:
- •schema.rs: Add fields to
JsonSchemaif the new keyword must be parsed. Use#[serde(default)]andOptionso extra keys are ignored. - •codegen.rs: In
collect_structs(and helpers), handle the new type/keyword—e.g. match on the newtypevalue or read the new field. If the new type produces new structs or enums, ensure they are collected and included in emission order. Extendemit_struct/emit_enumor add newemit_*functions if you introduce new output shapes. - •Tests: Add or extend tests in
tests/integration-tests.rs(inlined schema + expected output). Always updatetests/schemas/complex-schema.jsonandtests/schemas/complex-schema-expected.rsso the file-based test covers the new behavior. Use fullassert_eq!(expected, actual); no partial checks. - •README example and examples directory: Keep the README’s example (the
JSON Schema and "Generated Rust" blocks) and the examples directory in sync.
When adding a new feature, update both the README example and
examples/readme_example_schema.json/examples/readme_example.rsto demonstrate it. The README example and the examples directory should stay consistent with each other and with supported features.
Contribution Guidelines
Testing
- •Inlined tests: Most tests have input and expected output inlined in the test method (no file loading).
- •File-based test: One test (complex schema) reads schema and expected output from files; generates actual output in memory; compares expected vs actual. When implementing new features, always update the complex schema and its expected output so the file-based test exercises the new behavior.
- •Unit tests: Add
#[cfg(test)]tests in codegen.rs (or other modules) for feature-specific logic where it makes sense. Unit tests should always be exhaustive: aim for one unit test per code path for the feature (e.g. one test for each possible outcome or branch), plus opposite pairings such as success vs failure, enabled vs disabled, bounds present vs absent, or fallback vs non-fallback. - •Assertions: Always compare full expected output against full actual
output. Never use partial checks like
actual.contains(...)or!actual.contains(...)—useassert_eq!(expected, actual)with complete expected strings. - •Integration tests: Tests live in
tests/integration-tests.rsand use the public API.
Code Conventions
- •Run
make lint testbefore completing any changes. - •Use
#[expect]not#[allow]for Clippy overrides. - •Never fail silently; log errors internally (customer-facing message can differ).
- •Follow existing patterns: custom Error enum, BTreeMap for ordering, explicit type annotations on all variables.
Adding New JSON Schema Support
- •Add fields to
JsonSchemain schema.rs only when needed for the PoC. - •Use
#[serde(default)]andOptionso extra keys in the JSON are ignored. - •For unsupported types, ignore rather than fail (PoC behavior; future versions will hard-fail).
Reference implementations
When implementing a new feature, always check how the json_schema, schemafy, and typify Rust crates implement it (docs and/or source). Compare our approach to theirs to see if our idea is better or worse; document the choice in the skill (e.g. under a "learned" subsection) or in the PR. This reduces duplicated design work and keeps the crate aligned with ecosystem conventions where appropriate.
- •json_schema: Schema representation and validation (not codegen). Use for
how schema concepts are modeled (e.g.
SimpleTypes::IntegervsNumber). - •schemafy: JSON Schema → Rust codegen; simple fixed type mappings.
- •typify: JSON Schema → Rust codegen; uses
minimum/maximumfor constraint-based integer type selection (u8, i32, etc.).
Post-Feature Knowledge Capture
At the end of development for a new feature, add any knowledge learned during that work session back to this skill. Examples: edge cases, design decisions, pitfalls, or conventions that would help future contributors. This keeps the skill up to date and reduces repeated discovery.
Required vs Optional (learned)
- •
requiredis an array of property names at each object schema level. When absent, all properties are optional per JSON Schema spec. Whenrequired: [], all properties are optional. - •Explicit optional (recognized but ignored): The per-property
optionalkeyword is parsed in schema.rs and explicitly ignored in codegen; required vs optional is determined only by the object-levelrequiredarray. This is for future-proofing: strict adherence to the JSON Schema spec and/or settings (e.g. allow non-standard fields, or interpret them) may be added later. - •File-based expected output:
complex-schema-expected.rsmust end with a trailing newline to matchgenerate_to_writeroutput (each struct ends withwriteln!(writer)?). - •Field ordering: BTreeMap yields alphabetical order by property key (e.g.,
optional_fieldbeforerequired_field).
Enum support (learned)
- •
enumin JSON Schema: array of allowed values. Only string enums supported; non-string values fall back toString. - •Variant naming:
to_rust_variant_nameproduces PascalCase (first char uppercase, rest lowercase per word). Invalid identifiers (e.g.,"123") getEprefix. - •Collision handling: When multiple JSON values map to same Rust variant
name (e.g.,
"PENDING","pending","Pending"all ->Pending), append_0,_1,_2to all colliding variants. - •Determinism: Sort enum values alphabetically before processing. Ensures
["markdown", "plain"]and["plain", "markdown"]produce identical output. - •Deduplication: Duplicate JSON values (e.g.,
["a", "a"]) deduplicated to one variant. - •Raw string literals in tests: Use
r#"..."#notr"..."when expected output contains]"(e.g.,#[serde(rename = "x")]) to avoid Rust parsing the]"as end of string.
Numbers (learned)
- •Mapping: We use
minimumandmaximum(when both present and valid) to choose the smallest integer type (i8, u8, i16, u16, i32, u32, i64, u64) or float type (f32 vs f64). Fallback: no/min/max or invalid →i64for integer,f64for number. No validation; type selection only. Float selection is range-based; f32 may lose precision for some decimals. - •Arrays:
resolve_array_item_typeuseschoose_integer_type/choose_number_typeforitems.typeofinteger/number.
Default support (learned)
- •Schema model: Use
DefaultKeyword(Absent | Present(Value)) in schema.rs to preserve JSONnull. Serde deserializesOption<Value>with null asNone, losing the distinction between absent key and"default": null. - •Two strategies:
UseTypeDefault→#[serde(default)]when the schema value equals the type's Default (false, 0, 0.0, "", [], null for optional).Custom { fn_name, rust_expr }→ generated function +#[serde(default = "fn")]for literal defaults. - •Optional + null: For optional fields with
default: null, use#[serde(default)]so missing key yieldsNone. - •Emission order: Enums first, then default functions (they may reference enums), then structs.
- •Custom default function: Returns
Some(expr)for optional fields,exprfor required. Must be emitted before the struct that uses it. - •Out of scope: Object defaults, non-empty array defaults.
Description support (learned)
- •Normalization: Empty or whitespace-only
descriptionis treated as absent; usenormalize_description()so we do not emit blank///lines. - •Multi-line: Emit one
///line per line of text viadescription.trim().lines(). Field doc comments use a 4-space prefix (emit_doc_comment(..., " ")) so they align with the field line. - •Placement: Object schema
description→ struct doc; enum schemadescription→ enum doc; property schemadescription→ field doc (same schema used for the property, including nested object descriptions on the field that references that object).
CLI and Makefile (for contributors)
Use these to run the generator locally when testing changes.
Binary
- •Name:
json-schema-gen - •Usage:
./target/release/json-schema-gen <input_path> <output_path> - •Args: Input path to JSON Schema file, output path for generated Rust file.
Makefile Target
make json_schema_gen input=<path-to-json-schema> output=<path-to-rust-output>
- •Requires both
inputandoutputvariables. Fails with usage help if either is missing. - •Builds in release mode.
- •Runs the binary with the provided paths after building.
Example:
make json_schema_gen input=json-schema-rs/tests/schemas/complex-schema.json output=output.rs
Key Files
| File | Purpose |
|---|---|
json-schema-rs/src/lib.rs | Public API: generate_to_writer, generate_from_file |
json-schema-rs/src/codegen.rs | Struct collection, emission order, code generation |
json-schema-rs/src/schema.rs | Custom JSON Schema structs (serde-deserializable) |
json-schema-rs/src/error.rs | JsonSchemaGenError enum |
json-schema-rs/src/main.rs | CLI binary |
json-schema-rs/tests/integration-tests.rs | Integration tests |
json-schema-rs/tests/schemas/complex-schema.json | File-based test input; update when adding features |
json-schema-rs/tests/schemas/complex-schema-expected.rs | File-based test expected output; update when adding features |
json-schema-rs/examples/readme_example_schema.json | Example schema; keep in sync with README example and readme_example.rs |
json-schema-rs/examples/readme_example.rs | Example binary (inlined schema); keep in sync with README and schema file |