Data Sanity Check
When to use
Use after data loading, feature generation, or merging datasets.
Inputs
- •DataFrame(s) or array(s)
- •Expected schema (optional)
Steps
- •Report shapes and dtypes.
- •Check null counts and percentages.
- •Validate value ranges for critical columns.
- •Detect duplicates on key columns.
- •Flag unexpected columns or missing required fields.
Outputs
- •Summary table (shape, dtypes)
- •Anomaly report (nulls, range violations, duplicates)
Failure conditions
- •Critical nulls or schema mismatch
- •Out-of-range values beyond tolerance