Skill: dedupe-entity
Goal
Merge duplicate companies across sources and name variants.
Inputs
- •staging/leads_enriched.parquet
Outputs
- •outputs/leads_master.csv
- •outputs/dedupe_audit.csv (why merged)
Procedure
- •Normalize names (legal suffix stripping)
- •Blocking keys: country + website_domain OR normalized_name_prefix
- •Splink model, then human review for low-confidence merges