AgentSkillsCN

Join Safety

加入安全防护

SKILL.md

name: join-safety description: Prevent join explosions and silent definition changes by adding key + rowcount assertions around merges.

When to use

Any time you change a join, filter, crosswalk, aggregation, or key definition.

Checklist

  • State intended keys (left + right).
  • Assert uniqueness on the side(s) expected to be unique.
  • If many-to-many is possible: aggregate first OR build an explicit crosswalk.
  • Log before/after row counts and the share of unmatched keys.
  • Add a regression test or a small invariant check (e.g., totals within tolerance).

R helpers (preferred pattern)

Add (or reuse) tiny helpers in R/:

  • assert_unique(dt, keys, label)
  • assert_row_change(n_before, n_after, upper_mult, label)
  • summarize_join(left, right, keys, label)

Output to logs

Always log:

  • input paths
  • key uniqueness results
  • row counts
  • unmatched rates

“Stop and ask” triggers

If any of these happen, stop and ask the user:

  • row count increases by > 5–10% unexpectedly
  • uniqueness fails where it “should” hold
  • unmatched share rises substantially