Fix
Intent
Make risky or unclear code safe with the smallest sound, validated change.
Inputs
- •User request text.
- •Repo state (code + tests + scripts).
- •Validation signal (failing test/repro/log) OR a proof hook you create.
Outputs (chat)
- •Emit the exact sections in
Deliverable format (chat).
Hard rules (MUST / MUST NOT)
- •MUST default to review + implement (unless review-only is requested).
- •MUST triage and present findings in severity order: crash > corruption > logic.
- •MUST NOT claim done without a passing validation signal.
- •MUST NOT do product/feature work.
- •MUST NOT do intentional semantic changes without clarifying, except correctness tightening.
- •MUST resolve trade-offs using
Default policy (non-interactive). - •MUST produce a complete finding record for every acted-on issue:
- •counterexample
- •invariant_before
- •invariant_after
- •fix
- •proof
Default policy (non-interactive)
Use these defaults to maximize autonomy (avoid asking).
Priority order (highest first):
- •correctness + data safety + security
- •compatibility for PROVEN_USED behavior
- •performance
Definitions:
- •PROVEN_USED = behavior with evidence (see checklist below).
- •NON_PROVEN_USED = everything else.
PROVEN_USED evidence checklist (deterministic)
Goal: classify behavior as PROVEN_USED vs NON_PROVEN_USED without asking.
Algorithm:
- •Enumerate affected behavior tokens:
- •API symbols (functions/types/methods).
- •CLI flags/commands.
- •config keys / env vars.
- •file/wire formats (field names, JSON keys).
- •error codes/messages consumed by callers.
- •Collect evidence in this order (stop at first match):
- •User repro/signal references the token.
- •Tests assert on the token/behavior.
- •Docs/README/examples describe or demonstrate the token/behavior.
- •Repo callsites use the token (non-test, non-doc).
- •IF any evidence exists, THEN mark PROVEN_USED; ELSE mark NON_PROVEN_USED.
Evidence scan procedure (repo-local):
- •Tests: search
tests/,test/,__tests__/,spec/, and files matching*test*/*.spec.*. - •Docs/examples: search
README*,docs/,examples/, and*.md. - •Callsites: search remaining source files.
Tooling hint: prefer rg -n "<token>" with --glob filters.
Externally-used surface checklist (deterministic)
Treat a surface as external if ANY is true:
- •Token appears in docs/README/examples.
- •Token is a CLI flag/command, config key, env var, or file format field.
- •Token is exported/public API (examples: Rust
pub, TS/JSexport, Python in__all__, Go exported identifier).
If external use is plausible but uncertain:
- •Prefer an additive compatibility path (wrapper/alias/adapter) if small.
- •Ask only if a breaking change is unavoidable.
Compatibility rules:
- •MUST preserve PROVEN_USED behavior unless it is unsafe (crash/corruption/security). If unsafe, tighten and return a clear error.
- •For NON_PROVEN_USED behavior, treat it as undefined; tightening is allowed.
API/migration rules:
- •IF a fix touches an externally-used surface (use
Externally-used surface checklist), THEN prefer additive/backward-compatible changes (new option/new function/adapter) over breaking changes. - •IF a breaking change is unavoidable, THEN stop and ask.
Performance rules:
- •MUST avoid obvious asymptotic regressions in hot paths.
- •IF performance impact is plausible but unmeasurable locally,
THEN choose the smallest correctness-first fix and record the risk in
Residual risks / open questions(do not ask).
Autonomy gate (conviction)
Proceed without asking only if ALL are true:
- •SIGNAL: you have (or can create) a local repro/signal without product ambiguity.
- •CONTRACT: you can derive contract from repo evidence OR characterize current behavior without product ambiguity.
- •INVARIANT: you can state invariant_before and invariant_after.
- •DIFF: the change is localized and reviewable for the chosen strategy.
- •PROOF: at least one validation signal passes after changes.
If ANY gate fails:
- •Apply Defaults + contract derivation.
- •Ask only if still blocked.
Correctness tightening (default)
Definition: tightening = rejecting invalid/ambiguous states earlier, removing silent failure paths, or making undefined behavior explicit.
Rule:
- •IF tightening prevents crash/corruption/security issue, THEN apply it (even if PROVEN_USED), return a clear error, and record the compatibility impact.
- •ELSE IF tightening affects only NON_PROVEN_USED inputs/states, THEN apply it without asking.
- •ELSE (tightening might change PROVEN_USED behavior):
- •prefer a backward-compatible shape (adapter/additive API), OR
- •lock current behavior with a characterization test,
- •ask only if compatibility cannot be preserved.
Allowed tightening moves (examples only):
- •Parse/coerce: implicit coercion/partial parse -> explicit parse + error (JS implicit string->number; Rust
parse().ok()fallback). - •Fallbacks: warn+fallback/default-without-signal -> explicit error OR explicit "unknown"/"invalid".
- •Lossy conversion: truncation/clamp/wrap -> checked conversion + error (Rust
as; C casts; Go int conversions). - •Partial updates: multi-step mutation -> atomic/transactional boundary OR explicit partial result.
- •Ignored errors: ignore -> propagate with context; if intentionally ignored -> compensating signal (assert/test/metric/log).
- •Sentinels:
-1/0/""/None-> richer returns.
Defaults (use only when semantics-preserving for valid inputs)
Ownership / lifetimes
- •MUST inventory in-slice resources: allocations, handles, sockets, locks, txns, temp files.
- •For each resource, record: acquire_site, owner, release_action.
- •MUST ensure every exit releases exactly once.
- •MUST free/release ONLY via the proven owner/allocator/handle.
- •Never assume a "free" is safe without proving allocator/ownership.
- •Arena/bump allocators: per-object
freeis safe only if explicitly documented as permitted/no-op.
Validation signal selection
Algorithm:
- •IF the user provided a command, use it.
- •ELSE choose the cheapest local signal (no network) in this order:
- •README/QUICKSTART
- •
scripts/check/scripts/test - •
Makefile/justfile/Taskfile.yml
- •ELSE create a proof hook in this order:
- •focused regression/characterization test
- •boundary assertion that fails loudly
- •scoped + rate-limited diagnostic log tied to ONE invariant
Clarify before changes
Stop and ask ONLY if any is true:
- •Behavior is contradictory/product-sensitive AND cannot be derived from tests/docs/callsites AND cannot be characterized safely.
- •A breaking API change or irreversible migration is unavoidable (no backward-compatible path).
- •No local signal/proof hook can be found or created without product ambiguity.
Finding record schema (internal)
For every issue you act on, construct this record before editing:
- •location:
file:line - •severity:
crash|corruption|logic - •issue: <one sentence>
- •counterexample: <input/timeline>
- •invariant_before: <what is allowed/assumed today>
- •invariant_after: <what becomes guaranteed/rejected>
- •fix: <smallest sound fix that removes the bug-class>
- •proof: <test/assert/log + validation command + result>
Workflow (algorithm)
0) Preflight
- •Determine mode: review-only vs fix.
- •Define slice: entrypoint, inputs, outputs, state.
- •Select validation signal (or create proof hook).
1) Contract + baseline
- •Determine PROVEN_USED behavior:
- •Apply
PROVEN_USED evidence checklistto the behavior tokens affected by the slice.
- •Apply
- •Derive contract without asking (in this order):
- •tests that exercise the slice
- •docs/README/examples
- •callsites in the repo
- •if none: add a characterization test for current behavior
- •Write contract (1 sentence): "Working means …".
- •Run baseline signal once; record result.
2) Create initial findings
- •Enumerate candidate failure modes for the slice.
- •Rank crash > corruption > logic.
- •For each issue you will act on, create a finding record.
3) Mandatory scans (add/upgrade findings)
Run all scans for the touched slice.
3a) Unsoundness scan
For each hazard class that applies:
- •Identify the first failure point.
- •Provide a concrete counterexample (input/timeline).
- •Specify the smallest sound fix that removes the bug-class.
Hazard classes:
- •Nullability/uninitialized state; sentinel values.
- •Ownership/lifetime: leaks, double-free, use-after-close, lock not released.
- •Concurrency/order/time: races, lock ordering, reentrancy, timeouts/retries, TOCTOU.
- •Bounds/arithmetic: indexing/slicing, overflow/underflow, signedness.
- •Persistence/atomicity: partial writes, torn updates, non-transactional multi-step updates.
- •Encoding/units: bytes vs chars, timezone/locale, unit mismatches, lossy normalization.
- •Error handling: swallowed errors, missing context, silent fallback.
Language cues (examples only):
- •Rust:
unsafe,unwrap/expect,ascasts. - •JS/TS:
any, implicit coercions,undefinedflows. - •Python: bare
except, truthiness-based branching. - •C/C++: raw pointers, unchecked casts, manual
malloc/free.
3b) Invariant strengthening scan
For each invariant you can name:
- •State invariant_before and invariant_after.
- •Choose the strongest enforcement feasible:
- •compile-time/typestate
- •construction-time parser/smart constructor (return refined value)
- •runtime boundary checks
- •assertions/logs (last resort)
- •Prefer "parse/refine once at the boundary" over scattered validation.
- •Attach a proof hook (test/assert/log) that fails loudly on violation.
Discovery cues:
- •Same check duplicated across sites.
- •"should never happen" comments.
- •Silent defaulting/fallback.
- •Flag/boolean combos that only make sense in some states.
3c) Footgun scan + defusal
Trigger: you touched an API surface OR a caller can plausibly misuse the code.
For top-ranked misuse paths:
- •Provide minimal misuse snippet + surprising behavior.
- •Defuse by changing the surface:
- •options structs / named params
- •split booleans into enums or separate functions
- •explicit units/encodings
- •richer returns (no sentinels)
- •separate pure computation from effects
- •Lock with a regression test or boundary assertion.
3d) Complexity scan (risk-driven)
Rule: reshape only when it reduces risk and improves auditability of invariants/ownership.
Algorithm:
- •Identify hotspots: nesting, branch soup, cross-file hops, mixed responsibilities.
- •Separate essential vs incidental.
- •Apply in order: flatten → rename → extract → replace branches with data.
- •If simplification requires a missing invariant, strengthen invariants first.
- •Avoid premature abstraction; duplication is cheaper than the wrong abstraction.
4) Implement fixes (per finding)
For findings in severity order:
- •Implement the smallest sound fix that removes the bug-class.
- •Apply correctness tightening when allowed.
- •Ensure invariants + ownership hold on all paths.
- •Keep diff reviewable; avoid drive-by refactors.
5) Close the loop (required)
- •Run the chosen validation signal.
- •IF it fails:
- •update findings with the new counterexample,
- •apply the smallest additional fix,
- •re-run the SAME signal.
- •Repeat until the signal passes.
Deliverable format (chat)
Output exactly these sections.
Contract
- • <one sentence>
Findings (severity order) For each finding:
- •
<file:line>—<crash|corruption|logic>— <issue>- •Counterexample: <input/timeline>
- •Invariant (before): <what was assumed/allowed>
- •Invariant (after): <what is now guaranteed/rejected>
- •Fix: <smallest sound fix summary>
- •Proof: <test/assert/log + validation command> -> <ok/fail>
Changes applied
- •<file> — <rationale>
Validation
- •<cmd> -> <ok/fail>
Residual risks / open questions
- • <bullets>
Pitfalls
- •Vague advice without locations.
- •"Nice-to-have" refactors that don’t reduce risk.
- •Feature creep.
- •Asking for risk tolerance instead of applying the default policy.
- •Tightening presented as optional.
- •Ownership fixes without proven allocator/owner.
Activation cues
- •"resolve" / "fix" / "crash" / "data corruption"
- •"footgun" / "misuse" / "should never happen"
- •"invariant" / "lifetime" / "nullable surprise"
- •"too complex" / "branch soup" / "cross-file hops"