Skill: cc-data-organization

STOP - Priority 1: Never Skip

Item	Why Critical
No magic numbers in business logic	Source of silent bugs
Currency uses integer cents, never float	Financial bugs are lawsuits
No float == comparisons	Non-deterministic failures
Variables initialized before use	Undefined behavior
Boolean naming is unambiguous	Logic inversion bugs

Skipping Priority 1 items is NEVER acceptable. They represent latent defects that will manifest later.

Modes

CHECKER

Purpose: Execute data organization checklists against code Triggers:

•"review my variable declarations"
•"check for magic numbers"
•"review data type usage"
•"check my variable names" Non-Triggers:
•"what type should I use for X" -> APPLIER
•"how should I name this variable" -> APPLIER
•"fix these magic numbers" -> TRANSFORMER Checklist: See checklists.md Metrics: See hard-data.md for Span/Live Time measures (goal: minimize both) Output Format: | Item | Status | Evidence | Location | |------|--------|----------|----------| Severity:
•VIOLATION: Fails checklist item
•WARNING: Partial compliance
•PASS: Meets requirement

APPLIER

Purpose: Guide data type selection, variable naming, and structure design Triggers:

•"what data type should I use for..."
•"how should I name this variable"
•"best practice for enums/constants"
•"how should I organize this data" Non-Triggers:
•"review my types" -> CHECKER
•"fix this" -> TRANSFORMER
•"audit my code" -> CHECKER Produces: Type recommendations, naming conventions, enum patterns, constant definitions, structure designs Constraints:
•[p.308] Eliminate semantic literals - Replace business values (86400, 12, 0.07) with named constants. Loop bounds 0, 1 and array indices are typically fine.
•[p.295] For currency: integer cents or BCD, never float
•
[p.306] Enums (language-dependent):
- •C/C++: Reserve 0 for invalid, define First/Last bounds
- •TypeScript string enums: No zero-reservation needed (no uninitialized risk)
- •Rust/Kotlin: Leverage exhaustive matching instead of bounds checks
•[p.259] Minimize scope: Declare variables in innermost block where all usages occur. Balance with testability—sometimes slightly wider scope enables testing.
•[p.263] Names describe the entity clearly: Reader should understand purpose without searching for definition. Examples: d (bad) → data (vague) → userData (better) → validatedUserSubmission (good for complex entity)
•[p.279] Problem Orientation: names refer to problem domain (employeeData, printerReady), not computing (inputRec, bitFlag)
•[p.263] Name length heuristic: 2-4 words, long enough to describe purpose, short enough to scan. Research shows 10-16 chars minimizes debugging effort [Gorla et al. 1990], but this is guidance, not a hard rule.

TRANSFORMER

Purpose: Fix data organization violations Triggers:

•CHECKER findings with VIOLATION status
•"replace magic numbers with constants"
•"fix float comparison"
•"refactor these globals" Non-Triggers:
•Large refactorings beyond data organization -> cc-refactoring-guidance
•Control flow restructuring -> cc-control-flow-quality Input -> Output:
•Magic 86400 -> SECONDS_PER_DAY = 86400
•if (a == b) floats -> if (Math.abs(a-b) < EPSILON)
•true, false, true params -> enum values
•Unstructured variables -> grouped structure
•Direct global access -> access routines Preserves: Behavior, unrelated code Verification: Re-run CHECKER; VIOLATION count = 0

Rationalization Counters

Excuse	Reality
"Everyone knows what 12 means"	Named constants aid maintenance [Glass 1991]
"Floats are close enough for =="	0.1 added 10 times rarely equals 1.0
"Magic numbers are faster to type"	Debugging hard-coded literals takes far longer
"I don't need custom types"	One typedef change vs hundreds of declarations
"Short names are faster to type"	Code read far more than written; favor read-time convenience
"Global variables are more convenient"	Convenience writing trades against difficulty reading, debugging, modifying

Sunk Cost Counters

For resisting changes to "working" code:

Excuse	Reality
"It works, why change it?"	Violations are latent defects; "works" means "hasn't failed yet"
"I already invested time in this"	Time invested in bad code is lost regardless; fix now or pay more later
"Refactoring will break things"	Violations already broken; you just haven't discovered how yet
"Currency has always used floats here"	Every penny calculation is a potential lawsuit
"We've had no bugs from these magic numbers"	You've had bugs—you attributed them to other causes
"The code passed review before"	Past reviews missed issues; evidence now shows violations

Success-Bias Warning

Past success does NOT predict future safety.

Violations that "worked for years" fail when:

•Edge cases finally occur (currency rounding in new scenarios)
•Scale changes (global variable contention under load)
•Maintenance happens (magic numbers misunderstood by new developers)
•Requirements shift (hard-coded values need changing)

Every checklist item applies regardless of past success. "Worked until it didn't" examples fill bug databases.

Modern Data Types Coverage

Beyond Code Complete's C-era focus:

Concurrent Access

When data may be accessed from multiple threads/async contexts:

•Identify shared state - Mark variables accessed across thread boundaries
•Access routines are mandatory - Never expose shared data directly
•Consider immutability - Immutable data eliminates race conditions by design
•Document thread safety - Comment whether type/routine is thread-safe
•Violations: Data races, torn reads, lost updates

Nullable/Optional Types

Modern languages use Option<T>, Maybe, T? instead of null pointers:

•Prefer non-nullable by default - Make nullability explicit and intentional
•Handle all cases - Exhaustive matching on Option/Maybe types
•Avoid null as "not found" - Use Option types or result types instead
•Document null semantics - When null is valid, document what it means
•C-style pointer guidance still applies to unsafe code

Temporal Data

Dates and times are a common bug source:

•Store timestamps in UTC - Convert to local only for display
•Use timezone-aware types - Never use naive datetime for user-facing data
•Be explicit about precision - Seconds, milliseconds, nanoseconds?
•Name with time unit - timeoutMs, durationSeconds, not just timeout
•Avoid magic time values - 86400 → SECONDS_PER_DAY

Security-Sensitive Data

Secrets, tokens, API keys require special handling:

•Clear from memory after use - Don't leave secrets in variables longer than needed
•Never log sensitive data - Redact in all log statements
•Use dedicated types - SecureString, SensitiveData wrappers
•Limit scope aggressively - Shortest possible lifetime

Chain

After	Next
Data organization verified	cc-control-flow-quality (CHECKER)