This skill focuses on logical modeling — what entities exist and how they relate. For physical implementation (DDL, indexes, migrations), use the appropriate database skill after completing the model.

Process Overview

1. Discovery: Entity Extraction

From Requirements to Entities

code

"Users can create posts. Posts have tags. Users comment on posts."
→ Nouns: Users, Posts, Tags, Comments

Critical Questions (Ask Before Modeling)

2. Analysis: Relationships

Cardinality Decision Matrix

Referential Actions

Rule: Default to RESTRICT. Use CASCADE only when child is worthless without parent.

Junction Table Decisions

3. Normalization Strategy

Quick Reference

Form	Rule	Violation Example	Fix
1NF	Atomic values	`tags: "a,b,c"`	Separate table
2NF	No partial dependencies	`order_items.product_name`	Move to products
3NF	No transitive dependencies	`orders.customer_city`	Move to customers

Decision: Normalize vs Denormalize

Safe Denormalization Patterns

Rule: Every denormalized field MUST document: source, sync method, staleness tolerance, recovery procedure.

4. Common Patterns (Logical Design)

See PATTERNS.md for decision guides. See PATTERNS_POSTGRESQL.md or PATTERNS_SQLITE.md for implementation.

Pattern	Use When	Key Decision
Soft deletes	Retention requirements, undo needed	Unique constraint handling
Audit trail	Compliance, debugging	Level of detail needed
Polymorphic	Comments/likes across multiple parent types	Integrity vs flexibility
Hierarchical	Categories, org charts, nested structures	Read vs write frequency
State machine	Entity has lifecycle stages	Valid transitions
Temporal	Price history, effective dating	Overlap prevention
Multi-tenancy	SaaS with multiple customers	Isolation level

5. Scaling Considerations

Partitioning Triggers

ID Strategy (Pick ONE for entire project)

Note: SQLite uses INTEGER PRIMARY KEY for auto-increment. PostgreSQL uses BIGINT + GENERATED ALWAYS AS IDENTITY.

6. Output Format

Required Deliverables

Template

markdown

## Target Database: [PostgreSQL / SQLite]

## Entities

### User
- id: PK
- email: unique, required, immutable after verification
- name: required, mutable
- created_at: immutable, auto

### Order  
- id: PK
- user_id: FK → User, required
  - ON DELETE: RESTRICT (orders are business records)
- status: enum(pending|confirmed|shipped|delivered|cancelled)
- total: decimal, required, immutable after confirmation

## Relationships

- User 1:N Order
- Order N:M Product (via order_items junction)

## Patterns Selected

| Pattern | Decision | See Implementation |
|---------|----------|-------------------|
| Soft delete | Yes, for User and Order | PATTERNS_[DB].md |
| Audit trail | Level 2 (actor tracking) | PATTERNS_[DB].md |
| ID strategy | UUIDv7 | PATTERNS_[DB].md |

## Decisions

| Decision | Rationale |
|----------|-----------|
| RESTRICT on Order→User | Orders are legal records, can't delete user with order history |
| Immutable order.total | Matches invoice, prevents post-hoc disputes |

## Denormalization

| Field | Source | Sync | Staleness OK |
|-------|--------|------|--------------|
| user.order_count | COUNT(orders) | Trigger | Yes (display only) |

## Next Step

Implement using `postgresql` or `sqlite` skill with PATTERNS_[DB].md reference.

Question	If No
Has its own identity?	It's an attribute
Has multiple attributes?	It's just a column
Queried independently?	Consider embedding
Changes independently?	Embed with parent

Action	Use When	Example
RESTRICT	Default choice - deletion should be explicit	User with orders
CASCADE	Child meaningless without parent	Order → Order items
SET NULL	Child can exist independently	Post → Author left

Pattern	When	Sync	Risk
Cached aggregates	Displayed often, expensive to compute	Trigger	Count drift
Snapshot at event	Historical accuracy required	Copy once, never update	None
Materialized path	Hierarchical queries frequent	Update on parent change	Path corruption

Signal	Consider
Table > 100M rows	Horizontal partitioning
Some columns rarely accessed	Vertical partitioning
Clear time-based access pattern	Time-based partitioning

Type	Use When
Auto-increment	Single database, simple setup
UUID v4	Distributed systems, no ordering needed
UUID v7	Distributed + need time-based sorting

data-modeling

Data Modeling