Data Modeling Skill
Purpose
Create clear data models that document data structures, relationships, and business rules for system design and development.
When to Use
- •Designing new database structures
- •Documenting existing data for analysis
- •Defining data requirements in FRS
- •Integration design between systems
- •Data migration planning
Data Modeling Concepts
Entity Relationship Diagram (ERD)
Entities: Objects/things we store data about (e.g., Customer, Order, Product) Attributes: Properties of entities (e.g., customer_name, order_date) Relationships: How entities relate to each other
Relationship Types
One-to-One (1:1) Example: User ↔ UserProfile
- •One user has exactly one profile
- •One profile belongs to exactly one user
One-to-Many (1:N) Example: Customer → Orders
- •One customer can have many orders
- •One order belongs to one customer
Many-to-Many (M:N) Example: Products ↔ Categories
- •One product can be in many categories
- •One category can have many products
- •Requires junction/bridge table
Cardinality Notation (Crow's Foot)
code
||──────|| : One and only one (1:1) ||──────< : One-to-Many (1:N) >──────< : Many-to-Many (M:N) o|──────< : Zero or one to Many ||──────o< : One to Zero or Many
ERD Examples
E-commerce ERD (Mermaid)
mermaid
erDiagram
CUSTOMER ||--o{ ORDER : places
CUSTOMER {
uuid customer_id PK
string email UK
string first_name
string last_name
string phone
datetime created_at
}
ORDER ||--|{ ORDER_ITEM : contains
ORDER {
uuid order_id PK
uuid customer_id FK
string order_number UK
decimal subtotal
decimal tax
decimal shipping
decimal total
string status
datetime order_date
}
ORDER_ITEM {
uuid item_id PK
uuid order_id FK
uuid product_id FK
int quantity
decimal unit_price
decimal total_price
}
PRODUCT ||--o{ ORDER_ITEM : "included in"
PRODUCT {
uuid product_id PK
string sku UK
string name
text description
decimal price
int stock_quantity
string status
}
PRODUCT }|--|| CATEGORY : "belongs to"
CATEGORY {
uuid category_id PK
string name
uuid parent_id FK
}
CRM ERD
mermaid
erDiagram
ACCOUNT ||--o{ CONTACT : has
ACCOUNT ||--o{ OPPORTUNITY : has
ACCOUNT {
uuid account_id PK
string name
string industry
string website
int employee_count
decimal annual_revenue
uuid owner_id FK
}
CONTACT {
uuid contact_id PK
uuid account_id FK
string first_name
string last_name
string email UK
string phone
string title
}
OPPORTUNITY }o--|| CONTACT : "primary contact"
OPPORTUNITY {
uuid opportunity_id PK
uuid account_id FK
uuid contact_id FK
string name
string stage
decimal amount
date close_date
int probability
uuid owner_id FK
}
LEAD {
uuid lead_id PK
string first_name
string last_name
string email
string company
string source
int score
string status
}
Data Dictionary
Template
| Attribute | Data Type | Size | Required | Default | Description | Validation |
|---|---|---|---|---|---|---|
| customer_id | UUID | - | Yes | Auto-gen | Unique identifier | UUID format |
| VARCHAR | 255 | Yes | - | Customer email | Valid email | |
| status | ENUM | - | Yes | 'active' | Account status | active, inactive, suspended |
Example: Order Entity
| Attribute | Type | Size | Required | Default | Description | Rules |
|---|---|---|---|---|---|---|
| order_id | UUID | - | Yes | Auto | Primary key | Unique |
| order_number | VARCHAR | 20 | Yes | Generated | Display number | Format: ORD-YYYYMMDD-XXXX |
| customer_id | UUID | - | Yes | - | FK to Customer | Must exist |
| order_date | DATETIME | - | Yes | NOW() | When order placed | Cannot be future |
| status | ENUM | - | Yes | 'pending' | Order status | pending, processing, shipped, delivered, cancelled |
| subtotal | DECIMAL | 10,2 | Yes | 0.00 | Sum of items | >= 0 |
| tax | DECIMAL | 10,2 | Yes | 0.00 | Calculated tax | >= 0 |
| shipping_cost | DECIMAL | 10,2 | Yes | 0.00 | Shipping fee | >= 0 |
| total | DECIMAL | 10,2 | Yes | - | Final total | = subtotal + tax + shipping |
| shipping_address | JSON | - | Yes | - | Delivery address | Valid address |
| billing_address | JSON | - | Yes | - | Billing address | Valid address |
| notes | TEXT | - | No | NULL | Order notes | Max 2000 chars |
| created_at | DATETIME | - | Yes | NOW() | Record created | Immutable |
| updated_at | DATETIME | - | Yes | NOW() | Last modified | Auto-update |
Normalization
First Normal Form (1NF)
- •Eliminate repeating groups
- •Each cell contains single value
- •Each record is unique
❌ Bad: customer_phones = "123-456, 789-012" ✅ Good: Separate phone table with customer_id FK
Second Normal Form (2NF)
- •Meet 1NF
- •No partial dependencies (all non-key attributes depend on entire primary key)
Third Normal Form (3NF)
- •Meet 2NF
- •No transitive dependencies (non-key attributes don't depend on other non-key attributes)
❌ Bad: Order has customer_email (depends on customer_id, not order) ✅ Good: Get customer_email via Customer table join
Domain-Specific Data Patterns
E-commerce
- •Products with variants (SKU per variant)
- •Hierarchical categories
- •Shopping cart → Order transition
- •Address normalization
- •Price history tracking
ERP
- •Chart of Accounts structure
- •Multi-company data isolation
- •Master data (customer, vendor, product)
- •Transaction tables with journals
- •Audit trails
CRM
- •Lead → Contact → Account conversion
- •Activity logging (calls, emails, meetings)
- •Opportunity → Quote → Order pipeline
- •Campaign → Member → Response tracking
CDP
- •Customer identity resolution
- •Event/behavioral data (time-series)
- •Profile attributes (unified)
- •Segment membership
- •Consent tracking
Best Practices
✅ Do:
- •Use consistent naming conventions (snake_case)
- •Include audit fields (created_at, updated_at, created_by)
- •Define primary keys explicitly
- •Document foreign key relationships
- •Include data types and constraints
- •Consider soft deletes vs. hard deletes
- •Plan for data growth
❌ Don't:
- •Store calculated values (unless for performance)
- •Use ambiguous names
- •Skip documentation
- •Ignore data validation rules
- •Forget about NULL handling
Tools
- •Figma: Visual ERD design
- •Mermaid: Code-based diagrams in docs
- •dbdiagram.io: Quick ERD creation
- •Lucidchart: Professional diagrams
Next Steps
After data modeling:
- •Review with technical team
- •Include in FRS documentation
- •Create migration scripts
- •Plan data validation rules
- •Design API contracts based on data model
References
- •Database Normalization (1NF, 2NF, 3NF)
- •Entity Relationship Modeling
- •Data Dictionary Standards