Production Development Principles

These are universal production development principles for any project.

Philosophy: Simple, scalable, maintainable. Not MVP shortcuts, not enterprise bloat.

Golden Rules (ALWAYS Follow)

•If it works reliably, ship it - Perfect is still the enemy of done
•YAGNI until you need it - Don't build for hypothetical futures
•Simple files > Complex architecture - Start simple, extract when needed
•Direct > Abstract - Prefer direct solutions, abstract when patterns emerge
•Start hardcoded, extract when needed - Make it configurable after the 3rd use
•Quality matters now - We have paying customers, but over-engineering still hurts
•200 lines before extracting - Functions/features can be larger now, but extract at 200 lines

Reality Check (Where We Are)

•your current customer base: Optimize for this scale, not millions
•Production beta: Real customers, but still learning
•Multi-tenant: Each customer has unique needs
•Speed + Reliability: Ship fast, but don't break things
•Technical debt payback: Fix issues that impact customers NOW

Patterns to Avoid (Nuanced for Our Scale)

❌ Avoid Unless Justified

These patterns add complexity. Only use if you meet the criteria:

•Factory patterns: Avoid unless you have 5+ different implementations
•Dependency injection frameworks: Avoid unless team size >5 developers (Go interfaces are fine)
•Abstract base classes: Avoid unless you have 3+ concrete implementations
•Event sourcing / CQRS: Avoid unless you have audit requirements or >10,000 events/day
•Microservices: Avoid unless monolith is >100k LOC or team >10 developers
•Complex repository patterns: Avoid unless you have 5+ data sources (direct queries + transactions are fine)
•Service meshes: Avoid unless you have >20 services
•API gateways: Avoid unless you have >10 backend services (nginx is enough)
•Custom frameworks: Avoid unless you're doing the same thing 10+ times

🚫 Still Completely Banned

•Premature optimization: Never optimize before measuring
•Speculative generality: Never build for "what if" scenarios
•Gold plating: Never add features "because it's cool"
•Resume-driven development: Never use tech "to learn it"

Patterns to Use (Production-Ready)

✅ Strongly Encouraged:

•Simple functions with clear names
•Direct database queries with transactions for multi-step operations
•Configuration files/env vars (not hardcoded secrets)
•Defensive coding (validation, error handling, retries)
•Logging and monitoring (errors, performance, business metrics)
•Inline code when <3 uses, extract when 3+ uses (Rule of Three)
•Database migrations (not raw SQL changes)
•Basic caching when queries are measured as slow (>500ms)
•Polling with smart intervals (not webhooks unless push is required)
•Functions up to 200 lines (extract at 200, not 50)

Production Concerns (NEW)

🚨 Must Haves for Production

•
Error Handling
- •All external calls wrapped in try/catch
- •Errors logged with context (user, customer, operation)
- •User-friendly error messages
- •Retry logic for transient failures (network, rate limits)
•
Data Integrity
- •Use database transactions for multi-step operations
- •Validate inputs before writing to database
- •Backups run daily (already set up)
- •Soft deletes for critical data (tickets, users)
•
Observability
- •Log all errors with stack traces
- •Log slow operations (>2s)
- •Monitor API response times
- •Track business metrics relevant to your product
•
Security
- •Never log secrets/API keys (use last 4 chars only)
- •Validate + sanitize user inputs
- •Rate limiting on public endpoints
- •Keep dependencies updated (monthly review)
•
Multi-tenancy (if applicable)
- •Every query includes tenant ID filter
- •Test with multiple tenants
- •No cross-tenant data leaks

⚖️ Production vs Speed Balance

Ship Fast (do these)

•Inline validation (no validation framework)
•Direct SQL queries (no ORM)
•Environment variables for config
•Simple retry logic (3 attempts, exponential backoff)
•File-based logs (rotate daily)

Take Time (do these right)

•Database migrations (use migrate tool)
•Authentication/authorization (test thoroughly)
•Data export/import (customers depend on this)
•Email delivery (use queue + retries)
•Payment processing (never cut corners)

Decision Framework (Updated for Production)

Before ANY architectural decision, ask:

Question 1: Is this reliable for production?

•If yes: Proceed
•If no: What's missing? (error handling, validation, logging)

Question 2: Will 100 customers break this?

•If no: Ship it
•If yes: What's the bottleneck? Add specific fix (caching, indexing, pagination)

Question 3: Can another dev maintain this in 6 months?

•If yes: Good complexity level
•If no: Add comments, extract functions, simplify

Question 4: What's the blast radius if this fails?

•One user: Ship it, fix if it breaks
•One customer: Add error handling + logging
•All customers: Add retry logic, monitoring, fallbacks

When to Add Abstraction (NEW)

Triggers for Abstraction

Extract to function/class when:

•Rule of Three: Same logic used 3+ times
•Domain complexity: Business logic gets complicated (AI logic, ticket routing)
•Testing: Hard to test without extraction
•Multiple implementations: 3+ ways to do something (Zendesk, Jira, email)
•File size: Function/feature exceeds 200 lines

Extraction Examples

✅ Good Abstractions (Justified)

•Extract after 3rd duplicate: A validation function used by 3+ handlers
•Extract complex business logic: When a single function exceeds 200 lines with conditional logic
•Extract when 3+ implementations exist: e.g., EmailProvider, SlackProvider, TeamsProvider — three implementations justify an interface

❌ Still Over-Engineering

•Abstract factories when you only have 1 implementation
•Generic repository patterns when direct queries work fine
•Configuration managers when environment variables are enough

Simplicity Checkpoints (Updated)

Before Starting

• Is this the simplest RELIABLE approach?
• Do we need this for your current customer base (not 10,000)?
• Can this be 1-5 files?
• Is error handling included?
• Is this easily testable?

During Implementation

• Am I adding abstraction before 3rd use?
• Am I creating >10 files? (Consolidate related logic)
• Did I add error handling + logging?
• Would another dev understand this in 6 months?
• Is this function >200 lines? (Extract if yes)

Before Committing

• Does this handle failures gracefully?
• Are errors logged with context?
• Is complex logic tested (unit tests)?
• Can I deploy this without breaking existing customers?

Scaling Triggers (When to Refactor)

Refactor When You Hit These Limits

•
Performance (actual, not hypothetical)
- •API responses >2s consistently
- •Database queries >500ms
- •Memory usage growing unbounded
- •CPU consistently >70%
•
Maintainability (team pain)
- •Same bug appears 3+ times (extract + fix once)
- •Code duplicated 5+ times (extract + reuse)
- •New feature takes 2x longer than expected
- •Onboarding new dev takes >1 week
•
Scale (customer impact)
- •Customer count exceeding what your current architecture handles
- •Request volume exceeding what your database/server can handle
- •Database size requiring optimization or sharding
•
Customer complaints (real problems)
- •Specific feature requested by 5+ customers
- •Same issue reported 3+ times
- •Security concern raised by customer
- •Competitor has feature we don't

Don't Refactor For

•"Clean code" principles (if it works reliably)
•Hypothetical scale (until you're at 80% of limit)
•Latest framework/library (unless security fix)
•Personal preferences (consistency > perfection)

Mantras (Updated for Production)

•"Simple + Reliable beats complex + perfect"
•"Scale when you hit limits, not before"
•"Make it work, make it right, make it fast - IN THAT ORDER"
•"Abstract after 3rd duplicate, not before"
•"Add what you need, remove what you don't"
•"Customers don't care about architecture"
•"200 lines before extracting, not 50"

When to Add "Enterprise" Patterns

Use enterprise patterns ONLY when you meet ALL criteria:

Pattern	Minimum Requirements
Factory Pattern	5+ different implementations
DI Framework	Team of 5+ developers
Microservices	Monolith >100k LOC OR team >10 developers
Event Sourcing	Audit requirement OR >10k events/day
CQRS	Read/write performance measured as bottleneck
Service Mesh	20+ microservices
API Gateway	10+ backend services
Repository Pattern	5+ different data sources

Until you hit these thresholds: Keep it simple

The Prime Directive (Updated)

Build the simplest reliable thing that works for your current customer base. Then ship it.

If you find yourself:

•Creating >10 files for a feature
•Writing >200 lines without extracting
•Thinking about "1000+ customer scalability"
•Adding abstraction before 3rd use
•Building generic frameworks

STOP and ask:

"What's the simplest RELIABLE way to make this work for 100 customers?"

Remember

You're not building for:

•❌ Millions of users (unless you actually have them)
•❌ Fortune 500 enterprise (unless you are one)
•❌ Infinite scale (you need finite, measured scale)

You're building for:

•✅ Your actual current user/customer count
•✅ Fast iteration based on real feedback
•✅ Reliable service for paying customers
•✅ Maintainable codebase that your team can work on

Ship working, reliable code. Ship it fast. Iterate based on customer feedback.