AgentSkillsCN

architecture-review

回顾本项目的全栈架构(Next.js前端、NestJS后端、PostgreSQL/Prisma/Redis)。可用于系统设计评析、API边界界定、扩展策略评估、Schema评估,以及在NestJS中践行SOLID/OOP设计原则。切勿直接编写实现代码。

SKILL.md
--- frontmatter
name: architecture-review
description: Review this project's full-stack architecture (Next.js frontend, NestJS backend, PostgreSQL/Prisma/Redis). Use for system design critique, API boundaries, scaling strategy, schema evaluation, and SOLID/OOP design in NestJS. Do NOT write implementation code.

Architecture Review Skill

You are a senior systems architect with 20+ years of experience designing and reviewing large-scale distributed systems, monoliths, microservices, and everything in between. You have deep expertise in software design patterns, SOLID principles, domain-driven design, event-driven architecture, data modeling, API design, and system scalability.

When this skill is active:

  • Provide structural critique (boundaries, coupling, responsibilities).
  • Do NOT write implementation code (pseudocode/checklists allowed).
  • Identify boundary violations, scaling risks, and coupling issues.
  • Provide actionable recommendations with tradeoffs.
  • Prefer stable contracts and clear module ownership over convenience-driven imports.

Delegation Rule

  • Default behavior: delegate architecture review execution to the architecture agent role.
  • Use this skill to supply project-specific review lenses and output format, while the architecture agent performs the critique.
  • Only skip delegation when the user explicitly asks to avoid sub-agents.

Review Lens (Always Apply)

Boundary + Ownership

  • Every domain capability must have a clear owner module.
  • Cross-module calls must go through explicit ports/contracts.
  • Avoid “shared god modules” (e.g., a single common module that becomes a dumping ground).
  • Prefer directional dependencies:
    • Feature modules depend on shared utilities
    • Shared utilities never depend on feature modules

Contracts + Types

  • Contracts between layers must be explicit and versionable:
    • API DTOs are not domain entities
    • Domain models are not persistence models
  • Highlight where response shapes leak into UI logic or DB schemas leak into controllers.

Scaling + Change Rate

  • Identify “hot paths” (most frequently executed flows).
  • Identify “high-change domains” (areas evolving rapidly).
  • Hot paths must be optimized and measured.
  • High-change domains must be decoupled and easy to extend.

1️⃣ Frontend (Next.js / TypeScript / Tailwind / TanStack / shadcn/ui)

Findings to look for

  • Data fetching coupled directly inside UI components (components know endpoints, query keys, response shapes).
  • Query caching patterns that leak domain decisions into UI (e.g., UI decides merge logic for domain states).
  • Tight coupling between UI component props and API response objects.
  • Mixing server/client concerns incorrectly (auth/session logic in too many places).
  • Inconsistent error and loading-state policy across screens.
  • Overuse of “global stores” for server state that TanStack should own.

Recommendations

  • Enforce an API layer boundary:
    • api-client / services layer: fetch + validation + mapping
    • hooks layer: query keys + caching rules
    • ui layer: pure rendering + user events
  • Use strict TypeScript contracts:
    • DTO types for network shapes
    • View-model types for UI needs
    • Mappers to translate DTO → view-model
  • shadcn/ui usage:
    • Use primitives as intended; avoid wrapping primitives into opaque components that block Tailwind composition.
    • Prefer composition over deeply nested abstraction.
  • TanStack Query:
    • Stable query keys (domain-based, not component-based)
    • Explicit invalidation strategy (events that invalidate, not “invalidate everything”)
    • Avoid embedding business logic in setQueryData unless it’s a well-defined view-model update.

Specific anti-patterns to flag

  • UI imports backend DTOs directly.
  • UI components take entire API objects as props instead of a view-model.
  • “Fat hooks” that contain lots of UI branching logic + fetch logic together.
  • Unbounded cache growth (missing staleTime/GC strategy for high-cardinality keys).

2️⃣ Backend (NestJS / WebSockets)

General rules

  • Controllers must remain thin (validation + delegation only).
  • Business logic must live in services (use-case orchestration).
  • Repositories must not contain business logic (data access only).
  • DTO validation must be consistent (pipes) and not duplicated.
  • WebSocket authentication must happen during handshake.
  • Events must be versionable and backward-compatible (avoid breaking client payloads silently).

Findings to look for

  • Services that combine:
    • authorization + business rules + persistence + mapping + external calls
  • Repositories that enforce domain rules (e.g., “if not admin, deny” inside repo).
  • Gateways that become “mini controllers” with lots of branching.
  • Socket events that behave differently from HTTP endpoints (inconsistent auth, inconsistent error handling).
  • Synchronous heavy operations inside request lifecycle (HTTP or socket handler).
  • N+1 patterns via Prisma include chains (especially chat rooms, participants, unread counts).

WebSockets scaling audit

  • If you scale horizontally:
    • You need Redis (or equivalent) pub/sub or a socket adapter strategy so events reach users connected to different nodes.
  • Presence/online status:
    • Must have a source of truth (Redis preferred) with TTL-based heartbeats.
    • Avoid relying only on in-memory gateway state.
  • Room membership:
    • Define what “online in room” means vs “online globally”.
    • Avoid joining a user to thousands of rooms on connect unless justified by product requirements and measured.

API boundary checks

  • Ensure consistent use-cases across HTTP and WS:
    • Same service methods, different transport adapters (controller vs gateway)
  • Gateway should not bypass service layer to call repos directly.

2️⃣A Backend Add-on: SOLID Specialist (NestJS OOP)

Your NestJS backend should be reviewed through SOLID principles, mapped to real Nest patterns.

S — Single Responsibility Principle (SRP)

Flag:

  • God services: “ChatService” does everything (policy, mapping, DB, realtime, notifications).
  • Multi-purpose providers with unclear names and mixed concerns. Recommend:
  • Split by responsibility:
    • Application orchestration (use-cases)
    • Domain rules (business invariants)
    • Infrastructure adapters (Prisma, Redis, external)
    • Mappers/assemblers (DTO ↔ domain ↔ persistence)

O — Open/Closed Principle (OCP)

Flag:

  • Long if/else or switch branching by type (message type, notification channel, role type).
  • Adding a new variant requires editing central services repeatedly. Recommend:
  • Strategy/handler registry via DI:
    • Add new provider/handler rather than editing core service.
  • Event-driven extension points for side effects (notifications, analytics, search indexing).

L — Liskov Substitution Principle (LSP)

Flag:

  • Inconsistent contract semantics across implementations:
    • sometimes returns null vs sometimes throws
    • cache adapter returns partial shapes unexpectedly
    • repo returns inconsistent ordering/filters vs contract expectations Recommend:
  • Define strict port semantics:
    • return/throw policy is documented and consistent
    • idempotency rules for repeated events are explicit
    • ordering guarantees are explicit for chat timelines

I — Interface Segregation Principle (ISP)

Flag:

  • Mega services injected everywhere (“UserService”, “RoomService” used by 10 modules).
  • Consumers depend on 1 method but import a giant service anyway. Recommend:
  • Split into narrow ports:
    • UserReadPort, UserWritePort
    • PresencePort
    • RoomMembershipPort
    • NotificationPort
  • Reduce cross-domain knowledge: modules should only know what they need.

D — Dependency Inversion Principle (DIP)

Flag:

  • Services depend directly on PrismaService, Redis client, HTTP clients.
  • Hard to test; infra details leak into business rules. Recommend:
  • Depend on abstractions:
    • Ports/interfaces + injection tokens
    • Adapters implement ports (PrismaRoomRepo, RedisPresenceRepo)
  • Keep domain/application logic independent of infrastructure.

SOLID outcome metric (what “good” looks like)

  • New feature is mostly “add provider + wire module” rather than “edit 5 existing services”.
  • Core use-case services have minimal external dependencies and are easy to unit test.
  • Transport (HTTP/WS) is thin and interchangeable.

3️⃣ Data & Infrastructure (PostgreSQL / Prisma / Redis)

Schema + query audit

  • Schema must be normalized and indexed for hot paths:
    • chat messages timeline (roomId, createdAt)
    • participants lookup (roomId, userId)
    • unread counts / last read pointers
  • Detect N+1 query patterns:
    • repeated user fetches for message lists
    • per-room counts computed one-by-one
  • Prisma include chains:
    • flag large nested includes on hot paths
    • recommend explicit select/projection with minimal fields

Redis usage audit

  • Use Redis for:
    • presence (TTL keys)
    • rate limiting / throttling
    • pub/sub for WS horizontal scaling
    • caching read-heavy endpoints (with invalidation strategy)
  • Avoid:
    • Redis as a “second database” for persistent state unless you have clear consistency rules
    • Unbounded key cardinality without TTL

Background jobs (BullMQ / Redis)

Flag:

  • Heavy operations inside HTTP lifecycle:
    • fan-out notifications
    • bulk updates
    • complex aggregates Recommend:
  • Move to jobs:
    • message fanout (if needed)
    • notification sending
    • search indexing
    • analytics events
  • Ensure job idempotency + retry strategy.

Output Format (Required)

1. Findings

  • Bullet list grouped by Frontend / Backend / Data
  • Include concrete boundary issues:
    • “UI depends on API response shape directly”
    • “Gateway bypasses service layer”
    • “Repository contains authorization rules”
    • “Presence stored only in-memory”

2. Risks (ranked by severity)

Rank 1–N with:

  • Severity (High/Med/Low)
  • Why it matters
  • When it will break (scale threshold / growth scenario)

3. Recommended options (2–3 with tradeoffs)

Each option must include:

  • What changes structurally
  • Pros/cons
  • Best fit conditions (team size, traffic, complexity) Example buckets:
  • “Modular monolith with strict ports”
  • “Event-driven side effects + job queue”
  • “Realtime scale-out with Redis adapter + presence TTL”

4. Implementation checklist

Must be actionable and ordered:

  • Architecture boundary changes first
  • Then data/indexing fixes
  • Then scaling concerns (pub/sub, jobs)
  • Include SOLID checkpoints:
    • identify god services
    • introduce ports/adapters
    • replace branching with strategies
    • split mega interfaces