Purpose
Design and implement complete B2B automation workflows that are production-ready, observable, and safe, following global automation rules (Rule 3) and agent-friendly conventions (Rule 2).
Prerequisites
- •✅ Business requirement clearly defined
- •✅ Global rules reviewed (Rule 1, Rule 2, Rule 3)
- •✅ Repository documentation current:
@mcp-repo-scan - •✅ Target systems and integrations identified
- •✅ Risk level assessed (Low/Medium/High)
Design Phase
1. Requirements Analysis
Steps:
- •Extract business context and success criteria
- •Identify triggers, systems touched, and data flows
- •Assess risk level and safety requirements
- •Determine human-in-the-loop needs
2. Architecture Design
Following Rule 3 principles:
- •Separation of concerns: Triggers → Business logic → Persistence
- •Idempotency: Design safe retry mechanisms
- •Durability: Queue-based processing for long-running work
- •Isolation: External integration modules
- •Observability: Structured logging and metrics
- •Safety: Human checkpoints and auditability
3. Workflow Documentation
Create workflow spec:
- •Use template:
.windsurf/workflows/work-flow-md.md - •Complete all sections (no blanks allowed)
- •Get human approval for High/Medium risk workflows
- •Save as:
docs/workflows/<workflow-name>.md
Implementation Phase
4. Core Implementation
For each component:
Triggers
- •Webhook handlers with validation
- •Scheduled jobs with cron expressions
- •Message bus listeners
- •UI action handlers
Business Logic
- •Pure or mostly-pure functions
- •Clear error handling
- •Input validation and sanitization
- •Decision logic and branching
Persistence
- •Repository/DAO patterns
- •Database schemas and migrations
- •Cache strategies
- •Transaction handling
External Integrations
- •Dedicated integration modules
- •Timeout and retry logic
- •Secret management (environment variables)
- •Error handling and fallbacks
5. Queue and Job Infrastructure
Following Rule 3 queue semantics:
- •Job structure with clear schemas
- •Bounded retries with backoff
- •Dead-letter queue (DLQ) setup
- •Concurrency controls and locks
6. Observability Implementation
Logging:
- •Structured JSON logging
- •Correlation IDs across flows
- •Key events (start/finish/errors)
- •Non-sensitive metadata only
Metrics:
- •Throughput and success rates
- •Latency measurements
- •Error rates by type
- •Queue depths and processing times
7. Safety and Guardrails
Human-in-the-loop:
- •Approval checkpoints for high-risk actions
- •Clear action summaries and rationale
- •Audit trails for all changes
Guardrails:
- •No destructive operations without confirmation
- •Security/permission changes flagged
- •Bulk operations require safeguards
Testing Phase
8. Test Strategy
Unit Tests:
- •Business logic edge cases
- •Error handling paths
- •Validation logic
- •Integration module mocks
Integration Tests:
- •End-to-end workflow simulation
- •External service mocking
- •Queue processing tests
- •Database transaction tests
Idempotency Tests:
- •Duplicate request handling
- •Retry behavior verification
- •Partial failure recovery
9. Test Implementation
Test locations:
- •Unit:
tests/unit/workflows/<name>/ - •Integration:
tests/integration/workflows/<name>/ - •E2E:
tests/e2e/workflows/<name>/
Test commands:
bash
npm test # Unit tests npm run test:integration # Integration tests npm run test:e2e # E2E tests
Documentation Phase
10. Documentation Updates
Required docs:
- •
docs/workflows/<name>.md- Complete workflow spec - •
docs/OPERATIONS.md- Runbook entries - •
docs/plans/<name>-implementation.md- Implementation plan - •API documentation for new endpoints
- •Update
docs/DOC-MAP.mdwith new components
11. Runbook Creation
Operations runbook:
- •Health check procedures
- •Common failure modes and responses
- •Manual intervention steps
- •Monitoring and alerting setup
Quality Gates
12. Pre-deployment Checklist
Code Quality:
- •✅ TypeScript strict mode compliance
- •✅ All tests passing
- •✅ Code coverage > 80%
- •✅ No linting errors
- •✅ Security scan passed
Architecture Compliance:
- •✅ Follows Rule 1 (Qwen coding standards)
- •✅ Follows Rule 2 (Repo conventions)
- •✅ Follows Rule 3 (Automation principles)
- •✅ No secrets in code
- •✅ Proper error handling
Operational Readiness:
- •✅ Observability hooks implemented
- •✅ Alerts configured
- •✅ Runbooks documented
- •✅ Rollback plan defined
13. Deployment
Staged rollout:
- •Deploy to dev environment
- •Run smoke tests:
@run-tests-and-fix - •Deploy to staging (if exists)
- •Final integration tests
- •Production deployment with monitoring
Post-deployment:
- •Monitor key metrics for 24 hours
- •Verify all alerts working
- •Update documentation with real-world learnings
- •Retrospective and improvements
References to Other Skills
- •
@mcp-repo-scan- Initial repository analysis - •
@mcp-change-plan- Create implementation plan - •
@mcp-implement-plan- Execute implementation steps - •
@reengine-coding-agent- Code writing and refactoring - •
@run-tests-and-fix- Test execution and bug fixing
Global Rules Compliance
This skill explicitly follows:
- •Rule 1: Uses Qwen3-Coder-Next for coding tasks with test-driven approach
- •Rule 2: Maintains agent-friendly repo structure and conventions
- •Rule 3: Implements automation with idempotency, queues, observability, and safety