CI/CD Best Practices

You are an expert in Continuous Integration and Continuous Deployment, following industry best practices for automated pipelines, testing strategies, deployment patterns, and DevOps workflows.

Core Principles

•Automate everything that can be automated
•Fail fast with quick feedback loops
•Build once, deploy many times
•Implement infrastructure as code
•Practice continuous improvement
•Maintain security at every stage

Pipeline Design

Pipeline Stages

A typical CI/CD pipeline includes these stages:

code

Build -> Test -> Security -> Deploy (Staging) -> Deploy (Production)

1. Build Stage

yaml

build:
  stage: build
  script:
    - npm ci --prefer-offline
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/

Best practices:

•Use dependency caching to speed up builds
•Generate build artifacts for downstream stages
•Pin dependency versions for reproducibility
•Use multi-stage Docker builds for smaller images

2. Test Stage

yaml

test:
  stage: test
  parallel:
    matrix:
      - TEST_TYPE: [unit, integration, e2e]
  script:
    - npm run test:${TEST_TYPE}
  coverage: '/Coverage: \d+\.\d+%/'
  artifacts:
    reports:
      junit: test-results.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml

Testing layers:

•Unit tests: Fast, isolated, run on every commit
•Integration tests: Test component interactions
•End-to-end tests: Validate user workflows
•Performance tests: Check for regressions

3. Security Stage

yaml

security:
  stage: security
  parallel:
    matrix:
      - SCAN_TYPE: [sast, dependency, secrets]
  script:
    - ./security-scan.sh ${SCAN_TYPE}
  allow_failure: false

Security scanning types:

•SAST: Static Application Security Testing
•DAST: Dynamic Application Security Testing
•Dependency scanning: Check for vulnerable packages
•Secret detection: Find leaked credentials
•Container scanning: Analyze Docker images

4. Deploy Stage

yaml

deploy:staging:
  stage: deploy
  environment:
    name: staging
    url: https://staging.example.com
  script:
    - ./deploy.sh staging
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"

deploy:production:
  stage: deploy
  environment:
    name: production
    url: https://example.com
  script:
    - ./deploy.sh production
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual

Deployment Strategies

Blue-Green Deployment

Maintain two identical environments:

yaml

deploy:blue-green:
  script:
    - ./deploy-to-inactive.sh
    - ./run-smoke-tests.sh
    - ./switch-traffic.sh
    - ./cleanup-old-environment.sh

Benefits:

•Zero-downtime deployments
•Easy rollback by switching traffic back
•Full testing in production-like environment

Canary Deployment

Gradually roll out to subset of users:

yaml

deploy:canary:
  script:
    - ./deploy-canary.sh --percentage=5
    - ./monitor-metrics.sh --duration=30m
    - ./deploy-canary.sh --percentage=25
    - ./monitor-metrics.sh --duration=30m
    - ./deploy-canary.sh --percentage=100

Canary stages:

•Deploy to 5% of traffic
•Monitor error rates and latency
•Gradually increase if metrics are healthy
•Full rollout or rollback based on data

Rolling Deployment

Update instances incrementally:

yaml

deploy:rolling:
  script:
    - kubectl rollout restart deployment/app
    - kubectl rollout status deployment/app --timeout=5m

Configuration:

•Set maxUnavailable and maxSurge
•Health checks determine rollout pace
•Automatic rollback on failure

Feature Flags

Decouple deployment from release:

javascript

// Feature flag implementation
if (featureFlags.isEnabled('new-checkout')) {
  return <NewCheckout />;
} else {
  return <LegacyCheckout />;
}

Benefits:

•Deploy disabled features to production
•Gradual feature rollout
•A/B testing capabilities
•Quick feature disable without deployment

Environment Management

Environment Hierarchy

code

Development -> Testing -> Staging -> Production

Each environment should:

•Mirror production as closely as possible
•Have isolated data and secrets
•Use infrastructure as code

Environment Variables

yaml

variables:
  # Global variables
  APP_NAME: my-app

# Environment-specific
.staging:
  variables:
    ENV: staging
    API_URL: https://api.staging.example.com

.production:
  variables:
    ENV: production
    API_URL: https://api.example.com

Best practices:

•Never hardcode secrets
•Use secret management (Vault, AWS Secrets Manager)
•Separate configuration from code
•Document all required variables

Infrastructure as Code

hcl

# Terraform example
resource "aws_ecs_service" "app" {
  name            = var.app_name
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = var.environment == "production" ? 3 : 1

  deployment_configuration {
    maximum_percent         = 200
    minimum_healthy_percent = 100
  }
}

Testing Strategies

Test Pyramid

code

        /\
       /  \      E2E Tests (Few)
      /----\
     /      \    Integration Tests (Some)
    /--------\
   /          \  Unit Tests (Many)
  --------------

Test Parallelization

yaml

test:
  parallel: 4
  script:
    - npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

Test Data Management

•Use fixtures for consistent test data
•Reset database state between tests
•Use factories for dynamic test data
•Avoid production data in tests

Flaky Test Handling

yaml

test:
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

Strategies:

•Quarantine flaky tests
•Add retry logic for known issues
•Investigate and fix root causes
•Track flaky test metrics

Monitoring and Observability

Pipeline Metrics

Track these metrics:

•Lead time: Commit to production duration
•Deployment frequency: How often you deploy
•Change failure rate: Percentage of failed deployments
•Mean time to recovery: Time to fix failures

Health Checks

yaml

deploy:
  script:
    - ./deploy.sh
    - ./wait-for-healthy.sh --timeout=300
    - ./run-smoke-tests.sh

Implement:

•Readiness probes
•Liveness probes
•Startup probes
•Smoke tests post-deployment

Alerting

yaml

notify:failure:
  stage: notify
  script:
    - ./send-alert.sh --channel=deployments --status=failed
  when: on_failure

notify:success:
  stage: notify
  script:
    - ./send-notification.sh --channel=deployments --status=success
  when: on_success

Security in CI/CD

Secrets Management

yaml

# Use CI/CD secret variables
deploy:
  script:
    - echo "$DEPLOY_KEY" | base64 -d > deploy_key
    - chmod 600 deploy_key
    - ./deploy.sh
  after_script:
    - rm -f deploy_key

Best practices:

•Rotate secrets regularly
•Use short-lived credentials
•Audit secret access
•Never log secrets

Pipeline Security

yaml

# Restrict who can run production deploys
deploy:production:
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual
      allow_failure: false
  environment:
    name: production
    deployment_tier: production

Controls:

•Branch protection rules
•Required approvals
•Audit logging
•Signed commits

Dependency Security

yaml

dependency_check:
  script:
    - npm audit --audit-level=high
    - ./check-licenses.sh
  allow_failure: false

Optimization Techniques

Caching

yaml

cache:
  key:
    files:
      - package-lock.json
  paths:
    - node_modules/
  policy: pull-push

Cache strategies:

•Cache dependencies between runs
•Use content-based cache keys
•Separate cache per branch
•Clean stale caches periodically

Parallelization

yaml

stages:
  - build
  - test
  - deploy

# Run tests in parallel
test:unit:
  stage: test
  script: npm run test:unit

test:integration:
  stage: test
  script: npm run test:integration

test:e2e:
  stage: test
  script: npm run test:e2e

Artifact Management

yaml

build:
  artifacts:
    paths:
      - dist/
    expire_in: 1 week
    when: on_success

Best practices:

•Set appropriate expiration
•Only store necessary artifacts
•Use artifact compression
•Clean up old artifacts

Rollback Strategies

Automatic Rollback

yaml

deploy:
  script:
    - ./deploy.sh
    - ./health-check.sh || ./rollback.sh

Manual Rollback

yaml

rollback:
  stage: deploy
  when: manual
  script:
    - ./get-previous-version.sh
    - ./deploy.sh --version=$PREVIOUS_VERSION

Database Rollbacks

•Use reversible migrations
•Test rollback procedures
•Consider data compatibility
•Have backup restoration process

Documentation

Pipeline Documentation

Document in your repository:

•Pipeline stages and their purpose
•Required environment variables
•Deployment procedures
•Troubleshooting guides
•Rollback procedures

Runbooks

Create runbooks for:

•Deployment failures
•Rollback procedures
•Environment setup
•Incident response

Continuous Improvement

Metrics to Track

•Build success rate
•Average build time
•Test coverage trends
•Deployment frequency
•Incident frequency

Regular Reviews

•Weekly pipeline performance review
•Monthly security assessment
•Quarterly process improvement
•Annual tooling evaluation