CI Pipeline Skill

This skill runs the full CI pipeline with detailed feedback, auto-fixing lint issues, and reviewing test coverage.

Arguments

•--eval - Include make test-eval (runs Claude tool call evaluation tests, takes a long time)
•--skip-docs - Skip documentation validation
•--coverage-only - Only run test coverage analysis

Pipeline Steps

Execute each step in order. Stop on failure unless the step is auto-fixable.

Step 1: Install Dependencies

bash

poetry install

Verify the virtual environment is set up correctly.

Step 2: Format Code

bash

poetry run ruff format src/ tests/

This auto-formats all code. Report what files were changed (if any).

Step 3: Lint with Auto-Fix

bash

poetry run ruff check --fix src/ tests/

If there are fixable issues, they will be auto-fixed. After auto-fix, run again without --fix to see remaining issues:

bash

poetry run ruff check src/ tests/

If there are remaining unfixable issues, report them and stop.

Step 4: Type Checking

bash

poetry run mypy --warn-unreachable --warn-redundant-casts --warn-unused-ignores src/

Report any type errors. Stop if there are errors.

Step 5: Run Unit Tests with Coverage

bash

poetry run pytest tests/unit/ -v --cov=honeycomb --cov-report=term-missing --cov-branch

After tests complete:

•Check overall coverage percentage - current baseline is ~38% (CLI and integration code is tested via live tests)
•Identify uncovered lines - focus on models and validation code (should be 80%+)
•
Check for new functionality without tests:
- •Use git diff --name-only HEAD~5 to find recently changed files
- •Cross-reference with coverage report
- •Flag any new code in src/honeycomb/models/ or src/honeycomb/validation/ without 80%+ coverage

Coverage expectations by module:

•models/ and validation/: Should be 80%+ (unit testable)
•resources/: 30-50% typical (integration-heavy, tested via live tests)
•cli/: Low coverage expected (tested manually)
•tools/executor.py: 40-50% (tested via eval tests)

Step 6: Validate Documentation (requires direnv)

bash

direnv exec . poetry run python scripts/validate_docs_examples.py
direnv exec . poetry run pytest tests/integration/test_doc_examples.py -v

Important: This step requires environment variables from .envrc. If direnv is not configured, skip with a warning.

After validation:

•
Check for docs-code mismatch:
- •Use git diff --name-only HEAD~5 to find changed source files
- •Check if corresponding docs in docs/usage/ were updated
- •Flag functionality changes without docs updates

Step 7: (Optional) Claude Tool Evaluation Tests

Only run if --eval argument is provided:

bash

rm -rf tests/integration/.tool_call_cache/*.json
direnv exec . poetry run pytest tests/integration/test_claude_tools_eval.py -v -n 4

Warning: This step takes a long time (5-10 minutes) and requires ANTHROPIC_API_KEY.

Post-Pipeline Analysis

After all steps complete, provide a summary:

Coverage Analysis

•Overall coverage: Report the percentage
•Coverage gaps: List top 5 files with lowest coverage
•New code coverage: For files changed in recent commits, report their coverage

Documentation Sync Check

Compare changed files with docs:

bash

# Get changed source files
git diff --name-only HEAD~5 -- 'src/honeycomb/*.py' 'src/honeycomb/**/*.py'

# Get changed doc files
git diff --name-only HEAD~5 -- 'docs/**/*.md'

Flag any source files in resources/, models/, or builders/ that changed without corresponding doc updates.

Test Coverage for New Features

For each new or significantly changed file:

•Check if it has test coverage
•If coverage < 70%, flag it
•Suggest what tests might be needed

Example Output Format

code

CI Pipeline Results
==================

Step 1: Install Dependencies ✓
Step 2: Format Code ✓ (2 files reformatted)
Step 3: Lint ✓ (3 issues auto-fixed)
Step 4: Type Check ✓
Step 5: Unit Tests ✓ (831 passed, 38% overall coverage)
Step 6: Documentation ✓
Step 7: Eval Tests ✓ (320 passed) [if --eval]

Coverage Analysis
-----------------
Overall: 38% (baseline for this project)

Models/Validation (target 80%+):
  ✓ src/honeycomb/models/tool_inputs.py: 99%
  ✓ src/honeycomb/models/tags_mixin.py: 100%
  ✓ src/honeycomb/validation/boards.py: 90%

Resources (30-50% expected):
  - src/honeycomb/resources/markers.py: 24%
  - src/honeycomb/resources/columns.py: 27%

Documentation Sync
------------------
Changed source files without doc updates:
  - src/honeycomb/resources/triggers.py (modified)
    Consider updating: docs/usage/triggers.md

New Code Without Tests
----------------------
  - src/honeycomb/models/new_feature.py: New file, 45% coverage
    Missing tests for: validate_feature(), get_feature()

Failure Handling

•Lint failures (unfixable): Stop and report. User must fix manually.
•Type errors: Stop and report. User must fix.
•Test failures: Stop and report failed tests with tracebacks.
•Doc validation failures: Report but continue if --skip-docs not set.

Quick Commands

bash

# Full CI (default)
/ci

# CI with eval tests (for tool call changes)
/ci --eval

# Skip docs validation
/ci --skip-docs

# Just coverage analysis
/ci --coverage-only