Watch Mode

Use this skill when the user wants continuous regression monitoring during development. Watch mode observes file changes and automatically re-runs evalview check with debounced triggers.

What this does

EvalView's watch mode uses watchdog to monitor directories for file changes (.py, .yaml, .yml, .json, .md, .txt, .toml, .cfg, .ini). When a change is detected, it runs a regression check via the gate() API and displays a live scorecard with pass/fail status, score deltas, tool changes, and streak tracking.

How to start watch mode

Watch mode is a CLI command (not an MCP tool). Help the user run it:

code

evalview watch

Common options

•--quick — Skip LLM judge, deterministic checks only ($0 cost, sub-second)
•--path src/ --path tests/ — Watch specific directories (default: current directory)
•--test "my-test" — Only check a specific test by name
•--test-dir tests/evalview — Path to test cases directory (default: tests)
•--interval 1 — Debounce interval in seconds (default: 2.0)
•--fail-on REGRESSION,TOOLS_CHANGED — Comma-separated statuses that count as failure (default: REGRESSION)
•--sound — Terminal bell on regression

Examples

code

# Basic: watch everything, full checks
evalview watch

# Fast development loop: no LLM judge, 1-second debounce
evalview watch --quick --interval 1

# Watch specific directories and one test
evalview watch --path src/ --path tests/ --test "calculator-division"

# Strict mode: fail on any behavioral change
evalview watch --fail-on REGRESSION,TOOLS_CHANGED,OUTPUT_CHANGED --sound

Prerequisites

Watch mode requires the watchdog package. If not installed:

code

pip install evalview[watch]

Notes

•Watch mode excludes .evalview/, .git/, venv/, node_modules/, __pycache__/, and other common non-source directories automatically.
•The initial check runs immediately on startup before watching begins.
•Results include a live scorecard with pass counts, regression counts, health percentage, and streak info.
•--quick mode is ideal for tight development loops since it costs nothing and runs in sub-second time.