Scope Completeness Protocol
This skill ensures comprehensive coverage in batch operations, preventing missed items and incomplete processing.
Core Principle
Process ALL items, not just obvious ones
Before any batch operation:
- •Use comprehensive glob patterns to find ALL matching items
- •List all items explicitly: "Found N items: [list]"
- •Check multiple locations (root, subdirectories, dot-directories)
- •Verify completeness: "Processed N/N items"
The Complete Scope Workflow
Step 1: Define Scope
Clearly define what needs processing:
python
# Define scope parameters
scope_definition = {
'file_patterns': ['*.py', '*.js', '*.tsx'],
'directories': ['src/', 'tests/', 'scripts/', '.github/'],
'exclude_patterns': ['*.pyc', '__pycache__', 'node_modules'],
'include_hidden': True, # Don't forget dot directories!
}
print("Scope Definition:")
print(f" Patterns: {scope_definition['file_patterns']}")
print(f" Directories: {scope_definition['directories']}")
print(f" Include hidden: {scope_definition['include_hidden']}")
Step 2: Comprehensive Discovery
Find ALL items matching scope:
bash
# Use multiple search strategies to ensure completeness echo "=== Discovery Phase ===" # Strategy 1: fd (file finder) echo "Using fd..." fd -t f -e py -e js -e tsx . > files_fd.txt # Strategy 2: ripgrep files echo "Using ripgrep..." rg --files | grep -E '\.(py|js|tsx)$' > files_rg.txt # Strategy 3: find command echo "Using find..." find . -type f \( -name "*.py" -o -name "*.js" -o -name "*.tsx" \) > files_find.txt # Combine and deduplicate cat files_fd.txt files_rg.txt files_find.txt | sort -u > all_files.txt # Count and verify TOTAL_FILES=$(wc -l < all_files.txt) echo "Found $TOTAL_FILES files total"
Step 3: Create Explicit Inventory
List ALL items before processing:
python
def create_inventory(patterns, directories):
"""Create comprehensive inventory of items to process."""
import glob
from pathlib import Path
inventory = {
'files': [],
'directories': [],
'by_type': {},
'by_location': {}
}
# Search in all specified directories
for directory in directories:
# Check if directory exists
if not Path(directory).exists():
print(f"⚠️ Directory not found: {directory}")
continue
for pattern in patterns:
# Search in main directory
found = glob.glob(f"{directory}/**/{pattern}", recursive=True)
inventory['files'].extend(found)
# Don't forget hidden directories!
hidden_found = glob.glob(f"{directory}/**/.*/{pattern}", recursive=True)
inventory['files'].extend(hidden_found)
# Remove duplicates and sort
inventory['files'] = sorted(set(inventory['files']))
# Categorize by type
for file in inventory['files']:
ext = Path(file).suffix
if ext not in inventory['by_type']:
inventory['by_type'][ext] = []
inventory['by_type'][ext].append(file)
# Report
print(f"\n📋 INVENTORY COMPLETE")
print(f"Total files: {len(inventory['files'])}")
print("\nBy type:")
for ext, files in inventory['by_type'].items():
print(f" {ext}: {len(files)} files")
print("\nFirst 10 files:")
for file in inventory['files'][:10]:
print(f" - {file}")
if len(inventory['files']) > 10:
print(f" ... and {len(inventory['files']) - 10} more")
return inventory
Step 4: Process with Progress Tracking
Process items with explicit progress reporting:
python
def process_with_tracking(items, operation):
"""Process items with detailed progress tracking."""
total = len(items)
processed = 0
failed = []
skipped = []
print(f"\n🔄 PROCESSING {total} items")
print("=" * 50)
for i, item in enumerate(items, 1):
try:
# Show progress every 10 items or for small batches
if i % 10 == 0 or total < 50:
print(f"Progress: {i}/{total} ({i*100//total}%)")
# Apply operation
result = operation(item)
if result == 'skipped':
skipped.append(item)
else:
processed += 1
except Exception as e:
print(f"❌ Failed: {item} - {e}")
failed.append((item, str(e)))
# Final report
print("\n" + "=" * 50)
print("📊 PROCESSING COMPLETE")
print(f" ✅ Processed: {processed}/{total}")
print(f" ⏭️ Skipped: {len(skipped)}")
print(f" ❌ Failed: {len(failed)}")
if failed:
print("\nFailed items:")
for item, error in failed[:5]:
print(f" - {item}: {error}")
if len(failed) > 5:
print(f" ... and {len(failed) - 5} more")
return processed, skipped, failed
Step 5: Verify Completeness
Ensure nothing was missed:
bash
# Verification checklist
echo "=== COMPLETENESS VERIFICATION ==="
# 1. Check item count matches
EXPECTED=$(wc -l < all_files.txt)
PROCESSED=$(wc -l < processed_files.txt)
if [ "$EXPECTED" -eq "$PROCESSED" ]; then
echo "✅ All $EXPECTED files processed"
else
echo "❌ Mismatch: Expected $EXPECTED, Processed $PROCESSED"
# Find what was missed
comm -23 <(sort all_files.txt) <(sort processed_files.txt) > missed_files.txt
echo "Missed files:"
cat missed_files.txt
fi
# 2. Check common hiding spots
echo "Checking hidden directories..."
fd -H -t f -e py -e js -e tsx . | grep -E '/\.' > hidden_files.txt
if [ -s hidden_files.txt ]; then
echo "⚠️ Found files in hidden directories:"
head -5 hidden_files.txt
fi
# 3. Check commonly missed locations
COMMON_MISSED=(".github" ".vscode" "docs" "examples" "samples")
for dir in "${COMMON_MISSED[@]}"; do
if [ -d "$dir" ]; then
count=$(find "$dir" -type f \( -name "*.py" -o -name "*.js" \) | wc -l)
echo " $dir: $count relevant files"
fi
done
Common Scope Pitfalls
Pitfall 1: Missing Hidden Directories
bash
# ❌ BAD: Misses .github/, .vscode/, etc find . -name "*.py" # ✅ GOOD: Includes hidden directories find . -name "*.py" -o -path "*/.*/*.py"
Pitfall 2: Incomplete Glob Patterns
bash
# ❌ BAD: Misses .jsx, .mjs, .cjs files fd -e js # ✅ GOOD: Comprehensive patterns fd -e js -e jsx -e mjs -e cjs -e ts -e tsx
Pitfall 3: Not Checking All Locations
python
# ❌ BAD: Only checks src/
files = glob.glob("src/**/*.py")
# ✅ GOOD: Checks all relevant directories
locations = ["src/", "tests/", "scripts/", "tools/", ".github/"]
files = []
for loc in locations:
files.extend(glob.glob(f"{loc}**/*.py", recursive=True))
Batch Operation Strategies
Strategy 1: Parallel Processing
python
# Process in parallel for performance
from concurrent.futures import ThreadPoolExecutor
def batch_process_parallel(items, operation, max_workers=4):
with ThreadPoolExecutor(max_workers=max_workers) as executor:
results = list(executor.map(operation, items))
return results
Strategy 2: Chunked Processing
python
# Process in chunks for large batches
def chunk_process(items, chunk_size=100):
for i in range(0, len(items), chunk_size):
chunk = items[i:i+chunk_size]
print(f"Processing chunk {i//chunk_size + 1}: {len(chunk)} items")
process_chunk(chunk)
Strategy 3: Safe Mode Processing
python
# Dry run first, then actual processing
def safe_batch_process(items, operation):
# Dry run
print("🔍 DRY RUN - No changes will be made")
issues = []
for item in items:
if not validate_item(item):
issues.append(item)
if issues:
print(f"⚠️ Found {len(issues)} potential issues")
if not confirm("Continue anyway?"):
return
# Actual processing
print("🚀 EXECUTING - Making changes")
process_with_tracking(items, operation)
Progress Reporting Format
Use this standard format for batch operations:
code
================================= BATCH OPERATION: [Description] ================================= Scope: [patterns/criteria] Total items: N Discovery Phase: ✓ Found N files in src/ ✓ Found N files in tests/ ✓ Found N files in .github/ Total: N items Processing: [████████████░░░░░░░] 75% (750/1000) Current: processing file_xyz.py Elapsed: 2m 30s Estimated: 50s remaining Summary: ✅ Processed: 950/1000 ⏭️ Skipped: 30 (unchanged) ❌ Failed: 20 (see errors.log) Verification: ✓ All items in scope processed ✓ No files missed ✓ Results validated =================================
Scripts
Scope Discovery Tool
See scripts/discover_scope.py for automated scope discovery
Batch Processor
See scripts/batch_processor.py for parallel batch processing
References
- •Glob pattern guide: references/glob_patterns.md
- •Batch operation best practices: references/batch_operations.md