Code Quality Fix All

Fix code quality issues identified in a code quality review. This skill systematically addresses issues found by the code-quality-review-all skill for ANY code quality topic, with validation and testing at each step.

Expected Arguments

When invoked, this skill expects the path to a code quality topic as an argument (e.g., agent_artefacts/code_quality/private_api_imports).

If not provided, the skill will ask the user for the topic path. Within the topic path, there are several files:

•README.md - contains description of the issue and examples of how to fix it
•results.json - contains list of all identified issues
•SUMMARY.md - contains summary of the identified issues

Filters and options are specified interactively after the skill starts by using the AskUserQuestion tool to present options unless specified otherwise in arguments.

•
Which issue types to target:
- •all
- •specific types
- •Fix complexity level (easy only, medium and below, or all)
•Which evaluations to fix (all, specific ones, evaluations with small number of issues)
•Maximum number of issues to fix in this run

Workflow

Phase 1: Understanding the Topic and Planning

•
Read topic documentation
- •
  Read the README.md to understand:
  - •What code quality issue this topic addresses
  - •Why it matters (stability, maintainability, etc.)
  - •How to detect the issue
  - •How to fix the issue (fix patterns, examples)
- •Read results.json to get all identified issues
- •Identify which issues are in scope based on arguments
•
Analyze and categorize issues
- •
  Analyze fix complexity based on:
  - •issue_description
  - •suggested_fix from results.json
  - •Fix examples in README.md
- •
  Classify as:
  - •Easy: Single-line changes, clear fix pattern in README
  - •Medium: Multi-line changes, well-documented fix approach
  - •Hard: No clear fix pattern, requires research or copying code
- •Group issues by evaluation and issue type
- •Generate statistics for presenting to user
•
Ask user for filtering preferences
- •
  Use AskUserQuestion tool to ask:
  - •Which evaluations to fix? (all / specific ones / most affected)
  - •Which issue types to target? (all / specific types)
  - •Fix complexity level? (easy only / easy+medium / all)
  - •Max issues per run? (all / limit to specific number)
- •Apply filters based on user responses
- •
  Present filtered plan with:
  - •Number of issues to fix
  - •Breakdown by evaluation and issue type
  - •Complexity distribution
  - •Ask for final confirmation to proceed
•
Validate understanding of fixes
- •
  For each unique issue type in scope:
  - •Check if README.md documents how to fix it
  - •Look for "Good Examples" and "Bad Examples" sections
  - •Check "suggested_fix" field in results.json
- •
  If fix approach is unclear for any issue type:
  - •Research the correct approach
  - •Update <topic>/README.md with findings
  - •Ask user for guidance if still uncertain

Phase 2: Pre-Fix Validation

For each issue to be fixed:

•
Read and understand context
- •Read the entire file containing the issue (not just the line)
- •Understand how the problematic code is used
- •Look for related issues in the same file
- •Check for patterns that might affect the fix (e.g., multiple occurrences)
- •Identify any cascading changes needed (related imports, type hints, etc.)
•
Validate the suggested fix
- •Review the "suggested_fix" from results.json
- •Check against fix patterns in README.md
- •Verify the fix won't break functionality
- •
  For complex fixes:
  - •Check if dependencies/alternatives actually exist
  - •Validate that replacement code follows same patterns
  - •Consider edge cases
•
Estimate change scope
- •Count how many lines will change for this fix
- •Identify if cascading changes are needed
- •Determine if multiple files need updating
- •
  If changes exceed 100 lines for a single issue:
  - •Alert user with:
    
    •Issue details
    
    •Why the change is large
    
    •What will change
  - •Get explicit approval before proceeding

Phase 3: Applying Fixes

•Apply fixes systematically

•Create a new branch to apply fixes to, with a name like agent/<short_description_of_issue>
•Process one evaluation at a time
•Within each evaluation, group by issue type
•
For each fix:
- •Use Edit tool to apply the change
- •Follow the suggested_fix guidance
- •Apply fix patterns from README.md
- •Handle related issues in same file together
- •Add comments if the fix requires it (e.g., copied code attribution)
•Track what was fixed

•
Verify changes compile/parse
- •
  After fixing each file, validate:
  - •File is syntactically valid (Python can parse it)
  - •No obvious import errors introduced
  - •Code follows repository patterns
- •
  If validation fails:
  - •Investigate the issue
  - •Attempt to fix validation error
  - •Rollback change if cannot be resolved
•
Track progress
- •
  Maintain list of:
  - •Issues successfully fixed (file, line, issue type)
  - •Issues that couldn't be fixed (with reasons)
  - •Evaluations that have been modified
  - •Files that were changed

Phase 4: Testing and Validation

•
Run linting
- •Run repository's linter on modified files (ruff, flake8, mypy, etc.)
- •
  Check for:
  - •Import errors
  - •Type checking errors
  - •Style violations introduced
- •Fix any linting issues that result from changes
- •If linting issues can't be fixed, document them
•
Run unit tests
- •
  Identify test files for each modified evaluation
- •
  Run unit tests for affected evaluations using pytest:
  
  Basic test commands:
  bash
  # Install relevant packages in the event of import failure uv sync --extra test # Run tests for a specific evaluation uv run pytest tests/<evaluation_name>/ # Run a specific test file uv run pytest tests/test_file.py # Run a specific test uv run pytest tests/test_file.py::TestClass::test_method # Run slow tests (excluded by default) uv run pytest --runslow tests/ # Skip dataset download tests uv run pytest -m 'not dataset_download' tests/ # Run only slow tests uv run pytest -m slow tests/
  Test markers to be aware of:
  - •
    @pytest.mark.slow - Tests taking >10 seconds
  - •
    @pytest.mark.dataset_download - Tests that download datasets
  - •
    @pytest.mark.docker - Tests using Docker
  - •
    @pytest.mark.huggingface - HuggingFace-related tests
  - •
    Focus on tests for the specific evaluation
  - •
    Look for test failures or errors
- •
  IMPORTANT: Do NOT run full evaluations (they take too long) unless user explicitly requests it
•
Handle test failures
- •
  For each test failure:
  - •Read test output carefully
  - •Determine if failure is caused by the fix
  - •Check if it's a pre-existing failure
- •
  If caused by fix:
  - •Try to adjust the fix to make tests pass
  - •If cannot be resolved, rollback the change
  - •Document the issue for user review
- •
  If pre-existing:
  - •Note it but don't block on it
  - •Inform user

Phase 5: Re-Review and Handle Remaining Issues

•
Update results.json with fix status
- •
  For each issue that was fixed, add "fix_status" field after "suggested_fix":
  json
  { ... "suggested_fix": "...", "fix_status": "fixed - please review" }
- •
  For issues that couldn't be fixed, add explanation:
  json
  "fix_status": "not fixed - reason: ..."
- •
  IMPORTANT: Do NOT remove any entries from results.json - only add/update "fix_status"
- •
  The code-quality-review-all skill owns results.json and is responsible for removing entries
•
Re-run code quality review
- •IMPORTANT: Use Task tool to spawn subagent running code-quality-review-all skill
- •Pass the same topic path
- •This will update results.json with current state
- •
  Compare results before and after to identify:
  - •Issues that are now resolved (no longer appear)
  - •New issues that may have been introduced
  - •Issues that still remain despite fix attempts
•
Fix remaining issues if in scope
- •
  For each new or remaining in-scope issue:
  - •Investigate why previous fix didn't work
  - •Attempt alternative fix approach
  - •Update "fix_status" with attempt results
- •Repeat this process until no more in-scope issues can be fixed
•
Update topic's README.md
- •Add any knowledge that you have discovered that will be useful in detecting or fixing topic-related issues in the future
- •Do not remove examples of bad code or patterns that were fixed - they will be useful in future reviews and fixes of future evaluations.
•
Update SUMMARY.md
- •
  Add a "Recent Fixes" section with:
  - •Date of fix run
  - •Number of issues fixed
  - •Which evaluations were updated
- •Keep historical data (don't remove past information)
- •Update recommendations to reflect remaining work
•
Run markdown linters
- •Use uv run pre-commit run markdownlint-fix to fix markdown linting issues

Phase 6: Create PR Description and Present Results

•
Create/Update PR description (cumulative)
- •
  Read existing PR_DESCRIPTION.md if it exists (from previous runs)
- •
  Cumulative tracking: PR description represents ALL changes from branch base, not just this run
- •
  If PR_DESCRIPTION.md exists:
  - •Parse existing content to extract previous runs' data
  - •Append information from this run
  - •Update cumulative statistics
- •
  If PR_DESCRIPTION.md doesn't exist (first run):
  - •Create new file
- •
  Format for GitHub/GitLab pull request with:
  - •Summary: Brief overview of the code quality topic and total fixes (2-3 sentences)
  - •Overall Changes (cumulative from all runs):
    
    •Total issues fixed across all runs by type
    
    •Total evaluations affected
    
    •Total files modified
  - •Fix Sessions: List each run session with:
    
    •Date/time of run
    
    •Issues fixed in that session
    
    •Complexity level targeted (easy/medium/all)
  - •Fixed Issues (cumulative): Table or list with all file paths and issue types from all runs
  - •Testing (from latest run):
    
    •Which tests were run
    
    •Pass/fail status
    
    •Any test issues encountered
  - •Remaining Issues (current state):
    
    •Count of issues still open
    
    •Brief note on what remains
  - •Review Notes (cumulative):
    
    •Any complications or special considerations from any run
    
    •Areas that need extra attention during review
- •
  Use proper markdown formatting for PR readability
- •
  IMPORTANT: Do NOT commit PR_DESCRIPTION.md - it's only for creating the PR
- •
  Example structure:
  markdown
  ## Summary Fix private API imports code quality issues across evaluations. ## Overall Changes - Total issues fixed: 25 - Evaluations affected: 8 ## Fix Sessions ### Session 1: 2026-01-18 10:30 (Easy issues) - Fixed 10 easy issues - Targeted: Easy complexity, All evaluations ### Session 2: 2026-01-18 14:15 (Medium issues) - Fixed 15 medium issues - Targeted: Medium complexity, Specific evaluations ## Fixed Issues [Table of all fixed issues from all sessions] ## Testing [Latest test results] ## Remaining Issues 6 issues remain (4 hard, 2 require investigation) ## Review Notes - Session 1: All tests passed - Session 2: One test required adjustment in fortress/scorer.py
•
Present results to user
- •
  Show high-level summary for this run:
  - •X issues fixed in this session
  - •Y issues remain
  - •Z tests passed
- •
  Show cumulative progress from PR_DESCRIPTION.md:
  - •Total issues fixed across all runs
  - •Number of fix sessions completed
- •Show before/after statistics from SUMMARY.md
- •List modified files from this run
- •Display content of PR_DESCRIPTION.md for user review
- •Do NOT automatically commit - let user review changes
•
Offer next steps
- •
  Create commit and PR: Offer to:
  - •Commit all changes (source files, results.json, SUMMARY.md)
  - •Create pull request with description from PR_DESCRIPTION.md
  - •Note: PR_DESCRIPTION.md itself is NOT committed (it's just for PR description)
  - •The PR description includes cumulative changes from all fix sessions on this branch
- •
  Run more fixes: If issues remain, suggest running skill again with different filters
  - •Running again will append to PR_DESCRIPTION.md, creating cumulative tracking
  - •This allows iterative fixing: easy issues first, then medium, then hard
- •Manual review needed: List any issues that require manual attention

Important Guidelines

Safety First

•Never batch all fixes blindly - Validate each fix type before applying en masse
•Always read before editing - Understand context before changing code
•Verify fixes don't break functionality - Run tests incrementally
•Be conservative - Skip fixes you're uncertain about rather than risk breaking code
•Get approval for large changes - Alert user when fixes exceed 100 lines
•Have rollback strategy - Be able to revert if fixes cause problems

Context is Critical

•Understand the quality issue - Read README.md thoroughly
•Understand why code was written that way - There might be good reasons
•Look for patterns - Similar issues often need similar fixes
•Check related code - Fixes might require updating nearby code
•Read existing comments - Developers might have documented why they used certain patterns

Validation at Every Step

•Verify fix patterns from README - Don't guess how to fix
•Check suggested_fix in results.json - Use provided guidance
•Validate changes compile - Ensure code parses after changes
•Run linters - Catch style and import issues
•Run tests - Detect regressions immediately
•Re-run review - Verify fixes actually resolve issues

Communication

•Show plan before executing - Let user see what will be fixed
•Alert for large changes - Get approval for fixes >10 lines
•Report uncertainties - Flag issues where fix approach is unclear
•Show progress - Keep user informed during fixes
•Explain failures - Document why certain issues couldn't be fixed
•Provide detailed reports - Create comprehensive fix reports

What NOT to Do

•Don't assume you know how to fix - Always consult README.md and results.json
•Don't remove entries from results.json - Only add/update "fix_status" field
•Don't replace PR_DESCRIPTION.md - Append to it to maintain cumulative history across runs
•Don't run full evaluations - Only run unit tests (evaluations are slow)
•Don't commit automatically - Let user review changes first
•Don't commit PR_DESCRIPTION.md - It's only for creating the PR
•Don't fix issues you can't validate - Skip rather than risk breaking
•Don't ignore test failures - Investigate or rollback
•Don't make unrelated changes - Only fix the specific quality issues
•Don't assume all issues of same type are identical - Context matters