QuantConnect Validation Skill (Phase 5)

Purpose: Walk-forward validation for Phase 5 robustness testing before deployment.

Progressive Disclosure: This primer contains essentials only. Full details available via qc_validate.py docs command.

When to Use This Skill

Load when:

•Running /qc-validate command
•Testing out-of-sample performance
•Evaluating strategy robustness
•Making deployment decisions

Tool: Use python SCRIPTS/qc_validate.py for walk-forward validation

Walk-Forward Validation Overview

Purpose: Detect overfitting and ensure strategy generalizes to new data.

Approach:

•Training (in-sample): Develop/optimize on 80% of data
•Testing (out-of-sample): Validate on remaining 20%
•Compare: Measure performance degradation

Example (5-year backtest 2019-2023):

•In-sample: 2019-2022 (4 years) - Training period
•Out-of-sample: 2023 (1 year) - Testing period

Key Metrics

1. Performance Degradation

Formula: (IS Sharpe - OOS Sharpe) / IS Sharpe

Degradation	Quality	Decision
< 15%	Excellent	Deploy with confidence
15-30%	Acceptable	Deploy but monitor
30-40%	Concerning	Escalate to human
> 40%	Severe	Abandon (overfit)

Key Insight: < 15% degradation indicates robust strategy that generalizes well.

2. Robustness Score

Formula: OOS Sharpe / IS Sharpe

Score	Quality	Interpretation
> 0.75	High	Strategy robust across periods
0.60-0.75	Moderate	Acceptable but monitor
< 0.60	Low	Strategy unstable

Key Insight: > 0.75 indicates strategy maintains performance out-of-sample.

Quick Usage

Run Walk-Forward Validation

bash

# From hypothesis directory with iteration_state.json
python SCRIPTS/qc_validate.py run --strategy strategy.py

# Custom split ratio (default 80/20)
python SCRIPTS/qc_validate.py run --strategy strategy.py --split 0.70

What it does:

•Reads project_id from iteration_state.json
•Splits date range (80/20 default)
•Runs in-sample backtest
•Runs out-of-sample backtest
•Calculates degradation and robustness
•Saves results to PROJECT_LOGS/validation_result.json

Analyze Results

bash

python SCRIPTS/qc_validate.py analyze --results PROJECT_LOGS/validation_result.json

Output:

•Performance comparison table
•Degradation percentage
•Robustness assessment
•Deployment recommendation

Decision Integration

After validation, the decision framework evaluates:

DEPLOY_STRATEGY (Deploy with confidence):

•Degradation < 15% AND
•Robustness > 0.75 AND
•OOS Sharpe > 0.7

PROCEED_WITH_CAUTION (Deploy but monitor):

•Degradation < 30% AND
•Robustness > 0.60 AND
•OOS Sharpe > 0.5

ABANDON_HYPOTHESIS (Too unstable):

•Degradation > 40% OR
•Robustness < 0.5 OR
•OOS Sharpe < 0

ESCALATE_TO_HUMAN (Borderline):

•Results don't clearly fit above criteria

Best Practices

1. Time Splits

•Standard: 80/20 (4 years training, 1 year testing)
•Conservative: 70/30 (more OOS testing)
•Very Conservative: 60/40 (extensive testing)

Minimum OOS period: 6 months (1 year preferred)

2. Never Peek at Out-of-Sample

CRITICAL RULE: Never adjust strategy based on OOS results.

•OOS is for testing only
•Adjusting based on OOS defeats validation purpose
•If you adjust, OOS becomes in-sample

3. Check Trade Count

Both periods need sufficient trades:

•In-sample: Minimum 30 trades (50+ preferred)
•Out-of-sample: Minimum 10 trades (20+ preferred)

Too few trades = unreliable validation.

4. Compare Multiple Metrics

Don't just look at Sharpe:

•Sharpe Ratio degradation
•Max Drawdown increase
•Win Rate change
•Profit Factor degradation
•Trade Count consistency

All metrics should degrade similarly for robust strategy.

Common Issues

Severe Degradation (> 40%)

Cause: Strategy overfit to in-sample period

Example:

•IS Sharpe: 1.5 → OOS Sharpe: 0.6
•Degradation: 60%

Decision: ABANDON_HYPOTHESIS

Fix for next hypothesis: Simplify (fewer parameters), longer training period

Different Market Regimes

Cause: IS was bull market, OOS was bear market

Example:

•2019-2022 (bull): Sharpe 1.2
•2023 (bear): Sharpe -0.3

Decision: Not necessarily overfit, but not robust across regimes

Fix: Test across multiple regimes, add regime detection

Low Trade Count in OOS

Cause: Strategy stops trading in OOS period

Example:

•IS: 120 trades → OOS: 3 trades

Decision: ESCALATE_TO_HUMAN (insufficient OOS data)

Integration with /qc-validate

The /qc-validate command workflow:

•Read iteration_state.json for project_id and parameters
•Load this skill for validation approach
•Modify strategy for time splits (80/20)
•Run in-sample and OOS backtests
•Calculate degradation and robustness
•Evaluate using decision framework
•Update iteration_state.json with results
•Git commit with validation summary

Reference Documentation

Need implementation details? All reference documentation accessible via --help:

bash

python SCRIPTS/qc_validate.py --help

That's the only way to access complete reference documentation.

Topics covered in --help:

•Walk-forward validation methodology
•Performance degradation thresholds
•Monte Carlo validation techniques
•PSR/DSR statistical metrics
•Common errors and fixes
•Phase 5 decision criteria

The primer above covers 90% of use cases. Use --help for edge cases and detailed analysis.

Related Skills

•quantconnect - Core strategy development
•quantconnect-backtest - Phase 3 backtesting (qc_backtest.py:**)
•quantconnect-optimization - Phase 4 optimization (qc_optimize.py:**)
•decision-framework - Decision thresholds
•backtesting-analysis - Metric interpretation

Key Principles

•OOS is sacred - Never adjust strategy based on OOS results
•Degradation < 15% is excellent - Strategy generalizes well
•Robustness > 0.75 is target - Maintains performance OOS
•Trade count matters - Need sufficient trades in both periods
•Multiple metrics - All should degrade similarly for robustness

Example Decision

code

In-Sample (2019-2022):
  Sharpe: 0.97, Drawdown: 18%, Trades: 142

Out-of-Sample (2023):
  Sharpe: 0.89, Drawdown: 22%, Trades: 38

Degradation: 8.2% (< 15%)
Robustness: 0.92 (> 0.75)

→ DEPLOY_STRATEGY (minimal degradation, high robustness)

Version: 2.0.0 (Progressive Disclosure) Last Updated: November 13, 2025 Lines: ~190 (was 463) Context Reduction: 59%