AgentSkillsCN

Validation

验证

SKILL.md

Validation & Governance Skill

Purpose

Define the go/no-go gates, data quality checks, staleness windows, IC thresholds, and governance requirements that every pipeline run must satisfy before producing output. This skill encodes Wake Robin's fail-closed philosophy: uncertain or stale data triggers exclusion, not graceful degradation.

Preconditions

  • Pipeline runs MUST have an explicit as_of_date parameter (never datetime.now()).
  • All validation uses Decimal arithmetic where scores are involved.
  • PIT cutoff: source_date <= as_of_date - 1 (standard) or source_date < as_of_date - 2 (strict mode).

Gate 1: Point-in-Time (PIT) Enforcement

RuleFormulaConsequence
Standard PITsource_date <= as_of_date - 1 dayData admitted
Strict PITsource_date < as_of_date - 2 daysExtra buffer for intraday data
Lookaheadage_days < 0 (future data)Reject unconditionally

Every record must pass PIT admissibility before entering any scoring module. There are no exceptions.


Gate 2: Data Staleness (Phase-Dependent)

Financial Data

LevelAge (days)ActionPenalty
PASS<= 60Green light1.0x
WARN60-90Log warning1.0x
SOFT_GATE90-120Apply penalty0.5x
HARD_GATE> 120Exclude0.0x

Trial Data - Phase 3

LevelAge (days)Penalty
PASS<= 901.0x
WARN90-1201.0x
SOFT_GATE120-1800.6x
HARD_GATE> 180Exclude

Trial Data - Phase 2

LevelAge (days)Penalty
PASS<= 1801.0x
WARN180-2701.0x
SOFT_GATE270-3650.7x
HARD_GATE> 365Exclude

Trial Data - Phase 1

LevelAge (days)Penalty
PASS<= 2701.0x
WARN270-3651.0x
SOFT_GATE365-5450.8x
HARD_GATE> 545Exclude

Market Data

LevelAge (days)Penalty
PASS<= 31.0x
WARN3-51.0x
SOFT_GATE5-100.3x
HARD_GATE> 10Exclude

Short Interest Data (FINRA 2-week lag built in)

LevelAge (days)Penalty
PASS<= 201.0x
WARN20-301.0x
SOFT_GATE30-450.5x
HARD_GATE> 45Exclude

13F Holdings Data (45-day SEC filing lag)

LevelAge (days)Penalty
PASS<= 601.0x
WARN60-901.0x
SOFT_GATE90-1350.4x
HARD_GATE> 135Exclude

SEC_13F_FILING_LAG_DAYS: 45 (built-in constant).


Gate 3: Data Quality Hard Gates

GateThresholdAction
Financial data age> 90 daysExclude from scoring
Market data age> 7 daysExclude from scoring
Trial data age> 30 daysExclude from scoring
Liquidity (ADV)< $500,000/dayExclude ticker
Price (penny stock)< $5.00Exclude ticker
Market field coverage< 80% fields presentExclude ticker
Financial field coverage< 50% fields presentIssue warning

Gate 4: Circuit Breakers

ConditionThresholdAction
Records failing validation> 20%Log warning
Records failing validation> 50%Fail entire pipeline
Minimum records for check< 10Skip circuit breaker check

Circuit breakers prevent silent data corruption from propagating through the pipeline.


Gate 5: Input Validation

ValidationRuleDefault
Ticker format^[A-Z]{1,5}$Max 5 uppercase alpha
Minimum date>= 1990-01-01Historical cutoff
CashNon-negativeRequired positive
Market capPositiveRequired positive
Maximum runway<= 1200 months100-year cap
Valid records %>= 10% must passMinimum threshold

Gate 6: Score Bounds Validation

All scores must fall within [0, 100]:

Score FieldMinMaxModule
financial_score0.0100.0Module 2
clinical_score0.0100.0Module 4
catalyst_score0.0100.0Module 3
score_blended0.0100.0Module 3 v2
composite_score0.0100.0Module 5

Any score outside [0, 100] is a pipeline error. Fail-closed.


Gate 7: Weight Sum Validation

Module 5 component weights must sum to 1.0 within tolerance:

ConstraintExpectedTolerance
Weight sum1.0+/- 0.01

Weights outside tolerance are a configuration error. Fail-closed.


Gate 8: Module Coverage Minimums

ModuleMinimum CoverageAction if Below
Module 2 (Financial)80% of universeWarning
Module 3 (Catalyst)80% of universeWarning
Module 4 (Clinical)80% of universeWarning

Gate 9: Severity System

Severity Levels

LevelMeaningScore MultiplierAction
NONEHealthy1.0Include
SEV1Caution0.90 (10% penalty)Include with flag
SEV2Warning0.50 (50% penalty)Include, soft gate
SEV3Critical0.00Exclude

SEV3 is a hard gate. The ticker is removed from the rankable universe.


Gate 10: Pipeline Health Status

ComponentCoverage ThresholdStatus if Below
catalyst_raw10%DEGRADED
momentum0%OPTIONAL
smart_money0%OPTIONAL
market_data0%OPTIONAL

Run Status Classification:

  • OK: All thresholds met
  • DEGRADED: Optional components below threshold
  • FAIL: Critical catalyst pipeline broken (< 5% with events)

IC Quality Benchmarks

Information Coefficient Thresholds

QualityIC RangeClassificationAction
ExcellentIC > 0.05Institutional-gradeDeploy
GoodIC 0.03-0.05TradeableUse with confidence
WeakIC 0.01-0.03Needs enhancementMonitor
NoiseIC < 0.01No predictive powerAbandon signal
NegativeIC < 0Inverted signalInvestigate inversion

IC Measurement Constants

ConstantValuePurpose
MIN_OBS_IC10Minimum observations for IC calculation
MIN_OBS_TSTAT20Minimum for t-statistic
MIN_OBS_BOOTSTRAP30Minimum for bootstrap CI
MIN_ROLLING_WINDOW12 weeksMinimum rolling window
BOOTSTRAP_ITERATIONS1000Bootstrap resampling count
TSTAT_THRESHOLD_952.095% confidence
TSTAT_THRESHOLD_992.5899% confidence

Forward Return Horizons

HorizonTrading Days
1w5
2w10
1m20
1.5m30
3m60
4.5m90

Market Cap Buckets (IC Analysis)

BucketRange
MICRO< $300M
SMALL$300M - $1B
MID$1B - $5B
LARGE> $5B

Regime Data Staleness Haircuts

Data AgeConfidence Multiplier
<= 2 days1.00 (full)
3-5 days0.85 (15% haircut)
6-10 days0.65 (35% haircut)
> 10 days0.00 (force UNKNOWN regime)

Production Hardening Limits

File Size Limits

File TypeMax Size
JSON files100 MB
Config files10 MB
Checkpoint files50 MB

Operation Timeouts

OperationTimeout
File read60 seconds
Module execution600 seconds (10 min)
Full pipeline3600 seconds (1 hour)

Logging Sanitization

LimitValue
List items logged10 max
String length logged200 chars max
Blocked patternsapi_key, password, secret, token, credential, ssn, account_number, cusip

Determinism Enforcement

SettingRequired ValuePurpose
force_deterministic_timestampstrueNo datetime.now()
sort_output_keystrueReproducible JSON
include_content_hashestrueIntegrity verification
random_seed42Reproducible randomization

Determinism Rules

  1. Same inputs MUST produce byte-identical outputs
  2. All JSON serialization uses sorted keys
  3. All list operations use deterministic sort keys
  4. Content hashes (SHA256) included in every output for verification
  5. No external API calls during scoring (stdlib only)
  6. All timestamps derived from as_of_date, never from wall clock

Governance Metadata Requirements

Every pipeline output MUST include:

json
{
  "_governance": {
    "run_id": "<deterministic-hash>",
    "score_version": "<version>",
    "schema_version": "<version>",
    "parameters_hash": "sha256:<hash>",
    "pit_cutoff": "<ISO-date>",
    "as_of_date": "<ISO-date>"
  }
}

Audit Stages

StageWhen
INITPipeline initialization
LOADData loading
ADAPTData transformation
FEATURESFeature engineering
RISKRisk calculation
SCOREScoring execution
REPORTReport generation
FINALPipeline completion

Audit Status Values

StatusMeaning
OKStage passed
FAILStage failed
SKIPStage skipped

Standard Error Codes

CodeDescription
MISSING_INPUTRequired input not found
SCHEMA_MISMATCHSchema validation failed
HASH_ERRORIntegrity check failed
PARAMS_MISSINGParameters incomplete
MAPPING_MISSINGMapping not found
VALIDATION_ERRORData validation failed
UNKNOWN_ERRORUnclassified error

Schema Version Support

ModuleSupported Versions
module_11.0.0
module_21.0.0
module_3dynamic, m3catalyst_vnext_20260111
module_41.0.0
module_51.0.0, 1.1.0

Enhancement Engine Confidence Thresholds

EngineConfidence GateEffect Below Gate
PoS0.40PoS weight -> 0
Momentum0.50Momentum not meaningful
Smart Money0.50Smart money signal excluded
Valuation0.40Valuation fallback to sector

Pre-Run Checklist

Before executing a pipeline run, verify:

  1. as_of_date is explicitly provided (never derived from wall clock)
  2. All input files exist and are within size limits
  3. PIT cutoff is computed and logged
  4. Schema versions match expected versions
  5. Weight sums are within tolerance
  6. No float arithmetic in scoring paths (only Decimal)
  7. No datetime.now() calls in any module
  8. No random module usage without explicit seed
  9. Audit log writer is initialized
  10. Run ID is deterministically generated

Post-Run Checklist

After a pipeline run completes, verify:

  1. All output scores are within [0, 100]
  2. Governance metadata is present in every output file
  3. Content hashes match recomputed hashes (determinism check)
  4. No SEV3 tickers appear in ranked output
  5. Coverage metrics are logged (per-module and per-signal)
  6. Circuit breaker did not trip silently
  7. Staleness penalties were applied where required
  8. Audit log contains entries for all stages (INIT through FINAL)

Source Files

ComponentFile
Data Quality Gatescommon/data_quality.py
Staleness Gatescommon/staleness_gates.py
PIT Enforcementcommon/pit_enforcement.py
Input Validationcommon/input_validation.py
Integration Contractscommon/integration_contracts.py
Schema Validationcommon/schema_validation.py
Production Hardeningcommon/production_hardening.py
Robustness Utilitiescommon/robustness.py
IC Measurementbacktest/ic_measurement.py
Audit Loggovernance/audit_log.py
Pipeline Configconfig.yml