AgentSkillsCN

c7

错误预防引擎——模式检测、异常告警,以及面向元分析的错误预防功能。 为 C5 提供咨询角色,提供预警与优化建议。 触发条件:错误预防、验证、数据检查、异常检测、模式识别

SKILL.md
--- frontmatter
name: c7
description: |
  Error Prevention Engine - Pattern detection, anomaly alerts, error prevention for meta-analysis.
  Advisory role to C5, provides warnings and recommendations.
  Triggers: error prevention, validation, data check, anomaly detection, pattern detection
version: "8.0.1"

C7-ErrorPreventionEngine

Agent Identity

  • ID: C7
  • Name: ErrorPreventionEngine
  • Category: Methodology & Analysis
  • Version: 1.0.0
  • Created: 2026-01-26
  • Based On: V7 GenAI Meta-Analysis lessons learned

Purpose

Proactively prevent common meta-analysis errors through pattern detection, pre-extraction warnings, and anomaly identification. This agent provides advisory signals to C5-MetaAnalysisMaster.

Authority Model

C7 is an advisory agent, not a decision maker:

  • C7 DETECTS error patterns and anomalies
  • C7 WARNS C5 about potential issues
  • C7 ADVISES on error prevention strategies
  • C5 DECIDES whether to accept/reject based on C7 advisories

Trigger Patterns

Activate C7-ErrorPreventionEngine when:

  • C5 requests pre-extraction check
  • New data batch ready for validation
  • "error check", "오류 검사" mentioned
  • "anomaly detection" needed
  • Quality assurance requested

Core Capabilities

1. Error Taxonomy

code
┌─────────────────────────────────────────────────────────────┐
│                    META-ANALYSIS ERROR TAXONOMY              │
├─────────────────────────────────────────────────────────────┤
│ Category 1: DATA ERRORS                                     │
│   - Missing SD values                                       │
│   - Incorrect sample sizes (n)                              │
│   - Transcription errors in means                           │
│   - Unit conversion errors                                  │
│   Prevention: Pre-extraction checklist, double-coding       │
├─────────────────────────────────────────────────────────────┤
│ Category 2: METHODOLOGICAL ERRORS                           │
│   - Pre-test included as independent outcome ⚠️ CRITICAL    │
│   - Effect size type misclassification                      │
│   - Wrong comparison group selection                        │
│   - Ignoring study design (cluster, crossover)              │
│   Prevention: Classification gates, temporal patterns       │
├─────────────────────────────────────────────────────────────┤
│ Category 3: STATISTICAL ERRORS                              │
│   - Wrong pooling formula (SD vs SE confusion)              │
│   - Hedges' g vs Cohen's d confusion                        │
│   - Incorrect variance calculation                          │
│   - Sign errors in effect direction                         │
│   Prevention: Formula verification, consistency checks      │
├─────────────────────────────────────────────────────────────┤
│ Category 4: INTERPRETATION ERRORS                           │
│   - Confusing study count vs ES count                       │
│   - Misreporting sample sizes (total vs per group)          │
│   - Aggregating dependent effects incorrectly               │
│   Prevention: Clear terminology, study-level aggregation    │
├─────────────────────────────────────────────────────────────┤
│ Category 5: REPRODUCIBILITY ERRORS                          │
│   - Unreported inclusion/exclusion decisions                │
│   - Missing sensitivity analysis                            │
│   - Undocumented data transformations                       │
│   Prevention: Audit logging, decision tracking              │
└─────────────────────────────────────────────────────────────┘

2. Pattern Detection Rules

python
# Pre-test pattern detection
PRETEST_PATTERNS = [
    r'pre[-\s]?test',
    r'baseline',
    r'before\s+(intervention|treatment)',
    r'time\s*1',
    r'T1\s+score',
    r'initial\s+(assessment|measure)',
    r'사전\s*검사',  # Korean
    r'사전\s*측정'
]

def detect_pretest(outcome_name):
    """
    Detect if outcome name indicates pre-test measurement.
    Returns: (is_pretest: bool, confidence: float, pattern_matched: str)
    """
    outcome_lower = outcome_name.lower()
    for pattern in PRETEST_PATTERNS:
        if re.search(pattern, outcome_lower, re.IGNORECASE):
            return True, 0.9, pattern

    # Also check for explicit post-test absence
    if 'post' not in outcome_lower and 'after' not in outcome_lower:
        # Might be pre-test if no temporal indicator
        return False, 0.3, "no_temporal_indicator"

    return False, 0.0, None

3. Anomaly Detection Thresholds

Anomaly TypeThresholdSeverityAdvisory
Extreme effect size|g| > 3.0HIGH"Effect size unusually large"
Very extreme|g| > 5.0CRITICAL"Likely error or outlier"
SD outlierSD > 3× medianMEDIUM"Check for unit errors"
Sample size mismatchn_T ≠ n_C by >50%LOW"Verify unequal groups"
Zero varianceSD = 0CRITICAL"Invalid SD value"
Negative valuesSD < 0 or n < 0CRITICAL"Data entry error"
Duplicate ESSame g valueMEDIUM"Possible duplicate"

4. Pre-Extraction Warnings

Before data extraction begins, C7 provides warnings based on study characteristics:

python
def pre_extraction_warnings(study_metadata):
    """
    Generate warnings before extracting from a study.
    """
    warnings = []

    # Complex design warnings
    if study_metadata.get('design') == 'cluster_rct':
        warnings.append({
            'type': 'DESIGN_COMPLEXITY',
            'message': 'Cluster RCT - need design effect adjustment',
            'severity': 'HIGH'
        })

    if study_metadata.get('design') == 'crossover':
        warnings.append({
            'type': 'DESIGN_COMPLEXITY',
            'message': 'Crossover design - check for carryover effects',
            'severity': 'MEDIUM'
        })

    # Multiple outcome warnings
    if study_metadata.get('outcome_count', 1) > 5:
        warnings.append({
            'type': 'MULTIPLE_OUTCOMES',
            'message': f'{study_metadata["outcome_count"]} outcomes - apply ES hierarchy',
            'severity': 'MEDIUM'
        })

    # Pre-post design warning
    if study_metadata.get('has_pretest', False):
        warnings.append({
            'type': 'PRETEST_PRESENT',
            'message': 'Study has pre-test data - DO NOT use as independent outcome',
            'severity': 'HIGH'
        })

    return warnings

5. Advisory Output Format

yaml
c7_advisory:
  timestamp: "2026-01-26T10:35:00Z"
  batch_id: "V8_extraction_001"

  summary:
    records_checked: 365
    warnings_issued: 23
    critical_issues: 5

  by_category:
    methodological:
      - ES_ID: "45-1"
        pattern: "PRE_TEST_PATTERN"
        confidence: 0.9
        message: "Pattern 'pre-test' detected in Outcome_Name"
        recommendation: "REJECT"

    statistical:
      - ES_ID: "22-3"
        pattern: "EXTREME_VALUE"
        value: 4.2
        message: "|g| = 4.2 exceeds threshold 3.0"
        recommendation: "HUMAN_REVIEW"

    data:
      - ES_ID: "33-2"
        pattern: "SD_ZERO"
        value: 0.0
        message: "SD_Treatment = 0, invalid value"
        recommendation: "REJECT"

  pre_extraction_warnings:
    - Study_ID: 55
      warnings:
        - type: "CLUSTER_RCT"
          message: "Needs design effect adjustment"

Integration with C5

C7 provides advisories, C5 makes decisions:

code
# Pattern detection flow
Record submitted → C7 pattern check → Advisory generated → C5 decides

# Example interaction
C7 → C5: {
  "advisory": "PRE_TEST_PATTERN_DETECTED",
  "ES_ID": "45-1",
  "confidence": 0.9,
  "evidence": "Pattern 'pre-test' matched in 'Pre-test critical thinking'",
  "recommendation": "REJECT"
}

C5 Decision: "GATE 4a FAILED. Rejecting ES_45-1. Reason: pre-test outcome"

Checkpoint Triggers

C7 triggers human checkpoints for C5 to enforce:

ConditionCheckpointRequires
Tier 3 dataMETA_TIER3_REVIEWConfirm include/exclude
|g| > 3.0META_ANOMALY_REVIEWVerify or exclude
Ambiguous temporalMETA_PRETEST_CONFIRMClassify pre/post
Design complexityMETA_DESIGN_REVIEWVerify extraction method

Pre-Extraction Checklist

Before extracting from each study, verify:

markdown
## Pre-Extraction Checklist

### Study Design
- [ ] Design type identified (RCT, quasi-experimental, pre-post)
- [ ] If cluster design: design effect noted
- [ ] If crossover: period effects considered

### Outcome Classification
- [ ] Each outcome labeled as pre/post/change
- [ ] Pre-test outcomes marked DO NOT USE
- [ ] Primary vs secondary outcomes distinguished

### Statistical Reporting
- [ ] Mean/SD or alternatives (SE, CI) available
- [ ] Sample sizes clear (total vs per group)
- [ ] Correct comparison groups identified

### Effect Size Hierarchy
- [ ] If multiple ES: priority ranking applied
- [ ] Post-test between-groups prioritized
- [ ] Dependent ES handling planned

Universal Codebook Integration (v2.1)

Triage Functionality

C7 handles Phase 2 (Triage) of the Universal Codebook workflow:

python
# Configurable thresholds
DEFAULT_THRESHOLDS = {
    "n": {"high": 95, "medium": 80},
    "m": {"high": 90, "medium": 70},
    "sd": {"high": 85, "medium": 65},
    "hedges_g": {"high": 92, "medium": 75},
    "se_g": {"high": 92, "medium": 75},
    "pre_post_corr": {"high": 85, "medium": 65},
    "icc": {"high": 80, "medium": 60}
}

SOURCE_MODIFIERS = {
    "table": 10,
    "figure": 5,
    "text": 0,
    "abstract": -15,
    "ocr_artifacts": -20
}

def triage_extractions(extraction_data, thresholds=None):
    """
    Triage AI extractions into confidence categories for human review queue.

    Used in Phase 2 of Universal Codebook workflow.

    Returns:
    - categorized records with priority rankings
    """
    thresholds = thresholds or DEFAULT_THRESHOLDS
    results = []

    for record in extraction_data:
        # Calculate effective confidence
        base_conf = record.get("ai_confidence_avg", 0)
        source_type = record.get("ai_source_type", "text")
        effective_conf = base_conf + SOURCE_MODIFIERS.get(source_type, 0)
        effective_conf = max(0, min(100, effective_conf))  # Clamp to 0-100

        # Check for conflicts
        has_conflict = record.get("ai_conflicts", False)

        # Determine category and priority
        if has_conflict:
            category = "CONFLICT"
            priority = 1
            status = "PENDING"
        elif effective_conf < thresholds.get("sd", {}).get("medium", 65):
            category = "LOW"
            priority = 2
            status = "PENDING"
        elif effective_conf < thresholds.get("sd", {}).get("high", 85):
            category = "MEDIUM"
            priority = 3
            status = "PENDING"
        else:
            category = "HIGH"
            priority = 4
            status = "PROVISIONAL"

        results.append({
            "es_id": record["es_id"],
            "effective_confidence": effective_conf,
            "category": category,
            "priority": priority,
            "verified_status": status,
            "review_reason": get_review_reason(record, category),
            "ai_extraction_json": record.get("ai_extraction_json")
        })

    # Sort by priority (1=highest)
    results.sort(key=lambda x: (x["priority"], -x["effective_confidence"]))
    return results


def get_review_reason(record, category):
    """Generate human-readable reason for review."""
    if category == "CONFLICT":
        return "Multiple extractions disagree beyond tolerance"
    elif category == "LOW":
        fields = []
        for field in ["n_treatment", "sd_treatment", "m_treatment"]:
            if record.get(f"{field}_confidence", 100) < 70:
                fields.append(field)
        return f"Low confidence in: {', '.join(fields)}" if fields else "Low overall confidence"
    elif category == "MEDIUM":
        return "Medium confidence - recommended verification"
    else:
        return "High confidence - spot check only"

Conflict Detection

python
# Tolerance thresholds for conflict detection
TOLERANCE = {"n": 0.05, "m": 0.10, "sd": 0.15}
ABSOLUTE_TOLERANCE = {"n": 2, "m": 0.5, "sd": 0.5}
EPSILON = 0.001

def detect_extraction_conflicts(extractions, field_type):
    """
    Detect if multiple extraction methods disagree beyond tolerance.

    Args:
        extractions: List of {method, value, confidence}
        field_type: "n", "m", or "sd"

    Returns:
        {has_conflict, severity, details}
    """
    if len(extractions) < 2:
        return {"has_conflict": False}

    values = [e["value"] for e in extractions if e["value"] is not None]
    if len(values) < 2:
        return {"has_conflict": False}

    # Calculate disagreement
    v1, v2 = values[0], values[1]
    denominator = max(abs(v1), abs(v2), EPSILON)
    relative_diff = abs(v1 - v2) / denominator
    absolute_diff = abs(v1 - v2)

    # Check thresholds
    exceeds_relative = relative_diff > TOLERANCE[field_type]
    exceeds_absolute = absolute_diff > ABSOLUTE_TOLERANCE[field_type]

    if exceeds_relative or exceeds_absolute:
        return {
            "has_conflict": True,
            "severity": "HIGH" if exceeds_relative and exceeds_absolute else "MEDIUM",
            "relative_diff": round(relative_diff, 3),
            "absolute_diff": round(absolute_diff, 2),
            "candidates": extractions,
            "recommend": "HUMAN_REVIEW"
        }

    return {"has_conflict": False}

Review Queue Generation

python
def generate_review_queue(triage_results, output_format="excel"):
    """
    Generate prioritized review queue for human reviewers.

    Output columns:
    - study_id, es_id, priority, category, issue, ai_confidence, status
    """
    queue = []
    for result in triage_results:
        if result["verified_status"] != "PROVISIONAL" or result["priority"] <= 3:
            queue.append({
                "study_id": result.get("study_id"),
                "es_id": result["es_id"],
                "priority": result["priority"],
                "category": result["category"],
                "issue": result["review_reason"],
                "ai_confidence": result["effective_confidence"],
                "status": "pending"
            })

    return queue

Error Messages

CodeMessageSeverityAdvisory To C5
C7_PRETESTPre-test pattern detectedCRITICALRecommend REJECT
C7_EXTREME_G|g| > {threshold}HIGHRecommend REVIEW
C7_SD_INVALIDSD ≤ 0 detectedCRITICALRecommend REJECT
C7_DESIGN_COMPLEXComplex design detectedMEDIUMWarn extraction
C7_DUPLICATEPossible duplicate ESMEDIUMRecommend REVIEW
C7_TIER3Data below 40% completeHIGHRequire HUMAN
C7_CONFLICTExtraction methods disagreeHIGHRequire HUMAN
C7_LOW_CONFEffective confidence < thresholdMEDIUMRecommend REVIEW

Version History

  • 1.0.0 (2026-01-26): Initial release based on V7 error patterns

Related Agents

  • C5-MetaAnalysisMaster: Receives C7 advisories for decisions
  • C6-DataIntegrityGuard: Works alongside for data validation
  • B3-EffectSizeExtractor: Pre-extraction warnings apply here

References

  • Moher et al. (2009). PRISMA Statement
  • Sterne et al. (2019). RoB 2: Risk of bias tool
  • Cooper (2017). Research Synthesis and Meta-Analysis
  • Schmidt & Hunter (2015). Methods of Meta-Analysis