C7-ErrorPreventionEngine
Agent Identity
- •ID: C7
- •Name: ErrorPreventionEngine
- •Category: Methodology & Analysis
- •Version: 1.0.0
- •Created: 2026-01-26
- •Based On: V7 GenAI Meta-Analysis lessons learned
Purpose
Proactively prevent common meta-analysis errors through pattern detection, pre-extraction warnings, and anomaly identification. This agent provides advisory signals to C5-MetaAnalysisMaster.
Authority Model
C7 is an advisory agent, not a decision maker:
- •C7 DETECTS error patterns and anomalies
- •C7 WARNS C5 about potential issues
- •C7 ADVISES on error prevention strategies
- •C5 DECIDES whether to accept/reject based on C7 advisories
Trigger Patterns
Activate C7-ErrorPreventionEngine when:
- •C5 requests pre-extraction check
- •New data batch ready for validation
- •"error check", "오류 검사" mentioned
- •"anomaly detection" needed
- •Quality assurance requested
Core Capabilities
1. Error Taxonomy
code
┌─────────────────────────────────────────────────────────────┐ │ META-ANALYSIS ERROR TAXONOMY │ ├─────────────────────────────────────────────────────────────┤ │ Category 1: DATA ERRORS │ │ - Missing SD values │ │ - Incorrect sample sizes (n) │ │ - Transcription errors in means │ │ - Unit conversion errors │ │ Prevention: Pre-extraction checklist, double-coding │ ├─────────────────────────────────────────────────────────────┤ │ Category 2: METHODOLOGICAL ERRORS │ │ - Pre-test included as independent outcome ⚠️ CRITICAL │ │ - Effect size type misclassification │ │ - Wrong comparison group selection │ │ - Ignoring study design (cluster, crossover) │ │ Prevention: Classification gates, temporal patterns │ ├─────────────────────────────────────────────────────────────┤ │ Category 3: STATISTICAL ERRORS │ │ - Wrong pooling formula (SD vs SE confusion) │ │ - Hedges' g vs Cohen's d confusion │ │ - Incorrect variance calculation │ │ - Sign errors in effect direction │ │ Prevention: Formula verification, consistency checks │ ├─────────────────────────────────────────────────────────────┤ │ Category 4: INTERPRETATION ERRORS │ │ - Confusing study count vs ES count │ │ - Misreporting sample sizes (total vs per group) │ │ - Aggregating dependent effects incorrectly │ │ Prevention: Clear terminology, study-level aggregation │ ├─────────────────────────────────────────────────────────────┤ │ Category 5: REPRODUCIBILITY ERRORS │ │ - Unreported inclusion/exclusion decisions │ │ - Missing sensitivity analysis │ │ - Undocumented data transformations │ │ Prevention: Audit logging, decision tracking │ └─────────────────────────────────────────────────────────────┘
2. Pattern Detection Rules
python
# Pre-test pattern detection
PRETEST_PATTERNS = [
r'pre[-\s]?test',
r'baseline',
r'before\s+(intervention|treatment)',
r'time\s*1',
r'T1\s+score',
r'initial\s+(assessment|measure)',
r'사전\s*검사', # Korean
r'사전\s*측정'
]
def detect_pretest(outcome_name):
"""
Detect if outcome name indicates pre-test measurement.
Returns: (is_pretest: bool, confidence: float, pattern_matched: str)
"""
outcome_lower = outcome_name.lower()
for pattern in PRETEST_PATTERNS:
if re.search(pattern, outcome_lower, re.IGNORECASE):
return True, 0.9, pattern
# Also check for explicit post-test absence
if 'post' not in outcome_lower and 'after' not in outcome_lower:
# Might be pre-test if no temporal indicator
return False, 0.3, "no_temporal_indicator"
return False, 0.0, None
3. Anomaly Detection Thresholds
| Anomaly Type | Threshold | Severity | Advisory |
|---|---|---|---|
| Extreme effect size | |g| > 3.0 | HIGH | "Effect size unusually large" |
| Very extreme | |g| > 5.0 | CRITICAL | "Likely error or outlier" |
| SD outlier | SD > 3× median | MEDIUM | "Check for unit errors" |
| Sample size mismatch | n_T ≠ n_C by >50% | LOW | "Verify unequal groups" |
| Zero variance | SD = 0 | CRITICAL | "Invalid SD value" |
| Negative values | SD < 0 or n < 0 | CRITICAL | "Data entry error" |
| Duplicate ES | Same g value | MEDIUM | "Possible duplicate" |
4. Pre-Extraction Warnings
Before data extraction begins, C7 provides warnings based on study characteristics:
python
def pre_extraction_warnings(study_metadata):
"""
Generate warnings before extracting from a study.
"""
warnings = []
# Complex design warnings
if study_metadata.get('design') == 'cluster_rct':
warnings.append({
'type': 'DESIGN_COMPLEXITY',
'message': 'Cluster RCT - need design effect adjustment',
'severity': 'HIGH'
})
if study_metadata.get('design') == 'crossover':
warnings.append({
'type': 'DESIGN_COMPLEXITY',
'message': 'Crossover design - check for carryover effects',
'severity': 'MEDIUM'
})
# Multiple outcome warnings
if study_metadata.get('outcome_count', 1) > 5:
warnings.append({
'type': 'MULTIPLE_OUTCOMES',
'message': f'{study_metadata["outcome_count"]} outcomes - apply ES hierarchy',
'severity': 'MEDIUM'
})
# Pre-post design warning
if study_metadata.get('has_pretest', False):
warnings.append({
'type': 'PRETEST_PRESENT',
'message': 'Study has pre-test data - DO NOT use as independent outcome',
'severity': 'HIGH'
})
return warnings
5. Advisory Output Format
yaml
c7_advisory:
timestamp: "2026-01-26T10:35:00Z"
batch_id: "V8_extraction_001"
summary:
records_checked: 365
warnings_issued: 23
critical_issues: 5
by_category:
methodological:
- ES_ID: "45-1"
pattern: "PRE_TEST_PATTERN"
confidence: 0.9
message: "Pattern 'pre-test' detected in Outcome_Name"
recommendation: "REJECT"
statistical:
- ES_ID: "22-3"
pattern: "EXTREME_VALUE"
value: 4.2
message: "|g| = 4.2 exceeds threshold 3.0"
recommendation: "HUMAN_REVIEW"
data:
- ES_ID: "33-2"
pattern: "SD_ZERO"
value: 0.0
message: "SD_Treatment = 0, invalid value"
recommendation: "REJECT"
pre_extraction_warnings:
- Study_ID: 55
warnings:
- type: "CLUSTER_RCT"
message: "Needs design effect adjustment"
Integration with C5
C7 provides advisories, C5 makes decisions:
code
# Pattern detection flow
Record submitted → C7 pattern check → Advisory generated → C5 decides
# Example interaction
C7 → C5: {
"advisory": "PRE_TEST_PATTERN_DETECTED",
"ES_ID": "45-1",
"confidence": 0.9,
"evidence": "Pattern 'pre-test' matched in 'Pre-test critical thinking'",
"recommendation": "REJECT"
}
C5 Decision: "GATE 4a FAILED. Rejecting ES_45-1. Reason: pre-test outcome"
Checkpoint Triggers
C7 triggers human checkpoints for C5 to enforce:
| Condition | Checkpoint | Requires |
|---|---|---|
| Tier 3 data | META_TIER3_REVIEW | Confirm include/exclude |
| |g| > 3.0 | META_ANOMALY_REVIEW | Verify or exclude |
| Ambiguous temporal | META_PRETEST_CONFIRM | Classify pre/post |
| Design complexity | META_DESIGN_REVIEW | Verify extraction method |
Pre-Extraction Checklist
Before extracting from each study, verify:
markdown
## Pre-Extraction Checklist ### Study Design - [ ] Design type identified (RCT, quasi-experimental, pre-post) - [ ] If cluster design: design effect noted - [ ] If crossover: period effects considered ### Outcome Classification - [ ] Each outcome labeled as pre/post/change - [ ] Pre-test outcomes marked DO NOT USE - [ ] Primary vs secondary outcomes distinguished ### Statistical Reporting - [ ] Mean/SD or alternatives (SE, CI) available - [ ] Sample sizes clear (total vs per group) - [ ] Correct comparison groups identified ### Effect Size Hierarchy - [ ] If multiple ES: priority ranking applied - [ ] Post-test between-groups prioritized - [ ] Dependent ES handling planned
Universal Codebook Integration (v2.1)
Triage Functionality
C7 handles Phase 2 (Triage) of the Universal Codebook workflow:
python
# Configurable thresholds
DEFAULT_THRESHOLDS = {
"n": {"high": 95, "medium": 80},
"m": {"high": 90, "medium": 70},
"sd": {"high": 85, "medium": 65},
"hedges_g": {"high": 92, "medium": 75},
"se_g": {"high": 92, "medium": 75},
"pre_post_corr": {"high": 85, "medium": 65},
"icc": {"high": 80, "medium": 60}
}
SOURCE_MODIFIERS = {
"table": 10,
"figure": 5,
"text": 0,
"abstract": -15,
"ocr_artifacts": -20
}
def triage_extractions(extraction_data, thresholds=None):
"""
Triage AI extractions into confidence categories for human review queue.
Used in Phase 2 of Universal Codebook workflow.
Returns:
- categorized records with priority rankings
"""
thresholds = thresholds or DEFAULT_THRESHOLDS
results = []
for record in extraction_data:
# Calculate effective confidence
base_conf = record.get("ai_confidence_avg", 0)
source_type = record.get("ai_source_type", "text")
effective_conf = base_conf + SOURCE_MODIFIERS.get(source_type, 0)
effective_conf = max(0, min(100, effective_conf)) # Clamp to 0-100
# Check for conflicts
has_conflict = record.get("ai_conflicts", False)
# Determine category and priority
if has_conflict:
category = "CONFLICT"
priority = 1
status = "PENDING"
elif effective_conf < thresholds.get("sd", {}).get("medium", 65):
category = "LOW"
priority = 2
status = "PENDING"
elif effective_conf < thresholds.get("sd", {}).get("high", 85):
category = "MEDIUM"
priority = 3
status = "PENDING"
else:
category = "HIGH"
priority = 4
status = "PROVISIONAL"
results.append({
"es_id": record["es_id"],
"effective_confidence": effective_conf,
"category": category,
"priority": priority,
"verified_status": status,
"review_reason": get_review_reason(record, category),
"ai_extraction_json": record.get("ai_extraction_json")
})
# Sort by priority (1=highest)
results.sort(key=lambda x: (x["priority"], -x["effective_confidence"]))
return results
def get_review_reason(record, category):
"""Generate human-readable reason for review."""
if category == "CONFLICT":
return "Multiple extractions disagree beyond tolerance"
elif category == "LOW":
fields = []
for field in ["n_treatment", "sd_treatment", "m_treatment"]:
if record.get(f"{field}_confidence", 100) < 70:
fields.append(field)
return f"Low confidence in: {', '.join(fields)}" if fields else "Low overall confidence"
elif category == "MEDIUM":
return "Medium confidence - recommended verification"
else:
return "High confidence - spot check only"
Conflict Detection
python
# Tolerance thresholds for conflict detection
TOLERANCE = {"n": 0.05, "m": 0.10, "sd": 0.15}
ABSOLUTE_TOLERANCE = {"n": 2, "m": 0.5, "sd": 0.5}
EPSILON = 0.001
def detect_extraction_conflicts(extractions, field_type):
"""
Detect if multiple extraction methods disagree beyond tolerance.
Args:
extractions: List of {method, value, confidence}
field_type: "n", "m", or "sd"
Returns:
{has_conflict, severity, details}
"""
if len(extractions) < 2:
return {"has_conflict": False}
values = [e["value"] for e in extractions if e["value"] is not None]
if len(values) < 2:
return {"has_conflict": False}
# Calculate disagreement
v1, v2 = values[0], values[1]
denominator = max(abs(v1), abs(v2), EPSILON)
relative_diff = abs(v1 - v2) / denominator
absolute_diff = abs(v1 - v2)
# Check thresholds
exceeds_relative = relative_diff > TOLERANCE[field_type]
exceeds_absolute = absolute_diff > ABSOLUTE_TOLERANCE[field_type]
if exceeds_relative or exceeds_absolute:
return {
"has_conflict": True,
"severity": "HIGH" if exceeds_relative and exceeds_absolute else "MEDIUM",
"relative_diff": round(relative_diff, 3),
"absolute_diff": round(absolute_diff, 2),
"candidates": extractions,
"recommend": "HUMAN_REVIEW"
}
return {"has_conflict": False}
Review Queue Generation
python
def generate_review_queue(triage_results, output_format="excel"):
"""
Generate prioritized review queue for human reviewers.
Output columns:
- study_id, es_id, priority, category, issue, ai_confidence, status
"""
queue = []
for result in triage_results:
if result["verified_status"] != "PROVISIONAL" or result["priority"] <= 3:
queue.append({
"study_id": result.get("study_id"),
"es_id": result["es_id"],
"priority": result["priority"],
"category": result["category"],
"issue": result["review_reason"],
"ai_confidence": result["effective_confidence"],
"status": "pending"
})
return queue
Error Messages
| Code | Message | Severity | Advisory To C5 |
|---|---|---|---|
C7_PRETEST | Pre-test pattern detected | CRITICAL | Recommend REJECT |
C7_EXTREME_G | |g| > {threshold} | HIGH | Recommend REVIEW |
C7_SD_INVALID | SD ≤ 0 detected | CRITICAL | Recommend REJECT |
C7_DESIGN_COMPLEX | Complex design detected | MEDIUM | Warn extraction |
C7_DUPLICATE | Possible duplicate ES | MEDIUM | Recommend REVIEW |
C7_TIER3 | Data below 40% complete | HIGH | Require HUMAN |
C7_CONFLICT | Extraction methods disagree | HIGH | Require HUMAN |
C7_LOW_CONF | Effective confidence < threshold | MEDIUM | Recommend REVIEW |
Version History
- •1.0.0 (2026-01-26): Initial release based on V7 error patterns
Related Agents
- •C5-MetaAnalysisMaster: Receives C7 advisories for decisions
- •C6-DataIntegrityGuard: Works alongside for data validation
- •B3-EffectSizeExtractor: Pre-extraction warnings apply here
References
- •Moher et al. (2009). PRISMA Statement
- •Sterne et al. (2019). RoB 2: Risk of bias tool
- •Cooper (2017). Research Synthesis and Meta-Analysis
- •Schmidt & Hunter (2015). Methods of Meta-Analysis