AgentSkillsCN

Esi Extraction

Esi 提取

SKILL.md

ESI Extraction Skill

Overview

Extracts structured clinical facts from patient records using an LLM. This skill performs Phase 1 of the hybrid ESI classification pipeline: transforming unstructured text into validated, machine-readable facts.

When to Use

  • Extracting vital signs from clinical notes
  • Converting patient vignettes to structured data
  • Pre-processing for deterministic ESI logic (see esi-composition skill)
  • Building datasets for rule-based decision trees

Input/Output Schema

Input:

  • patient_record (string): Clinical note, ED triage form, or patient vignette

Output:

  • extracted_facts (object): Validated facts matching schemas/extraction_schema.json
  • confidence (float): Extraction confidence 0.0-1.0, calculated as: (extracted fields count) / (total possible fields)
  • status (string): "success" or "warning" (warning if confidence < 0.5 or >3 validation errors)
  • validation_errors (array): Schema violations and sanity check failures
  • error_count (int): Number of validation errors

Quick Start

python
# Use the esi-extraction skill to extract facts from a patient record
from src.skills import SkillRegistry, SkillExecutor

registry = SkillRegistry()
executor = SkillExecutor(registry)

result = executor.execute(
    skill_name="esi-extraction",
    inputs={
        "patient_record": "42-year-old male with chest pain and shortness of breath. BP 180/110, HR 105, RR 22"
    }
)

if result.is_success():
    facts = result.output["extracted_facts"]
    print(f"Vital Signs: {facts['vital_signs']}")
    print(f"Risk Factors: {facts['risk_factors']}")

Extracted Facts Structure

The skill extracts and validates the following categories:

Vital Signs

  • systolic_bp (int): Systolic blood pressure mmHg
  • diastolic_bp (int): Diastolic blood pressure mmHg
  • heart_rate (int): Beats per minute
  • respiratory_rate (int): Breaths per minute
  • oxygen_saturation (float): Percentage (0-100)
  • temperature (float): Celsius or Fahrenheit (inferred)

Symptoms

  • chief_complaint (string): Primary reason for visit
  • pain_level (int): 0-10 scale (if explicitly mentioned as numeric)

Risk Factors

  • high_risk_keywords (array): ["chest pain", "difficulty breathing", "confusion", ...]
  • trauma_indicators (bool): Recent injury or accident
  • infectious_signs (bool): Fever, infection markers
  • allergies (array): Known allergies
  • medications (array): Current medications

Resource Requirements

  • requires_imaging (bool): Likely needs CT, X-ray, ultrasound
  • requires_lab (bool): Needs blood work, urinalysis
  • requires_monitoring (bool): Continuous vital monitoring

Data Quality

  • extraction_confidence (float): 0.0-1.0 score
  • missing_fields (array): Fields not found in record
  • ambiguous_fields (array): Fields requiring clarification

Best Practices

Input Formatting

  • Ensure patient records are reasonably clean (medical notes, not handwritten scans)
  • Include vital signs if available; skill will infer normal ranges if missing
  • Include chief complaint in first sentence for best extraction

Output Validation

  • Always check confidence score; <0.7 indicates uncertain extraction
  • Review missing_fields to understand data gaps
  • Compare extracted vital_signs to input for sanity-checking

Error Handling

  • Skill retries up to 2 times with exponential backoff
  • If JSON parsing fails, check raw_extraction field
  • Timeout is 30 seconds; large documents may need splitting

Caching

  • Results cached for 1 hour by default; disable in config if needed
  • Cache key based on patient_record content, not metadata
  • Clear cache between runs for same patient with updated records

Examples

Example 1: Clear vital signs

code
Input: "74-year-old female, alert, brings in husband. VS: 168/92, HR 88, RR 16, O2 98%, Temp 37.2C. Complains of chest pain x 2 hours."

Output:
{
  "vital_signs": {
    "systolic_bp": 168,
    "diastolic_bp": 92,
    "heart_rate": 88,
    "respiratory_rate": 16,
    "oxygen_saturation": 98.0,
    "temperature": 37.2
  },
  "symptoms": {
    "chief_complaint": "chest pain",
    "pain_level": null,
    "pain_location": "chest",
    "symptom_onset": "2 hours ago",
    "symptom_duration": "2 hours"
  },
  "confidence": 0.95
}

Example 2: Incomplete vitals

code
Input: "8-year-old boy brought by mother. Very pale and lethargic. Rapid breathing. No vitals available. Mother reports fever starting yesterday evening."

Output:
{
  "vital_signs": {
    "systolic_bp": null,
    "respiratory_rate": null,
    "temperature": null
  },
  "risk_factors": {
    "infectious_signs": true
  },
  "extracted_facts": {
    "missing_fields": ["systolic_bp", "diastolic_bp", "heart_rate", "oxygen_saturation"],
    "ambiguous_fields": ["temperature"]
  },
  "confidence": 0.65
}

Troubleshooting

Issue: Confidence score too low (< 0.7)

  • Cause: Incomplete or unclear patient record
  • Solution: Request more detailed clinical notes; consider manual review

Issue: JSON parsing error

  • Cause: LLM returned non-JSON output
  • Solution: Review raw_extraction field; retry with different LLM model/temperature

Issue: Timeout (>30 seconds)

  • Cause: Very long patient record or slow LLM
  • Solution: Split record into sections; increase timeout in config

See Also

  • REFERENCE.md: Detailed parameter descriptions and edge cases
  • EXAMPLES.md: Complete examples with input/output
  • schemas/extraction_schema.json: Full JSON schema for validation
  • esi-composition: Deterministic ESI logic that consumes extracted facts