ESI Extraction Skill

Overview

Extracts structured clinical facts from patient records using an LLM. This skill performs Phase 1 of the hybrid ESI classification pipeline: transforming unstructured text into validated, machine-readable facts.

When to Use

•Extracting vital signs from clinical notes
•Converting patient vignettes to structured data
•Pre-processing for deterministic ESI logic (see esi-composition skill)
•Building datasets for rule-based decision trees

Input/Output Schema

Input:

•patient_record (string): Clinical note, ED triage form, or patient vignette

Output:

•extracted_facts (object): Validated facts matching schemas/extraction_schema.json
•confidence (float): Extraction confidence 0.0-1.0, calculated as: (extracted fields count) / (total possible fields)
•status (string): "success" or "warning" (warning if confidence < 0.5 or >3 validation errors)
•validation_errors (array): Schema violations and sanity check failures
•error_count (int): Number of validation errors

Quick Start

python

# Use the esi-extraction skill to extract facts from a patient record
from src.skills import SkillRegistry, SkillExecutor

registry = SkillRegistry()
executor = SkillExecutor(registry)

result = executor.execute(
    skill_name="esi-extraction",
    inputs={
        "patient_record": "42-year-old male with chest pain and shortness of breath. BP 180/110, HR 105, RR 22"
    }
)

if result.is_success():
    facts = result.output["extracted_facts"]
    print(f"Vital Signs: {facts['vital_signs']}")
    print(f"Risk Factors: {facts['risk_factors']}")

Extracted Facts Structure

The skill extracts and validates the following categories:

Vital Signs

•systolic_bp (int): Systolic blood pressure mmHg
•diastolic_bp (int): Diastolic blood pressure mmHg
•heart_rate (int): Beats per minute
•respiratory_rate (int): Breaths per minute
•oxygen_saturation (float): Percentage (0-100)
•temperature (float): Celsius or Fahrenheit (inferred)

Symptoms

•chief_complaint (string): Primary reason for visit
•pain_level (int): 0-10 scale (if explicitly mentioned as numeric)

Risk Factors

•high_risk_keywords (array): ["chest pain", "difficulty breathing", "confusion", ...]
•trauma_indicators (bool): Recent injury or accident
•infectious_signs (bool): Fever, infection markers
•allergies (array): Known allergies
•medications (array): Current medications

Resource Requirements

•requires_imaging (bool): Likely needs CT, X-ray, ultrasound
•requires_lab (bool): Needs blood work, urinalysis
•requires_monitoring (bool): Continuous vital monitoring

Data Quality

•extraction_confidence (float): 0.0-1.0 score
•missing_fields (array): Fields not found in record
•ambiguous_fields (array): Fields requiring clarification

Best Practices

Input Formatting

•Ensure patient records are reasonably clean (medical notes, not handwritten scans)
•Include vital signs if available; skill will infer normal ranges if missing
•Include chief complaint in first sentence for best extraction

Output Validation

•Always check confidence score; <0.7 indicates uncertain extraction
•Review missing_fields to understand data gaps
•Compare extracted vital_signs to input for sanity-checking

Error Handling

•Skill retries up to 2 times with exponential backoff
•If JSON parsing fails, check raw_extraction field
•Timeout is 30 seconds; large documents may need splitting

Caching

•Results cached for 1 hour by default; disable in config if needed
•Cache key based on patient_record content, not metadata
•Clear cache between runs for same patient with updated records

Examples

Example 1: Clear vital signs

code

Input: "74-year-old female, alert, brings in husband. VS: 168/92, HR 88, RR 16, O2 98%, Temp 37.2C. Complains of chest pain x 2 hours."

Output:
{
  "vital_signs": {
    "systolic_bp": 168,
    "diastolic_bp": 92,
    "heart_rate": 88,
    "respiratory_rate": 16,
    "oxygen_saturation": 98.0,
    "temperature": 37.2
  },
  "symptoms": {
    "chief_complaint": "chest pain",
    "pain_level": null,
    "pain_location": "chest",
    "symptom_onset": "2 hours ago",
    "symptom_duration": "2 hours"
  },
  "confidence": 0.95
}

Example 2: Incomplete vitals