Statistical Analysis Plan (SAP) Parsing Skill
Purpose
Extract analysis specifications from SAP documents for automated TLF generation.
Input
- •SAP PDF document
- •Protocol metadata (optional, for cross-validation)
Output Schema
json
{
"study_id": "string",
"sap_version": "string",
"primary_analysis": {
"endpoint": "string",
"method": "string (ANCOVA, MMRM, etc.)",
"model": "string (model specification)",
"missing_data": "string (LOCF, MMRM, MI, etc.)",
"alpha": "number",
"hypothesis": "string (one-sided, two-sided)"
},
"populations": {
"ITT": "string (definition)",
"Safety": "string (definition)",
"Per_Protocol": "string (definition)",
"Efficacy": "string (definition)"
},
"analysis_visits": [
{
"visit": "string",
"window_low": "number (days)",
"window_high": "number (days)"
}
],
"subgroups": ["string"],
"multiplicity_adjustment": "string",
"sensitivity_analyses": ["string"]
}
Extraction Instructions
Primary Analysis
- •Find "Primary Efficacy Analysis" or "Primary Endpoint Analysis"
- •Extract statistical method (ANCOVA, MMRM, mixed model)
- •Identify covariates and stratification factors
- •Note missing data handling approach
Analysis Populations
- •Locate "Analysis Populations" or "Study Populations" section
- •Extract exact definitions for each population
- •Note flag variable names if mentioned (ITTFL, SAFFL, etc.)
Visit Windows
- •Find "Analysis Windows" or "Visit Windowing" table
- •Extract target day and window bounds
- •Note rules for multiple observations within window
Multiplicity
- •Identify section on "Multiple Comparisons" or "Multiplicity"
- •Extract adjustment method (Hochberg, Bonferroni, etc.)
- •Note which comparisons are adjusted
Example Prompts
For Primary Analysis
code
Extract the primary analysis specifications including: - Statistical method - Full model specification - Missing data approach - Significance level
For Population Definitions
code
List all analysis populations with their exact definitions. Include the variable/flag name used to identify each population.
For Visit Windows
code
Extract the visit windowing rules as a table: | Visit | Target Day | Window (days) |
Validation Rules
- •Primary endpoint must match protocol
- •Population definitions should be unambiguous
- •Alpha level typically 0.05 (two-sided) or 0.025 (one-sided)
- •Visit windows should not overlap