Analyze EDSL Results
Load an EDSL Results object from Expected Parrot (by UUID) or from a local file (results.json.gz), export documentation files, and generate a comprehensive analysis report.
Usage
/edsl-analyze-results <uuid-or-path>
Examples:
/edsl-analyze-results 123e4567-e89b-12d3-a456-426614174000 /edsl-analyze-results ./my_experiment/results.json.gz
After loading the results, the skill will ask you to choose an analysis focus (full analysis, summary statistics, cross-tabulation, or a specific custom focus).
Workflow
1. Parse the Input
Determine if the input is:
- •UUID: A 36-character UUID (e.g.,
123e4567-e89b-12d3-a456-426614174000) - •File path: A path ending in
.json.gzor.json
If unclear, use AskUserQuestion to clarify.
2. Load the Results
from edsl import Results
# Load by UUID (from Expected Parrot cloud)
results = Results.pull("123e4567-e89b-12d3-a456-426614174000")
# OR load from local file
results = Results.load("path/to/results") # .json.gz extension optional
3. Create Output Directory
Create a directory for the analysis outputs using sequential numbering:
import os
import glob
# Find existing analysis directories and get next number
existing = glob.glob("./analysis_*")
existing_nums = []
for d in existing:
try:
num = int(d.split("_")[-1])
existing_nums.append(num)
except ValueError:
pass
next_num = max(existing_nums, default=0) + 1
output_dir = f"./analysis_{next_num}"
os.makedirs(output_dir, exist_ok=True)
4. Export Documentation Files
Export three core documentation files:
# Get survey from results
survey = results.survey
# 1. Export survey as markdown
survey_md = survey.to_markdown()
with open(f"{output_dir}/survey.md", "w") as f:
f.write(survey_md)
# 2. Export survey as mermaid diagram
# Note: Sanitize HTML tags for mermaid v11+ compatibility
import re
survey_mermaid = survey.to_mermaid()
# Remove HTML tags that cause syntax errors in newer mermaid versions
survey_mermaid = re.sub(r'<b>|</b>|<br/>', '\n', survey_mermaid)
survey_mermaid = re.sub(r'\n+', '\n', survey_mermaid) # Clean up multiple newlines
with open(f"{output_dir}/survey.mermaid", "w") as f:
f.write(survey_mermaid)
# 3. Export results as CSV
results_csv = results.to_csv()
results_csv.write(f"{output_dir}/results.csv")
# OR: with open(f"{output_dir}/results.csv", "w") as f: f.write(results_csv.text)
5. Initial Data Exploration
Before analysis, explore the data structure:
import pandas as pd
# Load and examine the CSV
df = pd.read_csv(f"{output_dir}/results.csv")
# Get column info
print(f"Shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
# Identify question answer columns (answer.*)
answer_cols = [c for c in df.columns if c.startswith('answer.')]
print(f"Answer columns: {answer_cols}")
# Identify agent columns (agent.*)
agent_cols = [c for c in df.columns if c.startswith('agent.')]
print(f"Agent columns: {agent_cols}")
# Identify scenario columns (scenario.*)
scenario_cols = [c for c in df.columns if c.startswith('scenario.')]
print(f"Scenario columns: {scenario_cols}")
6. Ask About Analysis Focus
After loading the data and exploring its structure, always use AskUserQuestion to ask about the analysis focus. This ensures the report is tailored to the user's needs:
Question: "What would you like me to focus on in the analysis?" Header: "Focus" Options: 1. "Full analysis (Recommended)" - "Comprehensive analysis covering all questions, response distributions, and key findings" 2. "Summary statistics" - "Basic descriptive statistics and response distributions only" 3. "Cross-tabulation" - "Focus on relationships between variables (scenarios × responses, traits × responses)" 4. "Specific focus" - "I have a particular question or hypothesis to investigate"
IMPORTANT: Always ask this question, even if the user provided a query with the UUID. The question helps clarify what type of analysis they want. If they select "Specific focus", follow up to get their specific question or hypothesis.
7. Generate Analysis Report
Create a comprehensive report.md with:
Structure
# Results Analysis Report ## Survey Overview [Describe survey structure, questions, and flow - reference survey.md for details] ## Data Summary - Number of responses: N - Agent traits collected: [list] - Scenarios tested: [list] ## Detailed Results ### Q1: [Question Name] [Question text and type] [Response distribution table]  [Interpretation of results] ### Q2: [Question Name] [Same pattern - table, visualization, interpretation together] ## Key Findings [Main insights from the data] ## Cross-Tabulations (if applicable) [Relationships between variables - only include agent breakdowns if agents have meaningful names, not UUIDs] ## Files Generated | File | Description | |------|-------------| | [survey.md](survey.md) | Survey documentation | | [survey.mermaid](survey.mermaid) | Survey flow diagram | | [results.csv](results.csv) | Raw results data | | [report.html](report.html) | This report (HTML) |
IMPORTANT: When listing files in the report, always use relative hyperlinks (e.g., [survey.md](survey.md)) so users can click through to the files.
IMPORTANT:
- •Do NOT include mermaid diagrams in the report (they often don't render correctly in HTML output). The mermaid file is still exported separately for reference.
- •Only include per-agent analysis if agents have meaningful names (not UUIDs). Check if agent names look like UUIDs (36-character strings with hyphens in pattern 8-4-4-4-12) and skip agent breakdowns if so.
IMPORTANT: Place each visualization immediately after its corresponding question's data table, not in a separate section at the end. This keeps the analysis coherent and easy to follow.
Generate Visualizations (Inline with Questions)
For each question, generate and save a visualization, then reference it in the report immediately after the question's statistics:
import matplotlib.pyplot as plt
# For each question, generate chart and include in report immediately
for col in answer_cols:
question_name = col.replace('answer.', '')
value_counts = df[col].value_counts()
# Add question section to report
report += f"### {question_name}\n\n"
report += "[Response distribution table here]\n\n"
# Generate and save chart
if len(value_counts) <= 20:
fig, ax = plt.subplots(figsize=(10, 6))
value_counts.plot(kind='bar', ax=ax)
ax.set_title(f'Response Distribution: {question_name}')
ax.set_xlabel('Response')
ax.set_ylabel('Count')
plt.tight_layout()
chart_path = f"{question_name}_distribution.png"
plt.savefig(f'{output_dir}/{chart_path}', dpi=150)
plt.close()
# Include chart immediately after question data
report += f"\n\n"
report += "[Interpretation of this question's results]\n\n"
8. Save All Outputs
Ensure all files are saved to the output directory:
output_dir/ ├── survey.md # Survey in markdown format ├── survey.mermaid # Survey flow diagram ├── results.csv # Full results data ├── report.md # Analysis report ├── report.html # Styled HTML report ├── *.png # Visualization files └── analysis.py # Optional: reproducible analysis script
9. Generate HTML Report with Pandoc
After saving report.md, convert it to a styled HTML report using pandoc:
# CSS file location
CSS_FILE="$HOME/tools/ep/skills/assets/report.css"
# Generate HTML report (no --metadata title to avoid duplicate with markdown h1)
pandoc "${output_dir}/report.md" \
-o "${output_dir}/report.html" \
--css="${CSS_FILE}" \
--standalone
Or in Python:
import subprocess
import os
css_file = os.path.expanduser("~/tools/ep/skills/assets/report.css")
# Note: Don't use --metadata title= since report.md already has # heading
subprocess.run([
"pandoc",
f"{output_dir}/report.md",
"-o", f"{output_dir}/report.html",
f"--css={css_file}",
"--standalone"
], check=True)
print(f"Generated: {output_dir}/report.html")
Note: The mermaid diagram is exported separately as survey.mermaid but not embedded in the report due to rendering issues in HTML output.
Complete Example Script
"""
EDSL Results Analysis Script
"""
from edsl import Results
import pandas as pd
import matplotlib.pyplot as plt
import os
import re
from datetime import datetime
# === CONFIGURATION ===
# Modify this to load your results
RESULTS_UUID = "123e4567-e89b-12d3-a456-426614174000" # Or use file path
# RESULTS_PATH = "./results.json.gz"
# === LOAD RESULTS ===
results = Results.pull(RESULTS_UUID)
# results = Results.load(RESULTS_PATH)
# === CREATE OUTPUT DIRECTORY ===
import glob
existing = glob.glob("./analysis_*")
existing_nums = []
for d in existing:
try:
num = int(d.split("_")[-1])
existing_nums.append(num)
except ValueError:
pass
next_num = max(existing_nums, default=0) + 1
output_dir = f"./analysis_{next_num}"
os.makedirs(output_dir, exist_ok=True)
# === EXPORT DOCUMENTATION ===
survey = results.survey
# Survey markdown
with open(f"{output_dir}/survey.md", "w") as f:
f.write(survey.to_markdown())
# Survey mermaid (sanitize HTML tags for mermaid v11+ compatibility)
survey_mermaid = survey.to_mermaid()
survey_mermaid = re.sub(r'<b>|</b>|<br/>', '\n', survey_mermaid)
survey_mermaid = re.sub(r'\n+', '\n', survey_mermaid)
with open(f"{output_dir}/survey.mermaid", "w") as f:
f.write(survey_mermaid)
# Results CSV
results_csv = results.to_csv()
results_csv.write(f"{output_dir}/results.csv")
# === LOAD DATA FOR ANALYSIS ===
df = pd.read_csv(f"{output_dir}/results.csv")
# Identify column types
answer_cols = [c for c in df.columns if c.startswith('answer.')]
agent_cols = [c for c in df.columns if c.startswith('agent.')]
scenario_cols = [c for c in df.columns if c.startswith('scenario.')]
# === HELPER: Check if string looks like a UUID ===
import re
def is_uuid(s):
"""Check if a string looks like a UUID (8-4-4-4-12 hex pattern)."""
if not isinstance(s, str):
return False
uuid_pattern = r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$'
return bool(re.match(uuid_pattern, s.lower()))
# Check if agents have meaningful names (not UUIDs)
has_meaningful_agents = False
if 'agent.agent_name' in df.columns:
agent_names = df['agent.agent_name'].dropna().unique()
has_meaningful_agents = len(agent_names) > 0 and not all(is_uuid(str(name)) for name in agent_names)
# === GENERATE REPORT ===
report = f"""# Results Analysis Report
Generated: {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}
## Survey Overview
See `survey.md` for the full survey documentation and `survey.mermaid` for the flow diagram.
## Data Summary
- **Total responses**: {len(df)}
- **Questions**: {len(answer_cols)}
- **Agent traits**: {len(agent_cols)} ({', '.join(agent_cols) if agent_cols else 'None'})
- **Scenario variables**: {len(scenario_cols)} ({', '.join(scenario_cols) if scenario_cols else 'None'})
## Response Distributions
"""
# Add distribution for each answer column
for col in answer_cols:
question_name = col.replace('answer.', '')
value_counts = df[col].value_counts()
report += f"### {question_name}\n\n"
report += "| Response | Count | Percentage |\n"
report += "|----------|-------|------------|\n"
for val, count in value_counts.items():
pct = count / len(df) * 100
report += f"| {val} | {count} | {pct:.1f}% |\n"
report += "\n"
# Generate chart
if len(value_counts) <= 20: # Only plot if reasonable number of categories
fig, ax = plt.subplots(figsize=(10, 6))
value_counts.plot(kind='bar', ax=ax)
ax.set_title(f'Response Distribution: {question_name}')
ax.set_xlabel('Response')
ax.set_ylabel('Count')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
chart_path = f"{question_name}_distribution.png"
plt.savefig(f'{output_dir}/{chart_path}', dpi=150)
plt.close()
report += f"\n\n"
# Only add agent analysis if agents have meaningful names
if has_meaningful_agents:
report += """## Analysis by Agent
"""
# Add per-agent breakdowns here
for col in answer_cols:
question_name = col.replace('answer.', '')
crosstab = pd.crosstab(df['agent.agent_name'], df[col], normalize='index') * 100
report += f"### {question_name} by Agent\n\n"
report += crosstab.to_markdown() + "\n\n"
report += """## Key Findings
[Add key findings based on the analysis]
## Methodology Notes
This analysis was generated from EDSL Results data. The survey was administered to AI agents
using the Expected Parrot platform.
"""
# Save report
with open(f"{output_dir}/report.md", "w") as f:
f.write(report)
# Generate HTML report with pandoc (no --metadata title to avoid duplicate)
import subprocess
css_file = os.path.expanduser("~/tools/ep/skills/assets/report.css")
subprocess.run([
"pandoc",
f"{output_dir}/report.md",
"-o", f"{output_dir}/report.html",
f"--css={css_file}",
"--standalone"
], check=True)
print(f"Analysis complete! Output saved to: {output_dir}/")
print(f" - survey.md")
print(f" - survey.mermaid")
print(f" - results.csv")
print(f" - report.md")
print(f" - report.html")
Output Files
| File | Description |
|---|---|
survey.md | Human-readable survey documentation with questions, options, and rules |
survey.mermaid | Mermaid diagram showing survey flow and skip logic |
results.csv | Full results data in CSV format for analysis |
report.md | Comprehensive analysis report with findings and visualizations |
report.html | Styled HTML report (via pandoc with Expected Parrot CSS) |
*.png | Charts and visualizations referenced in the report |
analysis.py | (Optional) Reproducible Python script for the analysis |
Tips
- •Check
survey.mermaidseparately to understand skip logic before analyzing - •Look for patterns in agent traits vs. responses (only if agents have meaningful names, not UUIDs)
- •Compare responses across scenarios (if scenarios were used)
- •The
answer.*columns contain question responses - •The
agent.*columns contain agent trait values - •The
scenario.*columns contain scenario variable values - •Use
comment.*columns to see free-text explanations (if available) - •Per-agent breakdowns are automatically skipped when agent names are UUIDs (not meaningful for analysis)
Common Analysis Patterns
Cross-tabulation by Scenario
# Compare responses across scenarios pd.crosstab(df['scenario.condition'], df['answer.question_name'], normalize='index')
Agent Trait Analysis
# First check if agents have meaningful names (not UUIDs)
import re
def is_uuid(s):
uuid_pattern = r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$'
return bool(re.match(uuid_pattern, str(s).lower()))
# Only analyze by agent if names are meaningful
if not all(is_uuid(name) for name in df['agent.agent_name'].dropna().unique()):
df.groupby('agent.agent_name')['answer.question_name'].value_counts(normalize=True)
Response Correlation
# For numeric responses df[[c for c in answer_cols if df[c].dtype in ['int64', 'float64']]].corr()