CLI Session Recorder

Record CLI sessions and generate shareable feedback with human-readable summaries and machine-readable data.

⚠️ MANDATORY: Comprehensive Output

EVERY saved report MUST include the complete exchanges array with:

•All user prompts (verbatim)
•All tool calls with full parameters AND results
•All assistant responses
•All errors with context
•Timing information

This is NOT optional. A report without detailed exchanges is incomplete and useless for debugging. Never save a summary-only report.

Workflow

Starting a Recording

When the user wants to record their session:

•Generate a unique session ID (use timestamp or UUID)
•Detect the current model from context
•Get git branch if in a repository: git branch --show-current
•Begin capturing all exchanges

python

from resources.session_recorder import SessionRecorder
import subprocess

# Get git branch
try:
    branch = subprocess.run(
        ["git", "branch", "--show-current"],
        capture_output=True, text=True, check=True
    ).stdout.strip()
except:
    branch = None

# Initialize recorder with optional debug log parsing
recorder = SessionRecorder(
    session_id=f"session-{datetime.now().strftime('%Y%m%d-%H%M%S')}",
    model="claude-sonnet-4-20250514",  # from context
    log_dir=".logs"  # Optional: path to Copilot debug logs
)
recorder.start_recording(git_branch=branch, capture_environment=True)

During the Session (VERBOSE CAPTURE)

IMPORTANT: Capture EVERY exchange with full verbosity. This is critical for debugging and feedback analysis.

For EACH exchange, you MUST capture:

•The exact user prompt (verbatim)
•
ALL tool calls with:
- •Full tool name
- •Complete parameters object
- •Full result/output (truncate only if >2000 chars)
- •Timing information
- •Retry counts
•The complete assistant response
•Any errors with full context
•Exchange timing

python

# When user sends a prompt - capture EXACTLY what they typed
recorder.add_user_prompt("help me deploy my app to azure")

# For EVERY tool call - capture full details
recorder.start_tool_call()  # Start timing
recorder.add_tool_call(
    name="bash",
    parameters={
        "command": "ls -la && cat index.html",
        "description": "List files and view content"
    },
    result="total 8\n-rw-r--r--  1 user  staff  123 Jan 28 index.html\n<!DOCTYPE html>...",
    retry_count=0
)

recorder.start_tool_call()
recorder.add_tool_call(
    name="glob",
    parameters={"pattern": "**/*.{json,yaml,yml}"},
    result=".github/workflows/run_test_deploy.yml",
    retry_count=0
)

recorder.start_tool_call()
recorder.add_tool_call(
    name="bash",
    parameters={"command": "az group create --name rg-myapp --location eastus2"},
    result='{"id": "/subscriptions/.../resourceGroups/rg-myapp", "location": "eastus2", "name": "rg-myapp"}',
    retry_count=0
)

# Capture COMPLETE assistant response
recorder.add_assistant_response(
    "I detected a static HTML site. I'll deploy it to Azure Static Web Apps.\n\n"
    "Created resource group `rg-myapp` and deploying..."
)

# When errors occur - capture FULL context
recorder.add_error(
    error_type="ToolError",
    message="SWA CLI artifact folder constraint - current directory cannot be identical to artifact folder",
    context={
        "tool": "bash",
        "command": "swa deploy . --deployment-token ...",
        "resolution": "Created dist folder and copied files"
    }
)

Verbose Capture Checklist

For each turn, ensure you record:

Item	Required	Details
User prompt	✅ Yes	Exact text, no summarization
Tool name	✅ Yes	Full tool identifier
Tool parameters	✅ Yes	Complete parameters object
Tool result	✅ Yes	Full output (truncate >2000 chars with "[truncated]")
Tool timing	✅ Yes	Use `start_tool_call()` before each call
Tool errors	✅ Yes	Include if tool failed
Assistant response	✅ Yes	Complete response text
Exchange timestamp	✅ Auto	Captured automatically

Stopping Recording

When the user says "stop recording" or "save feedback":

•Stop the recorder
•Scrub sensitive data
•Ask for summary information
•Generate the feedback file with FULL exchange data

python

recorder.stop_recording()
recorder.scrub_sensitive_data()

Generating Feedback (ALWAYS COMPREHENSIVE)

⚠️ CRITICAL: The generated feedback MUST always include the complete exchanges array with all tool calls, parameters, and results. This is mandatory, not optional.

When stopping the recording, auto-generate the summary by analyzing the recorded session:

•Task Summary: Infer from the first user prompt and overall conversation flow
•Problems: Identify from errors, retries, tool failures, or user corrections
•Outcome: Determine from final exchanges - did the task complete successfully?

Then present to user for confirmation:

code

I've generated a summary of this session:

**Task**: Fix login authentication bug causing 401 errors
**Problems**: 
- Initial grep search returned too many results
- Had to inspect multiple files before finding the issue
**Outcome**: Success - Bug identified and fixed in src/auth.py

Accept this summary, or provide your own?

If user accepts, proceed. If they provide edits, use their version.

python

from resources.format_feedback import format_feedback, save_feedback

# Auto-generated from session analysis
feedback = format_feedback(
    session=recorder.get_session_data(),
    task_summary="User attempted to fix authentication bug...",  # AI-generated
    problems=["Initial search returned too many results"],        # AI-detected
    outcome="Success - Bug fixed in auth.py"                      # AI-inferred
)

filepath = save_feedback(feedback, output_dir="./feedback/")
print(f"Saved to: {filepath}")

Sharing Options

Present the user with sharing choices:

1. Local file (default)

python

from resources.format_feedback import save_feedback
filepath = save_feedback(feedback)

2. GitHub Gist

python

from resources.share_gist import create_gist
url = create_gist(filepath, public=False)

3. GitHub Issue

python

from resources.share_issue import create_issue
url = create_issue("owner/repo", filepath, labels=["feedback"])

Commands Reference

Both explicit commands and natural language work. Use explicit commands if natural language isn't giving good results.

Command	Natural Language	Action
`/cli-session-recorder start`	"record this", "start recording"	Begin capturing session
`/cli-session-recorder stop`	"stop recording"	Stop and auto-generate summary
`/cli-session-recorder save`	"save feedback"	Save to local file
`/cli-session-recorder save --gist`	"share as gist"	Upload to GitHub Gist
`/cli-session-recorder save --issue`	"create issue", "report this"	Create GitHub Issue
`/cli-session-recorder status`	"show recording status"	Report if recording is active

Summary Generation

When stopping a recording, analyze the session to auto-generate:

Field	How to Infer
Task	First user prompt + overall goal from conversation
Problems	Errors, tool failures, retries, user corrections, "that's not right"
Outcome	Final state - did task complete? User satisfaction signals

Always present the generated summary and ask: "Accept this summary, or provide your own?"

Output Format

See resources/feedback_format.md for the complete output specification.

Required Output Format

The output MUST follow this exact structure with human-readable sections followed by structured data:

markdown

## Copilot CLI Session Feedback

## Session Info

| Field | Value |
|-------|-------|
| **Session ID** | `session-uuid-here` |
| **Duration** | ~10 minutes (HH:MM - HH:MM UTC) |
| **Date** | YYYY-MM-DD |
| **OS** | Darwin 24.6.0 (macOS) |
| **Terminal** | iTerm.app |
| **Copilot Version** | 0.0.400-0 |

## Environment

- **Python**: 3.14.2
- **Node.js**: v22.18.0
- **Git**: 2.50.1
- **GitHub CLI**: 2.83.2
- **Azure CLI**: Installed
- **Azure Developer CLI (azd)**: 1.23.2

## Task Summary

**Goal**: [What the user was trying to accomplish]

**Outcome**: ✅ SUCCESS / ❌ FAILED / ⚠️ PARTIAL - [Brief description]

---

## Detailed Tool Calls & Responses

### Exchange 1: [Exchange Title]

**User Prompt**: "[Exact user input]"

#### Tool Call 1.1: [tool_name]

Command: [command or description] Parameters: { "key": "value" } Result: [full output] Exit Code: 0

code


#### Tool Call 1.2: [tool_name]

Parameters: { "path": "/full/path/to/file" } Result: [file contents or action taken]

code


### Exchange 2: [Exchange Title]

**User Prompt**: "[Exact user input]"

#### Tool Call 2.1: [tool_name] (FAILED)

Command: [command that failed] Result: ERROR: [error message] Exit Code: 1

code


#### Tool Call 2.2: [tool_name] (FIX)

Action: [What was done to fix the error] Result: [Success confirmation]

code


---

## Files Created

| File | Purpose | Size |
|------|---------|------|
| `filename.py` | Description | ~1.2 KB |

## Resources Deployed (if applicable)

| Resource Type | Name | Region |
|--------------|------|--------|
| Resource Group | rg-name | eastus2 |

## Errors & Fixes

| Error | Location | Fix Applied |
|-------|----------|-------------|
| Error message | file.ext | What was done |

## Statistics

| Metric | Value |
|--------|-------|
| Total Tool Calls | 35 |
| bash commands | 18 |
| file creates | 7 |
| file edits | 3 |
| Errors encountered | 2 |
| Errors fixed | 2 |

---

_Generated by Copilot CLI Session Recorder_

Guidelines

•Always scrub sensitive data before sharing - the recorder does this automatically but verify no secrets leak
•Generate meaningful summaries - don't just echo back what happened, synthesize the key points
•Be specific about problems - vague feedback isn't helpful to developers
•Include context - working directory, git branch, and model info help reproduce issues
•Respect user privacy - don't share without explicit consent

Verbose/Debug Mode Best Practices

To ensure maximum detail in recordings:

Capture Everything

•User Input: Record the exact prompt text, not a summary
•Tool Invocations: Every tool call with full parameters
•Tool Responses: Complete output (truncate only if necessary)
•Errors: Full error messages with stack traces if available
•Timing: How long each operation took

What Verbose Output Looks Like

For each exchange, the output should show:

code

Exchange #2:
├── User: "help me deploy my app to azure"
├── Tools:
│   ├── bash(command="ls -la && cat index.html") → "total 8\n-rw-r--r--..."
│   ├── glob(pattern="**/*.{json,yaml,yml}") → ".github/workflows/..."
│   ├── bash(command="az account show") → "shboyer subscription"
│   ├── bash(command="az group create...") → "Created rg-..."
│   ├── bash(command="swa deploy .") → ERROR: "artifact folder constraint"
│   └── bash(command="mkdir -p dist && cp...") → "deployed to https://..."
├── Assistant: "Detected static HTML site..."
└── Duration: 2m 15s

Common Issues That Reduce Verbosity

Issue	Problem	Fix
Summarized prompts	"User asked about deployment"	Record exact: "help me deploy my app to azure"
Missing tool params	`{...}` or empty	Include full parameters object
Truncated results	"..."	Keep full output, truncate only >2000 chars
Missing errors	Error not recorded	Always call `add_error()` on failures
No timing	`null` duration	Always call `start_tool_call()` before each tool

Debug Log Integration

If Copilot debug logs are available (typically in ~/.copilot/logs/), the recorder can parse them:

python

recorder = SessionRecorder(
    session_id="session-123",
    model="claude-sonnet-4-20250514",
    log_dir="~/.copilot/logs"  # Parse debug logs for extra detail
)

This adds debug_logs to the output with API calls, timing, token usage, and errors from the log files.