CLI Session Recorder
Record CLI sessions and generate shareable feedback with human-readable summaries and machine-readable data.
⚠️ MANDATORY: Comprehensive Output
EVERY saved report MUST include the complete exchanges array with:
- •All user prompts (verbatim)
- •All tool calls with full parameters AND results
- •All assistant responses
- •All errors with context
- •Timing information
This is NOT optional. A report without detailed exchanges is incomplete and useless for debugging. Never save a summary-only report.
Workflow
Starting a Recording
When the user wants to record their session:
- •Generate a unique session ID (use timestamp or UUID)
- •Detect the current model from context
- •Get git branch if in a repository:
git branch --show-current - •Begin capturing all exchanges
from resources.session_recorder import SessionRecorder
import subprocess
# Get git branch
try:
branch = subprocess.run(
["git", "branch", "--show-current"],
capture_output=True, text=True, check=True
).stdout.strip()
except:
branch = None
# Initialize recorder with optional debug log parsing
recorder = SessionRecorder(
session_id=f"session-{datetime.now().strftime('%Y%m%d-%H%M%S')}",
model="claude-sonnet-4-20250514", # from context
log_dir=".logs" # Optional: path to Copilot debug logs
)
recorder.start_recording(git_branch=branch, capture_environment=True)
During the Session (VERBOSE CAPTURE)
IMPORTANT: Capture EVERY exchange with full verbosity. This is critical for debugging and feedback analysis.
For EACH exchange, you MUST capture:
- •The exact user prompt (verbatim)
- •ALL tool calls with:
- •Full tool name
- •Complete parameters object
- •Full result/output (truncate only if >2000 chars)
- •Timing information
- •Retry counts
- •The complete assistant response
- •Any errors with full context
- •Exchange timing
# When user sends a prompt - capture EXACTLY what they typed
recorder.add_user_prompt("help me deploy my app to azure")
# For EVERY tool call - capture full details
recorder.start_tool_call() # Start timing
recorder.add_tool_call(
name="bash",
parameters={
"command": "ls -la && cat index.html",
"description": "List files and view content"
},
result="total 8\n-rw-r--r-- 1 user staff 123 Jan 28 index.html\n<!DOCTYPE html>...",
retry_count=0
)
recorder.start_tool_call()
recorder.add_tool_call(
name="glob",
parameters={"pattern": "**/*.{json,yaml,yml}"},
result=".github/workflows/run_test_deploy.yml",
retry_count=0
)
recorder.start_tool_call()
recorder.add_tool_call(
name="bash",
parameters={"command": "az group create --name rg-myapp --location eastus2"},
result='{"id": "/subscriptions/.../resourceGroups/rg-myapp", "location": "eastus2", "name": "rg-myapp"}',
retry_count=0
)
# Capture COMPLETE assistant response
recorder.add_assistant_response(
"I detected a static HTML site. I'll deploy it to Azure Static Web Apps.\n\n"
"Created resource group `rg-myapp` and deploying..."
)
# When errors occur - capture FULL context
recorder.add_error(
error_type="ToolError",
message="SWA CLI artifact folder constraint - current directory cannot be identical to artifact folder",
context={
"tool": "bash",
"command": "swa deploy . --deployment-token ...",
"resolution": "Created dist folder and copied files"
}
)
Verbose Capture Checklist
For each turn, ensure you record:
| Item | Required | Details |
|---|---|---|
| User prompt | ✅ Yes | Exact text, no summarization |
| Tool name | ✅ Yes | Full tool identifier |
| Tool parameters | ✅ Yes | Complete parameters object |
| Tool result | ✅ Yes | Full output (truncate >2000 chars with "[truncated]") |
| Tool timing | ✅ Yes | Use start_tool_call() before each call |
| Tool errors | ✅ Yes | Include if tool failed |
| Assistant response | ✅ Yes | Complete response text |
| Exchange timestamp | ✅ Auto | Captured automatically |
Stopping Recording
When the user says "stop recording" or "save feedback":
- •Stop the recorder
- •Scrub sensitive data
- •Ask for summary information
- •Generate the feedback file with FULL exchange data
recorder.stop_recording() recorder.scrub_sensitive_data()
Generating Feedback (ALWAYS COMPREHENSIVE)
⚠️ CRITICAL: The generated feedback MUST always include the complete exchanges array with all tool calls, parameters, and results. This is mandatory, not optional.
When stopping the recording, auto-generate the summary by analyzing the recorded session:
- •Task Summary: Infer from the first user prompt and overall conversation flow
- •Problems: Identify from errors, retries, tool failures, or user corrections
- •Outcome: Determine from final exchanges - did the task complete successfully?
Then present to user for confirmation:
I've generated a summary of this session: **Task**: Fix login authentication bug causing 401 errors **Problems**: - Initial grep search returned too many results - Had to inspect multiple files before finding the issue **Outcome**: Success - Bug identified and fixed in src/auth.py Accept this summary, or provide your own?
If user accepts, proceed. If they provide edits, use their version.
from resources.format_feedback import format_feedback, save_feedback
# Auto-generated from session analysis
feedback = format_feedback(
session=recorder.get_session_data(),
task_summary="User attempted to fix authentication bug...", # AI-generated
problems=["Initial search returned too many results"], # AI-detected
outcome="Success - Bug fixed in auth.py" # AI-inferred
)
filepath = save_feedback(feedback, output_dir="./feedback/")
print(f"Saved to: {filepath}")
Sharing Options
Present the user with sharing choices:
1. Local file (default)
from resources.format_feedback import save_feedback filepath = save_feedback(feedback)
2. GitHub Gist
from resources.share_gist import create_gist url = create_gist(filepath, public=False)
3. GitHub Issue
from resources.share_issue import create_issue
url = create_issue("owner/repo", filepath, labels=["feedback"])
Commands Reference
Both explicit commands and natural language work. Use explicit commands if natural language isn't giving good results.
| Command | Natural Language | Action |
|---|---|---|
/cli-session-recorder start | "record this", "start recording" | Begin capturing session |
/cli-session-recorder stop | "stop recording" | Stop and auto-generate summary |
/cli-session-recorder save | "save feedback" | Save to local file |
/cli-session-recorder save --gist | "share as gist" | Upload to GitHub Gist |
/cli-session-recorder save --issue | "create issue", "report this" | Create GitHub Issue |
/cli-session-recorder status | "show recording status" | Report if recording is active |
Summary Generation
When stopping a recording, analyze the session to auto-generate:
| Field | How to Infer |
|---|---|
| Task | First user prompt + overall goal from conversation |
| Problems | Errors, tool failures, retries, user corrections, "that's not right" |
| Outcome | Final state - did task complete? User satisfaction signals |
Always present the generated summary and ask: "Accept this summary, or provide your own?"
Output Format
See resources/feedback_format.md for the complete output specification.
Required Output Format
The output MUST follow this exact structure with human-readable sections followed by structured data:
## Copilot CLI Session Feedback ## Session Info | Field | Value | |-------|-------| | **Session ID** | `session-uuid-here` | | **Duration** | ~10 minutes (HH:MM - HH:MM UTC) | | **Date** | YYYY-MM-DD | | **OS** | Darwin 24.6.0 (macOS) | | **Terminal** | iTerm.app | | **Copilot Version** | 0.0.400-0 | ## Environment - **Python**: 3.14.2 - **Node.js**: v22.18.0 - **Git**: 2.50.1 - **GitHub CLI**: 2.83.2 - **Azure CLI**: Installed - **Azure Developer CLI (azd)**: 1.23.2 ## Task Summary **Goal**: [What the user was trying to accomplish] **Outcome**: ✅ SUCCESS / ❌ FAILED / ⚠️ PARTIAL - [Brief description] --- ## Detailed Tool Calls & Responses ### Exchange 1: [Exchange Title] **User Prompt**: "[Exact user input]" #### Tool Call 1.1: [tool_name]
Command: [command or description] Parameters: { "key": "value" } Result: [full output] Exit Code: 0
#### Tool Call 1.2: [tool_name]
Parameters: { "path": "/full/path/to/file" } Result: [file contents or action taken]
### Exchange 2: [Exchange Title] **User Prompt**: "[Exact user input]" #### Tool Call 2.1: [tool_name] (FAILED)
Command: [command that failed] Result: ERROR: [error message] Exit Code: 1
#### Tool Call 2.2: [tool_name] (FIX)
Action: [What was done to fix the error] Result: [Success confirmation]
--- ## Files Created | File | Purpose | Size | |------|---------|------| | `filename.py` | Description | ~1.2 KB | ## Resources Deployed (if applicable) | Resource Type | Name | Region | |--------------|------|--------| | Resource Group | rg-name | eastus2 | ## Errors & Fixes | Error | Location | Fix Applied | |-------|----------|-------------| | Error message | file.ext | What was done | ## Statistics | Metric | Value | |--------|-------| | Total Tool Calls | 35 | | bash commands | 18 | | file creates | 7 | | file edits | 3 | | Errors encountered | 2 | | Errors fixed | 2 | --- _Generated by Copilot CLI Session Recorder_
Guidelines
- •Always scrub sensitive data before sharing - the recorder does this automatically but verify no secrets leak
- •Generate meaningful summaries - don't just echo back what happened, synthesize the key points
- •Be specific about problems - vague feedback isn't helpful to developers
- •Include context - working directory, git branch, and model info help reproduce issues
- •Respect user privacy - don't share without explicit consent
Verbose/Debug Mode Best Practices
To ensure maximum detail in recordings:
Capture Everything
- •User Input: Record the exact prompt text, not a summary
- •Tool Invocations: Every tool call with full parameters
- •Tool Responses: Complete output (truncate only if necessary)
- •Errors: Full error messages with stack traces if available
- •Timing: How long each operation took
What Verbose Output Looks Like
For each exchange, the output should show:
Exchange #2:
├── User: "help me deploy my app to azure"
├── Tools:
│ ├── bash(command="ls -la && cat index.html") → "total 8\n-rw-r--r--..."
│ ├── glob(pattern="**/*.{json,yaml,yml}") → ".github/workflows/..."
│ ├── bash(command="az account show") → "shboyer subscription"
│ ├── bash(command="az group create...") → "Created rg-..."
│ ├── bash(command="swa deploy .") → ERROR: "artifact folder constraint"
│ └── bash(command="mkdir -p dist && cp...") → "deployed to https://..."
├── Assistant: "Detected static HTML site..."
└── Duration: 2m 15s
Common Issues That Reduce Verbosity
| Issue | Problem | Fix |
|---|---|---|
| Summarized prompts | "User asked about deployment" | Record exact: "help me deploy my app to azure" |
| Missing tool params | {...} or empty | Include full parameters object |
| Truncated results | "..." | Keep full output, truncate only >2000 chars |
| Missing errors | Error not recorded | Always call add_error() on failures |
| No timing | null duration | Always call start_tool_call() before each tool |
Debug Log Integration
If Copilot debug logs are available (typically in ~/.copilot/logs/), the recorder can parse them:
recorder = SessionRecorder(
session_id="session-123",
model="claude-sonnet-4-20250514",
log_dir="~/.copilot/logs" # Parse debug logs for extra detail
)
This adds debug_logs to the output with API calls, timing, token usage, and errors from the log files.