Session Feedback

Evaluate AI session interaction logs and generate actionable feedback to help users improve their AI-assisted development workflow.

Overview

This skill reads the session reports and log files produced by session-interaction-logger and produces a structured feedback report. The report highlights what the user did well, identifies missed opportunities, and provides concrete suggestions for future sessions.

code

Read Session Logs → Analyze Patterns → Evaluate Against Rubric → Generate Feedback Report

When to Use This Skill

•After completing a coding session to get retrospective feedback
•When asked to "review session", "evaluate logs", or "give feedback"
•Before starting a new session to review lessons from the last one
•When preparing for an AI-usage evaluation or audit

Input Sources

The skill reads from the following files (produced by session-interaction-logger):

Source	Location	What It Provides
Session reports	`logs/copilot/session-report*.md`	High-level session summaries, timelines, decisions
Interactions log	`logs/copilot/interactions.log`	Full prompt/response detail (JSONL)
File changes log	`logs/copilot/file-changes.log`	What was created/modified and why (JSONL)
Commands log	`logs/copilot/commands.log`	Terminal commands executed (JSONL)
Decisions log	`logs/copilot/decisions.log`	Architectural decisions with rationale (JSONL)

If structured JSONL logs are not available, the skill falls back to analyzing the Markdown session reports.

Output

A feedback report written to:

code

logs/feedback/report-{YYYY-MM-DD}.md

Evaluation Rubric

The feedback report evaluates the session across these dimensions:

1. Prompt Quality (Weight: 25%)

Rating	Criteria
Excellent	Prompts are specific, provide context, reference files/docs, state desired outcome
Good	Prompts are clear but could include more context or constraints
Needs Work	Prompts are vague, ambiguous, or require multiple clarifications

What to look for:

•Did the user reference design docs, files, or specific requirements?
•Were prompts self-contained or did they require many follow-up clarifications?
•Did the user specify constraints (language version, patterns, testing expectations)?

2. Planning & Specification (Weight: 20%)

Rating	Criteria
Excellent	User requested a plan before implementation; reviewed and approved it
Good	Some planning occurred but implementation started without full review
Needs Work	No planning phase; jumped straight to implementation

What to look for:

•Was a plan/spec document created before coding?
•Did the user review the plan and provide feedback?
•Were steps broken down into manageable pieces?

3. Iterative Refinement (Weight: 20%)

Rating	Criteria
Excellent	User reviewed outputs, caught issues, requested fixes, iterated on quality
Good	Some iteration occurred but outputs were mostly accepted as-is
Needs Work	Outputs blindly accepted without review or testing

What to look for:

•Did the user request changes or improvements after initial output?
•Were tests run and failures addressed?
•Did the user verify behavior matches expectations?

4. Decision Documentation (Weight: 15%)

Rating	Criteria
Excellent	Key decisions documented with rationale and alternatives considered
Good	Some decisions noted but missing rationale or alternatives
Needs Work	No decision documentation; choices made without explanation

What to look for:

•Were architectural choices explained?
•Were alternatives considered and trade-offs discussed?
•Can someone reading the logs understand why decisions were made?

5. Testing & Verification (Weight: 20%)

Rating	Criteria
Excellent	Tests written alongside implementation; failures caught and fixed; coverage considered
Good	Tests added but not comprehensive; some verification performed
Needs Work	No tests or verification of AI-generated code

What to look for:

•Were tests requested as part of implementation?
•Were build/test commands run to verify correctness?
•Were test failures analyzed and resolved?

Step-by-Step Workflow

Step 1: Gather Session Logs

•List all files under logs/copilot/
•Identify session report files (session-report*.md)
•Check for structured JSONL logs (interactions.log, decisions.log, etc.)
•If a specific date is requested, filter to that session; otherwise analyze the most recent

Step 2: Analyze Session Content

Read each log source and extract:

•Prompts: User requests — count, specificity, context provided
•Iterations: How many rounds of refinement occurred per task
•Decisions: Architectural choices, rationale, alternatives
•File changes: Volume, organization, whether tests accompanied code
•Commands: Build/test execution, success/failure patterns
•Clarifications: How often the AI needed to ask for more info

Step 3: Score Against Rubric

For each rubric dimension:

•Review the relevant evidence from the logs
•Assign a rating: Excellent, Good, or Needs Work
•Provide specific examples from the session to justify the rating
•Calculate a weighted overall score

Step 4: Generate Suggestions

Based on the evaluation, produce:

•Strengths: 2-4 things the user did well (with examples)
•Improvement Areas: 2-4 areas to focus on (with examples)
•Actionable Tips: 3-5 concrete things to try in the next session
•Prompt Templates: Example improved prompts based on patterns observed

Step 5: Write Feedback Report

Generate the report at logs/feedback/report-{YYYY-MM-DD}.md using the structure below.

Report Template

The generated feedback report follows this structure:

markdown

# Session Feedback Report

**Date**: {date}
**Session(s) Analyzed**: {session IDs or report file names}
**Generated By**: session-feedback skill

---

## Overall Score: {X}/100

| Dimension | Weight | Rating | Score |
|-----------|--------|--------|-------|
| Prompt Quality | 25% | {rating} | {score}/25 |
| Planning & Specification | 20% | {rating} | {score}/20 |
| Iterative Refinement | 20% | {rating} | {score}/20 |
| Decision Documentation | 15% | {rating} | {score}/15 |
| Testing & Verification | 20% | {rating} | {score}/20 |

---

## Strengths

{Bulleted list with specific examples from the session}

## Areas for Improvement

{Bulleted list with specific examples and why they matter}

## Actionable Suggestions for Next Session

{Numbered list of concrete things to try}

## Prompt Improvement Examples

{Before/after prompt examples based on patterns observed}

---

_Generated on {timestamp} by session-feedback skill_
_Source logs: logs/copilot/_

Script Usage

Generate a feedback report from the command line:

bash

.github/skills/session-feedback/scripts/generate-feedback-report.sh [YYYY-MM-DD]

•If a date is provided, it scans for session-report-{date}.md
•If no date is provided, it analyzes all available session reports
•Output is always written to logs/feedback/report-{date}.md

Best Practices

Practice	Why
Run feedback after every session	Builds a habit of reflection and continuous improvement
Review suggestions before the next session	Primes you to apply improvements
Compare reports over time	Track your growth in AI collaboration skills
Share feedback reports with your team	Helps establish team-wide best practices

References

•Evaluation rubric details: references/evaluation-rubric.md
•Session interaction logger skill: ../.github/skills/session-interaction-logger/SKILL.md