AgentSkillsCN

session-feedback

评估会话交互日志,针对用户与 AI 的协作提供可操作的反馈。当被要求“回顾会话”、“评估日志”、“给出反馈”、“分析交互”、“会话反馈”或“优化工作流程”时使用。从 logs/copilot/ 中读取日志,并在 logs/feedback/report-{date}.md 下生成一份反馈报告,其中包含优势亮点、待改进之处,以及面向未来会话的具体建议。

SKILL.md
--- frontmatter
name: session-feedback
description: 'Evaluate session interaction logs and provide actionable feedback on user-AI collaboration. Use when asked to "review session", "evaluate logs", "give feedback", "analyze interactions", "session feedback", or "improve workflow". Reads logs from logs/copilot/ and generates a feedback report under logs/feedback/report-{date}.md with strengths, improvement areas, and concrete suggestions for future sessions.'

Session Feedback

Evaluate AI session interaction logs and generate actionable feedback to help users improve their AI-assisted development workflow.

Overview

This skill reads the session reports and log files produced by session-interaction-logger and produces a structured feedback report. The report highlights what the user did well, identifies missed opportunities, and provides concrete suggestions for future sessions.

code
Read Session Logs → Analyze Patterns → Evaluate Against Rubric → Generate Feedback Report

When to Use This Skill

  • After completing a coding session to get retrospective feedback
  • When asked to "review session", "evaluate logs", or "give feedback"
  • Before starting a new session to review lessons from the last one
  • When preparing for an AI-usage evaluation or audit

Input Sources

The skill reads from the following files (produced by session-interaction-logger):

SourceLocationWhat It Provides
Session reportslogs/copilot/session-report*.mdHigh-level session summaries, timelines, decisions
Interactions loglogs/copilot/interactions.logFull prompt/response detail (JSONL)
File changes loglogs/copilot/file-changes.logWhat was created/modified and why (JSONL)
Commands loglogs/copilot/commands.logTerminal commands executed (JSONL)
Decisions loglogs/copilot/decisions.logArchitectural decisions with rationale (JSONL)

If structured JSONL logs are not available, the skill falls back to analyzing the Markdown session reports.

Output

A feedback report written to:

code
logs/feedback/report-{YYYY-MM-DD}.md

Evaluation Rubric

The feedback report evaluates the session across these dimensions:

1. Prompt Quality (Weight: 25%)

RatingCriteria
ExcellentPrompts are specific, provide context, reference files/docs, state desired outcome
GoodPrompts are clear but could include more context or constraints
Needs WorkPrompts are vague, ambiguous, or require multiple clarifications

What to look for:

  • Did the user reference design docs, files, or specific requirements?
  • Were prompts self-contained or did they require many follow-up clarifications?
  • Did the user specify constraints (language version, patterns, testing expectations)?

2. Planning & Specification (Weight: 20%)

RatingCriteria
ExcellentUser requested a plan before implementation; reviewed and approved it
GoodSome planning occurred but implementation started without full review
Needs WorkNo planning phase; jumped straight to implementation

What to look for:

  • Was a plan/spec document created before coding?
  • Did the user review the plan and provide feedback?
  • Were steps broken down into manageable pieces?

3. Iterative Refinement (Weight: 20%)

RatingCriteria
ExcellentUser reviewed outputs, caught issues, requested fixes, iterated on quality
GoodSome iteration occurred but outputs were mostly accepted as-is
Needs WorkOutputs blindly accepted without review or testing

What to look for:

  • Did the user request changes or improvements after initial output?
  • Were tests run and failures addressed?
  • Did the user verify behavior matches expectations?

4. Decision Documentation (Weight: 15%)

RatingCriteria
ExcellentKey decisions documented with rationale and alternatives considered
GoodSome decisions noted but missing rationale or alternatives
Needs WorkNo decision documentation; choices made without explanation

What to look for:

  • Were architectural choices explained?
  • Were alternatives considered and trade-offs discussed?
  • Can someone reading the logs understand why decisions were made?

5. Testing & Verification (Weight: 20%)

RatingCriteria
ExcellentTests written alongside implementation; failures caught and fixed; coverage considered
GoodTests added but not comprehensive; some verification performed
Needs WorkNo tests or verification of AI-generated code

What to look for:

  • Were tests requested as part of implementation?
  • Were build/test commands run to verify correctness?
  • Were test failures analyzed and resolved?

Step-by-Step Workflow

Step 1: Gather Session Logs

  1. List all files under logs/copilot/
  2. Identify session report files (session-report*.md)
  3. Check for structured JSONL logs (interactions.log, decisions.log, etc.)
  4. If a specific date is requested, filter to that session; otherwise analyze the most recent

Step 2: Analyze Session Content

Read each log source and extract:

  • Prompts: User requests — count, specificity, context provided
  • Iterations: How many rounds of refinement occurred per task
  • Decisions: Architectural choices, rationale, alternatives
  • File changes: Volume, organization, whether tests accompanied code
  • Commands: Build/test execution, success/failure patterns
  • Clarifications: How often the AI needed to ask for more info

Step 3: Score Against Rubric

For each rubric dimension:

  1. Review the relevant evidence from the logs
  2. Assign a rating: Excellent, Good, or Needs Work
  3. Provide specific examples from the session to justify the rating
  4. Calculate a weighted overall score

Step 4: Generate Suggestions

Based on the evaluation, produce:

  • Strengths: 2-4 things the user did well (with examples)
  • Improvement Areas: 2-4 areas to focus on (with examples)
  • Actionable Tips: 3-5 concrete things to try in the next session
  • Prompt Templates: Example improved prompts based on patterns observed

Step 5: Write Feedback Report

Generate the report at logs/feedback/report-{YYYY-MM-DD}.md using the structure below.

Report Template

The generated feedback report follows this structure:

markdown
# Session Feedback Report

**Date**: {date}
**Session(s) Analyzed**: {session IDs or report file names}
**Generated By**: session-feedback skill

---

## Overall Score: {X}/100

| Dimension | Weight | Rating | Score |
|-----------|--------|--------|-------|
| Prompt Quality | 25% | {rating} | {score}/25 |
| Planning & Specification | 20% | {rating} | {score}/20 |
| Iterative Refinement | 20% | {rating} | {score}/20 |
| Decision Documentation | 15% | {rating} | {score}/15 |
| Testing & Verification | 20% | {rating} | {score}/20 |

---

## Strengths

{Bulleted list with specific examples from the session}

## Areas for Improvement

{Bulleted list with specific examples and why they matter}

## Actionable Suggestions for Next Session

{Numbered list of concrete things to try}

## Prompt Improvement Examples

{Before/after prompt examples based on patterns observed}

---

_Generated on {timestamp} by session-feedback skill_
_Source logs: logs/copilot/_

Script Usage

Generate a feedback report from the command line:

bash
.github/skills/session-feedback/scripts/generate-feedback-report.sh [YYYY-MM-DD]
  • If a date is provided, it scans for session-report-{date}.md
  • If no date is provided, it analyzes all available session reports
  • Output is always written to logs/feedback/report-{date}.md

Best Practices

PracticeWhy
Run feedback after every sessionBuilds a habit of reflection and continuous improvement
Review suggestions before the next sessionPrimes you to apply improvements
Compare reports over timeTrack your growth in AI collaboration skills
Share feedback reports with your teamHelps establish team-wide best practices

References

  • Evaluation rubric details: references/evaluation-rubric.md
  • Session interaction logger skill: ../.github/skills/session-interaction-logger/SKILL.md