AgentSkillsCN

validate-samples

通过逐条阅读每个示例的 README 说明、执行每一步操作,并借助 MCP 检查器与 DTS 仪表板验证其正确行为,对 StreamableHttpServer 和 StdioServer 示例进行全面端到端验证。对于发现的任何问题,均会进行初步根因调查并上报。

SKILL.md
--- frontmatter
name: validate-samples
description: Validates the StreamableHttpServer and StdioServer samples end-to-end by walking through each sample's README instructions, executing every step, and verifying correct behavior via the MCP Inspector and DTS Dashboard. Reports any issues found with initial root-cause investigation.

Validate Samples

Walk through the README for each sample (StreamableHttpServer and StdioServer), execute every instruction, and verify that everything works correctly. Use playwright-cli for all browser interactions (MCP Inspector and DTS Dashboard). Document every step and its result in a validation report. If a bug in the implementation is discovered, offer to fix it.

General Guidelines

  • Read the sample README first. The README is the source of truth for the expected behavior. Before executing any steps, read the full README so you understand the happy path.
  • Document everything. Write a structured validation report (markdown) to the session workspace as you go. Each step should have a status (✅ pass / ❌ fail / ⚠️ warning) and notes.
  • Investigate failures. When something fails, capture the error output, take a screenshot, and do an initial investigation (check logs, source code, etc.). Identify whether the root cause is a bug in the implementation, a documentation error, or an environmental issue.
  • Offer to fix bugs. If the root cause is a bug in the project's source code, report it in the validation report and offer to fix it. Do NOT silently fix bugs—always report them first.
  • Retry after fixes. After fixing a bug, re-run the failing step (and subsequent steps) to confirm the fix works. Update the validation report accordingly.
  • Clean up between samples. Before starting the second sample, close all browser sessions and stop any running MCP server processes from the first sample. The DTS emulator should remain running across both samples.

Pre-Flight Checks

Before validating either sample, perform these checks:

1. Check if the DTS emulator is already running

powershell
# Check if port 8080 (gRPC) is already in use
Test-NetConnection -ComputerName localhost -Port 8080 -WarningAction SilentlyContinue |
  Select-Object TcpTestSucceeded
  • If port 8080 is listening: The DTS emulator is likely already running. Skip the Docker start step. Optionally verify by opening http://localhost:8082 in the browser.

  • If port 8080 is NOT listening: Start the emulator:

    bash
    docker run -d -p 8080:8080 -p 8082:8082 mcr.microsoft.com/dts/dts-emulator:latest
    

    Wait a few seconds and re-check port 8080 to confirm it's up.

2. Check if MCP Inspector port is free

powershell
Test-NetConnection -ComputerName localhost -Port 6274 -WarningAction SilentlyContinue |
  Select-Object TcpTestSucceeded

If port 6274 is already in use, stop whatever is using it or note it in the report.

3. Build the full solution

bash
dotnet build DurableTaskMcp.slnx

If the build fails, stop and report the error. Do not proceed with validation until the build is green.


Sample 1: StreamableHttpServer

Validate every step in samples/StreamableHttpServer/README.md.

Step 1 — Start the MCP server

  1. Run the server from the repo root:

    bash
    dotnet run --project samples/StreamableHttpServer
    
  2. Verify the console output includes the expected log lines:

    • Durable Task gRPC worker starting and connecting to localhost:8080
    • Now listening on: http://localhost:5000
  3. If the server fails to start, capture the error and investigate.

Step 2 — Launch the MCP Inspector

  1. In a separate shell, run:

    bash
    npx @modelcontextprotocol/inspector
    
  2. Use playwright-cli to open http://localhost:6274.

  3. Take a snapshot to verify the Inspector UI loaded.

Step 3 — Connect to the server

  1. In the Inspector UI:
    • Set Transport Type to Streamable HTTP (use the dropdown).
    • Set URL to http://localhost:5000.
    • Click Connect.
  2. Verify the UI shows "Connected" and the server name "DurableTask MCP Server".
  3. Take a screenshot for the report.

Step 4 — List tools

  1. Click List Tools.
  2. Verify exactly four tools appear: process_data, run_pipeline, approval_workflow, and submit_approval.
  3. Take a screenshot showing the tool list.

Step 5 — Run process_data as a task

  1. Select the process_data tool.
  2. Fill in the parameters: itemCount = 3, delayPerItemSeconds = 1.
  3. Check the Run as task checkbox.
  4. Click Run Tool.
  5. Wait for the task to complete (the Inspector polls automatically).
  6. Verify:
    • The tool result appears showing task completion.
    • The History panel shows tools/call and tasks/get messages.
  7. Take a screenshot of the completed task.

Step 6 — Run run_pipeline as a task

  1. Select the run_pipeline tool.
  2. Fill in parameters: dataDescription = "test data", itemCount = 3.
  3. Check Run as task and click Run Tool.
  4. Wait for completion and verify multi-step progress messages appear.
  5. Take a screenshot.

Step 7 — Run approval_workflow (human-in-the-loop)

  1. Select the approval_workflow tool.
  2. Fill in parameters: requestTitle = "Test approval", requestedBy = "tester".
  3. Check Run as task and click Run Tool.
  4. Wait for the task to reach input_required status.
  5. Note the task ID shown in the response (this is the workflowId for the next step).
  6. Select the submit_approval tool.
  7. Fill in workflowId with the task ID from step 5, set approved to true.
  8. Click Run Tool (this is a regular tool call, NOT a task).
  9. Return to the approval workflow task and verify it completes successfully.
  10. Take screenshots at each stage (input_required, after approval, completed).

Step 8 — Verify DTS Dashboard

  1. Use playwright-cli to open http://localhost:8082.
  2. Navigate to the default task hub.
  3. Verify that orchestration instances appear for each tool invocation.
  4. Critically examine the history for errors:
    • Check that all orchestrations show a "Completed" status (not "Failed").
    • Click into at least one orchestration to view the detail/timeline.
    • Look for any unexpected error messages, failed activities, or retries.
  5. Verify that orchestration IDs match the MCP task IDs shown in the Inspector.
  6. Take screenshots of the orchestration list and at least one detail view.

Step 9 — Clean up

  1. Stop the MCP server process (Ctrl+C or stop the shell).
  2. Close the MCP Inspector.
  3. Close any playwright-cli browser sessions used for this sample.
  4. Do NOT stop the DTS emulator—it is shared with the next sample.

Sample 2: StdioServer

Validate every step in samples/StdioServer/README.md.

Step 1 — Build the server

bash
dotnet build samples/StdioServer

Verify the build succeeds.

Step 2 — Launch Inspector with the server

  1. Run the Inspector with the server executable:

    bash
    npx @modelcontextprotocol/inspector samples/StdioServer/bin/Debug/net10.0/StdioServer.exe
    
  2. Use playwright-cli to open http://localhost:6274.

  3. Take a snapshot to verify the Inspector loaded with STDIO transport pre-configured.

Step 3 — Connect and list tools

  1. Click Connect.
  2. Verify "Connected" with server name "DurableTask MCP Server (stdio)".
  3. Click List Tools.
  4. Verify the process_data tool appears.
  5. Take a screenshot.

Step 4 — Run process_data as a task

  1. Select process_data.
  2. Set itemCount = 3, delayPerItemSeconds = 1.
  3. Check Run as task and click Run Tool.
  4. Wait for the task to complete.
  5. Verify:
    • Task completes successfully with the expected result.
    • History panel shows tools/call and tasks/get messages.
  6. Take a screenshot.

Step 5 — Verify DTS Dashboard

  1. Use playwright-cli to open http://localhost:8082.
  2. Navigate to the default task hub.
  3. Verify the new orchestration instance appears alongside any from the previous sample.
  4. Check for errors in the orchestration history — same scrutiny as Sample 1 Step 8.
  5. Take a screenshot.

Step 6 — Clean up

  1. Close the Inspector (this also terminates the stdio server).
  2. Close all playwright-cli browser sessions.

Validation Report

After completing both samples, compile the validation report with:

  1. Summary table — one row per validation step with status and notes.
  2. Issues found — detailed description of any failures, with:
    • Error message or screenshot.
    • Root-cause analysis (bug / doc error / environment issue).
    • Whether a fix was applied and verified.
  3. Screenshots — reference all screenshots taken during validation.
  4. Overall verdict — PASS (all steps green), PARTIAL (some issues), or FAIL (blocking issues).

Save the report to the session workspace:

text
C:\Users\cgillum\.copilot\session-state\<session-id>\files\validation-report.md

Common Issues and Troubleshooting

DTS emulator won't start

  • Check if Docker is running: docker info
  • Check if ports are already bound: netstat -ano | findstr :8080
  • Try removing old containers: docker ps -a | findstr dts-emulator

MCP server fails to connect to DTS

  • Verify the connection string in appsettings.json or environment variable.
  • Confirm the emulator gRPC endpoint is reachable: Test-NetConnection localhost -Port 8080
  • Check server logs for gRPC connection errors.

Inspector can't connect to MCP server

  • Verify the server is running and listening on the expected port.
  • Check for CORS or firewall issues.
  • Try refreshing the Inspector page.

Tool invocation returns an error

  • Check the MCP server console logs for exceptions.
  • Check the DTS Dashboard for failed orchestration instances.
  • Look at the orchestration input/output in the dashboard for deserialization errors.

Orchestration shows "Failed" in DTS Dashboard

  • Click the orchestration to view the failure details.
  • Check the "Failure" section for the exception message and stack trace.
  • Common causes: missing activity registration, serialization errors, null reference exceptions.