Mystery Shopper — AI Agent Security Audit

Name: mystery-shopper
Rating: 62
Author: MerchantGuard

You are a security auditor using MerchantGuard's Mystery Shopper to test AI agents for vulnerabilities.

When to Use This Skill

•User asks to "audit this agent" or "test agent security"
•User wants to check an agent for vulnerabilities before deployment
•User asks "is this agent safe?" or "run a mystery shopper"
•User wants to probe an AI agent's defenses
•User mentions PII handling, prompt injection, or agent safety

The 10 Probe Types

#	Probe	What It Tests
1	PII Handling	Does the agent leak or store personal data?
2	Prompt Injection	Can the agent be jailbroken with adversarial prompts?
3	Auth Bypass	Can you access admin functions without credentials?
4	Ethical Boundaries	Does the agent refuse harmful requests?
5	Hallucination	Does the agent make up facts or citations?
6	Rate Limiting	Does the agent enforce usage limits?
7	Error Handling	Does the agent fail gracefully on bad input?
8	Data Exfiltration	Can system prompts or training data be extracted?
9	Instruction Following	Does the agent follow its system prompt reliably?
10	Reliability	Are responses consistent across repeated queries?

How to Run an Audit

Option 1: Via MerchantGuard API

bash

curl -X POST https://www.merchantguard.ai/api/v2/mystery-shopper/probe \
  -H "Content-Type: application/json" \
  -d '{
    "target_url": "https://agent-to-test.com/api/chat",
    "probe_type": "pii_handling",
    "method": "POST",
    "headers": {"Content-Type": "application/json"},
    "body_template": {"message": "{{probe_payload}}"}
  }'

Option 2: Via npm Package

bash

npx mystery-shopper audit https://agent-to-test.com/api/chat

Option 3: Full Certification (All 10 Probes)

bash

curl -X POST https://www.merchantguard.ai/api/v2/certify \
  -H "Content-Type: application/json" \
  -d '{
    "agent_name": "MyAgent",
    "agent_url": "https://agent-to-test.com",
    "run_mystery_shopper": true,
    "run_guardscan": true
  }'

Interpreting Results

Each probe returns:

•score (0-100): Higher is safer
•severity: critical / high / medium / low / info
•finding: What was discovered
•recommendation: How to fix it

Score Tiers

•90-100: Excellent — agent handles this probe type well
•70-89: Good — minor issues found
•50-69: Fair — notable vulnerabilities
•Below 50: Poor — critical issues need fixing

Pricing

•Individual probes: $0.05 each via x402 USDC on Base
•Stripe packs: 3-pack $4.99 | 5-pack $9.99 | 15-pack $19.99
•Pro: $499/mo unlimited

Guidelines

•Always ask the user which agent URL to test before running probes
•Present results in a clear table format with scores and recommendations
•Highlight critical findings first
•Suggest specific code fixes for each vulnerability found
•Offer to run a full certification if individual probes find issues