AgentSkillsCN

ai-vendor-evaluator

在比较不同AI平台、工具或服务时使用。建议在采购前使用。该技能可生成结构化的供应商对比报告、评分体系,以及推荐意见。

SKILL.md
--- frontmatter
name: ai-vendor-evaluator
description: Use when comparing AI platforms, tools, or services. Use before procurement. Produces structured vendor comparison, scoring, and recommendation.

AI Vendor Evaluator

Overview

Conduct structured evaluation of AI vendors, platforms, and services. Compare options objectively and make defensible procurement decisions.

Core principle: AI vendor selection is a strategic decision. Evaluate beyond features—consider support, roadmap, security, and total cost.

When to Use

  • Selecting LLM provider
  • Choosing ML platform
  • Evaluating AI SaaS tools
  • Renewing vendor contracts
  • Build vs buy decisions

Output Format

yaml
vendor_evaluation:
  project: "[What we're selecting for]"
  evaluation_date: "[YYYY-MM-DD]"
  evaluators: ["[Name]"]
  
  requirements:
    must_have:
      - requirement: "[Requirement]"
        rationale: "[Why essential]"
    
    should_have:
      - requirement: "[Requirement]"
        weight: "[1-5]"
    
    nice_to_have:
      - requirement: "[Requirement]"
  
  vendors:
    - vendor: "[Vendor name]"
      product: "[Product name]"
      
      overview:
        description: "[What they offer]"
        pricing_model: "[How they charge]"
        contract_terms: "[Key terms]"
      
      scores:
        - category: "[Category]"
          criteria:
            - criterion: "[Criterion]"
              score: "[1-5]"
              evidence: "[How verified]"
              notes: "[Details]"
          category_score: "[Weighted average]"
      
      pros: ["[Advantage]"]
      cons: ["[Disadvantage]"]
      risks: ["[Concern]"]
      
      total_score: "[Weighted total]"
      rank: "[Position vs others]"
  
  comparison_matrix:
    headers: ["[Criterion]", "[Vendor A]", "[Vendor B]"]
    rows:
      - ["[Criterion 1]", "[Score]", "[Score]"]
  
  total_cost_analysis:
    - vendor: "[Vendor]"
      year_1: "[$]"
      year_3: "[$]"
      assumptions: ["[Assumption]"]
  
  recommendation:
    selected: "[Vendor name]"
    rationale: "[Why chosen]"
    conditions: ["[Any conditions]"]
    risks_accepted: ["[Known risks]"]
    next_steps: ["[Actions to proceed]"]

Evaluation Categories

Technical Capability

CriterionWhat to Evaluate
Model performanceAccuracy, quality on your data
ScalabilityHandle your volume
LatencyResponse time requirements
CustomizationFine-tuning, configuration options
IntegrationAPIs, SDKs, connectors

Security & Compliance

CriterionWhat to Evaluate
Data handlingWhere data goes, retention
CertificationsSOC2, ISO27001, HIPAA
PrivacyGDPR compliance, consent
Access controlAuthentication, authorization
Audit capabilityLogging, monitoring

Operational

CriterionWhat to Evaluate
ReliabilitySLA, uptime history
SupportResponse time, expertise
DocumentationQuality, completeness
MonitoringObservability tools
Disaster recoveryBackup, failover

Strategic

CriterionWhat to Evaluate
Vendor stabilityFinancial health, market position
RoadmapFuture direction alignment
Lock-in riskPortability, standards
EcosystemPartners, community

Commercial

CriterionWhat to Evaluate
Pricing clarityUnderstandable pricing
Cost predictabilityUsage-based risks
Contract flexibilityTerms, exit clauses
Total costImplementation, operation, exit

Scoring Methodology

Score Scale

ScoreMeaning
5Excellent - exceeds requirements
4Good - meets requirements well
3Adequate - meets minimum requirements
2Weak - partially meets requirements
1Poor - does not meet requirements

Weighted Scoring

yaml
category_weights:
  technical: 30%
  security: 25%
  operational: 20%
  strategic: 15%
  commercial: 10%
  
total_score: "Sum of (category_score × weight)"

Evaluation Process

Phase 1: Requirements (Week 1)

yaml
steps:
  - "Document must-have requirements"
  - "Prioritize should-have requirements"
  - "Define evaluation criteria"
  - "Set weights by category"
  - "Identify stakeholders for input"

Phase 2: Research (Week 2)

yaml
steps:
  - "Identify vendor long list"
  - "Gather public information"
  - "Issue RFI if needed"
  - "Create short list (3-5 vendors)"

Phase 3: Evaluate (Weeks 3-4)

yaml
steps:
  - "Request demos"
  - "Technical proof of concept"
  - "Security questionnaire"
  - "Reference checks"
  - "Pricing negotiation"

Phase 4: Decide (Week 5)

yaml
steps:
  - "Score all criteria"
  - "Calculate weighted totals"
  - "Document recommendation"
  - "Present to stakeholders"
  - "Obtain approval"

Proof of Concept Design

yaml
poc_design:
  duration: "[2-4 weeks typical]"
  
  success_criteria:
    - criterion: "[What must be demonstrated]"
      measurement: "[How to measure]"
      threshold: "[Acceptable value]"
  
  test_scenarios:
    - scenario: "[Description]"
      data: "[What data to use]"
      expected: "[Expected outcome]"
  
  evaluation_questions:
    - "Does it work with our data?"
    - "What's the integration effort?"
    - "How does support respond to issues?"
    - "What's the actual cost at our scale?"

TCO Calculation

yaml
total_cost_of_ownership:
  year_1:
    license_subscription: "[$]"
    implementation: "[$]"
    integration: "[$]"
    training: "[$]"
    internal_effort: "[$]"
    total: "[$]"
  
  year_3:
    recurring: "[$ × 3]"
    scaling_assumptions: "[Volume growth]"
    price_changes: "[Expected changes]"
    total: "[$]"
  
  hidden_costs:
    - "Egress fees"
    - "Overage charges"
    - "Support tiers"
    - "Add-on features"

Red Flags

Red FlagConcern
Unclear pricingUnpredictable costs
No customer referencesLimited real-world validation
Vague roadmapMay not evolve with needs
Poor security responsesRisk exposure
High switching costsLock-in
New/unstable companyContinuity risk

Checklist

  • Requirements documented
  • Evaluation criteria defined
  • Weights assigned
  • Short list created
  • Demos completed
  • POC executed (if needed)
  • Security review done
  • References checked
  • TCO calculated
  • Scores documented
  • Recommendation justified