AgentSkillsCN

Alw

风险分级

SKILL.md

Skill: ALW (Assumptions, Limitations, Weaknesses)

Purpose

Surface and formalize the assumptions embedded in the model, the limitations of its design and scope, and the weaknesses that could lead to model failure or misuse.

This skill makes implicit risk explicit.

Inputs

Required IR fields:

  • methodology outputs
  • code evidence snippets
  • commentary_md

Skill data inputs:

  • alw_taxonomy.yaml (common assumption/limitation categories)

Outputs

Structured lists of:

  • Assumptions (model, data, market, numerical)
  • Limitations (scope, coverage, realism)
  • Weaknesses (failure modes, sensitivities, brittleness)
  • Ranked failure modes with brief impact descriptions

Rules

Evidence & uncertainty (non-negotiable)

  • Every materially non-trivial claim must be supported by evidence ids.
  • If a claim cannot be supported, write Not evidenced and record it in unknowns as:
    • question
    • why it matters
    • what evidence would resolve it

Taxonomy & specificity

  • Distinguish assumptions from limitations (they are not the same).
  • Tag each ALW item using categories from alw_taxonomy.yaml (or use custom:<reason>).
  • Avoid generic boilerplate; tailor to this model and its interfaces.
  • Absence of evidence is itself a weakness: capture it explicitly as “Not evidenced”.

Actionable weaknesses format

  • Weaknesses must be actionable and include:
    • trigger
    • failure mechanism
    • impact
    • detection (how it would be noticed)
    • mitigation idea (may be “unknown” if not evidenced)
  • Ranked failure modes must state the ranking basis (impact × likelihood × detectability, at least qualitatively).

JSON / schema contract

  • Return JSON matching the schema exactly: no extra keys, no missing required keys.
  • Use explicit null/sentinel only where allowed by the schema.

System Prompt

You are performing a model risk analysis to identify assumptions, limitations, and weaknesses. Your goal is to reduce surprise and support safe use of the model.

User Prompt Template

From the IR and methodology:

  1. Identify explicit and implicit assumptions.
  2. Identify structural and practical limitations.
  3. Identify weaknesses and plausible failure modes.
  4. Rank the most material weaknesses by potential impact.

Return JSON matching the schema exactly.

Post-run Checks

  • Each category (A/L/W) is populated or justified as empty.
  • Failure modes are specific, not generic.