AgentSkillsCN

ai-bias-auditor

在识别AI系统中的公平性问题时使用。建议在部署前及后续定期使用。该技能可生成偏见评估报告、公平性指标、缓解策略,以及审计文档。

SKILL.md
--- frontmatter
name: ai-bias-auditor
description: Use when identifying fairness issues in AI systems. Use before deployment and periodically after. Produces bias assessment, fairness metrics, mitigation strategies, and audit documentation.

AI Bias Auditor

Overview

Identify and address fairness issues in AI systems. Conduct structured bias audits, measure disparate impact, and recommend mitigation strategies.

Core principle: Fairness is not optional. Proactive bias auditing protects users and the organization.

When to Use

  • Before deploying AI system
  • After significant model changes
  • Periodic audits of production systems
  • Responding to bias complaints
  • Regulatory compliance requirements

Output Format

yaml
bias_audit:
  system: "[System name]"
  audit_date: "[YYYY-MM-DD]"
  auditor: "[Name/Team]"
  audit_type: "[Pre-deployment | Periodic | Complaint-driven]"
  
  scope:
    model_purpose: "[What the model does]"
    decision_type: "[Recommendation | Automation | Scoring]"
    affected_population: "[Who is impacted]"
    protected_attributes:
      - attribute: "[e.g., gender, race, age]"
        available_in_data: [true | false]
        proxy_risk: "[Potential proxies]"
  
  methodology:
    data_analyzed:
      training_data: [true | false]
      production_data: [true | false]
      sample_size: "[N]"
      time_period: "[Date range]"
    
    metrics_used:
      - metric: "[Metric name]"
        definition: "[How calculated]"
        threshold: "[Acceptable range]"
    
    techniques:
      - "[Disparate impact analysis]"
      - "[Intersectional analysis]"
  
  findings:
    summary: "[Overall assessment]"
    risk_level: "[High | Medium | Low | None detected]"
    
    by_attribute:
      - attribute: "[Protected attribute]"
        groups_compared: ["[Group A]", "[Group B]"]
        
        metrics:
          - metric: "[Metric name]"
            group_a: "[Value]"
            group_b: "[Value]"
            disparity: "[Ratio or difference]"
            threshold: "[Acceptable]"
            status: "[Pass | Fail | Review]"
        
        finding: "[Interpretation]"
        severity: "[High | Medium | Low | None]"
    
    intersectional:
      - groups: "[e.g., gender + age]"
        finding: "[What was found]"
        severity: "[Level]"
  
  root_cause_analysis:
    potential_sources:
      - source: "[Data | Model | Feature | Label]"
        description: "[How it introduces bias]"
        confidence: "[High | Medium | Low]"
  
  mitigation:
    recommendations:
      - recommendation: "[What to do]"
        priority: "[High | Medium | Low]"
        effort: "[Estimate]"
        expected_impact: "[How it helps]"
    
    if_deployed:
      - "[Monitoring requirement]"
      - "[Human oversight requirement]"
  
  compliance:
    regulations_considered: ["[Regulation]"]
    documentation_provided: ["[What's documented]"]
    
  verdict:
    deploy_recommendation: "[Approve | Approve with conditions | Do not deploy]"
    conditions: ["[If conditional approval]"]
    next_audit: "[When]"

Fairness Metrics

Group Fairness Metrics

MetricDefinitionThreshold
Demographic ParityEqual positive prediction rates across groupsRatio > 0.8
Equalized OddsEqual TPR and FPR across groupsDifference < 0.1
Predictive ParityEqual precision across groupsRatio > 0.8
CalibrationEqual accuracy of probability estimatesSimilar calibration curves

Disparate Impact Ratio

code
DI Ratio = (Favorable outcome rate for protected group) / 
           (Favorable outcome rate for majority group)

Interpretation:
- DI > 0.8: Generally acceptable (80% rule)
- DI 0.6-0.8: Needs review
- DI < 0.6: Likely problematic

Statistical Parity Difference

code
SPD = |P(Ŷ=1|A=0) - P(Ŷ=1|A=1)|

Interpretation:
- SPD < 0.05: Minimal disparity
- SPD 0.05-0.10: Moderate disparity
- SPD > 0.10: Significant disparity

Bias Sources

SourceExamplesDetection
Historical biasPast discrimination encoded in labelsCompare to fair baseline
Representation biasSome groups underrepresentedCheck training data distribution
Measurement biasDifferent measurement quality by groupAudit data collection
Aggregation biasOne model for heterogeneous groupsTest per-group performance
Proxy variablesFeatures correlated with protected attributesCorrelation analysis

Audit Process

Phase 1: Scope Definition

yaml
scoping:
  questions:
    - "What decisions does this system inform?"
    - "Who is affected by these decisions?"
    - "What are the relevant protected attributes?"
    - "What harm could result from biased decisions?"
    - "What legal/regulatory requirements apply?"

Phase 2: Data Analysis

yaml
data_analysis:
  steps:
    - "Document protected attribute distribution"
    - "Identify potential proxy variables"
    - "Check for missing data patterns by group"
    - "Analyze historical label quality by group"

Phase 3: Model Analysis

yaml
model_analysis:
  steps:
    - "Calculate fairness metrics by group"
    - "Perform intersectional analysis"
    - "Test on held-out diverse dataset"
    - "Analyze feature importance by group"

Phase 4: Reporting

yaml
reporting:
  elements:
    - "Executive summary with risk level"
    - "Detailed findings with evidence"
    - "Root cause hypotheses"
    - "Prioritized recommendations"
    - "Compliance documentation"

Mitigation Strategies

Pre-Processing

StrategyWhen to Use
ResamplingUnderrepresented groups
ReweightingImbalanced impact
Data augmentationLimited diverse examples

In-Processing

StrategyWhen to Use
Fairness constraintsOptimize with fairness objective
Adversarial debiasingRemove protected info from embeddings
RegularizationPenalize group disparity

Post-Processing

StrategyWhen to Use
Threshold adjustmentEqualize acceptance rates
CalibrationAlign predictions by group
Human reviewHigh-stakes edge cases

Documentation Requirements

For Regulatory Compliance

yaml
compliance_documentation:
  model_card:
    - "Intended use and limitations"
    - "Training data description"
    - "Fairness evaluation results"
    - "Known biases and mitigations"
  
  audit_trail:
    - "Audit methodology"
    - "Data sources examined"
    - "Metrics calculated"
    - "Findings and decisions"
  
  ongoing_monitoring:
    - "Metrics tracked post-deployment"
    - "Alert thresholds"
    - "Review frequency"

Audit Checklist

  • Protected attributes identified
  • Proxy variables analyzed
  • Training data distribution documented
  • Fairness metrics calculated
  • Intersectional analysis performed
  • Root causes hypothesized
  • Mitigations recommended
  • Compliance documentation complete
  • Monitoring plan established
  • Next audit scheduled