AgentSkillsCN

causal-inference-engine

掌握因果推断技能,用于估算处理效应,深入理解业务数据中的因果关系。

SKILL.md
--- frontmatter
name: causal-inference-engine
description: Causal inference skill for estimating treatment effects and understanding causal relationships in business data
allowed-tools:
  - Read
  - Write
  - Glob
  - Grep
  - Bash
metadata:
  specialization: decision-intelligence
  domain: business
  category: forecasting
  priority: medium
  shared-candidate: true
  tools-libraries:
    - econml
    - dowhy
    - causalml
    - statsmodels

Causal Inference Engine

Overview

The Causal Inference Engine skill provides sophisticated methods for estimating causal effects from observational data. It enables business analysts to move beyond correlation to understand true cause-and-effect relationships, supporting evidence-based decision-making for interventions, policy changes, and strategic initiatives.

Capabilities

  • Propensity score matching
  • Inverse probability weighting
  • Difference-in-differences
  • Instrumental variables
  • Regression discontinuity
  • Synthetic control methods
  • Causal forest implementation
  • Sensitivity analysis to unobserved confounding

Used By Processes

  • A/B Testing and Experimentation Framework
  • Predictive Analytics Implementation
  • Win/Loss Analysis Program

Usage

Problem Definition

python
# Define causal question
causal_problem = {
    "treatment": "marketing_campaign",
    "outcome": "purchase_conversion",
    "confounders": ["customer_segment", "prior_purchases", "channel", "region"],
    "instruments": ["random_assignment_probability"],  # if available
    "effect_type": "ATE",  # Average Treatment Effect
    "heterogeneity": ["customer_segment", "tenure"]  # for CATE
}

Propensity Score Matching

python
# Propensity score configuration
psm_config = {
    "method": "propensity_score_matching",
    "estimator": "logistic_regression",
    "matching": {
        "method": "nearest_neighbor",
        "caliper": 0.1,
        "replacement": False,
        "ratio": 1
    },
    "balance_check": True,
    "covariates": ["age", "income", "prior_purchases", "engagement_score"]
}

Difference-in-Differences

python
# DiD configuration
did_config = {
    "method": "difference_in_differences",
    "treatment_group": "stores_with_intervention",
    "control_group": "stores_without_intervention",
    "pre_period": ["2023-01", "2023-06"],
    "post_period": ["2023-07", "2023-12"],
    "parallel_trends_test": True,
    "fixed_effects": ["store_id", "month"]
}

Causal Forest (Heterogeneous Effects)

python
# Causal forest for CATE
causal_forest_config = {
    "method": "causal_forest",
    "n_trees": 1000,
    "honest": True,
    "effect_modifiers": ["customer_segment", "tenure", "region"],
    "output": {
        "individual_effects": True,
        "confidence_intervals": True,
        "variable_importance": True
    }
}

Method Selection Guide

MethodWhen to UseAssumptions
Propensity ScoreSelection on observablesNo unmeasured confounding
Difference-in-DifferencesPre/post with control groupParallel trends
Regression DiscontinuityThreshold-based treatmentContinuity at threshold
Instrumental VariablesUnmeasured confounding existsValid instrument
Synthetic ControlAggregate-level interventionPre-treatment fit
Causal ForestHeterogeneous effectsUnconfoundedness

Input Schema

json
{
  "causal_problem": {
    "treatment": "string",
    "outcome": "string",
    "confounders": ["string"],
    "effect_type": "ATE|ATT|CATE"
  },
  "data": "dataframe or path",
  "method_config": {
    "method": "string",
    "parameters": "object"
  },
  "validation": {
    "refutation_tests": ["placebo", "subset", "random_common_cause"],
    "sensitivity_analysis": "boolean"
  }
}

Output Schema

json
{
  "effect_estimate": {
    "point_estimate": "number",
    "confidence_interval": ["number", "number"],
    "p_value": "number",
    "standard_error": "number"
  },
  "heterogeneous_effects": {
    "subgroup": {
      "effect": "number",
      "ci": ["number", "number"]
    }
  },
  "diagnostics": {
    "balance_statistics": "object",
    "parallel_trends_test": "object",
    "first_stage_f_stat": "number (IV)"
  },
  "refutation_results": {
    "test_name": {
      "original_effect": "number",
      "refuted_effect": "number",
      "passed": "boolean"
    }
  },
  "sensitivity": {
    "robustness_value": "number",
    "interpretation": "string"
  }
}

Best Practices

  1. Clearly articulate the causal question before analysis
  2. Draw a causal diagram (DAG) to identify confounders
  3. Check covariate balance after matching/weighting
  4. Perform sensitivity analysis to unmeasured confounding
  5. Use multiple refutation tests to validate results
  6. Report effect sizes with confidence intervals
  7. Be transparent about assumptions and limitations

Refutation Tests

TestWhat It Checks
Placebo TreatmentEffect should be zero with random treatment
Placebo OutcomeEffect should be zero with unrelated outcome
Subset ValidationEffect should hold in subsamples
Random Common CauseAdding random confounder shouldn't change effect

Integration Points

  • Feeds into Hypothesis Tracker for test results
  • Connects with Experimentation Manager agent
  • Supports Predictive Analyst for causal features
  • Integrates with Bayesian Network Analyzer for causal graphs