Causal Inference Engine

Overview

The Causal Inference Engine skill provides sophisticated methods for estimating causal effects from observational data. It enables business analysts to move beyond correlation to understand true cause-and-effect relationships, supporting evidence-based decision-making for interventions, policy changes, and strategic initiatives.

Capabilities

•Propensity score matching
•Inverse probability weighting
•Difference-in-differences
•Instrumental variables
•Regression discontinuity
•Synthetic control methods
•Causal forest implementation
•Sensitivity analysis to unobserved confounding

Used By Processes

•A/B Testing and Experimentation Framework
•Predictive Analytics Implementation
•Win/Loss Analysis Program

Usage

Problem Definition

python

# Define causal question
causal_problem = {
    "treatment": "marketing_campaign",
    "outcome": "purchase_conversion",
    "confounders": ["customer_segment", "prior_purchases", "channel", "region"],
    "instruments": ["random_assignment_probability"],  # if available
    "effect_type": "ATE",  # Average Treatment Effect
    "heterogeneity": ["customer_segment", "tenure"]  # for CATE
}

Propensity Score Matching

python

# Propensity score configuration
psm_config = {
    "method": "propensity_score_matching",
    "estimator": "logistic_regression",
    "matching": {
        "method": "nearest_neighbor",
        "caliper": 0.1,
        "replacement": False,
        "ratio": 1
    },
    "balance_check": True,
    "covariates": ["age", "income", "prior_purchases", "engagement_score"]
}

Difference-in-Differences

python

# DiD configuration
did_config = {
    "method": "difference_in_differences",
    "treatment_group": "stores_with_intervention",
    "control_group": "stores_without_intervention",
    "pre_period": ["2023-01", "2023-06"],
    "post_period": ["2023-07", "2023-12"],
    "parallel_trends_test": True,
    "fixed_effects": ["store_id", "month"]
}

Causal Forest (Heterogeneous Effects)

python

# Causal forest for CATE
causal_forest_config = {
    "method": "causal_forest",
    "n_trees": 1000,
    "honest": True,
    "effect_modifiers": ["customer_segment", "tenure", "region"],
    "output": {
        "individual_effects": True,
        "confidence_intervals": True,
        "variable_importance": True
    }
}

Method Selection Guide

Method	When to Use	Assumptions
Propensity Score	Selection on observables	No unmeasured confounding
Difference-in-Differences	Pre/post with control group	Parallel trends
Regression Discontinuity	Threshold-based treatment	Continuity at threshold
Instrumental Variables	Unmeasured confounding exists	Valid instrument
Synthetic Control	Aggregate-level intervention	Pre-treatment fit
Causal Forest	Heterogeneous effects	Unconfoundedness

Input Schema

json

{
  "causal_problem": {
    "treatment": "string",
    "outcome": "string",
    "confounders": ["string"],
    "effect_type": "ATE|ATT|CATE"
  },
  "data": "dataframe or path",
  "method_config": {
    "method": "string",
    "parameters": "object"
  },
  "validation": {
    "refutation_tests": ["placebo", "subset", "random_common_cause"],
    "sensitivity_analysis": "boolean"
  }
}

Output Schema

json

{
  "effect_estimate": {
    "point_estimate": "number",
    "confidence_interval": ["number", "number"],
    "p_value": "number",
    "standard_error": "number"
  },
  "heterogeneous_effects": {
    "subgroup": {
      "effect": "number",
      "ci": ["number", "number"]
    }
  },
  "diagnostics": {
    "balance_statistics": "object",
    "parallel_trends_test": "object",
    "first_stage_f_stat": "number (IV)"
  },
  "refutation_results": {
    "test_name": {
      "original_effect": "number",
      "refuted_effect": "number",
      "passed": "boolean"
    }
  },
  "sensitivity": {
    "robustness_value": "number",
    "interpretation": "string"
  }
}

Best Practices

•Clearly articulate the causal question before analysis
•Draw a causal diagram (DAG) to identify confounders
•Check covariate balance after matching/weighting
•Perform sensitivity analysis to unmeasured confounding
•Use multiple refutation tests to validate results
•Report effect sizes with confidence intervals
•Be transparent about assumptions and limitations

Refutation Tests

Test	What It Checks
Placebo Treatment	Effect should be zero with random treatment
Placebo Outcome	Effect should be zero with unrelated outcome
Subset Validation	Effect should hold in subsamples
Random Common Cause	Adding random confounder shouldn't change effect

Integration Points

•Feeds into Hypothesis Tracker for test results
•Connects with Experimentation Manager agent
•Supports Predictive Analyst for causal features
•Integrates with Bayesian Network Analyzer for causal graphs