Working with Causal Frameworks; Simulations for Counterexamples
This lecture is about evaluating causal estimators
By the end, you should be able to
Suppose that we have data on
Interested in causal effect of \(\bX_{it}\) on \(Y_{it}\)
Data is panel:
One way to estimate effects of treatment on \(Y_{it}\) is using random intercept/fixed effect estimators
Simple example: one-way FE estimator:
Construct \[\small \tilde{Y}_{it} = Y_{it} - \dfrac{1}{T}\sum_{s=1}^T Y_{is}, \quad \tilde{\bX}_{it} = \bX_{it} - \dfrac{1}{T}\sum_{s=1}^T \bX_{is}, \]
The one-way FE estimator is the OLS estimator in regression of \(\tilde{Y}_{it}\) on \(\tilde{\bX}_{it}\)
Remember: properties of causal estimator only meaningful if one specifies an underlying causal model
For one-way FE estimator: typical setting is linear potential outcomes model
\[ Y_{it}^{\bx} = \alpha_i + \bbeta'\bx + U_{it} \tag{1}\]
Under
the one-way random intercept estimator \(\hat{\bbeta}^{FE}\) satisfies \[ \begin{aligned} \E[\hat{\bbeta}^{FE}] & = \bbeta, \\ \sqrt{N}(\hat{\bbeta}^{FE} - \bbeta) & \xrightarrow{d} N(0, \avar(\hat{\bbeta}^{FE})) \end{aligned} \]
If model correctly specified, suitable FE estimator will recover \(\bbeta\)
But model (1) not internally consistent:
More realistic to assume causal model \[ Y_{it}^{\bx} = \alpha_i + \bbeta_{i}'\bx + U_{it} \tag{2}\] Now \(\bbeta_i\) — also unit-specific
Can also consider even more general models with \(\bbeta_{it}\)
Suppose model (2) is true, but we still apply \(\hat{\bbeta}^{FE}\)
What is \(\hat{\bbeta}^{FE}\) actually estimating?
Can show that \[ \small \hat{\bbeta}^{FE} \xrightarrow{p} \E\left[ \left( \E\left[ \tilde{\bX}_i'\tilde{\bX}_i \right]\right)^{-1} \tilde{\bX}'_i\tilde{\bX}_i\bbeta_i \right] \tag{3}\] where \(\tilde{\bX}_i'\) — matrix with \(\tilde{\bX}_{it}'\) as rows
It is possible that “controlling for heterogeneity” with FE estimators can lead to more bias than not doing anything about heterogeneity?
\[ Y_i^x = \phi(x, U_i) \] = “If unit \(i\) with unobserved characteristics \(U_i\) receives treatment \(x\), their outcome is \(Y_i^x\)”
\(\phi\) may be known or (fully/partially) unknown
Same type of approach can be used with panel or time series settings. Also can consider settings where treatment of others matters
Potential outcomes allow defining causal effects:
Causal effect of moving from treatment \(\bx_1\) to \(\bx_2\) for unit \(i\) is defined as \[ \phi(\bx_2, U_i) - \phi(\bx_1, U_i) \]
Usually interested in some distributional features of causal effects:
Sometimes can learn individual treatment effects themselves
Two scenarios for studying a causal estimator:
In simulations:
Three metrics you typically see when evaluating causal estimators:
Distribution: usually for inference purposes (good normal approximation \(\Rightarrow\) good performance of usual asymptotic intervals)
So far:
Question is robustness:
Can FE estimator be somehow worse than not eliminating \(\alpha_i\) at all?
Again: want simplest possible example
Specify model in form \[ \begin{aligned} Y_{it} & = \alpha_i + (\beta + \alpha_i) X_{it} + U_{it}, \quad i=1, \dots, N, \quad t=1, 2 \end{aligned} \]
Now just need to specify \(\beta\) and distribution of \(\alpha_i\)
\(\Rightarrow\) two types of units:
Obvious that we need to include one-way random intercept/FE estimator
And
Remember: causal parameter must be specified based on the causal model, not on the estimator
In our case
Want a dramatic example for things going wrong the FE vis-à-vis \(\E[\beta_i]\)
Potential metric: distributions of estimators and their relation to \(\E[\beta_i]\)
Might be hard to get OLS exactly unbiased, depending on strict exogeneity
Good place to start looking — moment conditions that characterize limits of estimators
Distribution of \((X_{it}, U_{it})\) must be same for both estimators
Moment conditions tell us
Assume that \(U_{it}\) is standard normal and independent of everything else
In some settings import to explicitly model selection into treatment
Will solve numerically as part of DGP generation
Again have a small targeted simulation \(\Rightarrow\) will structure using functions
├── dgp
│ ├── constants.py
│ ├── data_generation.py
│ ├── __init__.py
│ └── moment_info.py
├── exp.ipynb
├── gmm
│ ├── __init__.py
│ └── solver.py
├── main.py
├── README.md
│ └── ...
└── simulation_infrastructure
├── constants.py
├── __init__.py
├── plotting.py
└── runner_functions.py
Again have code with tighter coupling but less infrastructure to manage
Will structure some things differently this time to give broader perspective
main.pymain.py
"""
Entry point for running simulation on bias of FE estimators under heterogeneity.
Causal model:
Y_{it}^x = alpha_i + beta_i * x + U_{it}
Goal of simulation: construct an example of distribution under which:
1. The OLS estimator in regression of Y_{it} on X_{it} is unbiased for
average coefficient value E[beta_i]
2. The one-way random intercept/fixed effects estimator has the wrong sign
relative to $E[beta_i]$ with high probability
Usage:
python main.py
Output:
Kernel density plots of distributions of the two estimators
"""
import os
from concurrent.futures import ProcessPoolExecutor, as_completed
import pandas as pd
from tqdm import tqdm
from dgp.constants import (
BETA_MEAN,
N_VALUES,
)
# from utils.combine_results import combine_results
from dgp.moment_info import (
constraints,
param_initial_guess,
process_mu_sigma_params,
sim_moment_conditions,
)
from gmm.solver import GMMSolver
from simulation_infrastructure.constants import (
N_REPLICATIONS,
OUTPUT_DIR,
SEEDS,
)
from simulation_infrastructure.plotting import plot_kdes
from simulation_infrastructure.runner_functions import run_simulation_for_seed
# Ensure output directory exists
os.makedirs(OUTPUT_DIR, exist_ok=True)
def main() -> None:
"""Main function to run simulations in parallel and combine results."""
# Select parameters of DGP by enforcing desired moment conditions
solver_dgp_params = GMMSolver(
sim_moment_conditions,
param_initial_guess,
constraints,
process_func=process_mu_sigma_params,
)
solver_dgp_params.minimize()
mu_sigma_params = solver_dgp_params.process_solution()
# Run simulations in parallel
all_results = []
with ProcessPoolExecutor() as executor:
futures = {
executor.submit(
run_simulation_for_seed,
seed,
N_REPLICATIONS,
N_VALUES,
BETA_MEAN,
mu_sigma_params,
)
for seed in SEEDS
}
# Collect results as they complete, with a progress bar
for future in tqdm(
as_completed(futures), total=len(futures), desc="Running simulations"
):
result = future.result()
all_results.append(result)
# Combine results and export plots
sim_results = pd.concat(all_results)
plot_kdes(sim_results, OUTPUT_DIR)
if __name__ == "__main__":
main()Construct parameters \((\mu_{\cdot}, \sigma_{\cdot}, \rho_{\cdot})\) by GMM
GMMSolver to find some valid parameter valuesThen just collect found parameter and pass to data generator function
Try varying target values of moment and comparing resulting estimator distributions
pyfixestStill want to use parallelism to use multiple cores
Here: use process-based approach: spawn a fully separate Python process that does part of the work, then combine results
simulation_infrastructure.runner_functions
import numpy as np
import pandas as pd
import pyfixest as pf
from dgp.data_generation import generate_data
def run_simulation_for_seed(
seed: int,
n_replications: int,
n_values: np.ndarray,
beta_mean: float,
mu_sigma_params: dict[str, np.ndarray | np.floating],
):
"""
Runs simulations for OLS vs. FE for given seed.
Args:
seed (int): Random seed for reproducibility.
n_replications (int): Number of replications per seed.
n_values (np.array): Cross-sectional sample sizes to simulate.
beta_mean (float): Average coefficient value for generating data.
mu_sigma_params (dict[str, np.ndarray]): Dictionary containing:
- "mu_plus" (np.ndarray): Mean for covariates when effect is +1.
- "mu_minus" (np.ndarray): Mean for covariates when effect is -1.
- "sigma_plus" (np.ndarray): Covariance for X when effect is +1.
- "sigma_minus" (np.ndarray): Covariance for X when effect is -1.
Returns:
pd.DataFrame: Monte Carlo results for given seed.
"""
results = []
for n_units in n_values:
for replication in range(n_replications):
# Generate data
data = generate_data(
n_units,
beta_mean,
mu_sigma_params,
seed=seed + replication,
)
fit_no_effect = pf.feols(
"outcome ~ covariate", data=data, drop_intercept=True
)
fit_effect = pf.feols("outcome ~ covariate|Unit", data=data)
# Collect results
results.append(
{
"seed": seed,
"replication": replication,
"n_units": n_units,
"model": "No Fixed Effects",
"coef_est": fit_no_effect.coef().iloc[0],
"ci_lower": fit_no_effect.confint().iloc[0, 0],
}
)
results.append(
{
"seed": seed,
"replication": replication,
"n_units": n_units,
"model": "With Fixed Effects",
"coef_est": fit_effect.coef().iloc[0],
"ci_lower": fit_effect.confint().iloc[0, 0],
}
)
# Return results as a dataframe
return pd.DataFrame(results)Another difference:
Both FE and OLS estimators applied to same dataset in each step
Before: separate scenario for each estimator
Our simulation
In line with our goal, most appropriate summary — visualize distributions of estimators vs. \(\E[\beta_i]\)
Allows us to see how often they sign \(\neq\) sign of \(\E[\beta-i]\)
Recall: \(\E[\beta_i] = -0.25\) (dotted vertical line)
Results exactly as wanted:
Accounting for the (correctly specified!) random intercept \(\alpha_i\) leads to much worse performance than not accounting for any coefficient heterogeneity
Our simulation: example of what could go wrong
Two practical conclusions:
In this lecture we
Evaluating Causal Estimators