Suppose that observe an IID sample \(X_1, \dots, X_N\) where \[
X_i \sim N(\mu, \sigma^2)
\] We estimate \(\mu\) and \(\sigma^2\) with their maximum likelihood estimators
A \((1-\alpha)\times 100\%\) confidence set for \(\theta\) (\(\theta\in\R^p\)) is a random set \(S(X_1, \dots, X_N)\subseteq \R^p\) \[ \scriptsize
P(\theta \in S(X_1, \dots, X_N)) = 1-\alpha
\]
\(S(\cdot, \cdots)\) is an asymptotic \((1-\alpha)\times 100\%\) confidence set for \(\theta\) if \(\lim_{N\to\infty} P(\theta \in S(X_1, \dots, X_N)) = 1-\alpha\)
\(P(\theta \in S(X_1, \dots, X_N))\) is the coverage of \(S\)
Confidence Sets as Set Estimators
Suppose that we are interested in some \(\btheta\in \R^p\)
Confidence sets can be viewed as set estimators — return a whole set of values in \(\R^p\) as a collection of guesses for \(\btheta\)
Confidence sets — intervals estimators with coverage guarantees
Should contrast with point estimators (e.g. OLS, IV, etc.) — give only one point
Familiar Example: CI Based on Asymptotic Normality
Proposition 1 Suppose that \(\sqrt{N}(\hat{\theta} - \theta) \xrightarrow{d} N(0, \avar(\hat{\theta}))\). Let \(\widehat{\avar}(\hat{\theta})\xrightarrow{p} \avar(\hat{\theta})\).
The confidence interval \[ \small \hspace{-1.6cm}
S = \left[ \hat{\theta} - z_{1-\alpha/2} \sqrt{ \frac{\widehat{\avar}(\hat{\theta})}{N} }, \hat{\theta}+ z_{1-\alpha/2} \sqrt{ \frac{\widehat{\avar}(\hat{\theta})}{N} } \right]
\tag{1}\] has asymptotic coverage \((1-\alpha)\times 100\%\)
Connection to Tests: Test Inversion
There is an equivalent way to construct the confidence interval (1)
Recall \(t\)-statistic for \(H_0: \theta = c\)
\(S\) is the set of all \(c\) for which the \(t\)-test does not reject
An example of test inversion and equivalence between testing and confidence intervals
Interpetation of Confidence Interval
Recall interpretation of \(95\%\) coverage:
Suppose we draw many samples of the same size
In ~95% of the samples the CI will fall “on top” of the true parameter value
Notice: interpretation aligns with what we actually do in Monte Carlo
Metrics for Evaluating Confidence Intervals
Metrics for Evaluating Confidence Intervals
There are two key metrics for looking at confidence intervals in simulations
Coverage: what the actual coverage is
(Average) length
Generally prefer shorter intervals with coverage that is close to the nominal level
Metric: Actual Coverage
How do you evaluate coverage?
Draw many datasets and in each sample \(s\)
Compute CI
Check if CI include the true parameter and record inclusion as variable \(D_s = 0, 1\) (1 for included)
Report estimated coverage as average of \(D_s\) over \(s\)
Metric: Average Length
Similar story for length:
Draw many datasets and in each sample \(s\)
Compute CI = \([\hat{a}_s, \hat{b}_s]\)
Record \(\text{Length}_s = \hat{b}_s - \hat{a}_s\)
Report estimated average of \(\text{Length}_s\) over \(s\)
Can also report other characteristics of length distribution (e.g. quantiles)
Simulation Implementation
Completing Simulation Design
Recap: Simulation Setting
Let’s come back to our example of today. Already have
DGP of variables (normal with different variances)
Estimators: maximum likelihood
Confidence interval to evaluate: based on asympotic normality of \(\hat{\sigma}^2\)
Still need to choose remaining DGP parameters: mean \(\mu\), sample size \(N\), and values of \(\sigma\)
Choosing the Remaining Parameters
Can set \[
\mu = 0
\]
For sample sizes:
Consider two: \(N=20\) and \(N=100\)
If asymptotic approximation problems exist, they should be less present with larger sample sizes
For \(\sigma\): logarithmic grid between \(10^{-3}\) and \(10\)
Implementing The Simulation
Approach to Code: Functions
Our simulation
Quite small
Unlikely to need many more DGPs and different estimators
Today will organize as a small collection of functions
Contrast with the modular approach of the previous two examples
Runs the simulation using function from sim_infrastructure.core
Makes plots using functions from sim_infrastructure.plotting
core Simulation Module
sim_infrastructure/core.py
"""Core simulation logic for evaluating confidence intervals for variance.This module contains the core functions:- mle_variance: maximum likelihood estimation of variance.- asymptotic_ci_variance: construction of asymptotic confidence intervals.- run_simulations: running Monte Carlo simulations for coverage and interval length."""import numpy as npfrom scipy.stats import normdef mle_variance(data: np.ndarray) -> np.floating:"""Estimate the variance of a normal distribution using MLE."""return np.var(data, ddof=0) # MLE for variance is the sample variance with ddof=0def asymptotic_ci_variance(data: np.ndarray, alpha: float=0.05) ->tuple[float, float]:"""Construct an asymptotic (Wald-type) confidence interval for the variance.""" n =len(data) var_hat = mle_variance(data) se = np.sqrt(2* var_hat**2/ n) z_crit = norm.ppf(1- alpha /2) lower =max(0, var_hat - z_crit * se) # Ensure lower bound is non-negative upper = var_hat + z_crit * sereturn lower, upperdef run_simulations( true_vars: np.ndarray, n_simulations: int=1000, n_obs_list: list[int] = [20, 100], alpha: float=0.05, seed: int=1,) ->dict[int, dict[float, dict]]:""" Run Monte Carlo simulations for each true variance and sample size. Args: true_vars (np.ndarray): true variance values to simulate. n_simulations (int): number of Monte Carlo simulations per true variance. n_obs_list (list[int]): list of sample sizes to simulate. alpha (float): significance level for confidence intervals. seed (int): Random seed for reproducibility. Returns: Nested dictionary: {n_obs: {true_var: {'coverage': ..., 'errors': ..., 'lengths': ...}}} """ rng = np.random.default_rng(seed) results = {n_obs: {} for n_obs in n_obs_list}# Loop over sample sizesfor n_obs in n_obs_list:# Loop over true variablesfor true_var in true_vars: errors = [] lengths = [] coverage = []for _ inrange(n_simulations):# Draw data and run estimation data = rng.normal(0, np.sqrt(true_var), n_obs) var_hat = mle_variance(data)# Record estimation error in variance errors.append(var_hat - true_var)# Construct CI and check if variance belongs to it ci_lower, ci_upper = asymptotic_ci_variance(data, alpha) lengths.append((ci_upper - ci_lower) / true_var) coverage.append(ci_lower <= true_var <= ci_upper) results[n_obs][true_var] = {'coverage': np.mean(coverage),'errors': np.array(errors),'lengths': np.array(lengths) }return results
main.py
main.py
"""Entry point for running simulations and generating plots.Simulation 3: evaluating confidence intervals for variance of the normaldistribution based on the maximum likelihood estimator. Simultions considercoverage and (scaled) length properties as true variance decreases to zero.This script:- Sets up simulation parameters.- Runs the core simulation.- Generates and saves all plots."""from pathlib import Pathimport numpy as npfrom sim_infrastructure.core import run_simulationsfrom sim_infrastructure.plotting import plot_coverage_and_lengths, plot_error_kdes# Simulation parameters: note upper case for constantsTRUE_VARS = np.logspace(-3, 1, 50)N_OBS_LIST = [20, 100]N_SIMS =50000CI_ALPHA =0.05SEED =1OUTPUT_DIR = Path() /"results"def main():# Run simulations results = run_simulations(TRUE_VARS, N_SIMS, N_OBS_LIST, CI_ALPHA, SEED)# Generate plots plot_coverage_and_lengths(results, OUTPUT_DIR) plot_error_kdes(results, OUTPUT_DIR)if__name__=="__main__": main()
Core Organization: Discussion
Tightly couple code
Sacrifice extensibility
But much easier to write, much less infrastructure
Fine for our small targeted simulation
Inside simulation: store CI coverage and length; estimation error of maximum likelihood estimator
Simulation Results
Core Results
Visualizing Core Results
Our simulation produces coverage and length for each each value of \(\sigma\)
How to look at these results?
Can visually graphically with coverage and length plots
For length: scale by magnitude to target value to get a “normalized” view
Coverage Plot
Length Plot
Interpretation of Simulation Results
For the core question:
Remarkably stable coverage and length properties
No dependence on \(\sigma\)
\(\Rightarrow\) no issue with \(\sigma\approx 0\)
But do see coverage issues for \(N=20\) despite much longer intervals (around 87% real coverage with 95% nominal level). Why?
Why Coverage Issues for Smaller Samples?
Potential Explanations of Bad Coverage for \(N=20\)
Why the coverage issues for \(N=20\)?
Normal approximation might be poor (incorrect shape of interval around its center)
MLE estimator is biased downward (incorrect center of the interval)
Can look at distribution of estimation errors and how it compares to the normal distribution with same variance
Example Distribution Plot for \(N=20\)
Example Distribution Plot for \(N=100\)
Discussions of Distributional Approximation
From looking at the two plots:
The shape is not perfect for \(N=20\): longer right tail, some asymmetry
Bias is present: noticeable shift to the left
But much better approximation for \(N=100\)
\(\Rightarrow\) Coverage issues are due to small sample effects
Recap and Conclusions
Recap
In this lecture we
Reviewed key concepts for confidence intervals
Identified key metrics: coverage and length
Saw a practical small simulation organized around tightly couple functions
———. 2000. “Inconsistency of the Bootstrap When a Parameter Is on the Boundary of the Parameter Space.”Econometrica 68 (2): 399–405. https://doi.org/10.1111/1468-0262.00114.
———. 2001. “Testing When a Parameter Is on the Boundary of the Maintained Hypothesis.”Econometrica 69 (3): 683–734. https://doi.org/10.1111/1468-0262.00210.