Good Simulation Code I: Starting with Functions

Starting with a Monolithic Design for Small Simulations

Vladislav Morozov

Introduction

Lecture Info

Learning Outcomes

This lecture is about a starting approach to writing clean Monte Carlo code

By the end, you should be able to

Specify a basic simulation design
Start with a simple function-based simulation
Understand the importance of setting an RNG seed
Recognize the limitations of a monolithic design

References

Statistics:

Chapter 4 of Hansen (2022) for OLS
Chapter 14 of Hansen (2022) for time series basics

Programming

Chapter 16-18 in Lutz (2025) on functions in Python
Chapter 2-3 of Hillard (2020) on basics of design

Example Of This Block

Context: Bias of OLS Estimator

Consider the simple causal linear model:

\[ Y_{t}^x = \beta_0 + \beta_1 x + U_t, \quad t=1, \dots, T \] \(Y_t^x\) — potential outcome under \(x\) for observation \(t\)

Basic econometrics tells us that OLS estimator in regressing \(Y_t\) on \((1, X_t)\) is

Unbiased if \(\E[U_t|X_1, \dots, X_T] = 0\)
Possibly biased otherwise

Question of the Block: Does Bias Matter?

Why should a user of OLS care about this bias?

If bias is big, can lead to

Improperly centered confidence intervals
Wrong sign of point estimate

\(\Rightarrow\) both can lead to incorrect policy conclusions if bias big

Statistical question of this block: how big is the bias?

Answer Approaches

Remember: three ways to answer statistical questions:

Theory
Simulations
Empirical

This case: theory and empirical evaluation do not work

Need to do simulations

Why Not Theory?

Theory not fully satisfactory: expression for bias depends on the DGP for \((X_t, U_t)\): \[ \E[\hat{\bbeta}] - \bbeta= \E\left[ (\bW'\bW)^{-1}\bW'\bU \right] \] where \(\bW\) — \(T\times 2\) with \(t\)th row given by \((1, X_t)\)

Bao and Ullah (2007): asymptotic order \(O(1/T)\)
Unclear if actual magnitude “important in practice”

Why Not Use Real Data?

Empirical data validation not an option both in causal and predictive settings:

Can never measure true bias even if believe in linear model
In predictive settings: don’t even care about \(\hat{\bbeta}\), only about predicted \(\hat{Y}_t\)

Have to recur to simulations

Questions of the Block: Simulations

Technical question of the block: how to think about and structure simulations in general?

This course block — talk mostly about code design

Our three-step simulation anatomy
Design sketches: monolithic (today) and more extensible (next lecture). Then organization and orchestration

Specifying Our Simulation

Choosing Context

Simulation Requirements

Recall that simulations should be

Realistic: need to think about relevant real world scenario to emulate
Targeted:
- Be specific in terms of target metrics
- Focus DGPs that cause differences in target metrics

Remember: simulations necessarily limited in scope — can only do finite number of experiments \(\Rightarrow\) need to focus

Choosing Scenario: Time Series

As an example, we care about time series

What’s relevant?

Possibility of dependence across time
Different strength of dependence
Different lengths of time series
(More advanced): data drift/time-varying DGP

We’ll include the first three

Metrics

For simplicity, we will consider just one explicit metric:

Our metric: absolute bias of OLS estimator of \(\beta_1\):

\[ \text{Bias}(\hat{\beta}_1) = \E[\hat{\beta}_1] - \beta_1 \]

Other metrics:

Coverage of CIs based on the OLS estimator
Proportion of incorrect signs (\(\mathrm{sgn}(\hat{\beta}_1)\neq \mathrm{sgn}(\beta_1)\))

Specifying DGPs

The Problem of DGP Choice

We now have to choose data generating processes that reflect

Dynamics of different strength
Different sample sizes

Ideally: would specify based on some reference empirical datasets

Here: talk in a very stylized manner

Static vs. Dynamic

First dimension: include both static and dynamic cases

Static:

\[ \small Y_{t} = \beta_0 + 0.5 X_{t} + U_{t} \]

Covariate \(X_t\) independent from \(U_t\) and over time

Dynamic:

\[ \small Y_{t} = \beta_0 + \beta_1 Y_{t-1} + U_{t} \]

Dependence: \(\beta_1\) value

\[ \small \beta_1 \in \curl{0, 0.5, 0.95} \]

Bias appears in dynamic setting, but not static. Checking dynamic vs. static is targeted

Sample Size and Distributions of \(U_t, X_t, Y_0\)

Second and third design dimension:

Sample sizes
DGPs for variables not determined by model

To start with — keep things simple to focus on essentials:

One DGP per \(U_t, X_t, Y_0\): each \(N(0, 1)\)
\(T=50, 200\)

Different sample sizes: how bias changes (targeted)

Basic Functional Implementation

Implementation Approaches

Summary So Far

So far specified:

Metric
How each dataset should be generate
Size of each dataset

Now time to actually implement — but how?

Two Approaches

In this block talk about two approaches:

Today: monolithic, based on a single function (good for starting and small simulations)
Next time: a more modular approach (better for larger simulations and easier to develop)

Code examples of today: illustrative examples

General Advice

How to not get overwhelmed — a good universal rule:

Start with the simplest pieces and iterate from there

Simulations are often fairly complex piece of code
Do not try to write the whole thing in one go
Do not be afraid of rewriting and improving things later

Often: start with basic simulation function (today) and then refactor and split up (next time)

Basic Function

Practice: Starting with the Static Case

Our simplest thing: static model with a small sample \[ Y_{t} = \beta_0 + \beta_1 X_t + U_t, \quad T=50 \]

Remember simulation anatomy: for many datasets we

Draw 50 points \((X_t, Y_t)\)
Run OLS estimator \(\hat{\beta}_1\)
Compute error \(\hat{\beta}_1-\beta_1\)

Simplest: wrap in a single function

Example: Basic Simulation Function

def simulate_ols(
    n_sim: int = 1000, n_obs: int = 50, beta0: float = 0.0, beta1: float = 0.5
) -> list[float]:
    """Simulates OLS estimation for a static/exogenous DGP: y = beta0 + beta1*x + u.

    Args:
        n_sim (int): Number of simulations to run. Defaults to 1000.
        n_obs (int): Number of observations per simulation. Defaults to 100.
        beta0 (float): True intercept value. Defaults to 0.
        beta1 (float): True slope coefficient. Defaults to 0.5.

    Returns:
        list[float]: List of errors of beta1 estimates for each simulation.
    """

    # Create container to store results
    results_ols_errors = []

    # Initialize the RNG
    rng = np.random.default_rng()

    # Core simulation loop
    for _ in range(n_sim):
        # 1. Draw data
        x = rng.normal(size=n_obs)
        u = rng.normal(size=n_obs)
        y = beta0 + beta1 * x + u

        # 2. Run OLS estimation
        W = np.column_stack([np.ones(n_obs), x])
        beta_hat = np.linalg.inv(W.T @ W) @ W.T @ y

        # 3. Compute error
        results_ols_errors.append(beta_hat[1] - beta1)

    return results_ols_errors

Our First Results: Static Model

Now can execute our simulation by calling function with default arguments:

static_results = simulate_ols()
print(f"Bias (static DGP): {np.mean(static_results)}")

Bias (static DGP): -0.003325398379449356

Qualitatively:

Bias small relative to coefficients (recall \(\beta_1=0.5\))
Theory is confirmed

Distribution of MC Estimates

Can also take a look at density of estimates:

Internal Reproducibility: Setting Seeds

Problem: Lack of Reproducibility

What happens if we rerun our code twice?

static_results_1 = simulate_ols()
static_results_2 = simulate_ols()
print(f"Bias (first run): {np.mean(static_results_1)}")
print(f"Bias (second run): {np.mean(static_results_2)}")

Bias (first run): 0.005406286046517284
Bias (second run): 0.0071337326148280525

Same qualitative result
Different numerical results

Our results are not reproducible

Why No Reproducibility and How to Fix?

Different data drawn by rng in different runs of the function

To get same data, need to know that:

Computers cannot generate actual random numbers (at least quickly and with guaranteed distribution)
Instead used pseudorandom number generators — deterministic algorithms with “good” properties
PseudoRNG controlled by seeds

Give same seed = get the same sequence of numbers

Setting the RNG Seed

In practice: easy to provide a seed to our rng

def simulate_ols(
    n_sim: int = 1000,
    n_obs: int = 50,
    beta0: float = 0.0,
    beta1: float = 0.5,
    seed: int = 1,
) -> list[float]:
    """Simulates OLS estimation for a static/exogenous DGP: y = beta0 + beta1*x + u.

    Args:
        n_sim (int): Number of simulations to run. Defaults to 1000.
        n_obs (int): Number of observations per simulation. Defaults to 100.
        beta0 (float): True intercept value. Defaults to 0.
        beta1 (float): True slope coefficient. Defaults to 0.5.
        seed (int): random number generator seed. Defaults to 1.

    Returns:
        list[float]: List of errors of beta1 estimates for each simulation.
    """

    # Create container to store results
    results_ols_errors = []

    # Initialize the RNG
    rng = np.random.default_rng(seed)

    # Core simulation loop
    for _ in range(n_sim):
        # 1. Draw data
        x = rng.normal(size=n_obs)
        u = rng.normal(size=n_obs)
        y = beta0 + beta1 * x + u

        # 2. Run OLS estimation
        W = np.column_stack([np.ones(n_obs), x])
        beta_hat = np.linalg.inv(W.T @ W) @ W.T @ y

        # 3. Compute error
        results_ols_errors.append(beta_hat[1] - beta1)

    return results_ols_errors

Rerunning Simulations

Now rerunning with the same (default) seed gives the same results:

static_results_1 = simulate_ols()
static_results_2 = simulate_ols()
print(f"Bias (first run): {np.mean(static_results_1)}")
print(f"Bias (second run): {np.mean(static_results_2)}")

Bias (first run): -0.00541847302776387
Bias (second run): -0.00541847302776387

Always set seeds explicitly to help reproducibility

Expanding the Simulation

Adding a Dynamic DGP

Expanding the Simulation

Now need to expand simulations to add AR(1) design

Simplest way: just expand existing function

Add new DGP option "ar1"
Involves changing simulate_ols to have a new argument for dgp_type
- Old behavior: dgp_type="static"
- New behavior: dgp_type="ar1"
Also deal with losing one observation to \(Y_{t-1}\)

Expanded Simulation Function: With AR(1) DGP

def simulate_ols(
    n_sim: int = 1000,
    n_obs: int = 50,
    beta0: float = 0.0,
    beta1: float = 0.5,
    seed: int = 1,
    dgp_type: str = "static",
) -> list[float]:
    """Simulates OLS estimation for static or AR(1) DGP.
    Args:
        n_sim (int): Number of simulations to run. Defaults to 1000.
        n_obs (int): Number of observations per simulation. Defaults to 100.
        beta0 (float): True intercept value. Defaults to 0.
        beta1 (float): True slope coefficient. Defaults to 0.5
        seed (int): random number generator seed. Defaults to 1.
        dgp_type (str): Type of DGP: "static" or "ar1". Defaults to "static".

    Returns:
        list[float]: List of errors of beta1 estimates for each simulation.
    """

    # Create container to store results
    results_ols_errors = []

    # InitializeRNG
    rng = np.random.default_rng(seed)

    # Core simulation loop
    for _ in range(n_sim):
        # 1. Draw data
        if dgp_type == "static":
            x = rng.normal(size=n_obs)
            u = rng.normal(size=n_obs)
            y = beta0 + beta1 * x + u
        elif dgp_type == "ar1":
            y = np.zeros(n_obs + 1)
            u = rng.normal(size=n_obs + 1)
            y[0] = beta0 + u[0]
            for t in range(1, n_obs + 1):
                y[t] = beta0 + beta1 * y[t - 1] + u[t]
            # Use the last n_obs elements of y as and the first n_obs as x
            x = y[:-1]  # x is y lagged (n_obs elements)
            y = y[1:]  # y[1:] is the dependent variable (n_obs elements)
        else:
            # Handle the case of giving the wrong DGP value
            raise ValueError("Invalid DGP choice")

        # 2. Run OLS estimation
        W = np.column_stack([np.ones(n_obs), x])
        beta_hat = np.linalg.inv(W.T @ W) @ W.T @ y

        # 3. Compute error
        results_ols_errors.append(beta_hat[1] - beta1)

    return results_ols_errors

Running AR(1) Simulation

Can now run our simulations with \(\beta \in \curl{0, 0.5, 0.95}\):

ar_results_no_pers = simulate_ols(beta1=0, dgp_type="ar1")
ar_results_med_pers = simulate_ols(beta1=0.5, dgp_type="ar1")
ar_results_high_pers = simulate_ols(beta1=0.95, dgp_type="ar1")
print(f"Bias (no persistence): {np.mean(ar_results_no_pers):.3f}")
print(f"Bias (medium persistence): {np.mean(ar_results_med_pers):.3f}")
print(f"Bias (high persistence): {np.mean(ar_results_high_pers):.3f}")

Bias (no persistence): -0.028
Bias (medium persistence): -0.059
Bias (high persistence): -0.100

Conclusions:

More persistence = larger absolute bias
Direction: downward (underestimating)

MC Distribution Under Persistence

Effect of Sample Size

To check effect of sample size, use different values for n_obs:

ar_results_no_pers = simulate_ols(beta1=0, n_obs=200, dgp_type="ar1")
ar_results_med_pers = simulate_ols(beta1=0.5, n_obs=200, dgp_type="ar1")
ar_results_high_pers = simulate_ols(beta1=0.95, n_obs=200, dgp_type="ar1")
print(f"Bias (no persistence): {np.mean(ar_results_no_pers):.3f}")
print(f"Bias (medium persistence): {np.mean(ar_results_med_pers):.3f}")
print(f"Bias (high persistence): {np.mean(ar_results_high_pers):.3f}")

Bias (no persistence): -0.008
Bias (medium persistence): -0.017
Bias (high persistence): -0.023

Conclusions:

Bias decays as sample size increases

Limitations of Functional Approach

What’s Good About The Functional Approach

The monolithic function approach has a few pluses in our small simulation:

Was easy to create and design: just put all the components in one function in the same order
Everything is in one place: easy to look at the whole code

These advantages \(\Rightarrow\) why writing a simple function is usually a good starting place for new simulations

Problem: How To Expand Simulations?

By now: a working base simulation

But what if we expand our simulations?

Try more DGPs (e.g. heavier tails for innovations, different \(X\))
More sample sizes
Different models (not necessarily \(Y_t = \beta_0 + \beta_1 X_t + U_t\))

Limitations of Current Approach

We need a better approach to organizing simulations

Basic approach is fine if you just want run one thing
But difficult if you want to expand more:
- Adding new DGPs would mean expanding simulation function even more
- We would have to remember to run all the different cases ourselves

Becomes difficult to scale and maintain

Recap and Conclusions

Recap

In this lecture we

Saw our first simulation
Talked about structuring simulation code in a single function + advantages and limitations of this approach
Discussed setting RNG seeds for reproducibility

Next Questions

How to use a more modular and loosely modular design for larger simulations?
How to run many simulation scenarios at the same time?
How to apply these approaches to different statistical scenarios?

References

Bao, Yong, and Aman Ullah. 2007. “The Second-Order Bias and Mean Squared Error of Estimators in Time-series Models.” Journal of Econometrics 140 (2): 650–69. https://doi.org/10.1016/j.jeconom.2006.07.007.

Hansen, Bruce. 2022. Econometrics. Princeton_University_Press.

Hillard, Dane. 2020. Practices of the Python Pro. Shelter Island, NY: Manning Publications Co.

Lutz, Mark. 2025. Learning Python: Powerful Object-Oriented Programming. Sixth edition. Santa Rosa, CA: O’Reilly.