11 Introduction to Nonparametric Models with Unobserved Heterogeneity

Summary and Learning Outcomes

This section introduces nonparametric models with unobserved heterogeneity and lays out the structure and goals for this block.

By the end of this section, you should be able to:

Explain limitations of linear models.
Write down generic nonparametric models with unobserved heterogeneity.
Understand identification challenges and the assumptions commonly used to address them.

11.1 Towards Nonparametric Models

11.1.1 Motivation

We began these notes by studying linear models with heterogeneous coefficients (2.7), a familiar and flexible starting point. As discussed in section 2, such models are widely used and arise naturally in empirical work.

But linearity is not innocent. Except for binary-only covariates, the assumption often lacks theoretical support and may be at odds with the data. Many economic settings suggest richer structures: preferences with satiation, production with non-constant returns to scale, or outcomes bounded by construction. Furthermore, differences between individuals may not be compressible into a finite-dimensional vector of heterogeneous coefficients. In such cases, linear models may be severely misspecified and lead to incorrect conclusions.

This motivates a shift. In this block, we move beyond linearity and consider nonparametric models with unobserved heterogeneity — a class that allows for far greater flexibility in how outcomes respond to both observed and unobserved variation. These models present new challenges but also offer a more powerful framework for accounting for unobserved differences.

11.1.2 Nonparametric Models

Nonparametric models address functional form concerns directly. Rather than imposing a specific shape on the relationship between \(y\) and the covariates \(\bx\), we assume only that this relationship is governed by an unknown function \(\phi\) of observed and unobserved variables. This leads to the following general setup:

In cross-sectional settings: \[ Y_i = \phi(\bX_i, A_i), \quad {}_{i=1,\dots, N}, \tag{11.1}\] where \(\bX_{i}\) includes the observed variables and \(A_i\) includes the unobserved components.
In panel data settings \[ Y_{it} = \phi(\bX_{it}, A_i, U_{it}), \quad {}_{i=1,\dots, N}^{t= 1, \dots, T}, \tag{11.2}\] where both \(A_i\) and \(U_{it}\) are not observed.

Models (11.1) and (11.2) parallel and generalize models (2.3) and (2.4).

In both cases the nature of \((A_i, U_{it})\) is not restricted a priori. These unobserved components may include both finite-dimensional vectors (such as unobserved variables or coefficients) and infinite-dimensional objects (such as utility functions). In such a fully unrestricted setting, we can equivalently represent (11.1) and (11.2) as \[ Y_i = \phi_i(X_i), \quad \quad Y_{it} = \phi_{it}(X_{it}), \] respectively.

11.1.3 Object of Interest

As in the linear case, possible objects of interest include:

The full structural function \(\phi(\cdot, \cdot)\) or \(\phi(\cdot, \cdot, \cdot)\). This function fully describes the relationship between \(Y\) and \(\bX\) for all individuals. This corresponds to the problem of identifying individual treatment effects.
Some distributional features of “treatment effects” — changes in outcomes due to variation in \(\bX_{it}\), conditional on unobserved heterogeneity. In the context of model (11.2), these effects are given by \[ \begin{aligned} & \phi(\bx_2, A_i, U_{it}) - \phi(\bx_1, A_i, U_{it}),\\ & \partial_{\bx} \phi(\bx_0, A_i, U_{it}), \end{aligned} \tag{11.3}\] where \(\bx_0, \bx_1, \bx_2\) are some possible values for \(\bX_{it}\), and the marginal effect is considered if \(\phi\) is suitably differentiable in \(\bx\). The distributional feature of interest may include average effects, variances, higher-order moments, or the full distribution.

11.1.4 Common Issue

Unfortunately, models (11.1) and (11.2) require further assumptions to be useful, as discussed in section 1. Without assumptions, we cannot hope to identify counterfactual objects of interest, not even average effects.

In a strict sense, one may point out that (11.1) and (11.2) are not even models in the sense of offering testable predictions or supporting counterfactual analysis. (11.1) and (11.2) are just statements that \(\bX\) and \(Y\) are related through some function that may differ with \(i\) and \(t\). Such a statement is vacuously true. For example, without further assumptions one may take \(\phi(x, a) = a\) and \(A_i = Y_i\) in (11.1).

Typically, such assumptions fall into two categories:

Assumptions on the joint distribution of the observed and the unobserved components, including on the nature of \((A_i, U_{it})\).
Assumptions on how unobserved components enter the equation.

11.2 Models of This Block

To make progress, we focus on two general but tractable versions of the general nonparametric panel model (11.2) in these notes. In both cases, we will assume that the outcome \(Y_{it}\) is continuous and that \(T=2\).

First Model

We will begin with the following model: \[ Y_{it} = \phi(X_{it}, A_i, U_{it}), \quad {}_{i=1, \dots, N}^{t=1, 2}. \tag{11.4}\] where for simplicity we assume that \(X_{it}\) is scalar. The variable \(X_{it}\) is assumed to be continuously distributed, and \(\phi\) is continuous in \(X_{it}\) for all values of \((A_i, U_{it})\).

We consider a version of (11.4) that is very general in terms of unobserved variables \((A_{i}, U_{it})\). In particular, we do not restrict

The dimension and the form of \(A_i\) and \(U_{it}\);
How the outcome depends on \((A_{i}, U_{it})\);
The dependence structure between \((A_i, U_{it})\) and \(X_{it}\).

At the same time, we impose an assumption of stationarity on \(U_{it}\). Its distribution is stable over time conditional on observed and unobserved covariates, allowing us to isolate changes in \(Y_{it}\) attributable to variation in \(X_{it}\).

Second Model

After reaching the (probable) limits of identification with (11.4), we will consider a different flavor of model (11.2), where the time-varying unobserved component \(U_{it}\) is scalar and affects the outcome \(Y_{it}\) additively: \[ Y_{it} = \phi(X_{it}, A_i) + U_{it}, \quad {}_{i=1, \dots, N}^{t=1, 2} \tag{11.5}\] In contrast to model (11.4), we do not assume that \(U_{it}\) is stationary. Models (11.4) and (11.5) are hence non-nested. We continue to allow \(A_i\) to have unrestricted dimensionality and structure. It may also have a complex dependence structure with \(X_{it}\).

11.3 Plan for This Block

In this block, we will focus on models (11.4)-(11.5) and consider identification of some distributional features of treatment effects (11.3). Specifically,

Average treatment and marginal effects for model (11.4):
- Show that identifying average effects is more complicated than considering averages of the outcome directly.
- Discuss heterogeneity bias, another form of confounding.
- Show how stationarity assumptions on \(U_{it}\) allow us to identify the average effects for a population of stayers — units with \(X_{i1}=X_{i2}\) — without any further assumptions.
- Consider two generalizations of the identification result: beyond the population of stayers and allowing some non-stationarity in the structural function.
Variance of treatment and marginal effects in model (11.5): identify the variance of effects (11.3) by requiring that \(U_{it}\) cannot depend on future values of \(\bX_{it}\)

In the next block, we will also revisit model (11.4) through the lens of quantile regression.

11.4 A Brief Classification of Nonparametric Models

In these notes we primarily focus on the general and powerful models (11.4)-(11.5). However, much work has gone into analyzing other special instances of (11.1) and (11.2). Before moving on to identification of average effects, we offer a brief taxonomy of nonparametric models with unobserved heterogeneity with some essential references. We organize the literature by the types of assumptions made:

How unobserved heterogeneity enters in the system:
- Full non-separable models (Altonji and Matzkin 2005; Hoderlein and Mammen 2007, 2007; Chernozhukov et al. 2015).
- Partially additive separable panel models: (Chen and Ai 2003; Evdokimov and White 2012; Blundell, Horowitz, and Parey 2012; Morozov 2023).
- Fully separable, additively (Henderson, Carroll, and Li 2008; Boneva, Linton, and Vogt 2015; Lee and Robinson 2015; Henderson and Soberon 2024) or multiplicatively (Beckert and Blundell 2008).
What assumptions are imposed on the form of unobserved components: restrictions on dimension such as scalarity (Matzkin 2003; Altonji and Matzkin 2005; Imbens and Newey 2009) or no restrictions (Hoderlein and Mammen 2007, 2007; Chernozhukov et al. 2015)]
What assumptions are imposed on dependence structure between observed and unobserved components: full or partial independence restrictions (Hoderlein and Mammen 2007; Imbens and Newey 2009) vs. no independence restrictions (Hoderlein and White 2012; Chernozhukov et al. 2015).
The nature of the outcome variable: continuous-outcome models (Chernozhukov et al. 2015) or discrete-outcome models (Manski 1987; Chesher and Rosen 2017, 2020).

Next Section

In the next section, we begin our analysis of average effects in model (11.4) and discuss why identification is more complex that analyzing average outcomes.

Altonji, Joseph G., and Rosa L. Matzkin. 2005. “Cross Section and Panel Data Estimators for Nonseparable Models with Endogenous Regerssors.” Econometrica 73 (4): 1053–1102. https://doi.org/10.1111/j.1468-0262.2005.00609.x.

Beckert, Walter, and Richard Blundell. 2008. “Heterogeneity and the Non-Parametric Analysis of Consumer Choice: Conditions for Invertibility.” The Review of Economic Studies 75 (4): 1069–80. https://doi.org/10.1111/j.1467-937X.2008.00500.x.

Blundell, Richard, Joel L. Horowitz, and Matthias Parey. 2012. “Measuring The Price Responsiveness of Gasoline Demand: Economic Shape Restrictions and Nonparametric Demand Estimation.” Quantitative Economics 3 (1): 29–51. https://doi.org/10.3982/qe91.

Boneva, Lena, Oliver Linton, and Michael Vogt. 2015. “A Semiparametric Model for Heterogeneous Panel Data with Fixed Effects.” Journal of Econometrics, Heterogeneity in Panel Data and in Nonparametric Analysis in honor of Professor Cheng Hsiao, 188 (2): 327–45. https://doi.org/10.1016/j.jeconom.2015.03.003.

Chen, Xiaohong, and Chunrong Ai. 2003. “Efficient Estimation of Models with Conditional Moment Restrictions Containing Unknown Functions.” Econometrica 71 (6): 1795–1843.

Chernozhukov, Victor, Iván Fernández-Val, Stefan Hoderlein, Hajo Holzmann, and Whitney Newey. 2015. “Nonparametric Identification in Panels Using Quantiles.” Journal of Econometrics 188 (2): 378–92. https://doi.org/10.1016/j.jeconom.2015.03.006.

Chesher, Andrew, and Adam M. Rosen. 2017. “Generalized Instrumental Variable Models.” Econometrica 85 (3): 959–89. https://doi.org/10.3982/ecta12223.

———. 2020. Generalized Instrumental Variable Models, Methods, and Applications. Vol. 7. Elsevier B.V. https://doi.org/10.1016/bs.hoe.2019.11.001.

Evdokimov, Kirill, and Halbert White. 2012. “Some Extensions of a Lemma of Kotlarski.” Econometric Theory 28 (4): 925–32. https://doi.org/10.1017/S0266466611000831.

Henderson, Daniel J., Raymond J. Carroll, and Qi Li. 2008. “Nonparametric Estimation and Testing of Fixed Effects Panel Data Models.” Journal of Econometrics 144 (1): 257–75. https://doi.org/10.1016/j.jeconom.2008.01.005.

Henderson, Daniel J., and Alexandra Soberon. 2024. “Nonparametric Models with Fixed Effects.” In The Econometrics of Multi-Dimensional Panels: Theory and Applications, edited by Laszlo Matyas, 285–323. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-49849-7_9.

Hoderlein, Stefan, and Enno Mammen. 2007. “Identification of Marginal Effects in Nonseparable Models without Monotonicity.” Econometrica 75 (5): 1513–18. https://doi.org/10.1111/j.1468-0262.2007.00801.x.

Hoderlein, Stefan, and Halbert White. 2012. “Nonparametric Identification in Nonseparable Panel Data Models with Generalized Fixed Effects.” Journal of Econometrics 168 (2): 300–314. https://doi.org/10.1016/j.jeconom.2012.01.033.

Imbens, Guido W., and Whitney K. Newey. 2009. “Identification and Estimation of Triangular Simultaneous Equations Models Without Additivity.” Econometrica 77 (5): 1481–1512. https://doi.org/10.3982/ecta7108.

Lee, Jungyoon, and Peter M. Robinson. 2015. “Panel Nonparametric Regression with Fixed Effects.” Journal of Econometrics, Heterogeneity in Panel Data and in Nonparametric Analysis in honor of Professor Cheng Hsiao, 188 (2): 346–62. https://doi.org/10.1016/j.jeconom.2015.03.004.

Manski, Charles F. 1987. “Semiparametric Analysis of Random Effects Linear Models from Binary Panel Data.” Econometrica 55 (2): 357. https://doi.org/10.2307/1913240.

Matzkin, Rosa L. 2003. “Nonparametric Estimation of Nonadditive Random Functions.” Econometrica 71 (5): 1339–75. https://doi.org/10.1111/1468-0262.00452.

Morozov, Vladislav. 2023. “Estimating the Moments and the Distribution of Heterogeneous Marginal Effects Using Panel Data.”