8 Interlude: Characteristic Functions and Deconvolution
To discuss the identification of the full distribution of the coefficients \(\bbeta_i\) in model (2.7), we need to review two key concepts: characteristic functions and the deconvolution approach. This brief section can be freely skipped if you are familiar with both.
8.1 Characteristic Functions
If \(\bV\) is a random \(T\)-vector, then the characteristic function \(\varphi_{\bV}(s): \R^T\to \C\) of \(\bV\) is defined as follows: \[ \varphi_{\bV}(\bs) = \E[\exp(i\bs'\bV)]. \] See Durrett (2019) (or your favorite probability textbook) regarding general properties of characteristic functions.
For our purposes, we will need the following three key properties:
The characteristic function uniquely determines the distribution.
Let \(\bV, \bU\) be two independent random vectors. Then the characteristic function of their sum \(\bV+\bU\) is equal to the product of characteristic function of \(\bV\) and \(\bU\): \[ \begin{aligned} \varphi_{\bV+\bU}(\bs) & = \E\left[e^{i\bs'(\bV+\bU)}\right] = \E\left[e^{i\bs'\bV}e^{i\bs'\bU} \right]\\ & = \E\left[e^{i\bs'\bV}\right]\E\left[e^{i\bs'\bU}\right]\\ & = \varphi_{\bV}(\bs) \varphi_{\bU}(\bs). \end{aligned} \tag{8.1}\]
Let \(\bbeta\) be a random \(p\)-vector and \(\bX\) a matrix. Then \[ \begin{aligned} \varphi_{\bX\bbeta}(\bs) & = \E\left[\exp(i\bs'(\bX\bbeta)) \right] \\ & = \E\left[\exp(i(\bX'\bs)'\bbeta) \right]\\ & = \varphi_{\bbeta}(\bX'\bs) \end{aligned} \tag{8.2}\]
Conditional characteristic functions may be defined analogously using conditional expectations in place of unconditional ones.
8.2 Deconvolution
Property (8.1) is particularly useful for statistical applications. It forms the basis of an estimation and identification approach known as deconvolution. This approach will help us identify the distribution of the coefficients \(\bbeta_i\) in the next section.
At heart, deconvolution is simple. Suppose that we observe a random vector \(\bY\). \(\bY\) is generated as a sum of two independent vector \(\bV\) and \(\bU\). The distribution of \(\bU\) is known, while the distribution of \(\bV\) is the object of interest.
By property (8.1) the characteristic function of \(\bY\) satisfies \[ \varphi_{\bY}(\bs) = \varphi_{\bV}(\bs) \varphi_{\bU}(\bs). \] If \(\varphi_{\bU}(\bs)\neq 0\), we can divide and rearrange to obtain \[ \varphi_{\bV}(\bs) = \dfrac{\varphi_{\bY}(\bs) }{\varphi_{\bU}(\bs)} \] By assumption, the distributions of \(\bY\) and \(\bU\) are known, and thus \(\varphi_{\bY}(\bs)\) and \(\varphi_{\bU}(\bs)\) are identified. It follows that the full \(\varphi_{\bV}(\cdot)\) is also identified provided \(\varphi_{\bU}(\bs)\neq 0\) for all \(\bs\) (or at least \(\varphi_{\bU}(\bs)= 0\) for “not too many” \(\bs\), see Evdokimov and White (2012)). The distribution of \(\bV\) is identified since characteristic functions uniquely identify distributions.
This identification strategy is called deconvolution. The name of the procedure stems from the fact that the distribution of \(\bY\) is the convolution of distributions of \(\bV\) and \(\bU\). Extracting the distribution of \(\bV\) from the laws of \(\bY, \bU\) may be viewed as an inverse operation.
Observe that the argument is nonparametric, as it imposes no parametric form assumptions on the distributions involved.
Next Section
In the next section, we will discuss identification of the distribution of the coefficients \(\bbeta_i\) in model (2.7) using the tools of this section.