Data Types

A Brief Classification

Vladislav Morozov

Introduction

Lecture Info

Learning Outcomes

This lecture offers a brief classification of data types


By the end, you should be able to

  • State the main ways to we classify data
  • Describe the four main kinds data in terms of observations

Textbook References

  • Section 1.3 in Wooldridge (2020)
  • Section 1.5 in Hansen (2022)

Classifying Data

How Do We Classify Data?

Main kinds of data classification:


Based on number of units and observations per unit

Based on sampling assumptions



By nature of data


All these difference affect identification, estimation, and inference

Kinds of Data by Observations


  • Cross-sectional data
  • Time series
  • Panel data
  • Repeated cross-sectional data

Sampling Assumptions


  • From an infinitely large population:
    • Independent samples
    • Dependent samples (network or spatial dependence)
  • From a finite population

Data Nature

  • Numerical vectors
  • Images
  • Text
  • Functional data


In this class mostly numerical data, but other data also hugely important!

Types of Data by Observation

Cross-Sectional Data

Simplest type of data — cross-sectional

  • One observation per unit
  • Time of collection not important


  • Example: unemployment in different European countries in the last quarter of 2024
  • Often indexed by \(i\), as in \((Y_i, \bX_i)\)

Visual Example: Cross-Sectional Data

Time Series Data

Data collected over time:


One unit with multiple observations


  • Example: unemployment in Spain over 1998-2024
  • Indexed by time \(t\): \((Y_t, \bX_t)\)

Visual Example of Time Series

Panel Data

Multidimensional kind of data:


Multiple units with multiple observations


  • Sometimes but not always — combination of cross-sections and time series
  • Can be two- \((Y_{it})\) or higher-dimensional \((Y_{ijt})\)

Panel Data Examples


  • Two-dimensional:
    • Unit \(\times\) time: unemployment in European country \(i\) over years \(t\)
    • Various groups: grades of student \(i\) under teacher \(j\)
  • Higher-dimensional: price of beer \(i\) in store \(j\) on day \(t\)

Visual Example: Panel Data

Repeated Cross-Sectional Data

  • One observation per unit
  • Time of collection important


Example

  • Surveying different households in different years
  • Changes over time cannot be ignored in many contexts

Data with Panel and Repeated Cross-Sectional Features


Rotating panels have features of both panel and repeated cross-sectional data

Example of Rotating Panel

Example: US consumer expenditure survey (quarterly)

  • Each household asked about their spending for four consecutive quarters, then dropped
  • Each quarter new 2500 households recruited into survey
    • 2014 and 2016: totally different households
    • 2014Q1 and 2014Q2: 7500 households in common, 2500 only in one of the two datasets

Recap and Conclusions

Recap

In this lecture we

  1. Discussed approaches to classifying data
  2. Discussed the four main kinds of data in terms of observation structure

References

Hansen, Bruce. 2022. Econometrics. Princeton_University_Press.
Wooldridge, Jeffrey M. 2020. Introductory Econometrics: A Modern Approach. Seventh edition. Boston, MA: Cengage.