Introduction to Prediction. Learning Scenarios
This lecture is an introduction to prediction
By the end, you should be able to
Can split statistics into two blocks based on overall goal:
Prediction studied by statistical and machine learning.
Personal definition:
Definition 1
Both develop algorithms that can learn from data and generalize to unseen data
Key goal of prediction — predicting well
Other goals:
Key goal in causal inference — correct identification
Causal settings: there is some true causal model. Trying to learn some of its features with identification arguments
SL/ML: only weak reference to the underlying “true” model. Generally no identification work
For SL/ML the key metric is how well you predict with unseen data — generalization or risk (next lecture)
Are the two fields totally disjoint?
No:
See Chernozhukov et al. (2024)
Hastie, Tibshirani, and Friedman (2009)
Shalev-Shwartz and Ben-David (2014)
Mohri, Rostamizadeh, and Talwalkar (2018)
Books on “core” machine learning methods in practice, mainly with scikit-learn
Géron (2023)
James et al. (2023)
Possibly skip TensorFlow in Géron (2023) in favor of PyTorch
SL/ML not monolithic: there are different learning scenarios based on
What kind of problems can you solve?
Domain of Application | Examples |
---|---|
Forecasting | Estimating the GDP in the current quarter |
Causal inference | Preestimating “first stage”/nuisance parameters |
Text or document classification | Assigning topics, determining whether contents are inappropriate, spam detection |
NLP | Part-of-speech tagging, named-entity recognition, context-free parsing, text summarization, chatbots |
Speech processing | Speech recognition, speech synthesis, speaker verification and identification |
Computer vision | Object recognition and identification, face detection, content-based image retrieval, optical character recognition, image segmentation |
Anomaly detection | Detecting credit card fraud |
Clustering | Segmenting clients into blocks and offering different marketing strategies |
Data visualization | Using dimensionality reduction |
Recommender systems | Suggesting next product to buy given purchase history |
All these things also done with people with econ backgrounds
Supervised settings: some observed output \(Y\):
Task | Type of Variable | Examples |
---|---|---|
Classification | Categorical | Document classification |
Regression | Continuous | Nowcasting the GDP |
Ranking | Ordinal | Selecting the order of results in a search |
Unsupervised settings: no obvious observed \(Y\):
Task | Type of Variable | Examples |
---|---|---|
Clustering | Categorical | Identifying communities in a large social network |
Dimensionality reduction/manifold learning | Continuous | Preprocessing digital images in computer vision tasks |
More concepts: self-supervised, active learning, reinforcement learning, etc.
Another axis: if new observations arrive, how to update model?
In this lecture we:
Prediction: Introduction