Antoine Chambaz (MAP5, Université Paris Descartes, Paris)
Title: Targeted Machine Learning: how we can use machine learning for causal inference
Abstract: Coined by Mark van der Laan and Dan Rubin in 2006, targeted learning is a general approach to learning from data that reconciles machine learning and statistical inference. On the one hand, "machine learning" refers to the estimation of infinite-dimensional features of the law of the data, P, for instance a regression function. Machine learning algorithms are versatile, and produce (possibly highly) data-adaptive estimators. Driven by the need to make accurate predictions, they do not care so much about the assessment of prediction uncertainty. On the other hand, "statistical inference" refers to the estimation of finite-dimensional parameters of P, for instance a measure of association with a causal interpretation. It focuses on the construction of confidence regions or the development of hypotheses tests. Emphasis is placed on robustness (guaranteeing that one goes to the truth even under mild and reasonable assumptions on P), efficiency (trying to draw as much information from the data as possible), and controlling the asymptotic levels or type I errors.
Targeted learning has been applied and studied in a great variety of contexts. Its analysis is framed in the theory of inference based on semiparametric models.
We will illustrate how targeted learning unfolds in examples from causal analysis.