HOME    ABOUT    NEWS    JOB BANK     EVENTS    CONTACT

 

Causal Inference – Emerging Areas of Research
and Thoughts for the Road Ahead
 

 Author: Madeline Roberts, MPH, PhD

Causal inference, inferring what would have happened in the past had something been done differently, or what would be the future result if a current course of action is altered, is one of the central aims of epidemiology. Donald Rubin has written masterfully on the conceptual and mathematical history of causal inference in epidemiology and statistics beginning in 1925 with Sir Ronald Fisher positing that randomization should be the basis for causal inference. Rubin also notes that the future of causal inference is nearly inextricable from modern computing, including artificial intelligence and machine learning.

This month we wanted to highlight an exciting article in the October 2022 issue of The American Journal of Epidemiology. “The Future of Causal Inference” presents a non-exhaustive, non-ranked list of ten areas of emergent research in causal inference that have been gaining traction in recent years.

It is noteworthy that the majority of these emerging areas of causal inference research are rooted in statistical learning methods. For current and rising epidemiologists, the pursuit of rigorous training and continued education in statistics and data analysis truly cannot be overemphasized. A synopsis of each area of emergent research follows.

1. High-dimensional data present a multitude of opportunities to explore a wider array of causal questions, however, subject to the “curse of dimensionality,” this comes with statistical challenges and the problem of high-dimensional confounding. Emerging research in this area covers methods which consider both treatment and the outcome in variable selection to mitigate confounding when including such a large number of potential confounders.

2. Precision medicine, which the authors define as “using available data to determine what treatment is best for an individual and delivering it at the right time” has benefitted from advances in statistical methodology. Developments in this area involve determining both the optimal monitoring plan as well as the optimal intervention based on feedback from said monitoring plan. Future aims will involve microrandomized trials and mobile technology to tailor with even greater precision both interventions and delivery.

3. Causal machine learning—where the aim is to predict what would happen if a specific aspect of the world changed, rather than trying to predict what will occur next in the world’s current state. This necessitates thoughtful study design and model selection before implementation.

4. Enriching randomized experiments with real-world data. While randomized experiments are the gold standard for study, they can be cost-prohibitive and often include only a subgroup of the target population. One growing area of research, which the authors liken to a form of meta-analysis, focuses on how to combine evidence from both randomized experiments and observational studies. For example, taking treatment effect point estimates from a randomized control trial (RCT) and using observational study evidence to evaluate differences in the included and excluded participants of the RCT.  

 5. Algorithmic fairness and social responsibility. In view of evidence demonstrating bias in data sources and the resulting algorithms, new causal inference research emphasizes counterfactual thinking—such as whether changing one feature affects model prediction—and sensitivity analyses to evaluate whether bias exists from unmeasured factors.

6. Distributed learning. In view of the considerable computational workload that deep learning can necessitate, distributed learning can refer to distributing computational workload across many machines to achieve scalability. It can also refer to a privacy preservation approach when data is shared across different systems and populations wherein models can be fit to data without sharing the raw, granular data.

7. Causal discovery involves, under specific assumptions, using statistical methods and computational algorithms to find causal relationships and identify a directed acyclic graph (DAG) from observational data.

8. Interference and spillover. Knowing that the results of policy decisions in one area do not happen in isolation but rather “spillover” into neighboring areas, a growing area of causal inference research focuses on evaluating the causal effect of this spillover and defining estimands of interest. 

9. Transportability involves whether study results from one specific population can be conveyed to another target population. Differences in factors between the two populations or geographic regions necessitate methods that can account for these differences, as well as generalize causal effects in complex settings.

10. Quasi-experimental devices aim to mitigate the ambiguity of associations in observational studies resulting from non-randomized treatment. Emerging developments in quasi-experimental devices include evidence factors, differential effects, and computerized construction of quasi-experiments. Future work in this area will aim to apply statistical inference in quasi-experimental devices as well as determine how to leverage quasi-experimental devices in making causal inferences about many treatments working together.

*

With computational advances and the proliferation of data in so many fields, perhaps there has never been a more exciting time to be an epidemiologist. With these advances in statistical computing, rigorous study design and disclosing assumptions remain critically important for responsible research.

 

HOME    ABOUT    NEWS    JOB BANK     EVENTS    CONTACT