HOME    ABOUT    NEWS    JOB BANK     EVENTS    CONTACT

 

Artificial Intelligence, Epidemiology, and
Moving Toward Causal Inference

 

Author: Madeline Roberts, PhD, MPH

The artificial intelligence zeitgeist continues, marked by proliferating capabilities and applications along with the race to develop guardrails to keep pace with growth.

Some epidemiological and health applications for AI include:

♦    IBM Watson Health and the Explorys dataset: comprises anonymous data on 50 million patients, useful for evaluating disease history and progression in populations, among a multitude of other uses.
 

♦     HealthMap utilizes natural language processing (NLP) to sift through disparate data sources to produce a comprehensive disease and public health threat surveillance system in real-time
 

♦     The CDC has utilized AI and machine learning (ML) for tuberculosis surveillance in chest X-rays
 

♦    The CDC is also working toward employing AI for trend forecasting in opioid mortality, as well as using satellite imagery changes to target polio vaccine delivery in places like Nigeria.

AI is well-suited for powerful predictive modeling, however, it is not necessarily skilled in causal inference or differentiating between association and causality. A recent International Epidemiological Association article describes this difference well: “The key difference between AI and classic epidemiology is that the latter builds models based on explicit assumptions about what matters and how, so that the results can be directly interpretable, whereas AI builds algorithms in essence for predictive models discovered from the data, without necessarily understanding why.”

Other epidemiological concerns include lack of specified a priori research questions or analytic plans, and bias and inequity stemming from data training sets which often underrepresent minority populations. Another related issue is data overfitting, which can lead to poor generalizability for populations or settings beyond the training dataset.

In 2019, a landmark study was published by Obermeyer et al, which identified that by using health costs as a proxy for health needs, a US commercial health algorithm miscalculated Black patients’ risk levels and need for additional health care. Because fewer health dollars are spent on Black patients compared to White patients at the same level of need, for equally sick patients the algorithm incorrectly found that Black patients were healthier than White patients.

Last year another influential study by Dr. Anirban Basu concluded that including a race correction feature in algorithms for clinical diagnostic purposes can help curb inequalities, but that including race race in models for resource allocation can in essence perpetuate inequality because it can be predicated on “differential efforts across groups.”

Bias built into healthcare algorithms as well as medical insurance decisions based solely or primarily on AI predictions can have devastating repercussions for the people subject to them. Investigations into Medicare Advantage, particularly in the area of post-acute care (i.e., after a stroke or a fall), have demonstrated patterns of Medicare Advantage insurers discontinuing payments for care based heavily or exclusively on algorithm conclusions, despite clinician notes documenting the need for continued care. The patient accounts of payment termination for prescribed care based on algorithm determinations are disturbing, even more so considering these patients are seniors. There is now a class action lawsuit against United Healthcare over Medicare Advantage denials.


These findings sparked bi-partisan ire in late 2023, and came up again earlier this month on February 8 at the Senate Committee on Finance Hearing over “AI and Healthcare: Promise and Pitfalls.” Dr. Ziad Obermeyer testified at this hearing (his written testimony here; recording of the Senate hearing here). Dr. Obermeyer’s written testimony is balanced, compelling, and well worth reading. While he remains optimistic that AI has the potential to improve health outcomes and reduce costs, he also warns:

“…without concerted effort from researchers, the private sector, and government, AI may be on a path to do more harm than good in health care…

AI learns from historical data, with all its biases and inequities, and encodes those past practices in policy. So those underserved patients whose claims have been denied by humans in our past datasets—often for unjust reasons—will have their claims denied by AI at scale, forever, unless we can re-align AI with our society’s goals.”

Similar concerns were expressed by former Google CEO Eric Scmidt in a recent article titled “How We Can Control AI,” where he discussed the challenge of “encod[ing] human values” into AI models. As a potential solution to keeping AI in check, Schmidt described a proposal from a recent meeting of AI key players—to develop competitive, innovative AI testing companies that are government-certified but whose developers and funding are from the private sector. AI model builders would pay to have their models tested and certified in keeping with safety standards.        

Principled epidemiological approaches can both contribute to and benefit from responsible use of AI in population health studies and interventions. Some suggested methods toward causal inference include directed acyclic graphs (DAGs) and Mendelian randomization.

Epidemiology as a field exists to analyze disease distribution patterns and develop targeted interventions. Cross-disciplinary collaborations between epidemiologists and data informaticists are needed to continue making strides toward this end.  

 

HOME    ABOUT    NEWS    JOB BANK     EVENTS    CONTACT