HOME

HOME ♦ ABOUT ♦ NEWS ♦ JOB BANK ♦ EVENTS ♦ CONTACT

Society for Epidemiologic Research Mid-Year Meeting Notes: Innovative Data Science Applications in Epidemiology

Author: Madeline Roberts, PhD, MPH

The SER Mid-Year Meeting was held March 4-8 in Toronto, Ontario. Innovative Data Science Applications in Epidemiology was the theme of this year’s mid-year meeting. As data science advances, methodology from that field is increasingly utilized within epidemiology. As such, this meeting discussed some of the latest artificial intelligence (AI) developments as they pertain to epidemiology. The papers referenced throughout the conference were of great interest, and truly are at the vanguard of what is happening at the intersection of data science and public health. Several of these are linked throughout this article, and they are well worth reading.

One of the overarching themes that surfaced is that a key challenge in AI approaches is to clearly identify and understand the problem you are trying to solve. Other challenges include sourcing/curating data that can extract useful signals, and selecting the appropriate tools, techniques, and frameworks to analyze those signals.

Dr. Alejandro Berlin gave a fantastic talk titled “Harnessing the Power of AI in Cancer Research: From Code to Clinic.” Dr. Berlin emphasized that significant effort needs to go into the not-very-glamorous work of thinking about, constructing, curating, and stewarding data assets. Throughout his talk he returned to the theme of the need for technology to be human-centered, the key question being “How are we improving care for patients?” He emphasized that to keep the focus human-centered, researchers need to clearly identify what problem(s) they are trying to solve and who is going to see the benefit of this technology. The aim is not what elaborate, impressive things we can do with the technology, rather, how are we demonstrably improving care, and for whom? Dr. Berlin challenged researchers to consider whether we are simply digitizing a process and making it “fancier,” or if we are actually moving the bar and improving patient care and patient experience.

Dr. Berlin discussed the importance of differentiating between knowledge (the prediction) and judgment (taking action). AI provides the tools and delivers predictions, humans make judgments and take action, the latter of which is critical in technology evaluation. And he posits that highly curated data is the most essential element of all.

Two note-worthy articles he referenced were one, “Decoding biological age from face photographs using deep learning” by Zalay, et al (a pre-print at the time of this writing). This study developed and validated FaceAge, a deep learning tool that estimates biological age from simple facial photographs. Trained on data from healthy patients and cancer patients, the authors found that, “on average, cancer patients look older than their chronological age, and looking older is correlated with worse overall survival,” which was assessed using Kaplan Meier survival analysis. Dr. Berlin touched on the potential ethical implications of this study, such as if insurance companies were to begin using it in association with assigning premiums.

The second standout article was “All models are wrong and yours are useless: making clinical prediction models impactful for patients” by Florian Markowetz, which is both as fun and as informative to read as it sounds.

Dr. Irene Chen delivered an outstanding keynote speech on Artificial Intelligence and Health Disparities. She began by discussing audits performed on algorithms used in healthcare are increasingly providing evidence of bias. Some reasons for this include: (1) healthcare in its current form has existing health disparities, so data that reflects disparities is generated and then fed into algorithms, and (2) genome wide association studies (GWAS) are not reflective of the global population (96% of GWAS participants are of European descent). She gave the example of dermatology algorithms trained on fair-skinned patients, which demonstrate poorer performance on dark-skinned patients, and one way to correct this is to augment existing data sets with darker skin images (study: Disparities in dermatology AI performance on a diverse, curated clinical image set by Daneshjou et al). Thus, if the training dataset is not representative, we should look for ways to amend this to achieve better and more equitable performance.

Dr. Chen’s research focuses on what she calls the “ethical AI pipeline for medicine” (you can find a figure of the pipeline here). She discussed how bias is entering into each step along the way, and that researchers must consider how to make the data collection process more equitable and consider elements such as power dynamics, representation, who consents for data collection, and which studies are ultimately funded. She referenced “biased systems and biased datasets create algorithmic bias.”

Dr. Chen concluded that equity problems are both societal and computational in nature and both of these facets need to be addressed. Some of Dr. Chen’s other research is on patient-centered reasons for treatment switching which utilizes large language models (LLMs), Clustering Interval-Censored Time-Series for Disease Phenotyping, and Ethical Machine Learning in Health Care, the latter of which addresses some of the social justice aspects of machine learning.

The 2024 SER Annual Meeting will be held June 18-21 in Austin, Texas. More information can be found here, including accommodations and submissions. We hope to see you there! ■

HOME ♦ ABOUT ♦ NEWS ♦ JOB BANK ♦ EVENTS ♦ CONTACT