|
|
Society for Epidemiologic Research Mid-Year Meeting Notes: Innovative Data
Science Applications in Epidemiology |
Author:
Madeline Roberts, PhD, MPH One of the overarching themes that surfaced is that a key challenge in AI approaches is to clearly identify and understand the problem you are trying to solve. Other challenges include sourcing/curating data that can extract useful signals, and selecting the appropriate tools, techniques, and frameworks to analyze those signals. Dr. Alejandro Berlin gave a fantastic talk titled “Harnessing the Power of AI in Cancer Research: From Code to Clinic.” Dr. Berlin emphasized that significant effort needs to go into the not-very-glamorous work of thinking about, constructing, curating, and stewarding data assets. Throughout his talk he returned to the theme of the need for technology to be human-centered, the key question being “How are we improving care for patients?” He emphasized that to keep the focus human-centered, researchers need to clearly identify what problem(s) they are trying to solve and who is going to see the benefit of this technology. The aim is not what elaborate, impressive things we can do with the technology, rather, how are we demonstrably improving care, and for whom? Dr. Berlin challenged researchers to consider whether we are simply digitizing a process and making it “fancier,” or if we are actually moving the bar and improving patient care and patient experience. Dr. Berlin discussed the importance of differentiating between knowledge (the prediction) and judgment (taking action). AI provides the tools and delivers predictions, humans make judgments and take action, the latter of which is critical in technology evaluation. And he posits that highly curated data is the most essential element of all. Two note-worthy articles he referenced were one, “Decoding biological age from face photographs using deep learning” by Zalay, et al (a pre-print at the time of this writing). This study developed and validated FaceAge, a deep learning tool that estimates biological age from simple facial photographs. Trained on data from healthy patients and cancer patients, the authors found that, “on average, cancer patients look older than their chronological age, and looking older is correlated with worse overall survival,” which was assessed using Kaplan Meier survival analysis. Dr. Berlin touched on the potential ethical implications of this study, such as if insurance companies were to begin using it in association with assigning premiums. The second standout article was “All models are wrong and yours are useless: making clinical prediction models impactful for patients” by Florian Markowetz, which is both as fun and as informative to read as it sounds. Dr. Irene Chen delivered an outstanding keynote speech on Artificial Intelligence and Health Disparities. She began by discussing audits performed on algorithms used in healthcare are increasingly providing evidence of bias. Some reasons for this include: (1) healthcare in its current form has existing health disparities, so data that reflects disparities is generated and then fed into algorithms, and (2) genome wide association studies (GWAS) are not reflective of the global population (96% of GWAS participants are of European descent). She gave the example of dermatology algorithms trained on fair-skinned patients, which demonstrate poorer performance on dark-skinned patients, and one way to correct this is to augment existing data sets with darker skin images (study: Disparities in dermatology AI performance on a diverse, curated clinical image set by Daneshjou et al). Thus, if the training dataset is not representative, we should look for ways to amend this to achieve better and more equitable performance. Dr. Chen’s research focuses on what she calls the “ethical AI pipeline for medicine” (you can find a figure of the pipeline here). She discussed how bias is entering into each step along the way, and that researchers must consider how to make the data collection process more equitable and consider elements such as power dynamics, representation, who consents for data collection, and which studies are ultimately funded. She referenced “biased systems and biased datasets create algorithmic bias.” Dr. Chen concluded that equity problems are both societal and computational in nature and both of these facets need to be addressed. Some of Dr. Chen’s other research is on patient-centered reasons for treatment switching which utilizes large language models (LLMs), Clustering Interval-Censored Time-Series for Disease Phenotyping, and Ethical Machine Learning in Health Care, the latter of which addresses some of the social justice aspects of machine learning. The 2024 SER Annual Meeting will be held June 18-21 in Austin, Texas. More information can be found here, including accommodations and submissions. We hope to see you there! ■ |
|