|
|
Q&A
With Dr. Maya Mathur |
Author: Madeline Roberts, PhD, MPH This month we were delighted to catch up with Dr. Maya Mathur who gave several fantastic presentations covering causal inference and E-values at the recent SER conference. Dr. Mathur is an Assistant Professor in Stanford University’s Quantitative Sciences Unit and the Department of Pediatrics. She also serves as the Associate Director of Stanford Data Science’s Center for Open and Reproducible Science (CORES). We enjoyed following up with Dr. Mathur about everything from open science and p-hacking to advice for early career epidemiologists and data scientists. EpiMonitor: Could you share a bit about your current research interests? Are there any projects about which you are particularly excited? MM: For better or worse, I'm a greedy algorithm: I tend to jump into whatever research topic happens to most excite me at any given moment, so I do tend to jump around. On the epi side, I've been very interested recently in selection bias and missing data, and how graphical models can help straighten out these counterintuitive issues. I'm enjoying delving into some new-to-me theory and proof techniques based on graphical models; it's all very confusing and interesting, an addictive combination. I also have a longer-standing interest in evidence synthesis and meta-analysis. Recently I've been very interested in meta-analytic methods to address p-hacking, which, it turns out, is a lot more pernicious than what we usually conceptualize as publication bias. EpiMonitor: A PubMed search shows a substantial increase in the number of publications mentioning E-values over the past ten years. Can you talk about the utility of E-values in causal inference and why you think there may have been such a rise in their use? [Editor’s note: an E-value can be defined as “the minimum strength of association on the risk ratio scale that an unmeasured confounder would need to have with both the exposure and the outcome, conditional on the measured covariates, to fully explain away a specific exposure-outcome association.” (https://www.evalue-calculator.com/) MM: I think their utility is as an easy-to-apply sensitivity analysis that is conservative in the sense that it allows for worst-case confounding. E-values certainly have their limitations as well, particularly in relation to their conservatism, though I think Peng Ding and Tyler [VanderWeele, both co-authors on the Website and R package for computing E-values] have been thoughtful about trying to convey this in their various papers. In terms of uptake, I think it helps that some top journals are encouraging or requiring their use, and there is software to easily calculate E-values, so it is not a big lift. It's also pretty interesting that approaches very similar to the E-value can be used for many other forms of bias that manifest as some kind of backdoor path in a graph; the theory Peng developed for the E-value is general and flexible in this way. EpiMonitor: Some of your work focuses on meta-analyses, which are useful in identifying patterns across the literature, but their utility can be restricted by publication bias. Can you talk about your work on P-hacking and the corresponding R package phacking, and how it addresses some of the issues arising from publication bias? MM: What's fascinating about p-hacking is that we often talk about it alongside publication bias as if they're interchangeable, but they really are not. As traditionally conceived, publication bias is when there is a filter on which studies are published and available for inclusion in meta-analysis. But p-hacking is when investigators actually manipulate results within their studies, for example to obtain a significant result. The bias that arises from p-hacking is, it turns out, much harder to deal with statistically than the bias that arises from pure publication bias. The paper you're referring to provides some new meta-analytic methods that will be unbiased under many -- but not all -- forms of p-hacking and publication bias, unlike methods that only address publication bias. The R package and accompanying website (metabias.io) are largely thanks to my postdoc Mika Braginsky and I think goes a long way toward making the methods accessible. EpiMonitor: Your personal website is a wealth of information with links to datasets, code, and lecture slide decks. You are clearly a proponent of open science! Can you talk a bit about what open science means to you and why it's important? Have you encountered any drawbacks? MM: Yes, I believe that in general open science can lead to a much more reliable, efficient, and accurate scientific ecosystem. Some of the key tenets that I most strive for in my own work are making analysis code publicly available, making de-identified data available when permissible, and preregistering study hypotheses and protocols. There is increasingly compelling empirical evidence from other disciplines, especially the social sciences, that these kinds of practices can reduce errors and publication bias in the literature, and I hope to see our field update its norms and incentives accordingly. There can definitely be drawbacks to adopting open-science practices: for example, it's harder to p-hack and get attractive results to publish! On the other hand, I do find that pre-specifying analyses sharpens my scientific reasoning and that organizing data and code for public release encourages better, more reuseable workflows. Sometimes there are surprises as well. In a paper Matt Fox and I wrote, we recount a story where the public repo containing study materials for one of my papers became -- to my great surprise -- pretty popular for re-use in other papers. This not only meant people were citing the original paper a lot more than I'd expected, but also it was very fun to see the creative new ideas and results that others generated from the study materials. EpiMonitor: You completed your PhD in 2018, do you have any advice for early career epidemiologists and data scientists? MM: Yes, that's right, so probably my advice should be treated as untested and unreliable until further study! More seriously, for me an important guiding principle has been to keep at the forefront why I love being in academia in the first place, and relatedly, what I need in order to do work that contributes usefully to our canon of knowledge. Then I try to prioritize the most mission-critical things. I love going off into a hermit cave and struggling with a difficult concept or proof. This is the main reason I am in academia. It would be easy to let meetings and email constantly interrupt this kind of deep work time, but I try hard to prevent this from happening too often. Similarly, curiosity is vital to life in academia but can easily get pushed aside if we focus just on getting papers out and checking off tenure criteria. If I get interested in something, I try to just roll with it even if it means writing a paper, or recently founding a lab, on a topic that is pretty out of the box. Last, of course, it's also helpful to try to have a life outside of work. I work strange hours, but do preserve a significant amount of time to enjoy friendships and several hobbies, and I think this makes the rigors of academia sustainable in the long run. Dr. Mathur’s work, including links to publications, data repositories, and slide decks, can be found on her personal website: https://www.mayamathur.com/. ■ |
|
|