HOME    ABOUT    NEWS    JOB BANK     EVENTS    CONTACT

 

Epi Info™ Phase-Out
 Reflections on a Transformational Data Tool as Told by Developer

 

Author: Dr. Andrew Dean

Editor's Note:
Computational analysis and data visualization are integral to epidemiologic work and current programs are the result of countless iterative improvements. The CDC's Office of Public Health Data, Surveillance, and Technology (OPHDST) recently announced the phase-out of Epi Info™, the public domain software for epidemiologic analyses, maps, and graphs. Epi Info™ has been critical to epidemiologic investigations, particularly in remote areas.

This month we had the privilege of interviewing the Epi Info™ developer and former Epidemic Intelligence Service (EIS) Officer, Dr. Andy Dean. Dr. Dean received many national and international awards recognizing his work on Epi Info™, but the one he most treasures was the first. In April 1987, the graduating Epidemic Intelligence Service (EIS) Class of 1985 awarded him the Dr. Phil Brachman teaching award for his role in creating Epi Info™, a tool that empowered dozens of subsequent EIS officers and other public health workers to go anywhere in the world with a portable computer or laptop and conduct lifesaving investigations. Dr. Dean later served as president of the Council of State and Territorial Epidemiologists (CSTE), and his peers there awarded him the Pumphandle Award ten years later in 1995. 

Drs. Andy Dean and Consuelo Beck-Sagué, husband and wife and both former EIS officers, shared their experiences developing and implementing this software. It’s a remarkable story of identifying a need and delivering a solution that benefited many epidemiologists and the communities they served.

This interview has been edited for length and clarity.


EpiMonitor: Can you talk about the lack of computational resources for epidemiologic fieldwork in the early/mid-eighties and what led you to develop Epi Info™?

Dean: In the early 1980s, epidemiologists had a choice of processing data and statistical testing, mostly using SPSS or SAS. Both were expensive and not in the public domain. They were mostly designed for mainframe computing. They were complex and not very accessible. Often, epidemiologists would have to explain the analysis that they wanted to perform to a statistical or computer professional down the hall or in another facility or state, and since these analyses sometimes produced other questions, it required multiple time-consuming processes.

Beck-Sagué: Computer programmers and/or statisticians generally managed access to these resources. During the 1985 EIS class training, we spent most afternoons learning how to use mainframe SAS and Job Control Language (JCL) in our respective corners of CDC Atlanta headquarters trying to use SAS and JCL for the data that we were supposed to analyze.

Dean: Data entry required knowledge of the mainframe’s procedures.

Beck-Sagué: Data entry and analyses were as discouraging as one could imagine. Most of us were coming from clinical medicine and spending the mornings in the thrilling process of learning how to analyze epidemiologic data from outbreak investigations, surveillance, or other vital activities. Designing data collection, data entry forms, and conducting data analyses was challenging but exciting. But the weak and nightmarish link was the use of computational resources to achieve these ends. 

I envisioned my first epidemic investigation, my first “EPI AID” as working out a puzzle, solving a mystery. But I didn’t know enough to envision the onsite investigation of an epidemic with computer resources on site until July 28, 1985 when Andy [Dr. Dean] presented the very earliest version of what would become Epi Info™. 

EpiMonitor: I understand that you worked with your son, Jeffrey Dean, in developing some of the later iterations of Epi Info™, and I believe that your wife, Dr. Consuelo Beck-Sagué, conducted the first field test of Epi Info™. Can you talk about those experiences?

Dean: My son Jeff was about 16 years old when we moved from Minnesota, where I was state epidemiologist, to Atlanta to work at CDC. I had built a microcomputer from a kit when we lived in Minnesota. As a teenager, Jeff loved computing. He belonged to a computer network at the University of Minnesota that allowed young people to work on games data. During our first year in Atlanta, we needed a programmer. I designed the specs for software that would work like a word processor that would allow ordinary people (non-programmers) to draft a questionnaire, with options (e.g. <Y> yes/no; continuous variables; multiple choice, etc.) that the program would then turn into a data-entry screen.  The program would then convert the entered data into a dataset that the program could analyze. As I recall, Jeff designed and programmed much of the data-entry and analysis program in Pascal. The analysis program took the data from the data entry module and calculated 2x2 tables, relative risks, analyses of continuous and categorical data, etc. and produced statistics, including significance testing.  During the summer of 1985, Jeff produced about 200,000 lines of Pascal code that became part of Epi Info.  

Beck-Sagué:  I was admitted to the Epidemic Intelligence Service (EIS) class of 1985 and assigned to the Division of STDs. On July 28, 1985, Andy’s collaborators explained components of the program, and Andy gave a demo of the alpha version.

There were over 100 of us EIS officers, and I was in the front row. Andy is very thin, not tall, and very soft-spoken; everyone was very quiet, looking at the screen where the output from his computer was being projected. Andy created a questionnaire in what looked like an ordinary word processor, “QES”, and made fields for different variables to investigate an epidemic of diarrhea after a picnic. Then, he went to the menu and picked “ENTER”, and the questionnaire turned into a data-entry screen. The blanks turned into empty fields. He entered information for a few participants in the picnic, age, sex, yes or no for different foods. When he had a few in, he went to ANALYSIS and chose the “READ” command that pulled up the dataset that had been created in less than a second. The entered data had become a dataset ready to be analyzed.  He demonstrated some of the ANALYSIS commands, and then used the READ command on an identical dataset where data had been entered for over 70 persons who attended the picnic. He performed various analyses, using very simple commands, culminating with the TABLES command, which showed that having eaten vanilla ice cream was significantly linked with being a case.

We all looked at each other, stunned.  This process of converting a questionnaire into a data-entry screen, entering data, and analyzing the data, answering the major question of the investigation, had taken minutes. Months of JCL and “coding” into 0s and 1s, and analysis using SAS (PROC FREQ: TABLES, etc.) had become literally, a summer afternoon’s fun. People got a little emotional and when Andy opened the floor for questions there was thunderous applause.

Four months later, in the Division of STDs, we were invited to investigate a cluster of deaths due to tertiary syphilis in young people in a county (“County A”) in State A, and a simultaneous surge in cases of early syphilis. I was the Division’s syphilis investigator. I dutifully put together the Epi 1 Memo that went to the Epidemiology Program Office to announce that CDC had been invited by the county to investigate the outbreak. The next day, my branch chief told me that (100% coincidentally), Epi Info was ready for an alpha test. Andy wanted to accompany me on the syphilis investigation for the first field test of what would become Epi Info. To this day, Andy is 100% sure that he did not remember ever having seen me before.

We went together to County A in State A in January 1986, and designed questionnaires in a couple of minutes which became data-entry forms to enter and analyze data from the medical examiner autopsy findings of everyone autopsied in the last year, and results of postmortem serum HIV and treponemal tests conducted at CDC. We also abstracted, entered and analyzed data on hundreds of cases of early syphilis who had been treated in the county in the last year. It was amazing. Since that experience, I have conducted at least 40 studies, including epidemics, surveillance, trials, and various others, using the succeeding versions of Epi Info. These studies were conducted in multiple states in the US, and various countries, including Egypt, Sierra Leone, Botswana, South Africa, the Dominican Republic, Haiti and Guatemala.     

I was able to rapidly, onsite, help colleagues develop control measures and lifesaving interventions, and to better understand the needs and resources of their populations. Everywhere I went, I left copies of Epi Info, so that my colleagues there could conduct their own analyses. Within a few years, Epi Info became the most widely used epidemic investigation software for desktop, portable and laptop computers in the world.

Epi Info was readily translatable. Epidemiologists from China, Egypt, Italy, France and Spain worked with Andy to develop Epi Info versions in their languages. These collaborations turned into lifelong friendships. 

I loved this program so much I had to marry the inventor. I’m just kidding, but really, it is that remarkable. I still use it for everything. I am going to be 73 next year when Epi Info sunsets. I will probably sunset, too. It has been amazing. 

EpiMonitor: Could you describe some of the challenges you encountered in the development and implementation processes?

Dean: Some challenges included the paucity of resources available to maintain and improve Epi Info through its different versions. The availability and interest of government in creating resources that are not money-makers varies with time and administrations, and there is generally a lot of competition for limited resources. This resulted in the need to work on a shoestring budget, with a very small group of highly dedicated professionals.

Beck-Sagué: I would add the tendency for people in the beginning to not trust a tool that could be used by people without a lot of formal statistical and computer training. This also extended to a problem with professionals who themselves were initially afraid of conducting work with computers. My experience has been that between 1985 and 2000, people changed.

They got much more confident and experienced in the use of computers for professional activities, research and recreation, and it became much easier to train and collaborate with colleagues using Epi Info. The emergence of the internet reduced many of the barriers to the use of Epi Info, and the insistence on its being open-source and free helped to make it a worldwide phenomenon including in resource-constrained developing countries and institutions.  

CDC support for Epi Info™ will be available until Sept. 30, 2025. An annotated history can be found here. An excerpt from a message to the Epi Info™ user community on the CDC website reads: “This sunsetting decision is part of OPHDST's realignment of resources to focus on products that support our "One Public Health Approach" to data modernization…We recognize that Epi Info™ has been an integral part of users' public health work for nearly four decades. We appreciate the trust you have placed in Epi Info™ and CDC.”

 Dr. Consuelo Beck-Sagué during the
first field trial of Epi Info

 

 

Dr. Andrew Dean

     

1993 Conference on “Microcomputers and the Future of Epidemiology”

Dr. Andrew Dean (left) with translators:
 

Spanish – Juan Carlos Fernandez Merino
Chinese
French – Dr. Robert Freund
Indonesian: Dr. Pandu Riono

Arabic: Dr. Samy Sidki

 

 

Dr. Jeffrey Dean

 

 

HOME    ABOUT    NEWS    JOB BANK     EVENTS    CONTACT