Jul 21, 2008 (CIDRAP News) – In the history of infectious diseases, coincidence plays an extraordinary role. In 1706, Cotton Mather purchased a slave named Onesimus who happened to come from a tribe that practiced variolation, and so smallpox prevention was introduced to North America. In 1928, Alexander Fleming happened to leave a window open in his laboratory, and the contaminants that drifted into a dish of Staphylococcus aureus provided the raw material for the discovery of penicillin.
And in February 2003, a never-identified man in southern China emailed a query to an American teacher he knew from an Internet chat room, who happened to have been the neighbor of a US Navy epidemiologist. The epidemiologist, Dr. Stephen Cunnion, placed the relayed note on the electronic mailing list ProMED—and so the first notice of the international SARS epidemic was brought to the world, weeks before the Chinese government admitted the disease's existence.
Five years on, the example of that relayed note has inspired a broad-based effort to take the coincidence out of outbreak notification. It seeks to do by design what the never-named writer accomplished by happenstance: tap nontraditional sources of information, find and verify the earliest possible news of disease outbreaks, publicize the outbreaks, and possibly help contain them.
Dr. Larry Brilliant, one of the chiefs of the World Health Organization's (WHO's) smallpox-eradication effort and now executive director of the philanthropy Google.org, has dubbed the effort "two steps to the left"—meaning two steps backward on an epidemic curve, when an outbreak is much harder to detect but easier to control or contain.
"Is it possible, or even probable, that if we better understood the complexity and magnitude of the many factors that lead to the emergence of infectious disease, that we . . . might be able to get early warning signals from satellites or webcrawlers or phone banks?" Brilliant said in March in a keynote speech at the International Conference on Emerging Infectious Diseases. "Or even better, that we could identify hotspots where newly emerging communicable diseases would arise . . . ?"
HealthMap is latest entry
The emerging surveillance systems address the first of Brilliant's "two steps": They seek very early warnings of outbreaks by analyzing data that originates outside the public health hierarchy. The most recent entry in the field is HealthMap, created by epidemiologist John Brownstein and software developer Clark Freifeld at the Children's Hospital Boston Informatics Program. It began as a pilot project in September 2006 and is described in the July issue of Public Library of Science Medicine (and is partially supported by a grant from Google.org).
It joins older nonprofit tools including ProMED, a free Web and email-service of the International Society for Infectious Diseases (ISID); official surveillance efforts by public health agencies, such as the European Union's MedISys and the Global Public Health Intelligence Network (GPHIN), operated by the Public Health Agency of Canada for the WHO; and new grassroots efforts by epidemiologists and computer scientists such as the volunteer effort WhoIsSick.org.
Collectively, the new surveillance efforts give teeth to the revised International Health Regulations, which took effect a year ago. The revision formally recognized "informal sources" of disease news as worthy of attention and capable of triggering an international outbreak alert.
The new efforts differ widely. Some rely entirely on human input, while others employ data-mining algorithms. Some are open to the public, others restricted to health professionals or government officials. And some are entirely text-based, while others take advantage of new technologies such as geographical information systems or GIS (the technology behind GoogleMaps) to display the location of outbreaks as precisely as possible.
"No one system can do it all or will be able to do it all," said Dr. Larry Madoff, editor of ProMED Mail and a professor of medicine at the University of Massachusetts. "It is better to have multiple systems, because they provide verification for each other, and conversely show us what we are missing."
Unofficial sources offer speed
What the new surveillance systems share is a refusal to depend on data from public health's established reporting systems, which rely on electronic or paper reports filed by physicians or local health departments and passed layer by layer through the public health hierarchy. Those reports may be exquisitely accurate, because they originate with medical professionals, but they are slow.
The new systems balance the risk of sacrificing accuracy against the need for speed, which they get by harvesting and evaluating news stories, blog posts, listserv discussions, and whatever else can be spotted by eye or scraped by a Web-crawling program.
ProMED, which began in 1994 and has operated with ISID's support since 1999, is the most labor-intensive: It relies on 37 volunteer editors who examine submissions from a global network of official "rapporteurs" and casual correspondents. It also has the lowest-bandwidth: Though it maintains a website, many of its global subscribers rely on its text-only emails, which move easily even through computers connecting by dial-up.
GPHIN, running on the WHO's behalf since 1997, automatically samples two major Internet news aggregators and machine-translates stories in eight languages; the harvested stories are reviewed by humans before they are sent out to a subscription-only network. It is credited with turning up the earliest hint of SARS in November 2002, in a Chinese-language account of a rise in respiratory disease complaints in local emergency rooms.
HealthMap, the newest entry, expands on the sources the other systems draw from: It performs fully automated Web-scraping from 14 aggregate sources that collect data from approximately 20,000 sites. It currently collects in English and machine-translates from four other languages, with three more under development. The reports it collects are automatically sifted for duplicates and mistakes, ranked by urgency, and sorted and posted by source, date, location, and disease.
Its striking innovation is real-time mapping of the news it gathers. Reports are coded with latitude and longitude and "pinned" to a world map; clicking on the pins produces links to the reports that the system has gathered. The collective result—map plus links plus reports—is gathered into a single open-access Web page.
"We realized there is so much content out there on the Web, and that information is scattered in an unorganized, unstructured way," said Brownstein, who is an assistant professor of pediatrics at Harvard Medical School in addition to his Children's Hospital informatics appointment. "So for any particular person—a public health official, an international traveler, a travel clinic, or whatever perspective you come from—to know what is going on at any given time, in any given country, around any given infectious disease is essentially impossible without massive amounts of manual queries and searching that is overly burdensome."
Maps provide extra dimension
The latest movement in novel reporting is the deployment of sophisticated but easy-to-use tools such as GIS-mapping for very local surveillance. It was the inspiration for WhoIsSick.org, a private project by California software engineer PT Lee that aggregates personal reports of illness into "crowdsourced" snapshots of local disease trends.
It was also used recently by the Toronto Star, whose "Map of the Week" project plotted the vaccination-exemption rates of local schools to suggest where an ongoing measles outbreak might strike next. And in the July/August issue of Public Health Reports, researchers from Montefiore Medical Center and Albert Einstein College of Medicine in the Bronx map the location and quality of food sources and exercise areas to illuminate local rates of diabetes and obesity.
Developers of the new surveillance systems agree that incorporating local reports is the necessary next step in the systems' evolution. It may be the most challenging: Data gathered by amateurs is likely to include a higher percentage of inaccurate or irrelevant reports. But it may also be the only route by which areas with no official disease surveillance—or with tight political controls on disease reports—can share information with the rest of the world. In fact, representatives of the public health systems from 23 countries called for enhanced disease surveillance in a December 2007 "call for action," asking industrialized countries to help improve disease reporting especially in Africa and South Asia.
Cell-phone text-messaging has already been used in India to report suspected cases of avian flu to provincial animal-health authorities. A new nonprofit named InSTEDD (Innovative Support to Emergencies, Diseases and Disasters) has received grants from the Rockefeller Foundation and Google.org's Predict and Prevent Initiative to bring rapid disease-reporting tools to Mekong Basin villages in Southeast Asia.
HealthMap's founders are working on a pilot project, using ProMED's volunteer moderators, that will test combining machine-harvested reports with human-evaluated ones. "The vision down the road is it would be a two-way line of communication, not just receiving or curating information [but] also inputting new data," Brownstein said. "That would be the concept that would open this up to the global community."
MedISys home page
GPHIN home page
WhoIsSick home page
May 2007 CIDRAP News story discussing WhoIsSick.org
Lefer TB, Anderson MR, Fornari A, et al. Using Google Earth as an innovative tool for community mapping. Public Health Reports 2008 Jul/Aug, 123(4):474-80
InSTEDD Mekong Basin Collaboration