Pandemic reveals strengths of new flu database

Jun 25, 2009 (CIDRAP News) – Against the backdrop of a global struggle to solve a dispute related to H5N1 avian influenza virus sharing and an anxious watch over the novel H1N1 virus sweeping the globe, a new public database for sharing influenza genetic sequences is easing the flow of data and winning the support of a growing community of researchers and health officials, even some from countries that have sparred in the past over intellectual property rights.

The database, which contains both human and animal influenza sequences as well as epidemiologic and clinical data, is part of the Global Initiative on Sharing All Influenza Data (GISAID), a nonprofit foundation based in Washington, DC, that was formed in 2006 with the support of international researchers who sought a more open way to share genetic data from H5N1 and other influenza viruses. Seventy-seven of the world's leading flu researchers, including six Nobel laureates, signed a letter in Nature announcing the group's formation.

Though databases such as GenBank still play a vital role in sharing and archiving influenza viruses, GISAID's EpiFlu database provides a more complete picture of the flu data. It includes some flu sequences that have not been made available to the public and permits scientists to submit extra information, such as clinical features, when they upload sequences. Once sequences are submitted, they are immediately accessible to other researchers.

Database exposes new virus structure and spread
In late April, the first news of the novel H1N1 virus threatened to overwhelm the database, GISAID officials said. However, partners from four time zones kept registration verifications going around the clock, ensured everyone's access to the data, and shared early detection findings with health authorities.

By April 25, the CDC had uploaded the first full genome sequence of the new virus from the initial US cases onto the GISAID database, instantly giving the world's research community its first detailed look at novel H1N1.

A spokeswoman for GISAID said its database administrators were among the first to see where the new virus was circulating, as spikes in registrations from a country typically preceded official confirmation of the first case.

"It was an incredible way to see where the next cluster was about to be confirmed," she said. "These were indeed some very dramatic days for all of us, days I believe some will never forget, since no one really knew what was about to happen next."

Virus-sharing dispute sparked project
Peter Bogner, a broadcast-executive-turned-international-crisis-manager who is GISAID's principal facilitator, devised a strategy for this initiative after first hearing about the virus-sharing debate at the World Economic Forum in Switzerland in January 2006, where he attended a meeting with former U.S. Homeland Secretary Michael Chertoff on America's preparedness for an influenza pandemic. He used his contacts in governments around the globe to forge a consensus between scientists and policy makers on how a responsible sharing mechanism for influenza data should work, then pushed forward with its development.

He enlisted support for the new database concept from Nancy Cox, MD, director of the influenza division and the WHO collaborating center at the US Centers for Disease Control and Prevention (CDC) in Atlanta, and from veterinary virologist Ilaria Capua, of Italy's Istituto Zooprofilattico Sperimentale delle Venezie (IZSV) in Padua.

Capua had previously said in scientific forums that sequence data for H5N1 avian influenza virus strains should be shared immediately in publicly accessible databases so that researchers around the world can more quickly track the virus's movement and evolution. Her proposals represented a departure from the policies that have kept some H5N1 viruses in more protected databases because of governmental secrecy, research publishing constraints, and intellectual property considerations.

Cox’s team and those of the other three World Health Organization (WHO) collaborating centers that were caught in the middle of the data-sharing debate decided to support the GISAID concept by providing the scientific expertise and some initial financial backing from the CDC to get GISAID’s EpiFlu database off the ground. However, designing the structure of the new database also required the involvement of researchers from other national influenza centers and the leading H5 influenza veterinary laboratories, GISAID officials said.

"Together they are effectively the architects of our database, given that no bioinformatics group by itself could even begin to design such a system without the experience of these influenza experts," the GISAID spokeswoman said.

Following the request by governments of countries hit by avian influenza, GISAID also developed on its platform a system for tracking actual samples of H5N1 and other flu strains with pandemic potential. The tracking system is designed to mitigate transparency concerns that have spurred intellectual property rights controversy and moved some developing countries to demand greater access to pandemic vaccines in return for the viral isolates they share.

GISAID’s platform was created and is maintained by the Max-Planck-Institute for Informatics in Saarbruecken, Germany. The Swiss Institute of Bioinformatics developed programming for the influenza database.

System yields practical benefits
GISAID’s EpiFlu went live on May 15, 2008, with Indonesia supporting the project and promising to share its H5N1 sequences, and quickly achieved important milestones. Alexander Klimov, PhD, ScD, chief of the CDC's influenza surveillance and diagnosis branch, said several of the WHO collaborating centers used the database in September 2008 to make their recommendation for the southern hemisphere's 2009 seasonal flu vaccine, and all of the centers used it in February to make the recommendation for the northern hemisphere's 2009-10 vaccine.

He said GISAID’s EpiFlu database is more comprehensive than others the group has used and has search tools and filters that allow for more precise data analysis.

Catherine Smith, sequence activity officer in the CDC's flu division, told CIDRAP News that GISAID's database combines several features that scientists have long hoped for. She said researchers have sought more flexibility, such as ways to include information about antiviral resistance with the sequence information.

Although the EpiFlu database is open to the public and free of charge, its quest for transparency requires users to identify themselves and to cite the original source of the specimen and submitting laboratory of the data in their manuscripts. Users are also encouraged to collaborate with representatives of the originating laboratories and, most important, to refrain from imposing any restrictions that might preclude others from freely accessing and using all the data. By comparison, other public databases do not offer the same measures to ensure openness and crediting of sources.

Submitting sequences is quick and easy
Loading sequences is quick and easy, and several can be submitted at once. "It was mind-boggling before, but now it's nothing to dread," says Smith. When a user adds a sequence, the database program performs a curation step by verifying that the sequence is a functional protein and comparing it to subtypes in GenBank. The program generates a unique accession number for each sequence submitted, which is critical when submitting a manuscript for publication. If necessary, with a single click, users can automatically upload their sequences to GenBank at the same time they submit them to EpiFlu, which avoids duplicating the entry.

Isabella Monne, a researcher from IZSV in Padua who has used the EpiFlu database, said other useful features include the ability to add new information, such as clinical and epidemiologic data, about sequences that have already been submitted. She said another major benefit is that other users have already agreed on the principles of trust and respect for intellectual property.

Collecting sequences in one place from humans, animals, and the environment "will allow us to 'join the dots' from en epidemiological point of view," Monne said. "In addition, we will have a real-time grasp on the occurrence of mutations which are of relevance to public health, such as virulence markers and antiviral resistance, and this will allow scientists to study these mutations and policymakers to decide accordingly on pandemic preparedness."

Monne said the 2006 rallying call for greater virus sharing has dramatically boosted the number of sequences in all public avian influenza databases. "I believe that now there is a new awareness about the importance of sharing for improved global public health," she added. "Certainly, the major challenge we have is to continue to promote sharing and find incentives for scientists that share."

Also, the database has searchable fields. For example, when working on the flu vaccine recommendation, scientists were able to search the viral sequences by country over a certain time frame. "There's no other database that lets you do that," she said. The volume of sequences that researchers can compare has provided powerful benefits to researchers, she added. "A conclusion [previously] based on 5 samples can now be a conclusion based on 70 samples."

Tracking virus samples' travels
The virus tracking application, which has been offered to the WHO free of charge, is ready to use, and scientists are currently exploring its features, GISAID confirmed to CIDRAP News.

The tracking system was developed by the German based Kisters AG with the support of members of the GISAID community after an appeal from WHO member states in 2007. Countries supplying H5N1 isolates wanted a secure and transparent mechanism for continuously monitoring a specimen's chain of custody after it is submitted to a WHO collaborating center or H5 reference laboratory for confirmation and risk analysis.

After countries affected by H5N1 viruses send specimens from a living or deceased patient or animal to the laboratories, the WHO collaborating centers ship samples to research institutes and vaccine manufacturers request that request them.

The tracking system creates a record of where the sample goes by sending automatic e-mail notifications. "Many countries want to have some ownership and a way to give credit," Smith said. Another benefit for researchers is that the tracking system contains ample contact information in case questions arise about the sample. "There's a lot of potential for transparency. For example, there's a mapping tool to see where the sample or copies have traveled contained within the application," she said.

Though progress has been slow in solving virus-sharing issues, members of the WHO's intergovernmental virus-sharing group in a report following their May 15 and 16 meeting acknowledged the importance of transparency and the need for mechanisms such as those on GISAID's platform. A week earlier at an Association of South East Asian Nations (ASEAN) meeting in Bangkok, the group recognized GISAID for encouraging the sharing of influenza genetic data, as well as the CDC for its contributions during the novel H1N1 outbreak.

See also:

GISAID Platform

Aug 31, 2006, GISAID letter in Nature

Aug 25, 2006, CIDRAP News story "Scientists launch effort to share avian flu data"

May 19, 2008, CIDRAP News story "Experts welcome Indonesia's vow to share H5N1 data"

This week's top reads