NIH: 2,000 flu viruses sequenced, data published

Feb 22, 2007 (CIDRAP News) – Scientists with the National Institutes of Health (NIH) have finished mapping the genomes of more than 2,000 human and avian influenza viruses, an achievement that will help efforts to develop new flu vaccines and drugs, the NIH said yesterday.

The genetic data harvested in the Influenza Genome Sequencing Project have been deposited in GenBank, an Internet-accessible public database, the NIH said. The project is an effort of the National Institute of Allergy and Infectious Diseases (NIAID).

"Scientists around the world can use the sequence data to compare different strains of the virus, identify the genetic factors that determine their virulence, and look for new therapeutic, vaccine and diagnostic targets," NIH Director Elias Zerhouni, MD, said in a news release.

The sequencing project was launched in 2004, when limited genetic information on flu viruses was publicly available, Maria Y. Giovanni, PhD, who oversees the NIAID Microbial Sequencing Centers, commented in the news release. She said the project has vastly increased the sequence data available.

"Subsequently, there has been a marked increase in the number of scientists worldwide depositing influenza genome sequence data into the public domain including scientists at St. Jude Children's Research Hospital [in Memphis] and the Centers for Disease Control and Prevention," Giovanni said.

The project is being conducted at the NIAID-funded Microbial Sequencing Center managed by The Institute for Genomics Research, of Rockville, Md. The center has recently increased its sequencing capacity to more than 200 viral genomes per month, the NIH said.

Most of the viruses sequenced so far have been human ones, but about 100 avian viruses that are in the public domain are being mapped as well, Giovanni told CIDRAP News in an interview.

She said the project has sequenced about three dozen H5N1 avian flu viruses, but all are low-pathogenic strains. "We don't have access right now to the [highly pathogenic] ones that are circulating" in Asia, she added.

Giovanni said the goal of the project from the start was to publish complete viral sequence data, not just sequences for particular proteins such as hemagglutinin and neuraminidase.

"Within 45 days of completion everything gets put in GenBank for everyone to have access to," she said. "We're hoping that we're actually starting a trend. If we put it in GenBank and somebody else puts data in, then you have this incredibly rich source of data that's very diverse."

To obtain viral isolates for sequencing, the NIAID has advertised the project at conferences and by other word-of-mouth avenues, according to Giovanni and Karen Lacourciere, PhD, an influenza program officer in the NIAID's respiratory disease branch

Lacourciere said putting flu virus genetic data in the public domain has become "a very hot issue" in the flu community in the past year. "As a result, more and more people have become aware of this project and have come to us," she said. "With the threat of a pandemic, I think people are recognizing the importance of sharing data rather than saving it for their own publication."

Giovanni said it's difficult to estimate how many research studies have used sequence data harvested by the project, since GenBank contains data from the NIAID and other sources. "We know right now of two publications that have been based on this data, but I'm sure more will come," she commented.

The project has received viruses from industry, academic institutions, international organizations, and other government agencies, she said. There is no cost to the submitting scientist or agency.

"We've pretty much sequenced everything submitted," she said. The process for a given isolate takes a few months.

To help interpret the data generated by the project, the NIAID has funded the BioHealthBase Bioinformatics Resource Center, according to the news release. The center provides scientists with "software tools and a robust point-of-entry for accessing influenza genomic and related data in a user-friendly format."

The bioinformatics center is being developed by researchers at the University of Texas Southwestern Medical Center in Dallas and specialists at Northrop Grumman Information Technology's Life Sciences Division in Rockville, Md., the NIAID said.

See also:

Feb 21 NIAID news release, with links to GenBank and other resources

Feb 16 CIDRAP News story "Indonesia to resume sharing H5N1 samples with WHO"

Newsletter Sign-up

Get CIDRAP news and other free newsletters.

Sign up now»


Unrestricted financial support provided by

Bentson Foundation Gilead 
Grant support for ASP provided by


  Become an underwriter»