Previous Article | Next Article ![]()
Journal of Virology, June 2009, p. 5485-5494, Vol. 83, No. 11
0022-538X/09/$08.00+0 doi:10.1128/JVI.02565-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland,1 Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, Pennsylvania,2 Virology Department, Veterinary Laboratories Agency—Weybridge, Addlestone, Surrey, United Kingdom,3 Fogarty International Center, National Institutes of Health, Bethesda, Maryland4
Received 12 December 2008/ Accepted 10 March 2009
|
|
|---|
|
|
|---|
The processes by which avian IAVs stably switch hosts and acquire mutations that facilitate replication and efficient transmission in a new host species are fundamental to understanding the ecology of these viruses but are also of critical importance to public health and veterinary preparedness. IAVs from the genetically and antigenically divergent avian reservoir pool have been associated with stable host switch events to novel host species, including humans, swine, domestic poultry, and horses (1, 55, 61). The last three human influenza pandemic viruses all contained two or more novel genes that were very similar to those found in IAVs of wild birds, derived either by reassortment with circulating human strains in formation of the 1957 and 1968 pandemic viruses (23, 47) or possibly by whole-genome adaptation in the case of the 1918 pandemic virus (18, 36, 43, 60). Other novel influenza viruses derived by stable host switching from avian influenza viruses have also been isolated recently from pigs, including other independent introductions of A/H1N1 influenza viruses in China (19), A/H4N6 influenza viruses in Canada (22), and most recently, A/H2N3 influenza viruses in the United States (27). Similarly, a stable lineage of A/H3N8 influenza virus emerged in dogs in the United States following a host switch event without reassortment from the equine A/H3N8 lineage (9). The present concern that an avian influenza virus, especially the currently circulating lineages of highly pathogenic avian H5N1 influenza virus, could initiate a new pandemic if the virus stably adapts to humans is also a question of considerable biomedical importance (62). Together, these examples demonstrate that reassortment is not a prerequisite for IAV emergence in novel hosts.
Swine have been hypothesized to be the mixing vessel in which avian and human IAVs reassort, resulting in the emergence of novel human pandemic influenza virus strains (2, 46). However, direct or experimental data linking swine as intermediaries in the emergence of past pandemics are lacking. Swine are the only animals documented to be susceptible to infection with avian, swine, and human IAVs (2), and coinfections with both avian and human IAVs have been reported (3, 5, 20, 49, 64). This has been attributed to the fact that swine tracheal epithelium expresses both
2,3 (avian IAV preferred)- and
2,6 (mammalian IAV preferred)-N-acetylneuraminic acid-galactose-linked receptors (17), and it is believed that avian IAVs adapted to swine undergo a shift from
2-3 to
2-6 binding, a critical step required in the adaptation of an avian virus to a human host (52). A subset of amino acids that are invariant in all avian hemagglutinin (HA) subtypes but vary in mammalian-adapted HAs have been identified (29). It is possible that this set of mutations (or a subset thereof) play important roles in the adaptation of avian IAVs to swine.
Whether common genetic changes are associated with the adaptation to specific host species, such that they are predictive of future events, or if genetic changes are made up of unique constellations of mutations that occur independently in each host switch event is an important question. The process by which the 1918 pandemic A/H1N1 influenza virus emerged and adapted to both humans and swine is not yet fully elucidated, although the virus is avian-like in both its coding sequences (58, 60) and nucleotide composition (36). The European avian-like swine A/H1N1 viruses emerged independently of the 1918 pandemic virus from an avian-like source (34, 45, 49). We therefore sought to compare changes that might be associated with mammalian adaptation between these two swine H1N1 lineages.
To address whether the two swine H1N1 lineages were evolving in parallel, as might be expected given their common host species, we examined patterns of base composition variation to determine relative nucleotide usages. We examined in detail the amino acid sites that had previously been reported as important for mammalian adaptation to determine whether these mutations appeared as parallel genetic changes, and therefore were always required for avian H1N1 IAV to adapt to pigs, or whether there is more flexibility in the adaptive process.
|
|
|---|
Sequence analysis. In addition to the 17 European swine IAV genomes sequenced for this study, 3 European swine avian-like H1N1 genomes, 38 classical swine H1N1 IAV genomes, other available swine H1N1 full-length gene sequences, and 81 human A/H1N1 virus genomes were downloaded from the Influenza Virus Resource (http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html) and/or GenBank. The genome sequences of the 1918 pandemic IAV and representative Eurasian avian influenza virus sequences were used to infer the ancestry of the two swine lineages. Whole genome sequences were not available for the Eurasian avian viral sequences analyzed. Therefore, the NCBI BLAST database was used to find highly similar avian sequences for each European swine influenza virus gene segment. Consensus sequences were derived from the Eurasian avian IAV sequences for each segment. Within these data, we compiled separate manually aligned gene segments (coding regions) using Se-Al (36a). Sequence alignments consisted of the following coding regions for each segment: 197 PB2 (2,277 nucleotides [nt]) sequences, 203 PB1 (2,271 nt) sequences, 198 PA (2,148 nt) sequences, 214 HA (1,218 nt) sequences, 234 NP (1,494 nt) sequences, 231 NA (1,410 nt) sequences, 193 M1/2 (1,002 nt) sequences, and 224 NS1/2 (831 nt) sequences. Sequence alignments are available upon request.
Phylogenetic analysis.
The best-fit GTR + I +
4 model of nucleotide substitution was determined using ModelTest 3.7 (35a), and resulting parameter estimates were imported into PAUP* (56) to create maximum likelihood trees through tree bisection-reconnection branch swapping (parameter values available upon request). Whole genome sequences and a consensus Eurasian avian sequence were concatenated (in the absence of reassortment [see below]) to infer the evolutionary relationship of swine H1N1 IAVs. Individual gene segment phylogenetic trees are available in the supplemental material.
To estimate the rates of evolutionary change and the time to the most recent common ancestor (TMRCA), we applied a Bayesian Markov chain Monte Carlo approach available in the BEAST package (13), employing a relaxed (uncorrelated log normal) molecular clock in all cases (12). For each data set, we utilized the Bayesian skyline coalescent prior (as demographic history was a nuisance parameter in our analysis) with a 10% burn-in, assuming a GTR + I +
4 model of nucleotide substitution. Uncertainty in parameter estimates is reflected in the 95% highest probability density (HPD) values, and all chains were run for sufficient length to ensure convergence, as assessed using the TRACER program (http://tree.bio.ed.ac.uk/software/tracer/). For the estimates of TMRCA, the most recent sequence used for a classical swine H1N1 virus was from 1991 (A/swine/Maryland/23239), and that for a European swine H1N1 virus was from 2004 (A/swine/Spain/53207/2004). Selective pressures on codon sites were estimated along the branches of the swine H1N1 phylogenetic trees for all eight genes, using Datamonkey (35; http://www.datamonkey.org/). The best-fit codon model was fitted to the data by using parameters obtained from the best-fit nucleotide substitution model. A 1-df likelihood ratio test was applied to the data to determine whether the instantaneous rates of synonymous (
) and nonsynonymous (β) substitutions differ and whether this difference is based on
> β (negative selection) or
< β (positive selection) and is significant.
Analysis of base composition. With the exception of the Eurasian avian sequences, for which insufficient whole H1N1 genome sequences were available, genome sequences were used to calculate the base compositions of all lineages. A method similar to that of Shultes et al. (48) was used to calculate the frequencies of GU (G+U), GA (G+A), and GC (G+C) across all eight gene segments for the European swine, classical swine, human H1N1, Eurasian avian, and 1918 H1N1 IAV lineages. Unambiguous calculations of the base compositional space of these IAV genes were defined by the following three parameters: GU (frequency of G plus frequency of U), GA (frequency of G plus frequency of A), and GC (frequency of G plus frequency of C) for each gene segment. Base composition frequencies of complete and first and second codons were calculated using the PAUP* package (56). Base compositional data were then graphically plotted using the R, version 2.7.0, statistical program (2008). The third-position GC content for each gene segment of swine IAV was measured using the GCUA (General Codon Usage Analysis) package (30).
Amino acid analysis. Amino acid differences among the lineages were recorded as changes at amino acid sites compared to the putative ancestral sequence. Given its location at the root of the human and classical swine H1N1 clades, we used the 1918 Brevig Mission sequence as the ancestral sequence to infer changes for classical swine and human H1N1 viruses. In the case of the European swine H1N1 viruses, we collected highly similar Eurasian avian sequences from 1977 to 1998 for each gene segment to infer amino acid changes in European swine H1N1 IAVs.
Nucleotide sequence accession numbers. Sequences generated for this analysis have been deposited in GenBank (accession numbers CY037895 to CY038027).
|
|
|---|
![]() View larger version (21K): [in a new window] |
FIG. 1. Maximum likelihood tree of concatenated genome sequences (54 whole genomes) of European and classical swine H1N1 IAVs. Horizontal branch lengths are drawn to scale (nucleotide substitutions per site). Bootstrap values (>75%) are shown next to the relevant nodes. The tree is midpoint rooted for clarity only. Classical swine H1N1 viruses are in blue, and European swine H1N1 viruses are in red.
|
|
View this table: [in a new window] |
TABLE 1. Parameter estimates under the uncorrelated logistic Bayesian demographic model
|
|
View this table: [in a new window] |
TABLE 2. Key unique conserved changes in European avian-like swine influenza viruses
|
Changes in HA and NA. Classical swine strains maintain the critical HA receptor binding domain mutation E190D (H3 numbering), as do the majority of European avian-like swine strains (A/swine/Netherlands/3/80 retains the avian consensus glutamic acid). Classical swine strains possess the avian glycine at 225, whereas this receptor binding domain residue is variable in European avian-like swine strains and includes the avian 225G, but also G225E and G225K (see Table S1 in the supplemental material). Interestingly, European avian-like swine strains also show variability in receptor binding residues 135, 137, and 138, unlike classical swine strains, which maintain the avian consensus at these sites. Thus, some European avian-like swine strains show V135I/A/S/T, A137I/V, and A138S changes. We also found evidence of positive selection at residue 145 (see Table S1 in the supplemental material).
European avian-like swine strains also show changes in or near mapped antigenic site regions in human H1, as previously reported (4). Most of these changes are in or near the mapped Ca and Sb antigenic regions (6, 37). These viruses also lose two potential N-linked glycosylation sites which are conserved in avian H1 sequences and the 1918 virus (38). In more recent strains, residues 104 to 106 (NGT) become NGA, and in most European avian-like swine strains, residues 304 to 306 (NSS) become NSN. However, these strains gain two potential glycosylation sites at residues 212 to 214, where ADA becomes NHT (in the antigenic Sb region), and at residues 291 to 293, where NCD becomes NCT in most strains.
The neuraminidase (NA) of the European avian-like swine strains maintains the 15 conserved amino acids making up the active site of the enzyme (8), and no mutations associated with NA inhibitor resistance are observed. The NA also maintains the full-length stalk and the seven potential N-linked glycosylation sites predicted for the 1918 influenza virus (41). Some European avian-like swine strains gain an additional potential glycosylation site at residues 386 to 388, where SFS becomes NFS or NYS.
Analysis of nucleotide compositional space of individual gene segments. To investigate changes in base composition through time, an indicator of the evolutionary processes that shape genetic diversity in influenza virus, the percent GC content at the third position of each codon of each gene segment was plotted over time (Fig. 2). The directionality of third-position GC content change is measured as the percent change over time from the ancestral sequence.
![]() View larger version (25K): [in a new window] |
FIG. 2. Synonymous third-codon-position G+C contents over time for all eight genes across European and classical swine, Eurasian avian, 1918, and human H1N1 IAVs. Classical swine H1N1 viruses are in red, European swine H1N1 viruses are in blue, Eurasian avian virus sequences are in green, human H1N1 viruses are in light blue, and 1918 H1N1 virus is in black.
|
Some variation in nucleotide base compositional bias is expected for the eight gene segments, based on the different molecular functions of the gene products (which in turn affects amino acid usage). If base compositional bias for the eight gene segments is universal, such that all segments evolve in the same way irrespective of host, it is expected that no difference in the clustering of these genes in compositional space across the different A/H1N1 lineages over time would be observed. Nucleotide compositional analysis revealed that each gene segment has a unique clustering profile, revealing a powerful segment-specific bias (Fig. 3a). However, each A/H1N1 lineage within this overall bias subdivides the clustering profile in space, indicating that there is also a lineage-specific effect on gene composition. Furthermore, the 1918 sequence in general occupies contiguous compositional space with classical swine and human A/H1N1 lineages, as would be expected because classical swine and human A/H1N1 viruses are direct descendants of the 1918 virus. Similarly, the Eurasian avian and European swine A/H1N1 lineages occupy contiguous compositional space in this analysis. There is an overall lack of overlap in compositional space clustering between the two swine A/H1N1 lineages across most of the gene segments. The HA and MP genes show a greater spread in compositional space than do the other gene segments. Analysis of the first and second codon positions also revealed a considerable difference in nucleotide composition between the two swine H1N1 lineages (Fig. 3b). The compositional space is partitioned by a unique compositional profile with very little overlap. The Eurasian avian and European swine viruses showed more overlap for the polymerase genes. The NP and NS genes showed the most specific pattern for all lineages, with little or no overlap in compositional space. The HA gene segment profile showed the greatest spread, most likely indicative of antigenic drift. The base compositional results for M2 and NS2 (NEP) are shown in the supplemental material.
![]() View larger version (20K): [in a new window] |
FIG. 3. (a) Overall nucleotide compositions of eight gene segments by H1N1 IAV lineages. Axes correspond to the frequencies of G+U, G+A, and G+C for each gene segment. Classical swine H1N1 viruses are in red, European swine H1N1 viruses are in blue, Eurasian avian virus sequences are in green, human H1N1 viruses are in light blue, and 1918 H1N1 virus is in black. (b) Nucleotide compositions of eight gene segments at the first and second codon positions. Axes correspond to the frequencies of G+U, G+A, and G+C for each gene segment. Classical swine H1N1 viruses are in red, European swine H1N1 viruses are in blue, Eurasian avian virus sequences are in green, human H1N1 viruses are in light blue, and 1918 H1N1 virus is in black.
|
|
|
|---|
The third-position base composition analysis revealed that each swine lineage is diverging from its putative ancestor by generally decreasing in GC content at the third codon position over time. The movement of human H1N1 viruses to a higher GC content for the HA gene implies that selection for antigenic differences may affect the trajectory of third-position GC content over time, although this will need to be explored in more detail. This trend is also reflected in the NP gene, which shows an overlap in both of the swine lineages, suggesting that both the HA and NP genes are highly host specific. Although synonymous changes at the third codon position can be attributed partially to neutral evolution (24, 31, 63), the similar patterns of decreasing GC content over time for all of the gene segments again argue for host specificity.
The nucleotide compositional analysis revealed several evolutionary patterns (Fig. 3a and b). First, each gene segment has a distinctive signature nucleotide compositional space profile. This trend strongly suggests that there are functional and structural constraints acting on each segment individually, which will clearly need to be explored further. However, within these segment-specific profiles, the compositional space is also partitioned by a lineage effect (Fig. 3). This strongly signifies that natural selection has played a major role in shaping nucleotide composition in IAV, reflective of the past history of each virus. Second, the wide distribution of points in the HA gene shows a strong host effect in compositional space, again likely driven by antigenic drift (such that selection for amino acid changes has a secondary effect on nucleotide composition). Third, the swine lineages share the same compositional space away from the human A/H1N1 and 1918 sequences, again supporting the idea that there is strong selection for host-specific antigenic change. The first- and second-codon-position compositional analysis revealed a more distinct pattern by clade. The classical swine and human H1N1 viruses show very little overlap, indicating host-specific compositional bias (Fig. 3b). The Eurasian avian and European swine clades overlap for the polymerase genes. This suggests that although European swine H1N1 viruses emerged almost 30 years ago, the polymerase genes are still very avian-like. Interestingly, the NP gene shows the most host-specific pattern, with no overlap among the groups. However, the lack of overlap between the two swine H1N1 lineages suggests that host specificity is contingent upon the history of the IAV. Similar to Rabadan et al. (36), we found that the 1918 virus is avian-like in the nucleotide composition of its gene segments.
The biased nucleotide composition at the first and second codon positions is reflective of nonsynonymous changes in amino acid usage. In the context of a single host switch event, the mutations identified in the 1918 influenza virus and subsequently maintained in human influenza viruses and in classical swine strains may represent a set of crucial functional changes from an ancestral avian IAV (15). However, the lack of parallel evolution in the independent emergence of the European avian-like swine strains suggests that the acquisition of a polygenic set of functional changes may be different between independent host switch events. The utility of identifying these mutations as proxies to define whether a future IAV is acquiring changes important in mammalian adaptation might be limited. For example, of the 10 amino acid changes identified in PB2 (15, 60), only more recent European avian-like swine strains share a single change from the avian consensus at one of these sites, at residue 271, with a T271I change. Crucially, they lack the PB2 E627K change (54), even in those strains isolated after 20 years of circulation in swine. Thus, this particular mutation may not be necessary for mammalian adaptation in general, or at least swine adaptation in particular. However, European avian-like swine strains do possess a D701N change from the avian consensus that may also play a role in mammalian adaptation (16), but they lack the K702R change associated with human PB2 genes and early classical swine H1N1 strains (60). Classical swine viruses after the mid-1970s reverted to the avian lysine at 702 but continued to possess the E627K change. The D701N change was observed as one of six changes after mouse adaptation of an avian H7N7 virus (16). None of the other five changes is observed in human, classical swine, or European avian-like swine lineages, and they may be specific to this mouse adaptation experiment. The D701N mutation has also been observed in a small minority of human H5N1 isolates but has been linked to increased pathogenicity in an experimental mouse H5N1 infection model (26). Recently, the structure of the C-terminal end of PB2 was resolved, and structural analysis suggests that this region of PB2 contains a nuclear localization signal and complexes with the importin
5 (57). The structure suggests that the changes observed at residues 701 and 702 may be important in the interaction with importin
5, suggesting a biological explanation for changes at these sites associated with host switch events.
In the H1 subtype, only a single amino acid change, E190D (using the H3 numbering), is required to alter receptor specificity from
2-3 to gain the ability to bind
2-6 receptors (53). The 1918 pandemic viruses possessed HAs with two receptor-binding variants—either with a single E190D change from the avian consensus or with two changes, E190D plus G225D (42). The form with two changes is highly specific for
2-6 binding (51, 53). Both the 1918 pandemic virus and its derivatives and the European avian-like swine virus HAs have the E190D mutation crucial for
2-6 binding (53). This indicates that H1 subtype HAs involved in switching from an avian host to a mammalian host may need to acquire this particular mutation for stable host adaptation. Other changes observed in the receptor binding domains of the European avian-like swine viruses (at residues 135, 137, 138, and 145) may also play a role in altering receptor specificity, but this has not been evaluated experimentally.
In summary, our study demonstrates that we should consider the role of historical contingency, reflected in a strong lineage-specific effect, in the emergence of IAVs from an avian reservoir into a new mammalian host and that mutations identified as important in prior host switch events may or may not be observed in future such events. The host switch events leading to the emergence of the European avian-like swine lineage from birds and the recent emergence of a canine H3N8 IAV lineage derived from equine H3N8 viruses (9) demonstrate that even in the absence of reassortment, stable host adaptation can occur in IAVs by acquisition of crucial mutations.
Published ahead of print on 18 March 2009. ![]()
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»