The substitution rate of HIV-1 subtypes: a genomic approach (original) (raw)

Quantifying Differences in the Tempo of Human Immunodeficiency Virus Type 1 Subtype Evolution

Journal of Virology, 2009

Human immunodeficiency virus type 1 (HIV-1) genetic diversity, due to its high evolutionary rate, has long been identified as a main cause of problems in the development of an efficient HIV-1 vaccine. However, little is known about differences in evolutionary rate between different subtypes. In this study, we collected representative samples of the main epidemic subtypes and circulating recombinant forms (CRFs), namely, sub-subtype A1, subtypes B, C, D, and G, and CRFs 01_AE and 02_AG. We analyzed separate data sets for pol and env. We performed a Bayesian Markov chain Monte Carlo relaxed-clock phylogenetic analysis and applied a codon model to the resulting phylogenetic trees to estimate nonsynonymous (dN) and synonymous (dS) rates along each and every branch. We found important differences in the evolutionary rates of the different subtypes. These are due to differences not only in the dN rate but also in the dS rate, varying in roughly similar ways, indicating that these differen...

Selecting models of nucleotide substitution: an application to human immunodeficiency virus 1 (HIV-1)

2001

The blind use of models of nucleotide substitution in evolutionary analyses is a common practice in the viral community. Typically, a simple model of evolution like the Kimura two-parameter model is used for estimating genetic distances and phylogenies, either because other authors have used it or because it is the default in various phylogenetic packages. Using two statistical approaches to model fitting, hierarchical likelihood ratio tests and the Akaike information criterion, we show that different viral data sets are better explained by different models of evolution. We demonstrate our results with the analysis of HIV-1 sequences from a hierarchy of samples; sequences within individuals, individuals within subtypes, and subtypes within groups. We also examine results for three different gene regions: gag, pol, and env. The Kimura two-parameter model was not selected as the best-fit model for any of these data sets, despite its widespread use in phylogenetic analyses of HIV-1 sequences. Furthermore, the model complexity increased with increasing sequence divergence. Finally, the molecular-clock hypothesis was rejected in most of the data sets analyzed, throwing into question clock-based estimates of divergence times for HIV-1. The importance of models in evolutionary analyses and their repercussions on the derived conclusions are discussed.

Molecular evolution methods to study HIV-1 epidemics

Future Virology

Nucleotide sequences of HIV isolates are obtained routinely to evaluate the presence of resistance mutations to antiretroviral drugs. But, beyond their clinical use, these and other viral sequences include a wealth of information that can be used to better understand and characterize the epidemiology of HIV in relevant populations. In this review, we provide a brief overview of the main methods used to analyze HIV sequences, the data bases where reference sequences can be obtained, and some caveats about the possible applications for public health of these analyses, along with some considerations about their limitations and correct usage to derive robust and reliable conclusions. Executive summary  The HIV pandemic affects millions of people worldwide, and there is an increased trend of infections among vulnerable groups, such as MSM.  The combination of epidemiological, molecular and evolutionary tools is relevant to depict HIV epidemics. In this way, molecular epidemiology can help to develop detection and prevention campaigns focusing in the most vulnerable populations to HIV infection.  This review aims to highlight the most relevant uses of molecular epidemiology, giving an overview of the most common approaches used for understanding HIV epidemics. Consequently, it may be useful for designing correct protocols in HIV epidemiological studies.

The Molecular Population Genetics of HIV-1 Group O

Genetics, 2004

HIV-1 group O originated through cross-species transmission of SIV from chimpanzees to humans and has established a relatively low prevalence in Central Africa. Here, we infer the population genetics and epidemic history of HIV-1 group O from viral gene sequence data and evaluate the effect of variable evolutionary rates and recombination on our estimates. First, model selection tools were used to specify suitable evolutionary and coalescent models for HIV group O. Second, divergence times and population genetic parameters were estimated in a Bayesian framework using Markov chain Monte Carlo sampling, under both strict and relaxed molecular clock methods. Our results date the origin of the group O radiation to around 1920 (1890-1940), a time frame similar to that estimated for HIV-1 group M. However, group O infections, which remain almost wholly restricted to Cameroon, show a slower rate of exponential growth during the twentieth century, explaining their lower current prevalence. To explore the effect of recombination, the Bayesian framework is extended to incorporate multiple unlinked loci. Although recombination can bias estimates of the time to the most recent common ancestor, this effect does not appear to be important for HIV-1 group O. In addition, we show that evolutionary rate estimates for different HIV genes accurately reflect differential selective constraints along the HIV genome.

Using physical-chemistry-based substitution models in phylogenetic analyses of HIV-1 subtypes

Molecular Biology and Evolution, 1999

HIV-1 subtype phylogeny is investigated using a previously developed computational model of natural amino acid site substitutions. This model, based on Boltzmann statistics and Metropolis kinetics, involves an order of magnitude fewer adjustable parameters than traditional substitution matrices and deals more effectively with the issue of protein site heterogeneity. When optimized for sequences of HIV-1 envelope (env) proteins from a few specific subtypes, our model is more likely to describe the evolutionary record for other subtypes than are methods using a single substitution matrix, even a matrix optimized over the same data. Pairwise distances are calculated between various probabilistic ancestral subtype sequences, and a distance matrix approach is used to find the optimal phylogenetic tree. Our results indicate that the relationships between subtypes B, C, and D and those between subtypes A and H may be closer than previously thought.

Evidence for a Recombinant Origin of HIV-1 group M from Genomic Variation

Reconstructing the early dynamics of the HIV-1 pandemic can provide crucial insights into the socioeconomic drivers of emerging infectious diseases in human populations, including the roles of urbanization and transportation networks. Current evidence indicates that the global pandemic comprising almost entirely of HIV-1/M originated around the 1920s in central Africa. However, these estimates are based on molecular clock estimates that are assumed to apply uniformly across the virus genome. There is growing evidence that recombination has played a significant role in the early history of the HIV-1 pandemic, such that different regions of the HIV-1 genome have different evolutionary histories. In this study, we have conducted a dated-tip analysis of all near full-length HIV-1/M genome sequences that were published in the GenBank database. We used a sliding window approach similar to the bootscanning method for detecting breakpoints in intersubtype recombinant sequences. We found evi...

Unravelling the complicated evolutionary and dissemination history of HIV-1M subtype A lineages

Subtype A is one of the rare HIV-1 group M (HIV-1M) lineages that is both widely distributed throughout the world and persists at high frequencies in the Congo Basin (CB), the site where HIV-1M likely originated. This, together with its high degree of diversity suggests that subtype A is amongst the fittest HIV-1M lineages. Here we use a comprehensive set of published near full-length subtype A sequences and A-derived genome fragments from both circulating and unique recombinant forms (CRFs/URFs) to obtain some insights into how frequently these lineages have independently seeded HIV-1M sub-epidemics in different parts of the world. We do this by inferring when and where the major subtype A lineages and subtype A-derived CRFs originated. Following its origin in the CB during the 1940s, we track the diversification and recombination history of subtype A sequences before and during its dissemination throughout much of the world between the 1950s and 1970s. Collectively, the timings and numbers of detectable subtype A recombination and dissemination events, the present broad global distribution of the sub-epidemics that were seeded by these events, and the high prevalence of subtype A sequences within the regions where these sub-epidemics occurred, suggest that ancestral subtype A viruses V C The Author(s) (and particularly sub-subtype A1 ancestral viruses) may have been genetically predisposed to become major components of the present epidemic.

Rates and dates of divergence between AIDS virus nucleotide sequences

Molecular biology and evolution, 1988

The acquired immune deficiency syndrome (AIDS), caused by a retrovirus called human immunodeficiency virus (HIV), has become a pandemic. A knowledge of the rate of nucleotide substitution in HIV and of the history and pattern of spread of the virus is important for understanding the epidemiology and pathogenesis of AIDS and for developing therapies and vaccine strategies. A new model has been developed and used to estimate the substitution rates in various regions in the HIV genome. The rate of nonsynonymous (amino acid-changing) substitution is lowest in the regions coding for the capsid proteins and the reverse transcriptase, being approximately 1.7 X 10(-3) nucleotide substitutions/site/year. The nonsynonymous rate is extremely high (14 X 10(-3] in the hypervariable regions of the envelope gene, suggesting extremely rapid change in viral antigenicity. The nonsynonymous rates in the other coding regions are between 3 X 10(-3) and 7 X 10(-3). The average synonymous rate for the HIV...

Inconsistencies in estimating the age of HIV-1 subtypes due to heterotachy

Molecular biology and evolution, 2012

Rate heterogeneity among lineages is a common feature of molecular evolution, and it has long impeded our ability to accurately estimate the age of evolutionary divergence events. The development of relaxed molecular clocks, which model variable substitution rates among lineages, was intended to rectify this problem. Major subtypes of pandemic HIV-1 group M are thought to exemplify closely related lineages with different substitution rates. Here, we report that inferring the time of most recent common ancestor of all these subtypes in a single phylogeny under a single (relaxed) molecular clock produces significantly different dates for many of the subtypes than does analysis of each subtype on its own. We explore various methods to ameliorate this problem. We conclude that current molecular dating methods are inadequate for dealing with this type of substitution rate variation in HIV-1. Through simulation, we show that heterotachy causes root ages to be overestimated.

Novel Evolutionary Analyses of Full-Length HIV Type 1 Subtype C Molecular Clones from Cape Town, South Africa

Aids Research and Human Retroviruses, 2002

Understanding the origin, distribution, and evolving dominance of HIV-1 subtype C strains is an important component in the design and evaluation of a globally effective AIDS vaccine. To better understand subtype C viruses, we constructed complete molecular clones of primary, CCR-5-using isolates from South Africa and analyzed the molecular phylogenies of these clones using best fitting evolutionary substitution models. Analyses were performed on three full-length sequences, and on the individual genes. All clones were nonrecombinant, and although two of three had open reading frames and intact splice sites, they were not infectious. At the genomic level, the models demonstrated the increasing variability of subtype C in South Africa. At the subgenomic level, they revealed marked differences in the evolutionary patterns of individual genes, a finding that suggests that the genes are under different selective pressures and constraints. These data underscore the dynamic nature of the subtype C epidemic and emphasize the need for continuous monitoring of local strains.