A computer rendered HIV-1 particle.[[dubrow-2014]]      
Dark Mode: 

by John Berea
Published:  October 2017
Updated: May 3, 2018

HIV Evolution

"The human immunodeficiency virus... is one of the fastest evolving entities known."[[berkeley-2017]] "HIV shows stronger positive selection [having more beneficial mutations] than any other organism studied so far."[[rambaut-2004]]


Despite a huge number of mutations and strong natural selection, HIV has evolved very little since it first entered humans about 100 years ago. We've seen:

  1. A total of about 6x1022 HIV viruses have existed within humans in the last 100 years.
  2. HIV has been through about 14,000 generations during this time.
  3. These HIV viruses underwent about 1022 total mutations.
  4. During this time, about 5,000 or fewer constructive mutations have become fixed within the various HIV subtypes.

This, along with lackluster evolution seen in other large-population microbes, suggests that evolution could not have created complex animals. Especially considering that natural selection is much weaker in complex animals[[rambaut-2004]] [[lynch-2006]] [[sanford-2007]] and and their population sizes and mutation counts are much smaller, thus giving them less of a chance to evolve.

History and Groups

HIV came from SIV (simian immunodeficiency virus) in chimpanzees, which in turn came from SIV in monkeys.

SIV is a retrovirus that infects monkeys and apes. "More than 40 species of African monkeys are infected with their own, species-specific, SIV and in at least some host species, the infection seems non-pathogenic ."[[sharp-2010]]

Chimpanzees "acquired from monkeys"[[sharp-2010]] two different forms of SIVs that combined to create a new strain. Unlike in some monkeys, SIV "increases mortality"[[sharp-2010]] in chimpanzees.

HIV is a modified form of SIV that infects humans and is grouped into two main types:

HIV-1 "is most closely related to SIVcpz"[[rambaut-2004]] and is found in some chimpanzees. HIV-1 is categorized into "groups M, N and O [which] represent separate transfers from chimpanzees,"[[rambaut-2004]] although "one or two of those transmissions may have been via gorillas."[[sharp-2010]]

HIV-2 is "most closely related to SIVsm"[[rambaut-2004]] found in sooty mangabey monkeys and is believed to have been transfered from these monkeys to humans "at least four"[[rambaut-2004]] times.

HIV-1 group M (for major) accounts for "vast majority (perhaps 98%) of HIV infections worldwide"[[sharp-2010]] while all other HIV types are mostly or entirely restricted to West Africa.[[rambaut-2004]] Molecular clocks suggest that HIV1 group M first originated "around the 1920s"[[sharp-2010]] and the other groups don't "appear older than HIV-1 group M."[[sharp-2010]]

Population Size

Population Per Infected Peson

About 1010 to 1012 HIV viruses exist in an infected person:

Study Estimate
Haase et al, 1996[[haase-1996]]

"the FDC-associated [folicular dendritic cell] pool of HIV RNA would be about 10 11 copies in a 70-kg HIV infected individual."

Follicular dendritic cells are major reservoirs for HIV-1 within the lymph nodes.

Perelson et al, 1996[[perelson-1996]]

"The estimated average total HIV-1 production was 10.3x109 virions per day." This is 1.3x1010

"the average HIV-1 generation time--defined as the time from release of a virion until it infects another cell and causes the release of a new generation of viral particles--is 2.6 days."

Brown, 1997[[brown-1997]]

"HIV infections are initiated from a small inoculum and increase very rapidly to ≈10 10 in the first stages of infection, so a considerable reduction in N e [effective population size] would be expected to be due to this expansion."
Rambaut et al, 2004[[rambaut-2004]]

"[HIV] has a viral generation time of ~2.5 days and produces ~1010 to 10 12 new virons each day" Perelson, 1996 is cited for this estimate.

Coffin et al, 2013[[coffin-2013]]

"In the case of HIV-1 infection, perhaps 10 11 virions are produced daily; the number of cells infected in the same time span is... unlikely to exceed 10 9 ."

"this result is consistent with the high natural turnover rate of activated effector memory helper T cells, the primary target for HIV-1 infection, on the order of 1010 cel ls per day, of which only a small fraction are infected after the initial primary infection phase."

Estimated Observed and Effective Popuation Size Discrepencies

Rather than by counting and extrapolating, some studies (not included above) measure HIV genetic diversity, and use that to estimate effective population sizes in the range of 450 to 105 HIV virons per person.[[althaus-2005]] These estimates "are many orders of magnitude lower than the census size--a result that has surprised and perplexed many in the HIV-1 community."[[tan-2005]]

However, models estimating effective population "are heavily influenced by variations in allele frequency" and should "be taken as a lower bound" with the true values "likely to be much higher."[[maidarelli-2013]] This is because low genetic diversity in HIV leads to lower effective population size estimates. Even if the real population size is much larger.

Why would HIV have low genetic diversity? Because in each HIV infection, the HIV starts as only a small number of virus particles and then expands to nearly a trillion viruses. And rapidly expanding populations have lower genetic diversity. HIV is also subject to strong selection, and selection removes variants from a population.

Therefore since the effective population size is misleading, and since observation trumps such models, the observed population size estimates in the table above are more reliable.

Cumulative Popuation Size

HIV reproduces about once every 2.6 days,[[perelson-1996]] which over the last 100 years since HIV first entered humans is about 14,000 "generations." Counting only the last 40 years (1977-2017) when HIV population sizes were significant totals about 5,600 "generations."

Sadly there are "an estimated 42 million people carrying the [HIV] virus at present."[[rambaut-2004]] Multiplied by 1011 HIV virons per person gives 4x1018 HIV virons existing at any given time.

If "perhaps 1011 virions are produced daily,"[[coffin-2013]] 14,600 days over 40 years times 10 11 virons per person per day times 4.2x107 people with HIV gives a total of 6.13x10 22 HIV virons that have ever existed in humans. That gives us:

  1. 14,000 reproduction cycle "generations" of HIV over the last 100 years.
  2. 4x1018 HIV virons existing in humans at any given time.
  3. 6x1022 HIV virons existing in humans over the last 100 years.

However, only one in 100 or fewer virons go on to infect other cells: "perhaps 1011 virions are produced daily,"[[coffin-2013]] although "the number of cells infected in the same time span... is unlikely to exceed 109 "[[coffin-2013]]

HIV Mutations

HIV has "about 2×10-5 mutations per site per replication cycle."[[Zanini-2017]] An earlier study "reported a much higher mutation rate," but it "focused on integrated provirus and might not reflect the mutational frequency in the circulating HIV-1 virions."[[Zanini-2017]] The HIV-1 genome is 9181 nucleotides, so that works out to:

  1. HIV has about 0.18 mutations per replication, or one mutation every 5.6 replications.
  2. HIV genomes have had a total of about 1022 mutations since first entering humans about 100 years ago.

Exploring Mutation Space

This means that during the last 100 years, a point mutation has occurred at every single letter of HIV's 9181 nucleotide genome about 1.1x1018 times (1022 / 9181).  If we account for point mutations changing nucleotides to one of three other letters, every nucleotide of HIV's genome has been tried out 3.7x1017 times (1.1x1018 / 3), and every possible combination of two nucleotide mutations has been tried about 6.1x108 times (previous number divided by 9191*3).

For comparison, the common bacteria E. coli have a genome of about 5.4 million nucleotides, and have one mutation every 1000 replications.[[lee-2012]]  Therefore it would take a population of about 16.2 billion E. coli (5.4 million times 1000 times 3) to mutate every possible nucleotide, and a population of about 2.6x1020 E. coli to try out every possible combination of two mutations.

For humans, the numbers are similarly large.  We have a 3 billion nucleotide haploid genome and about 33 mutations per haploid genome per generation.  Thus it would take about 273 million humans to test every possible single nucleotide mutation, and 2x1026 humans to try every possible combination of two mutations.  The table below extrapolates further for combinations of 3, 4, and 5 mutations:

Every combination of Times found by HIV E coli Needed Humans Needed
1 mutation 1.1x1017 1.6x1010 2.7x108
2 mutations 1.3x1013 2.6x1020 7.5x1016
3 mutations 4.8x108 4.3x1030 2.0x1026
4 mutations 17,301 6.9x1040 5.6x1033
5 mutations 0.63 1.1x1051 1.5x1042

In other words, it would take about 6.9x1040 E. coli to try out every possible combination of 4 specific mutations in its genome.  This is more than the total of 1040 cellular organisms estimated to have existed on a 4-billion year old Earth.[[user813801-2014]] [[behe-2007]]  Yet in the last 100 years, HIV has tried out every possible combinations of 4 mutations in its own (much smaller) genome about 17,301 times.

What's the purpose of this comparison?  When HIV uncovers evolutionary gains that require 3, 4 or even 5 simultaneous, specific mutations to all be present, we should not expect other organisms to evolve such specific mutations at all. Even if there are thousands of possible ways to evolve through such paths of simultaneous mutations.

Total Evolution

The annotated[[rambaut-2004]] chart below shows all strains of HIV circulating within humans (red lines) and their inferred origins from monkey and ape SIV (gray lines). Longer vertical lines indicate more fixed mutations. Blue numbers indicate the total number of mutations fixed during the time represented by the red vertical lines. The sum of all blue numbers indicates that about 5,160 mutations have become fixed among the various HIV subtypes since first entering humans.

This estimate of 5,160 fixed mutations is rather arbitrary. We would get a smaller number if we randomly picked a single HIV-1M virus and counted the mutations in its lineage since it first entered humans. Or if we counted all mutations among every person with HIV we would get a much larger number, many times more mutations than the ~9181 nucloetides in an average HIV-1 genome. In that case we would be counting identical mutation many times over.

Likewise if we wanted to compare the number of mutations separating humans and mice, we would compare the average human genome to the average mouse genome, but we wouldn't include the billions of unique mutations present in small numbers within both human and mouse populations.

The chart above omits HIV-1 group M subtypes H through K, which would increase the number of fixed mutations beyond 5,160. However, even though HIV shows "stronger positive selection than any other organism studied so far,"[[rambaut-2004]] and most mutations with HIV's ENV gene "confer a selective advantage,"[[rambaut-2004]] (HIV has about 10 genes) it's unlikely that all 5,160 mutations are positively selected and constructive.  So we will use the still-generous estimate of:

5,000 or fewer constructive mutations became fixed among the various HIV subtypes.

Specific Evolutionary Gains


Tetherin is a protein used inside mammal cells to build tethers "between virus envelopes and the cytoplasmic membrane of the cell, preventing the release of those viruses."[[sharp-2010]]

Some strains of SIV have a Vpu gene that produces a protein that counteracts tetherin.  For example "Vpu protein of SIVgsn has been shown to counteract greater spot-nosed monkey tetherin."[[sharp-2010]] But the strain of SIV that infects chimpanzees usese Vpu only for "anti-CD4 activity"[[sharp-2010]] and does not use Vpu against tetherin. CD4 is a protein on cell surfaces that SIV and HIV use to enter T-cells.

During the process of entering humans, HIV-1 groups M and N both and separately reactivated Vpu's anti-tetherin ability: "the vpu gene did not diverge to the extent that the activity could not be rescued."[[sharp-2010]] "When SIVcpz crossed the species barrier to infect humans... Vpu subsequently (re)gained its tetherin-antagonizing function."[[kuhl-2011]]  Through this evolution, "HIV-1 group N Vpu has lost [its] anti-CD4 activity,"[[sharp-2010]] although this ability was retained in HIV-1 group M.

However "it is likely that SIV jumped into humans many times"[[sharp-2010]] before it led to the modern AIDS epedimic. Since these species-crossing attemps occurred perhaps far into the remote past, it's not possible to estimate how many viral replications occurred before SIV was able to discover these mutations.

In order for HIV's Vpu gene to counteract tetherin, at least three amino acids were changed in the Vpu's transmembrane region:  "three amino acid positions, A14, W22, and, to a lesser extent, A18, are required for tetherin antagonism."[[vigan--2010]]  This was confirmed by splicing those regions into a Vpu gene from chimpanzee SIV:  "SIVcpz Vpu was able to completely rescue the Tetherin restriction phenotype when it encoded both regions 1-8 and 14-22 from HIV-1."[[lim-2010]] 

Likewise, the digram above (under Hydrophobic TM domain) shows that HIV-1's Vpu protein differs from SIV at amino acid positions 14, 18, and 22, among many other mutations that don't affect its anti-tetherin ability.  Amino acids 14 and 22 are different amino acids than those there before. They are not simply reversions to those found in monkey SIVs.

HIV's re-acquisition of counteracting tetherin appears to be a stewpise evolutionary gain where mutations gradually improved the ability, since "chimeras within each region yielded intermediate phenotypes."[[lim-2010]]

Other notable evolutionary gains

This section is incomplete, although there are many other gains that could be documented here.


  1. [[dubrow-2014]]Dubrow, Aaron. " Computing a Cure for HIV: 9 Ways Supercomputers Help Scientists Understand and Treat the Virus." Huffington Post.  2014.
    This article's header is a modified version of the image in this article.
  2. [[berkeley-2017]]Berkeley. " HIV: the ultimate evolver."  Understanding Evolution.  Retrieved 2017.
    Mirrors:  Archive.org
  3. [[rambaut-2004]]Rambaut, Andrew et al. " The causes and consequences of HIV evolution." Nature. 2 004.
    Mirrors: University of Wisconsin
  4. [[sharp-2010]]Sharp, Paul M et al. " The evolution of HIV-1 and the origin of AIDS." Philos Trans R Soc Lond B.  2010.
    Mirrors:   Archive.org
  5. [[haase-1996]]Haase, A. T. et al. " Quantitative image analysis of HIV-1 infection in lymphoid tissue." Science.   1996.  Middle of page 7.
    Mirrors:  Amazon S3
  6. [[coffin-2013]]Coffin, John et al.   " HIV Pathogenesis: Dynamics and Genetics of Viral Populations and Infected Cells." Cold Spring Harb Perspect Med.  2013.
  7. [[brown-1997]]Brown, Andew J. Leigh. " Analysis of HIV-1 env gene sequences reveals evidence for a low effective number in the viral population."  PNAS.  1997.
  8. [[maidarelli-2013]]Maldarelli, Frank et al. " HIV Populations Are Large and Accumulate High Genetic Diversity in a Nonlinear Fashion." J. Virology. 2013.
  9. [[perelson-1996]]Perelson, Alan S et al. " HIV-1 dynamics in vivo: virion clearance rate, infected cell life-span, and viral generation time." Science.  1996.
  10. [[tan-2005]]Tan, W. Y.  "Deterministic and Stochastic Models of AIDS Epidemics and HIV Infections with Intervention."   World Scientific.  2005.
    Mirrors:  Google books.
  11. [[althaus-2005]]Althaus, Christian L et al. "Stochastic Interplay between Mutation and Recombination during the Acquisition of Drug Resistance Mutations in Human Immunodeficiency Virus Type 1."  J. Virol.  2005.
    See table 1 for a list of seven previous estimates of HIV-1 per-human effective population size. Estimates range from 450 to 105.
  12. [[Zanini-2017]]Zanini, Fabio et al. " In vivo mutation rates and the landscape of fitness costs of HIV-1 " Virus Evolution.  2017.
  13. [[snoeck-2011]]Snoeck, Joke et al.   " Mapping of positive selection sites in the HIV-1 genome in the context of RNA and protein structural constraints ."  Retrivirology.  2011.
  14. [[kuhl-2011]]Kuhl, Björn D. et al.  " Tetherin and its viral antagonists ."  J Neuroimmune Pharmacol. 2011.
  15. [[sauter-2010]]Sauter, Daniel et al.  " The evolution of pandemic and non-pandemic HIV-1 strains has been driven by Tetherin antagonism."  Cell Host Microbe.  2010.
  16. [[vigan--2010]]Vigan, Raphaël et al.  " Determinants of Tetherin Antagonism in the Transmembrane Domain of the Human Immunodeficiency Virus Type 1 Vpu Protein."  Journal of Virology.  2010.
  17. [[lim-2010]]Lim, Efrem S. et al.  " Ancient Adaptive Evolution of Tetherin Shaped the Functions of Vpu and Nef in Human Immunodeficiency Virus and Primate Lentiviruses."  Journal of Virology.  2010.
  18. [[lee-2012]]Heewook Lee et al.  "Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing."  PNAS.  2012. The authors estimate "2.2x10-10 mutations per nucleotide per generation or 1.0x10-3 mutations per genome per generation"
  19. [[roach-2012]]Jared C. Roach et al.  "Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing."  Science.  2012. The authors "estimated a human intergeneration mutation rate of ~1.1x10-8 per position per haploid genome." This times 3 billion nucleotides in a human haploid genome is 33.  The whole (diploid) genome mutation rate wold be 66.
  20. [[lynch-2006]]Lynch, Michael.  "The Origins of Eukaryotic Gene Structure."  Mol Bio Evol.   2006. Lynch writes: "the efficiency of natural selection declines dramatically between prokaryotes, unicellular eukaryotes, and multicellular eukaryotes" and "all lines of evidence point to the fact that the efficiency of selection is greatly reduced in eukaryotes to a degree that depends on organism size" and explains that this becausemore complex organisms typically have 1. smaller population sizes, 2. "decreases in the intensity of recombination" and 3. lower mutation rates per nucleotide.
  21. [[sanford-2007]]Sanford, John, et al.  "Mendel's Accountant: A biologically realistic forward-time population gentics program."  Scalable Computing.  2007. The authors exlain: "each nucleotide in a smaller genome on average plays a greater relative role in the organism’s fitness"
  22. [[user813801-2014]]user813801. "How many organisms have ever lived on Earth?."  Biology.StackExchange.  2014.The answer estimates: "5.95 * 10^39 Bacteria that ever lived," and most cellular organisms are bacteria.
  23. [[behe-2007]]Behe, Michael J.  "The Edge of Evolution."  2007.  Page 63.Behe estimates: "throughout the course of history there would have been slightly fewer than 1040 cells."