A computer rendered HIV-1 particle.[^dubrow-2014]

HIV Evolution

The human immunodeficiency virus... is one of the fastest evolving entities known.1

HIV shows stronger positive selection [having more beneficial mutations] than any other organism studied so far.2


Despite a huge number of mutations and strong natural selection, HIV has evolved very little since it first entered humans about 100 years ago.  During that time there's been:

  1. About 14,000 replication cycle "generations" of HIV.
  2. A total of about 6x1022 new HIV virus particles being produced.
  3. About 1022 total HIV mutations among all those new virus particles.
  4. About 5,000 or fewer constructive mutations becoming fixed within the various HIV subtypes.

Along with lackluster evolution seen in other fast-replicating, large-population microbes, this suggests evolution could not have created the the vast information in complex animal genomes in the time available.  Especially considering that natural selection is much weaker in complex animals2a34 and their population sizes and mutation counts are much smaller, thus giving animals less of a chance to evolve.

History and Groups

HIV came from SIV (simian immunodeficiency virus) in chimpanzees, which in turn came from SIV in monkeys:

  1. SIV is a retrovirus that infects monkeys and apes, with different SIV variants infecting each species. In some African monkeys SIV is not known to cause any harmful effects.5a
  2. At an unknown time in the past, two different forms of SIV entered Chimpanzees and combined into a new strain5b that was sometimes deadly.5c
  3. Sometime "around the 1920s,"5 SIV first entered humans, becoming HIV.

HIV is a modified form of SIV that infects humans.  It's grouped into two main types:

HIV-1 "is most closely related to SIVcpz,"5 the form of SIV found in some chimpanzees.  HIV-1 is categorized into "groups M, N and O [which] represent separate transfers from chimpanzees,"2 although "one or two of those transmissions may have been via gorillas."5

HIV-2 is "most closely related to SIVsm"2 found in sooty mangabey monkeys,  HIV-2 is believed to have been transferred from these monkeys to humans "at least four"2 times.

HIV-1 group M (for major) accounts for the "vast majority (perhaps 98%) of HIV infections worldwide"5 while all other HIV types are mostly or entirely restricted to West Africa.2 Molecular clocks suggest that HIV1 group M first originated "around the 1920s"5 and the other groups don't "appear older than HIV-1 group M."5

Population Size

Population Per Infected Person

About 1010 to 1012 HIV viruses exist in an infected person:

Study Estimate
Haase et al, 19966 "the FDC-associated [folicular dendritic cell] pool of HIV RNA would be about 1011 copies in a 70-kg HIV infected individual."

Follicular dendritic cells are major reservoirs for HIV-1 within the lymph nodes.
Perelson et al, 19967 "The estimated average total HIV-1 production was 10.3x109 virions per day." This is 1.3x1010

"the average HIV-1 generation time--defined as the time from release of a virion until it infects another cell and causes the release of a new generation of viral particles--is 2.6 days."
Brown, 19978) "HIV infections are initiated from a small inoculum and increase very rapidly to ≈1010 in the first stages of infection, so a considerable reduction in N e [effective population size] would be expected to be due to this expansion."
Rambaut et al, 20042 "[HIV] has a viral generation time of ~2.5 days and produces ~1010 to 1012 new virions each day" Perelson, 1996 is cited for this estimate.
Coffin et al, 20139 "In the case of HIV-1 infection, perhaps 1011 virions are produced daily; the number of cells infected in the same time span is... unlikely to exceed 109."

"this result is consistent with the high natural turnover rate of activated effector memory helper T cells, the primary target for HIV-1 infection, on the order of 1010 cells per day, of which only a small fraction are infected after the initial primary infection phase."

Estimated Observed and Effective Population Size Discrepancies

Rather than by counting and extrapolating, some studies (not included above) measure HIV genetic diversity, and use that to estimate effective population sizes in the range of 450 to 105 HIV virions per person.10 These estimates "are many orders of magnitude lower than the census size--a result that has surprised and perplexed many in the HIV-1 community."11

However, models estimating effective population "are heavily influenced by variations in allele frequency" and should "be taken as a lower bound" with the true values "likely to be much higher."12 This is because low genetic diversity in HIV leads to lower effective population size estimates.  Even if the real population size is much larger.

Why would HIV have low genetic diversity? Because in each HIV infection, the HIV starts as only a small number of virus particles and then expands to nearly a trillion viruses. And rapidly expanding populations have lower genetic diversity.  HIV is also subject to strong selection, and selection removes variants from a population.

Therefore since the effective population size is misleading, and since observation trumps such models, the observed population size estimates in the table above are more reliable.

Cumulative Population Size

HIV reproduces about once every 2.6 days,7 which over the last 100 years since HIV first entered humans is about 14,000 "generations."  Counting only the last 40 years (1977-2017) when HIV population sizes were significant gives about 5,600 total "generations."

Sadly there are "an estimated 42 million people carrying the [HIV] virus at present."2 Multiplied by 1011 HIV virions per person gives 4x1018 HIV virions existing at any given time.

If "perhaps 1011 virions are produced daily,"9 14,600 days over 40 years times 1011 virions per person per day times 4.2x107 people with HIV gives a total of 6.13x1022 HIV virions that have ever existed in humans. That gives us:

  1. 14,000 reproduction cycle "generations" of HIV over the last 100 years.
  2. 4x1018 total HIV virions existing in all humans at any given time.
  3. 6x1022 total HIV virions existing in all humans over the last 100 years.

However, only one in 100 or fewer virions go on to infect other cells: "perhaps 1011 virions are produced daily,"9 although "the number of cells infected in the same time span... is unlikely to exceed 109 "9

HIV Mutations

HIV has "about 2×10-5 mutations per site per replication cycle."13 An earlier study "reported a much higher mutation rate," but it "focused on integrated provirus and might not reflect the mutational frequency in the circulating HIV-1 virions."13  The HIV-1 genome is 9181 nucleotides, so that works out to:

  1. HIV has about 0.18 mutations per replication, or one mutation every 5.6 replications.
  2. HIV genomes have had a total of about 1022 mutations since first entering humans about 100 years ago.

Exploring Mutation Space

This means that during the last 100 years, a point mutation has occurred at every single letter of HIV's 9181 nucleotide genome about 1.1x1018 times (1022 / 9181).  If we account for point mutations changing nucleotides to one of three other letters, every nucleotide of HIV's genome has been tried out 3.7x1017 times (1.1x1018 / 3), and every possible combination of two nucleotide mutations has been tried about 6.1x108 times (previous number divided by 9191*3).

For comparison, the common bacteria E. coli have a genome of about 5.4 million nucleotides, and have one mutation every 1000 replications.14  Therefore it would take a population of about 16.2 billion E. coli (5.4 million times 1000 times 3) to mutate every possible nucleotide, and a population of about 2.6x1020 E. coli to try out every possible combination of two mutations.

For humans, the numbers are similarly large.  We have a 3 billion nucleotide haploid genome and about 33 mutations per haploid genome per generation.  Thus it would take about 273 million human reproductions to test every possible single nucleotide mutation, and 2x1026 human reproductions to try every possible combination of two mutations.  The table below extrapolates further for combinations of 3, 4, and 5 mutations:

Every combination of Times found by HIV E coli Needed Humans Needed
1 mutation 1.1x1017 1.6x1010 2.7x108
2 mutations 1.3x1013 2.6x1020 7.5x1016
3 mutations 4.8x108 4.3x1030 2.0x1026
4 mutations 17,301 6.9x1040 5.6x1033
5 mutations 0.63 1.1x1051 1.5x1042

In other words, it would take about 6.9x1040 E. coli to try out every possible combination of 4 specific mutations in its genome.  This is more than the total of 1040 cellular organisms estimated to have existed on a 4-billion year old Earth.1516  Yet in the last 100 years, HIV has tried out every possible combinations of 4 mutations in its own (much smaller) genome.  And has done so about 17,301 times.

What's the purpose of this comparison?  When HIV uncovers evolutionary gains that require 3, 4 or even 5 simultaneous, specific mutations to all be present, we should not expect other organisms to evolve such specific mutations at all.  Even if there are thousands of possible ways to evolve through such paths of simultaneous mutations.

Total Evolution

The annotated2 chart below shows all strains of HIV circulating within humans (red lines) and their inferred origins from monkey and ape SIV (gray lines). Longer vertical lines indicate more fixed mutations.  Blue numbers indicate the total number of mutations fixed during the time represented by the red vertical lines.  The sum of all blue numbers indicates that about 5,160 mutations have become fixed among the various HIV subtypes since first entering humans.

This estimate of 5,160 fixed mutations is rather arbitrary.  We would get a smaller number if we randomly picked a single HIV-1M virus and counted the mutations in its lineage since it first entered humans.  Or if we counted all mutations among every person with HIV we would get a much larger number, many times more mutations than the ~9181 nucloetides in an average HIV-1 genome.  In that case we would be counting identical mutation many times over.

Likewise if we wanted to compare the number of mutations separating humans and mice, we would compare the average human genome to the average mouse genome, but we wouldn't include the billions of unique mutations present in small numbers within both human and mouse populations.

The chart above omits HIV-1 group M subtypes H through K, which would increase the number of fixed mutations beyond 5,160.  However, even though HIV shows "stronger positive selection than any other organism studied so far,"2 and most mutations with HIV's ENV gene "confer a selective advantage,"2 (HIV has about 10 genes) it's unlikely that all 5,160 mutations are positively selected and constructive.  So we will use the still-generous estimate of:

Since first entering humans in the 1920s, about 5,000 or fewer constructive mutations became fixed among the various HIV subtypes.

Specific Evolutionary Gains


Tetherin is a protein used inside mammal cells to build tethers "between virus envelopes and the cytoplasmic membrane of the cell, preventing the release of those viruses."5

Some strains of SIV have a Vpu gene that produces a protein that counteracts tetherin.  For example "Vpu protein of SIVgsn has been shown to counteract greater spot-nosed monkey tetherin."5 But the strain of SIV that infects chimpanzees usese Vpu only for "anti-CD4 activity"5 and does not use Vpu against tetherin.  CD4 is a protein on cell surfaces that SIV and HIV use to enter T-cells.

During the process of entering humans, HIV-1 groups M and N both and separately reactivated Vpu's anti-tetherin ability: "the vpu gene did not diverge to the extent that the activity could not be rescued."5 "When SIVcpz crossed the species barrier to infect humans... Vpu subsequently (re)gained its tetherin-antagonizing function."17  Through this evolution, "HIV-1 group N Vpu has lost [its] anti-CD4 activity,"5 although this ability was retained in HIV-1 group M.

However "it is likely that SIV jumped into humans many times"5 before it led to the modern AIDS epedimic.  Since these species-crossing attemps occurred perhaps far into the remote past, it's not possible to estimate how many viral replications occurred before SIV was able to discover these mutations.

The differences in the Vpu protein in HIV/SIV in various species.18 Each letter represents an amino acid. "Hydrophobic TM [transmembrane] domain", "α-helix", and "β-turn" are different regions of the protein.

  • HIV-1 = humans
  • SIVcpz = chimpanzee
  • SIVgor = gorilla
  • SIVmon = mona monkeys
  • SIVgsn =greater spot-nosed monkeys
  • SIVmus = mustached monkeys

In order for HIV's Vpu gene to counteract tetherin, at least three amino acids were changed in the Vpu's transmembrane region:  "three amino acid positions, A14, W22, and, to a lesser extent, A18, are required for tetherin antagonism."19  This was confirmed by splicing those regions into a Vpu gene from chimpanzee SIV:  "SIVcpz Vpu was able to completely rescue the Tetherin restriction phenotype when it encoded both regions 1-8 and 14-22 from HIV-1."20

Likewise, the diagram above (under Hydrophobic TM domain) shows that HIV-1's Vpu protein differs from SIV at amino acid positions 14, 18, and 22, among many other mutations that don't affect its anti-tetherin ability.  Amino acids 14 and 22 are different amino acids than those there before. They are not simply reversions to the amino acids found in monkey SIVs.

HIV's re-acquisition of its ability to counteract tetherin appears to be a stepwise evolutionary gain where mutations gradually improved the ability, since "chimeras within each region yielded intermediate phenotypes."20  In other words, each mutation made HIV increasingly better at counteracting tetherin.

Other notable evolutionary gains

This section is incomplete, although there are many other gains that could be documented here.


  1. University of California, Berkeley.  "HIV: the ultimate evolver."  Understanding Evolution.  Retrieved 2017.  Mirrors:  Archive.org

  2. Rambaut, Andrew et al.  "The causes and consequences of HIV evolution."  Nature.  2004.  Mirrors: University of Wisconsin

    1. The authors write:  "HIV shows stronger positive selection than any other organism studied so far."  Hiv is an example how natural selection is strongest on small, simpler genomes.

  3. Lynch, Michael.  "The Origins of Eukaryotic Gene Structure."  Mol Bio Evol.  2006.  Lynch writes: "the efficiency of natural selection declines dramatically between prokaryotes, unicellular eukaryotes, and multicellular eukaryotes" and "all lines of evidence point to the fact that the efficiency of selection is greatly reduced in eukaryotes to a degree that depends on organism size." Lynch explains this because more complex organisms typically have 1. smaller population sizes, 2. "decreases in the intensity of recombination" and 3. lower mutation rates per nucleotide.

  4. Sanford, John, et al.  "Mendel's Accountant: A biologically realistic forward-time population genetics program."  Scalable Computing.  2007.The authors explain: "each nucleotide in a smaller genome on average plays a greater relative role in the organism’s fitness"

  5. Sharp, Paul M et al.  "The evolution of HIV-1 and the origin of AIDS."  Philos Trans R Soc Lond B.  2010.Mirrors:  Archive.org

    1. "More than 40 species of African monkeys are infected with their own, species-specific, SIV and in at least some host species, the infection seems non-pathogenic."

    2. "Chimpanzees acquired from monkeys two distinct forms of SIVs that recombined to produce a virus with a unique genome structure."

    3. "We have found that SIV infection causes CD4+ T-cell depletion and increases mortality in wild chimpanzees, and so the origin of AIDS is more ancient than the origin of HIV-1."

  6. Haase, A. T. et al.  "Quantitative image analysis of HIV-1 infection in lymphoid tissue."  Science.  1996.  Middle of page 7. Mirrors:   Amazon S3

  7. Perelson, Alan S et al.  "HIV-1 dynamics in vivo: virion clearance rate, infected cell life-span, and viral generation time."  Science.  1996.

  8. Brown, Andew J. Leigh.  "Analysis of HIV-1 env gene sequences reveals evidence for a low effective number in the viral population."  PNAS.  1997.

  9. Coffin, John et al.  "HIV Pathogenesis: Dynamics and Genetics of Viral Populations and Infected Cells."  Cold Spring Harb Perspect Med.  2013.

  10. Althaus, Christian L et al.  "Stochastic Interplay between Mutation and Recombination during the Acquisition of Drug Resistance Mutations in Human Immunodeficiency Virus Type 1."  J. Virol.  2005.  See table 1 for a list of seven previous estimates of HIV-1 per-human effective population size. Estimates range from 450 to 105.

  11. Tan, W. Y.  "Deterministic and Stochastic Models of AIDS Epidemics and HIV Infections with Intervention."  World Scientific.  2005.  Mirrors:  Google books.

  12. Maldarelli, Frank et al.  "HIV Populations Are Large and Accumulate High Genetic Diversity in a Nonlinear Fashion."  J. Virology.  2013.

  13. Zanini, Fabio et al. "In vivo mutation rates and the landscape of fitness costs of HIV-1" Virus Evolution.  2017.

  14. Lee, Heewook et al.  "Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing."  PNAS.  2012.  The authors estimate "2.2x10-10 mutations per nucleotide per generation or 1.0x10-3 mutations per genome per generation"

  15. "How many organisms have ever lived on Earth?"  Biology.StackExchange.  2014.  The answer estimates: "5.95 * 1039 Bacteria that ever lived," and most cellular organisms are bacteria.

  16. Behe, Michael J.  "The Edge of Evolution."  2007.  Page 63Behe estimates: "throughout the course of history there would have been slightly fewer than 1040 cells."

  17. Kuhl, Björn D. et al.  "Tetherin and its viral antagonists."  J Neuroimmune Pharmacol.  2011.

  18. Sauter, Daniel et al.  "The evolution of pandemic and non-pandemic HIV-1 strains has been driven by Tetherin antagonism."  Cell Host Microbe.  2010.

  19. Vigan, Raphaël et al.  "Determinants of Tetherin Antagonism in the Transmembrane Domain of the Human Immunodeficiency Virus Type 1 Vpu Protein."  Journal of Virology.  2010.

  20. Lim, Efrem S. et al.  "Ancient Adaptive Evolution of Tetherin Shaped the Functions of Vpu and Nef in Human Immunodeficiency Virus and Primate Lentiviruses."  Journal of Virology.  2010.