What is the probability of virus undergoing a specific dangerous mutation?

What is the probability of virus undergoing a specific dangerous mutation?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Non-biologist here so apologies if the question is violating too many of the community standards for asking a question in the forum.

What got me thinking was imagining how much more terrifying the current situation would have been if the COVID virus had very long incubation period of something like 6 months and a higher mortality rate with the same rate of spread. By the time someone realizes that a virus is spreading, it would have already infected the entire planet. If it has a very high mortality rate as well, wouldn't that pretty much mean the end of the world?

My question is, what is the probability that this can happen? For example, what is the probability that a virus starts spreading with an incubation period and mortality rate of HIV but has the spread rate of COVID?

HIV and SARS-CoV-2 are two very different viruses in terms of how the are transmitted, how they replicate within the cells, how they interact with the immune system, etc. All of these aspects of a virus existence are highly adapted to assure virus survival (otherwise this virus quickly goes extinct). The ongoing mutations do help a virus to adapt to somewhat changing environment, but do not radically change it.

Thus, a virus that combines a long incubation period of HIV while being highly contagious as SARS-CoV-2, would be a completely new species, different from either of these two viruses in its genetic makeup, lifestyle, etc. It could take millions of years of evolution for such species to develop. Moreover, such a development is unlikely, since such a virus would exterminate its host, thus becoming extinct itself. The scenario that is more plausible is that of Ebola - a virus that exists as relatively harmless in some animals, and can evolve within them, but is highly lethal when it spills to humans, which results in deadly out breaks that quickly end. Probability of such a nightmare scenario is as small as a probability of the world ending due to a black hole generated in Big Hadron Collider or huge meteorite striking the Earth - while it cannot be mathematically excluded, it is too small to account for it in our daily lives (as compared to other dangers).

Examples of Beneficial Mutation

Mutation, a change in the sequence of genes, is divided into various types such as beneficial, harmful, and neutral, based on their effects. We are here to discuss beneficial mutation in detail.

Mutation, a change in the sequence of genes, is divided into various types such as beneficial, harmful, and neutral, based on their effects. We are here to discuss beneficial mutation in detail.

Mutation is a permanent alteration in the nucleotide sequence of DNA (deoxyribonucleic acid). As a result of mutation, the amino acid sequence of proteins encoded by the stretch of DNA or gene is changed, which in turn, may alter the composition and/or function of body cells and tissues.

Would you like to write for us? Well, we're looking for good writers who want to spread the word. Get in touch with us and we'll talk.

Mutation is a major reason for variation in the genetic composition of a population or gene pool. In organisms, mutation can be caused due to cell division (mitosis and meiosis), exposure to mutagens (carcinogens), strong radiations, and viruses.

Mutation in higher organisms is either somatic or germ-line. The former type refers to the mutation in the body cells, which is not usually passed on to the offspring. Germ-line mutation occurs in the germ cells, and is inherited by the offspring via the reproduction cells. Based on the long-term effects of mutation in the particular population, it can be categorized as beneficial (more favorable), deleterious (less favorable), and neutral.


Although the emergence of SARS-CoV-2 may have resulted from recombination events between a bat SARS-like coronavirus and a pangolin coronavirus (through cross-species transmission), [4] mutations have been shown to play an important role in the ongoing evolution and emergence of novel SARS-CoV-2 variants. [1]

The variant first sampled and identified in China is considered by researchers to differ from the progenitor genome "by three variants". [5] [6] Subsequently, many distinct lineages of SARS-CoV-2 have evolved. [7]

Modern DNA sequencing, where available, may permit rapid detection (sometimes known as 'real-time detection') of genetic variants that appear in pathogens during disease outbreaks. [8] Through use of phylogenetic tree visualization software, records of genome sequences can be clustered into groups of identical genomes all containing the same set of mutations. Each group represents a 'variant', 'clade', or 'lineage', and comparison of the sequences allows the evolutionary path of a virus to be deduced. For SARS-CoV-2, over 330,000 viral genomic sequences have been generated by molecular epidemiology studies across the world. [9]

SARS-CoV-2 is evolving to become more transmissible. Notably the Alpha variant and the Delta variant are both more transmissible than the original virus identified round Wuhan in China. [10]

The following table presents information and level of risk for variants with elevated or possibly elevated risk at present. [11] [12] [13] [14] [15] [16] [17] [18] The intervals assume a 95% confidence or credibility level, unless otherwise stated. Currently, all estimates are approximations due to the limited availability of data for studies.

Identification [20] Emergence Changes relative to previously circulating variants at the time and place of emergence Test accuracy Neutralizing antibody activity (or efficacy when available)
WHO label PANGO lineage PHE variant [A] Nextstrain clade First outbreak Earliest sample [20] Designated variant of concern Notable mutations Transmissibility Hospitalization Mortality After natural infection (risk of reinfection) From vaccination Monoclonal antibodies [12]
Alpha B.1.1.7 VOC‑20DEC‑01 20I (V1) United Kingdom 20 Sep 2020 [21] 18 Dec 2020 [22] 69–70del, N501Y, P681H [23] [24] +82% ( 43 – 130% ) [B] +52% ( 47 – 57% ) [C] +59% ( 44 – 74% ) [C] No change [14] Minimal reduction [12] Minimal reduction [12] No change
Beta B.1.351 VOC‑20DEC‑02 20H (V2) South Africa May 2020 14 Jan 2021 [27] K417N, E484K, N501Y [23] +52% ( 46 – 58% ) [D] Under investigation Possibly increased [14] [30] No change [14] Reduced, T cell response elicited by D614G virus remains effective [12] [30] Reduced efficacy for many [E] Retained by some
Gamma P.1 VOC‑21JAN‑02 20J (V3) Brazil Nov 2020 15 Jan 2021 [31] [32] K417T, E484K, N501Y [23] +161% ( 145 – 176% ) [G] Possibly increased [30] +50% (50% CrI , 20 – 90% ) [F] [I] No change [14] Reduced [12] Retained by many [J] Retained by some
Alpha [K] B.1.1.7 with E484K [36] VOC‑21FEB‑02 20I (V1) United Kingdom 26 Jan 2021 [37] 5 Feb 2021 [38] 69–70del, E484K, N501Y, P681H [23] [24] +82% ( 43 – 130% ) [L] [B] +52% ( 47 – 57% ) [L] [C] +59% ( 44 – 74% ) [L] [C] No change [L] Considerably reduced [39] Considerably reduced [39] Retained by some [39]
Epsilon B.1.429, B.1.427 21C United States Mar 2020 17 Mar 2021 [40] L452R [23] +20% ( 19%–24% ) [M] Under investigation Under investigation No change [43] Moderately reduced [N] Moderately reduced [N] Reduced for some, unknown implications
Delta B.1.617.2 VOC‑21APR‑02 21A India Oct 2020 6 May 2021 [45] L452R, T478K, P681R [46] +64% ( 26 – 113% ) relative to Alpha [O] +85% ( 39 – 147% ) relative to Alpha [P] Under investigation No evidence of change [30] Reduced [50] [51] Minimal efficacy reduction [30] [52] [Q] Retained by some [51]
Kappa B.1.617.1 VUI‑21APR‑01 21B India Oct 2020 L452R, E484Q, P681R [53] Under investigation Under investigation Under investigation Under investigation Slightly reduced [50] Slightly reduced [50] Possibly reduced
Eta B.1.525 VUI‑21FEB‑03 21D Nigeria 11 Dec 2020 [54] [55] E484K, F888L [57] Under investigation Under investigation Under investigation Under investigation Possibly reduced [12] Possibly reduced [12] Possibly reduced

  1. ^ Name format updated March 2021, changing year from 4 to 2 digits and month from 2 digits to 3 letters, for example, VOC-202101-02 to VOC-21JAN-02. [13]
  2. ^ ab 1 March – 24 December 2020, England. [25]
  3. ^ abcd 23 November 2020 – 31 January 2021, England. [26]
  4. ^ 1 August – 31 December 2020, United Kingdom. [28] Another study found +50% ( 20 – 113% ) in May – November 2020 in South Africa. [29]
  5. ^ Except Moderna and Johnson & Johnson. [14][30]
  6. ^ ab The reported confidence or credible interval has a low probability, so the estimated value can only be understood as possible, not certain nor likely.
  7. ^ 1 November 2020 – 31 January 2021, Manaus. [33] Another study in Manaus has estimated that lineage P.1 may be 100% (50% CrI , 70 – 140% ) [F] more transmissible. [34]
  8. ^ ab Differences may be due to different policies and interventions adopted in each area studied at different times, to the capacity of the local health system, or to different variants circulating at the time and place of the study.
  9. ^ March 2020 – February 2021, Manaus. [34] Preliminary results from a study in the Southern Region of Brazil found lineage P.1 increases mortality for healthy young people much more. In groups without pre-existing conditions, the variant was found to increase mortality by 490% ( 220 – 985% ) for men in the 20-39 age group, 465% ( 190 – 1003% ) for women in the 20-39 age group and 670% ( 401 – 1083% ) for women in the 40-59 age group. [35][H]
  10. ^ Except Pfizer–BioNTech. [14]
  11. ^ B.1.1.7 with E484K has not received a WHO label it is listed here with the same label as its parent lineage, B.1.1.7
  12. ^ abcd Assumed to be the same as Alpha. [15]
  13. ^ September 2020 – January 2021, California. [12][41] Overtaken by Alpha. [42]
  14. ^ ab 1 September 2020 and 29 January, California. [44]
  15. ^ 18 March – 17 May 2021, England. [47] Another preliminary study in Japan found that the Delta variant is only 23% more transmissible than the Alpha variant. [48][H]
  16. ^ 1 April – June 6 2021, Scotland. [49]
  17. ^ Moderately reduced neutralization with Covaxin. [50]
SARS-CoV-2 corresponding nomenclatures [58]
PANGO lineages [59] Notes to PANGO lineages [60] Nextstrain clades, [61] 2021 [62] GISAID clades Notable variants
A.1–A.6 19B S Contains "reference sequence" WIV04/2019 [63]
B.3–B.7 , B.9 , B.10 , B.13–B.16 19A L
O [a]
B.2 V
B.1 B.1.5–B.1.72 20A G Lineage B.1 in the PANGO Lineages nomenclature system includes Delta/ B.1.617 [46] [64]
B.1.9 , B.1.13 , B.1.22 , B.1.26 , B.1.37 GH
B.1.3–B.1.66 20C Includes Epsilon/ B.1.427/ B.1.429/ CAL.20C [65] and Eta/ B.1.525 [12]
20G Predominant in US generally, Jan '21 [65]
20H Includes Beta/ B.1.351 aka 20H/501Y.V2 or 501.V2 lineage
B.1.1 20B GR Includes B.1.1.207 [ citation needed ]
20J Includes Gamma/ P.1 and Zeta/ P.2 [66] [67]
20I Includes Alpha/ B.1.1.7 aka VOC-202012/01, VOC-20DEC-01 or 20I/501Y.V1
B.1.177 20E (EU1) [62] GV [a] Derived from 20A [62]

No consistent nomenclature has been established for SARS-CoV-2. [69] Many organizations, including governments and news outlets, refer colloquially to concerning variants by the country in which they were first identified. [70] [71] [72] After months of discussions, the World Health Organization announced Greek-letter names for important strains on 31 May 2021, so they could be easily referred to in a geographically and politically neutral fashion. [73] [74]

While there are many thousands of variants of SARS-CoV-2, [75] subtypes of the virus can be put into larger groupings such as lineages or clades. [b] Three main, generally used nomenclatures [69] have been proposed:

  • As of January 2021 [update] , GISAID – referring to SARS-CoV-2 as hCoV-19 [60] – had identified eight global clades (S, O, L, V, G, GH, GR, and GV). [76]
  • In 2017, Hadfield et al. announced Nextstrain, intended "for real-time tracking of pathogen evolution". [77] Nextstrain has later been used for tracking SARS-CoV-2, identifying 13 major clades [c] (19A–B, 20A–20J and 21A) as of June 2021 [update] . [78]
  • In 2020, Rambaut et al. of the Phylogenetic Assignment of Named Global Outbreak Lineages (PANGOLIN) [79] software team proposed in an article [59] "a dynamic nomenclature for SARS-CoV-2 lineages that focuses on actively circulating virus lineages and those that spread to new locations" [69] as of February 2021 [update] , six major lineages (A, B, B.1, B.1.1, B.1.177, B.1.1.7) had been identified. [7][80]

Each national public health institute may also institute its own nomenclature system for the purposes of tracking specific variants. For example, Public Health England designated each tracked variant by year, month and number in the format [YYYY] [MM]/[NN], prefixing 'VUI' or 'VOC' for a variant under investigation or a variant of concern respectively. [13] This system has now been modified and now uses the format [YY] [MMM]-[NN], where the month is written out using a three-letter code. [13]

As it is currently not known when the index case or 'patient zero' occurred, the choice of reference sequence for a given study is relatively arbitrary, with different notable research studies' choices varying as follows:

  • The earliest sequence, Wuhan-1, was collected on 24 December 2019. [5]
  • One group (Sudhir Kumar et al.) [5] refers extensively to an NCBIreference genome (GenBankID:NC_045512 GISAID ID: EPI_ISL_402125), [81] this sample was collected on 26 December 2019, [82] although they also used the WIV04 GISAID reference genome (ID: EPI_ISL_402124), [83] in their analyses. [84]
  • According to another source (Zhukova et al.), the sequence WIV04/2019, belonging to the GISAID S clade / PANGO A lineage / Nextstrain 19B clade, is thought to most closely reflect the sequence of the original virus infecting humans—known as "sequence zero". [63]WIV04/2019 was sampled from a symptomatic patient on 30 December 2019 and is widely used (especially by those collaborating with GISAID) [85] as a reference sequence. [63]

Viruses generally acquire mutations over time, giving rise to new variants. When a new variant appears to be growing in a population, it can be labeled as an "emerging variant".

Some of the potential consequences of emerging variants are the following: [23] [86]

  • Increased transmissibility
  • Increased morbidity
  • Increased mortality
  • Ability to evade detection by diagnostic tests
  • Decreased susceptibility to antiviral drugs (if and when such drugs are available)
  • Decreased susceptibility to neutralizing antibodies, either therapeutic (e.g., convalescent plasma or monoclonal antibodies) or in laboratory experiments
  • Ability to evade natural immunity (e.g., causing reinfections)
  • Ability to infect vaccinated individuals
  • Increased risk of particular conditions such as multisystem inflammatory syndrome or long-haul COVID.
  • Increased affinity for particular demographic or clinical groups, such as children or immunocompromised individuals.

Variants that appear to meet one or more of these criteria may be labeled "variants under investigation" or "variants of interest" pending verification and validation of these properties. The primary characteristic of a variant of interest is that it shows evidence that demonstrates it is the cause of an increased proportion of cases or unique outbreak clusters however, it must also have limited prevalence or expansion at national levels, or the classification would be elevated to a "variant of concern". [13] [87] If there is clear evidence that the effectiveness of prevention or intervention measures for a particular variant is substantially reduced, that variant is termed a "variant of high consequence". [12]

Listed below are the Variants of Concern (VOC) currently recognised by the World Health Organization. [11] Note that other organizations such as the CDC in the United States may use a slightly different list. [12]

Alpha (lineage B.1.1.7) Edit

First detected in October 2020 during the COVID-19 pandemic in the United Kingdom from a sample taken the previous month in Kent, [88] lineage B.1.1.7, [89] labelled Alpha variant by the WHO, was previously known as the first Variant Under Investigation in December 2020 (VUI – 202012/01) [90] and later notated as VOC-202012/01. [13] It is also known as 20I (V1), [20] 20I/501Y.V1 [30] (formerly 20B/501Y.V1), [91] [92] [23] or 501Y.V1. [39] Since then, its prevalence odds have doubled every 6.5 days, the presumed generational interval. [93] [94] It is correlated with a significant increase in the rate of COVID-19 infection in United Kingdom, associated partly with the N501Y mutation. [95] There is some evidence that this variant has 40–80% increased transmissibility (with most estimates lying around the middle to higher end of this range), [96] and early analyses suggest an increase in lethality. [97] [98] More recent work has found no evidence of increased virulence. [99] As of May 2021, the Alpha variant has been detected in some 120 countries. [100]

B.1.1.7 with E484K Edit

Variant of Concern 21FEB-02 (previously written as VOC -202102/02), described by Public Health England (PHE) as "B.1.1.7 with E484K" [13] is of the same lineage in the Pango nomenclature system, but has an additional E484K mutation. As of 17 March 2021, there are 39 confirmed cases of VOC -21FEB-02 in the UK. [13] On 4 March 2021, scientists reported B.1.1.7 with E484K mutations in the state of Oregon. In 13 test samples analysed, one had this combination, which appeared to have arisen spontaneously and locally, rather than being imported. [101] [102] [103] Other names for this variant include B.1.1.7+E484K [104] and B.1.1.7 Lineage with S:E484K. [105]

Beta (lineage B.1.351) Edit

On 18 December 2020, the 501.V2 variant, also known as 501.V2, 20H (V2), [20] 20H/501Y.V2 [30] (formerly 20C/501Y.V2), 501Y.V2, [29] VOC-20DEC-02 (formerly VOC -202012/02), or lineage B.1.351, [23] was first detected in South Africa and reported by the country's health department. [106] It has been labelled as Beta variant by WHO. Researchers and officials reported that the prevalence of the variant was higher among young people with no underlying health conditions, and by comparison with other variants it is more frequently resulting in serious illness in those cases. [107] [108] The South African health department also indicated that the variant may be driving the second wave of the COVID-19 epidemic in the country due to the variant spreading at a more rapid pace than other earlier variants of the virus. [106] [107]

Scientists noted that the variant contains several mutations that allow it to attach more easily to human cells because of the following three mutations in the receptor-binding domain (RBD) in the spike glycoprotein of the virus: N501Y, [106] [109] K417N, and E484K. [110] [111] The N501Y mutation has also been detected in the United Kingdom. [106] [112]

Gamma (lineage P.1) Edit

The Gamma variant or lineage P.1, termed Variant of Concern 21JAN-02 [13] (formerly VOC-202101/02) by Public Health England, [13] 20J (V3) [20] or 20J/501Y.V3 [30] by Nextstrain, or just 501Y.V3, [39] was detected in Tokyo on 6 January 2021 by the National Institute of Infectious Diseases (NIID). It has been labelled as Gamma variant by WHO. The new variant was first identified in four people who arrived in Tokyo having travelled from the Brazilian Amazonas state on 2 January 2021. [113] On 12 January 2021, the Brazil-UK CADDE Centre confirmed 13 local cases of the new Gamma variant in the Amazon rain forest. [114] This variant of SARS-CoV-2 has been named lineage P.1 (although it is a descendant of B.1.1.28, the name B. [115] [14] is not permitted and thus the resultant name is P.1), and has 17 unique amino acid changes, 10 of which in its spike protein, including the three concerning mutations: N501Y, E484K and K417T. [114] [116] [117] [118] : Figure 5

The N501Y and E484K mutations favour the formation of a stable RBD-hACE2 complex, thus, enhancing the binding affinity of RBD to hACE2. However, the K417T mutation disfavours complex formation between RBD and hACE2, which has been demonstrated to reduce the binding affinity. [1]

The new variant was absent in samples collected from March to November 2020 in Manaus, Amazonas state, but it was detected for the same city in 42% of the samples from 15–23 December 2020, followed by 52.2% during 15–31 December and 85.4% during 1–9 January 2021. [114] A study found that infections by Gamma can produce nearly ten times more viral load compared to persons infected by one of the other lineages identified in Brazil (B.1.1.28 or B.1.195). Gamma also showed 2.2 times higher transmissibility with the same ability to infect both adults and older persons, suggesting P.1 and P.1-like lineages are more successful at infecting younger humans irrespective of sex. [119]

A study of samples collected in Manaus between November 2020 and January 2021, indicated that the Gamma variant is 1.4–2.2 times more transmissible and was shown to be capable of evading 25–61% of inherited immunity from previous coronavirus diseases, leading to the possibility of reinfection after recovery from an earlier COVID-19 infection. As for the fatality ratio, infections by Gamma were also found to be 10–80% more lethal. [120] [121] [34]

A study found that people fully vaccinated with Pfizer or Moderna have significantly decreased neutralization effect against Gamma, although the actual impact on the course of the disease is uncertain. [122] A pre-print study by the Oswaldo Cruz Foundation published in early April found that the real-world performance of people with the initial dose of the Sinovac's Coronavac Vaccine had approximately 50% efficacy rate. They expected the efficacy to be higher after the 2nd dose. The study is ongoing. [123]

Preliminary data from two studies indicate that the Oxford–AstraZeneca vaccine is effective against the Gamma variant, although the exact level of efficacy has not yet been released. [124] [125] Preliminary data from a study conducted by Instituto Butantan suggest that CoronaVac is effective against the Gamma variant as well, and the study will be expanded to obtain definitive data. [126]

Delta (lineage B.1.617.2) Edit

The Delta variant, also known as B.1.617.2, G/452R.V3, 21A [20] or 21A/S:478K, [30] was first discovered in India. Descendant of lineage B.1.617, which also includes the Kappa variant under investigation, it was first discovered in October 2020 and has since spread internationally. [127] [128] [129] [130] [131] On 6 May 2021, British scientists declared B.1.617.2 (which notably lacks mutation at E484Q) as a "variant of concern", labelling it VOC-21APR-02, after they flagged evidence that it spreads more quickly than the original version of the virus and could spread as quickly as Alpha. [132] [133] [134] It carries L452R, T478K and P681R mutations, [46] but unlike Kappa it does not carry E484Q.

On 3 June 2021, Public Health England reported that twelve of the 42 deaths from the Delta variant in England were among the fully vaccinated, and that it was spreading almost twice as fast as the Alpha variant. [135] Also on 11 June, Foothills Medical Centre in Calgary, Canada reported that half of their 22 cases of the Delta variant occurred among the fully vaccinated. [136]

In June 2021, reports began to appear of a variant of Delta with the K417N mutation dubbed the "Nepal variant". [137] The mutation, also present in the Beta variant, has raised concerns about the possibility of reduced effectiveness of vaccines and antibody treatments and increased risk of reinfection. [138] The variant, called "Delta with K417N" by Public Health England, includes two clades corresponding to the Pango lineages AY.1 and AY.2. [139] It has been nicknamed "Delta plus" [140] from "Delta plus K417N". [141] On 22 June, India's Ministry of Health and Family Welfare has declared the "Delta plus" variant of COVID-19 a Variant of Concern after 22 cases of the variant were reported in India. [142] After the announcement, leading virologists said there was insufficient data to support labeling the strain as distinct variant of concern, pointing to the small number of patients studied. [143]

Listed below are the Variants of Interest (VOI) currently recognised by the World Health Organization. [11] Note that other organizations such as the CDC in the United States may use a slightly different list. For example, they designate Epsilon as a VOC rather than a VOI. [12]

Epsilon (lineages B.1.429, B.1.427, CAL.20C) Edit

The Epsilon variant or lineage B.1.429, also known as CAL.20C [144] or CA VUI1, [145] 21C [20] or 20C/S:452R, [30] is defined by five distinct mutations (I4205V and D1183Y in the ORF1ab-gene, and S13I, W152C, L452R in the spike protein's S-gene), of which the L452R (previously also detected in other unrelated lineages) was of particular concern. [65] [146] B.1.429 is possibly more transmissible, but further study is necessary to confirm this. [146] CDC has listed B.1.429 and the related B.1.427 as "variants of concern," and cites a preprint for saying that they exhibit a

20% increase in viral transmissibility, have a "Significant impact on neutralization by some, but not all," therapeutics that have been given Emergency Use Authorization (EUA) by FDA for treatment or prevention of COVID-19, and moderately reduce neutralization by plasma collected by people who have previously infected by the virus or who have received a vaccine against the virus. [147] [148] According to WHO, it has been labelled as Epsilon variant.

Epsilon (CAL.20C) was first observed in July 2020 by researchers at the Cedars-Sinai Medical Center, California, in one of 1,230 virus samples collected in Los Angeles County since the start of the COVID-19 epidemic. [149] It was not detected again until September when it reappeared among samples in California, but numbers remained very low until November. [150] [151] In November 2020, the Epsilon variant accounted for 36 percent of samples collected at Cedars-Sinai Medical Center, and by January 2021, the Epsilon variant accounted for 50 percent of samples. [146] In a joint press release by University of California, San Francisco, California Department of Public Health, and Santa Clara County Public Health Department, [152] the variant was also detected in multiple counties in Northern California. From November to December 2020, the frequency of the variant in sequenced cases from Northern California rose from 3% to 25%. [153] In a preprint, CAL.20C is described as belonging to clade 20C and contributing approximately 36% of samples, while an emerging variant from the 20G clade accounts for some 24% of the samples in a study focused on Southern California. Note however that in the US as a whole, the 20G clade predominates, as of January 2021. [65] Following the increasing numbers of Epsilon in California, the variant has been detected at varying frequencies in most US states. Small numbers have been detected in other countries in North America, and in Europe, Asia and Australia. [150] [151] After an initial increase, its frequency rapidly dropped from February 2021 as it was being outcompeted by the more transmissible Alpha. In April, Epsilon remained relatively frequent in parts of northern California, but it had virtually disappeared from the south of the state and had never been able to establish a foothold elsewhere only 3.2% of all cases in the United States were Epsilon, whereas more than two-thirds were Alpha. [42]

Zeta (lineage P.2) Edit

Zeta variant or lineage P.2, a sub-lineage of B.1.1.28 like P.1, was first detected in circulation in the state of Rio de Janeiro it harbours the E484K mutation, but not the N501Y and K417T mutations. [118] It evolved independently in Rio de Janeiro without being directly related to the Gamma variant from Manaus. [114] [154]

Under the simplified naming scheme proposed by the World Health Organization, P.2 has been labeled Zeta variant', and is considered a variant of interest (VOI), but not yet a variant of concern. [11]

Eta (lineage B.1.525) Edit

The Eta variant or lineage B.1.525, also called VUI -21FEB-03 [13] (previously VUI-202102/03) by Public Health England (PHE) and formerly known as UK1188, [13] 21D [20] or 20A/S:484K, [30] does not carry the same N501Y mutation found in Alpha, Beta and Gamma, but carries the same E484K-mutation as found in the Gamma, Zeta, and Beta variants, and also carries the same ΔH69/ΔV70 deletion (a deletion of the amino acids histidine and valine in positions 69 and 70) as found in Alpha, N439K variant (B.1.141 and B.1.258) and Y453F variant (Cluster 5). [155] Eta differs from all other variants by having both the E484K-mutation and a new F888L mutation (a substitution of phenylalanine (F) with leucine (L) in the S2 domain of the spike protein). As of March 5, it had been detected in 23 countries. [156] [157] [56] It has also been reported in Mayotte, the overseas department/region of France. [156] The first cases were detected in December 2020 in the UK and Nigeria, and as of 15 February, it had occurred in the highest frequency among samples in the latter country. [56] As of 24 February, 56 cases were found in the UK. [13] Denmark, which sequences all its COVID-19 cases, found 113 cases of this variant from January 14 to February 21, of which seven were directly related to foreign travels to Nigeria. [157]

UK experts are studying it to understand how much of a risk it could be. It is currently regarded as a "variant under investigation", but pending further study, it may become a "variant of concern". Prof Ravi Gupta, from the University of Cambridge spoke to the BBC and said lineage B.1.525 appeared to have "significant mutations" already seen in some of the other newer variants, which is partly reassuring as their likely effect is to some extent more predictable. [158]

Under the simplified naming scheme proposed by the World Health Organization, lineage B.1.525 has been labelled variant Eta. [11]

Theta (lineage P.3) Edit

On 18 February 2021, the Department of Health of the Philippines confirmed the detection of two mutations of COVID-19 in Central Visayas after samples from patients were sent to undergo genome sequencing. The mutations were later named as E484K and N501Y, which were detected in 37 out of 50 samples, with both mutations co-occurrent in 29 out of these. There were no official names for the variants and the full sequence was yet to be identified. [159] It is also labelled as Theta variant by WHO.

On 13 March, the Department of Health confirmed the mutations constitutes a variant which was designated as lineage P.3. [160] On the same day, it also confirmed the first COVID-19 case caused by the Gamma variant in the country. Although the Gamma and Theta variants stem from lineage B.1.1.28, the department said that the impact of the Theta variant on vaccine efficacy and transmissibility is yet to be ascertained. The Philippines had 98 cases of the Theta variant on 13 March. [161] On 12 March it was announced that Theta had also been detected in Japan. [162] [163] On 17 March, the United Kingdom confirmed its first two cases, [164] where PHE termed it VUI-21MAR-02. [13] On 30 April 2021, Malaysia detected 8 cases of the Theta variant in Sarawak. [165]

Iota (lineage B.1.526) Edit

In November 2020, a mutant variant was discovered in New York City, which was named lineage B.1.526. [166] As of April 11, 2021, the variant has been detected in at least 48 U.S. states and 18 countries. In a pattern mirroring Epsilon, Iota was initially able to reach relatively high levels in some states, but in the spring of 2021 it was outcompeted by the more transmissible Alpha. [42] It is labelled as Iota variant by the WHO.

Kappa (lineage B.1.617.1) Edit

Kappa variant [167] is one of the three sublineages of lineage B.1.617. It is also known as lineage B.1.617.1, 21B [20] or 21A/S:154K, [30] and was first detected in India in December 2020. [168] By the end of March 2021, the Kappa variant accounted for more than half of the sequences being submitted from India. [169] On 1 April 2021, it was designated a variant under investigation (VUI-21APR-01) by Public Health England. [45]

Lambda (lineage C.37) Edit

The Lambda variant, also known as lineage C.37, was first detected in Peru in August 2020 and was designated by the WHO as a variant of interest on 14 June 2021. [170]

Lineage B.1.1.207 Edit

First sequenced in August 2020 in Nigeria, [171] the implications for transmission and virulence are unclear but it has been listed as an emerging variant by the US Centers for Disease Control. [23] Sequenced by the African Centre of Excellence for Genomics of Infectious Diseases in Nigeria, this variant has a P681H mutation, shared in common with the Alpha variant. It shares no other mutations with the Alpha variant and as of late December 2020 this variant accounts for around 1% of viral genomes sequenced in Nigeria, though this may rise. [171] As of May 2021, lineage B.1.1.207 has been detected in 10 countries. [172]

Lineage B.1.620 Edit

In March 2021, this variant was discovered in Lithuania. It was named lineage B.1.620, [ citation needed ] also known as the 'Lithuanian strain'. It is found in Central Africa as well as North America. [173] Apart from Lithuania, other European countries including Spain and Belgium have also found presence of this variant. [ citation needed ] This lineage has 23 mutations and deletions compared to the reference strain, some of which are unique mutations. The lineage contains an E484K mutation. [173] [174] D614G, a mutation present in most circulating strain, is also found in this variant. [175] Other notable mutations include P681H and S477N. [173]

Additional variants Edit

  • Lineage B.1.618 was first isolated in October 2020. It has the E484K mutation in common with several other variants, and showed significant spread in April 2021 in West Bengal, India. [176][177] As of 23 April 2021, the PANGOLIN database showed 135 sequences detected in India, with single-figure numbers in each of eight other countries worldwide. [178]
  • Lineage B.1.1.318 was designated by PHE as a VUI (VUI-21FEB-04, [13] previously VUI-202102/04) on 24 February 2021. 16 cases of it have been detected in the UK. [13][179]
  • Lineage B.1.1.317, while not considered a variant of concern, is noteworthy in that Queensland Health forced 2 people undertaking hotel quarantine in Brisbane, Australia to undergo an additional 5 days quarantine on top of the mandatory 14 days after it was confirmed they were infected with this variant. [180]
  • On 29 May 2021, medical scientists in Vietnam reported a new, more contagious, form of the COVID-19 virus, that may be a mixture of the variants first detected in India and Britain. [181] On 3 June, the WHO described this as a mutated form of the Delta variant. [182]

Cluster 5 Edit

In early November 2020, Cluster 5, also referred to as ΔFVI-spike by the Danish State Serum Institute (SSI), [183] was discovered in Northern Jutland, Denmark, and is believed to have been spread from minks to humans via mink farms. On 4 November 2020, it was announced that the mink population in Denmark would be culled to prevent the possible spread of this mutation and reduce the risk of new mutations happening. A lockdown and travel restrictions were introduced in seven municipalities of Northern Jutland to prevent the mutation from spreading, which could compromise national or international responses to the COVID-19 pandemic. By 5 November 2020, some 214 mink-related human cases had been detected. [184]

The World Health Organization (WHO) has stated that cluster 5 has a "moderately decreased sensitivity to neutralizing antibodies". [185] SSI warned that the mutation could reduce the effect of COVID-19 vaccines under development, although it was unlikely to render them useless. Following the lockdown and mass-testing, SSI announced on 19 November 2020 that cluster 5 in all probability had become extinct. [186] As of 1 February 2021, authors to a peer-reviewed paper, all of whom were from the SSI, assessed that cluster 5 was not in circulation in the human population. [187]

There is a risk that COVID-19 could transfer from humans to other animal populations and could combine with other animal viruses to create yet more variants that are dangerous to humans. [188]

N440K Edit

The name of the mutation, N440K, refers to an exchange whereby the asparagine (N) is replaced by lysine (K) at position 440. [189]

This mutation has been observed in cell cultures to be 10 times more infective compared to the previously widespread A2a strain and 1000 times more in the lesser widespread A3i strain. [190] It is involved in current rapid surges of Covid cases in India. [191] India has the largest proportion of N440K mutated variants followed by the US and Germany. [192]

L452R Edit

The name of the mutation, L452R, refers to an exchange whereby the leucine (L) is replaced by arginine (R) at position 452. [189]

There has been a significant surge of COVID-19 starting 2021 all across India caused in part by lineage B.1.617, frequently, but misleadingly, referred to as a "double mutant". L452R is a relevant mutation in this strain that enhances ACE2 receptor binding ability and can reduce vaccine-stimulated antibodies from attaching to this altered spike protein.

L452R, some studies show, could even make the coronavirus resistant to T cells, that are class of cells necessary to target and destroy virus-infected cells. They are different from antibodies that are useful in blocking coronavirus particles and preventing it from proliferating. [128]

S477G/N Edit

A highly flexible region in the receptor binding domain (RBD) of SARS-CoV-2, starting from residue 475 and continuing up to residue 485, was identified using bioinformatics and statistical methods in several studies. The University of Graz [193] and the Biotech Company Innophore [194] have shown in a recent publication that structurally, the position S477 shows the highest flexibility among them. [195]

At the same time, S477 is hitherto the most frequently exchanged amino acid residue in the RBDs of SARS-CoV-2 mutants. By using molecular dynamics simulations of RBD during the binding process to hACE2, it has been shown that both S477G and S477N strengthen the binding of the SARS-COV-2 spike with the hACE2 receptor. The vaccine developer BioNTech [196] referenced this amino acid exchange as relevant regarding future vaccine design in a preprint published in February 2021. [197]

E484K Edit

The name of the mutation, E484K, refers to an exchange whereby the glutamic acid (E) is replaced by lysine (K) at position 484. [189] It is nicknamed "Eeek". [198]

E484K has been reported to be an escape mutation (i.e., a mutation that improves a virus's ability to evade the host's immune system [199] [200] ) from at least one form of monoclonal antibody against SARS-CoV-2, indicating there may be a "possible change in antigenicity". [201] The Gamma variant (lineage P.1), [114] the Zeta variant (lineage P.2, also known as lineage B. [118] and the Beta variant (501.V2) exhibit this mutation. [201] A limited number of lineage B.1.1.7 genomes with E484K mutation have also been detected. [37] Monoclonal and serum-derived antibodies are reported to be from 10 to 60 times less effective in neutralizing virus bearing the E484K mutation. [202] [203] On 2 February 2021, medical scientists in the United Kingdom reported the detection of E484K in 11 samples (out of 214,000 samples), a mutation that may compromise current vaccine effectiveness. [204] [205]

E484Q Edit

The name of the mutation, E484Q, refers to an exchange whereby the glutamic acid (E) is replaced by glutamine (Q) at position 484. [189]

India is seeing a significant surge of COVID-19 starting 2021 caused in part by lineage B.1.617. This has frequently (but misleadingly, as most variants contain multiple mutations) been referred to as a "double mutant". [206] E484Q may enhance ACE2 receptor binding ability and may reduce vaccine-stimulated antibodies from attaching to this altered spike protein. [128]

N501Y Edit

N501Y denotes a change from asparagine (N) to tyrosine (Y) in amino-acid position 501. [207] N501Y has been nicknamed "Nelly". [198]

This change is believed by PHE to increase binding affinity because of its position inside the spike glycoprotein's receptor-binding domain, which binds ACE2 in human cells data also support the hypothesis of increased binding affinity from this change. [24] Molecular interaction modeling and the free energy of binding calculations has demonstrated that the mutation N501Y has the highest binding affinity in variants of concern RBD to hACE2. [1] Variants with N501Y include Gamma, [201] [114] Alpha (VOC 20DEC-01), Beta, and COH.20G/501Y (identified in Columbus, Ohio). [1] This last became the dominant form of the virus in Columbus in late December 2020 and January and appears to have evolved independently of other variants. [208] [209]

D614G Edit

D614G is a missense mutation that affects the spike protein of SARS-CoV-2. From early appearances in Eastern China, the frequency of this mutation in the global viral population has increased during the pandemic. [211] G (glycine) has replaced D (aspartic acid) at position 614 in many countries, especially in Europe though more slowly in China and the rest of East Asia, supporting the hypothesis that G increases the transmission rate, which is consistent with higher viral titers and infectivity in vitro. [63] Researchers with the PANGOLIN tool nicknamed this mutation "Doug". [198]

In July 2020, it was reported that the more infectious D614G SARS-CoV-2 variant had become the dominant form in the pandemic. [212] [213] [214] [215] PHE confirmed that the D614G mutation had a "moderate effect on transmissibility" and was being tracked internationally. [207]

The global prevalence of D614G correlates with the prevalence of loss of smell (anosmia) as a symptom of COVID-19, possibly mediated by higher binding of the RBD to the ACE2 receptor or higher protein stability and hence higher infectivity of the olfactory epithelium. [216]

Variants containing the D614G mutation are found in the G clade by GISAID [63] and the B.1 clade by the PANGOLIN tool. [63]

P681H Edit

In January 2021, scientists reported in a preprint that the mutation 'P681H', a characteristic feature of the Alpha variant and lineage B.1.1.207 (identified in Nigeria), is showing a significant exponential increase in worldwide frequency, similar to the now globally prevalent 'D614G'. [217] [210]

P681R Edit

The name of the mutation, P681R, refers to an exchange whereby the proline (P) is replaced by arginine (R) at position 681. [189]

Indian SARS-CoV-2 Genomics Consortium (INSACOG) found that other than the two mutations E484Q and L452R, there is also a third significant mutation, P681R in lineage B.1.617. All three concerning mutations are on the spike protein, the operative part of the coronavirus that binds to receptor cells of the body. [128]

A701V Edit

According to initial media reports, the Malaysian Ministry of Health announced on 23 December 2020 that it had discovered a mutation in the SARS-CoV-2 genome which they designated as A701B(sic), among 60 samples collected from the Benteng Lahad Datu cluster in Sabah. The mutation was characterized as being similar to the one found recently at that time in South Africa, Australia, and the Netherlands, although it was uncertain if this mutation was more infectious or aggressive [ clarification needed ] than before. [218] The provincial government of Sulu in neighboring Philippines temporarily suspended travel to Sabah in response to the discovery of 'A701B' due to uncertainty over the nature of the mutation. [219]

On 25 December 2020, the government health organisation 'Kementerian Kesihatan Malaysia/ covid-19 Malaysia' described a mutation A701V as circulating and present in 85% of cases (D614G was present in 100% of cases) in Malaysia. [220] [221] [222] These reports also referred to samples collected from the Benteng Lahad Datu cluster. [222] [221] The text of the announcement was mirrored verbatim on the Facebook page of Noor Hisham Abdullah, Malay Director-General of Health, who was quoted in some of the news articles. [221]

The A701V mutation has the amino acid alanine substituted by valine at position 701 in the spike protein. Globally, South Africa, Australia, Netherlands and England also reported A701V at about the same time as Malaysia. [220] In GISAID, the prevalence of this mutation is found to be about 0.18%. of cases. [220]

On 14 April 2021, 'Kementerian Kesihatan Malaysia' reported that the third wave, which had started in Sabah, has involved the introduction of variants with D614G and A701V mutations. [223]

On 26 January 2021, the British government said it would share its genomic sequencing capabilities with other countries in order to increase the genomic sequencing rate and trace new variants, and announced a "New Variant Assessment Platform". [224] As of January 2021 [update] , more than half of all genomic sequencing of COVID-19 was carried out in the UK. [225]

On 11 June 2021, Public Health England introduced a rules-based decision algorithm to distinguish between variants in RT-PCR results. The system is reviewed weekly. The rules require that specific mutations in the S gene [226] be present for each variant (P681R for Delta, K417N for Beta and K417T for Gamma) and impose other requirements on the presence or absence of other mutations depending on the confirmation status of the test. [47]

Researchers have suggested that multiple mutations can arise in the course of the persistent infection of an immunocompromised patient, particularly when the virus develops escape mutations under the selection pressure of antibody or convalescent plasma treatment, [227] [228] with the same deletions in surface antigens repeatedly recurring in different patients. [229]

The interplay between the SARS-CoV-2 virus and its human hosts was initially natural but is now being altered by the prompt availability of vaccines. [230] The potential emergence of a SARS-CoV-2 variant that is moderately or fully resistant to the antibody response elicited by the current generation of COVID-19 vaccines may necessitate modification of the vaccines. [231] Trials indicate many vaccines developed for the initial strain have lower efficacy for some variants against symptomatic COVID-19. [232] As of February 2021 [update] , the US Food and Drug Administration believed that all FDA authorized vaccines remained effective in protecting against circulating strains of SARS-CoV-2. [231]

Alpha (lineage B.1.1.7) Edit

Limited evidence from various preliminary studies reviewed by the WHO have indicated retained efficacy/effectiveness against disease from Alpha with the Oxford–AstraZeneca vaccine, Pfizer–BioNTech and Novavax, with no data for other vaccines yet. Relevant to how vaccines can end the pandemic by preventing asymptomatic infection, they have also indicated retained antibody neutralization against Alpha with most of the widely distributed vaccines (Sputnik V, Pfizer–BioNTech, Moderna, CoronaVac, BBIBP-CorV, Covaxin), minimal to moderate reduction with the Oxford–AstraZeneca and no data for other vaccines yet. [233]

Early results suggest protection to the variant from the Pfizer-BioNTech and Moderna vaccines. [234] [235]

One study indicated that the Oxford–AstraZeneca COVID-19 vaccine had an efficacy of 42–89% against Alpha, versus 71–91% against other variants. [236]

Preliminary data from a clinical trial indicates that the Novavax vaccine is

96% effective for symptoms against the original variant and

Beta (lineage B.1.351) Edit

Limited evidence from various preliminary studies reviewed by the WHO have indicated reduced efficacy/effectiveness against disease from Beta with the Oxford–AstraZeneca vaccine (possibly substantial), Novavax (moderate), Pfizer–BioNTech and Johnson & Johnson (minimal), with no data for other vaccines yet. Relevant to how vaccines can end the pandemic by preventing asymptomatic infection, they have also indicated possibly reduced antibody neutralization against Beta with most of the widely distributed vaccines (Oxford–AstraZeneca, Sputnik V, Johnson & Johnson, Pfizer–BioNTech, Moderna, Novavax minimal to substantial reduction) except CoronaVac and BBIBP-CorV (minimal to modest reduction), with no data for other vaccines yet. [233]

Moderna has launched a trial of a vaccine to tackle the Beta variant or lineage B.1.351. [238] On 17 February 2021, Pfizer announced neutralization activity was reduced by two-thirds for this variant, while stating that no claims about the efficacy of the vaccine in preventing illness for this variant could yet be made. [239] Decreased neutralizing activity of sera from patients vaccinated with the Moderna and Pfizer-BioNTech vaccines against Beta was later confirmed by several studies. [235] [240] On 1 April 2021, an update on a Pfizer/BioNTech South African vaccine trial stated that the vaccine was 100% effective so far (i.e., vaccinated participants saw no cases), with six of nine infections in the placebo control group being the Beta varian. [241]

In January 2021, Johnson & Johnson, which held trials for its Ad26.COV2.S vaccine in South Africa, reported the level of protection against moderate to severe COVID-19 infection was 72% in the United States and 57% in South Africa. [242]

On 6 February 2021, the Financial Times reported that provisional trial data from a study undertaken by South Africa's University of the Witwatersrand in conjunction with Oxford University demonstrated reduced efficacy of the Oxford–AstraZeneca COVID-19 vaccine against the variant. [243] The study found that in a sample size of 2,000 the AZD1222 vaccine afforded only "minimal protection" in all but the most severe cases of COVID-19. [244] On 7 February 2021, the Minister for Health for South Africa suspended the planned deployment of about a million doses of the vaccine whilst they examine the data and await advice on how to proceed. [244] [245]

In March 2021, it was reported that the "preliminary efficacy" of the Novavax vaccine (NVX-CoV2373) against Beta for mild, moderate, or severe COVID-19 [246] for HIV-negative participants is 51%. [ medical citation needed ]

Gamma (lineage P.1) Edit

Limited evidence from various preliminary studies reviewed by the WHO have indicated likely retained efficacy/effectiveness against disease from Gamma with CoronaVac and BBIBP-CorV, with no data for other vaccines yet. Relevant to how vaccines can end the pandemic by preventing asymptomatic infection, they have also indicated retained antibody neutralization against Gamma with Oxford–AstraZeneca and CoronaVac (no to minimal reduction) and slightly reduced neutralization with Pfizer–BioNTech and Moderna (minimal to moderate reduction), with no data for other vaccines yet. [233]

The Gamma variant or lineage P.1 variant (also known as 20J/501Y.V3), initially identified in Brazil, seems to partially escape vaccination with the Pfizer-BioNTech vaccine. [240]

Delta (lineage B.1.617) Edit

Limited evidence from various preliminary studies reviewed by the WHO have indicated likely retained efficacy/effectiveness against disease from Delta with the Oxford–AstraZeneca vaccine and Pfizer–BioNTech, with no data for other vaccines yet. Relevant to how vaccines can end the pandemic by preventing asymptomatic infection, they have also indicated reduced antibody neutralization against Delta with Oxford–AstraZeneca (substantial reduction), Pfizer–BioNTech and Covaxin (modest to moderate reduction), with no data for other vaccines yet. [233]

Among some 15 defining mutations, it has spike mutations D111D (synonymous), G142D, [ medical citation needed ] P681R, E484Q [247] and L452R, [248] the latter two of which may cause it to easily avoid antibodies. [249]

A dangerous stage in the evolution of the novel coronavirus is upon us with the discovery of “escape mutations”. Artificial intelligence may be our best response

Credit: T. Tibbitts

While it may take awhile to see whether these escape mutations will evade the vaccines approved or in the pipeline, Tyler Starr from the Fred Hutchinson Cancer Research Center and colleagues report in a new study in Science an effect on two already available treatments — monoclonal antibodies. They’ve identified an escape mutation with a single glitch that enables the virus to evade Regeneron’s double-antibody REGN-COV2 “cocktail” (which Trump took) and a third antibody in Eli Lilly’s LY-CoV016. The researchers found the escapee using a new lab mapping technique that displays viruses contorted with mutation, and then they found it in a patient who was still testing positive, 145 days after the first test.

What does this mean? The discovery of escape mutations derailing antibody treatments means that the companies’ initial tests hadn’t caught them all. And the escape mutations — the new mapping revealed three others — are already in circulation.

The researchers conclude, “Ultimately, it will be necessary to wait and see what mutations spread as SARS-CoV-2 circulates in the human population. Our work will help with the ‘seeing,’ by enabling immediate interpretation of the effects of the mutations cataloged by viral genomic surveillance.”

A scientific crystal ball to predict escape mutations to come, is already in the works: Artificial Intelligence

“Toto, I have a feeling we’re not in Oz anymore.”

Replacing just one word in a sentence can profoundly alter the meaning, even if the result is grammatically correct. Dorothy, of course, had been whisked out of Kansas and not Oz.

The same can be true for a virus. And this natural changeability may pose serious challenges to health officials everywhere. Small alterations in the viral genome can have no effect at all — or be so profound as to ultimately shift the course of the pandemic. The mutation that enhances transmissibility in the UK, South African, and Brazilian variants now spreading around the globe began as just a single RNA base change. Particularly distressing are the escape mutations that enable a virus to essentially hide in plain sight. A mutant virus then can more easily replicate and jump from person to person, spreading infection.

A tool from a team of computer scientists thinking about the nuances of language may be able to alert us to such future mutations. In “Learning the language of viral evolution and escape,” a recent article in Science, Brian Hie and colleagues at the Computer Science and Artificial Intelligence Laboratory at MIT described how they are harnessing machine learning to predict rogue viruses that hide from our immune responses. The researchers “have uncovered a parallel between the properties of a virus and its interpretation by the host immune system and the properties of a sentence in natural language and its interpretation by a human,” wrote Yoo-Ah Kim and Teresa M. Przytycka, from the National Library of Medicine, in an accompanying Perspective.

Depending on how the COVID-19 virus evolves, the vaccines could become less effective. Credit: Volanthevist/Getty Image

A mutation changes the sequence of the amino acid building blocks that build a protein. An escape mutation alters that sequence in a way that makes the protein, and the pathogen it’s part of, invisible to the immune system. And that enables the virus to go about the business of making more of itself unchecked.

The researchers equated a change in the sequence of amino acids of a protein to a change in the sequence of letters of a sentence. Then they adapted artificial intelligence methods developed for linguistics applications to recognize the nuances of known escape mutations and then use the information to identify new ones, testing the approach on familiar, well-studied viruses.

Virology meets linguistics

Machine learning is an AI technique in which “training” on one dataset is used to extract meaning from new information, enabling predictions. Applications are familiar and eclectic: voice recognition, predicting financial meltdowns, developing new programming for streaming services, finding hidden patterns in fine art, and filling Facebook feeds with those annoyingly spot-on ads.

For analyzing the virus behind COVID, machine learning “trains” on databases of thousands of known mutations, seeking single-RNA-base substitutions that enhance the ability of the virus to hide from the immune system. Presumably natural selection would have weeded out mutations that didn’t help the virus. Identifying health-relevant mutations would be faster than sequencing entire viral genomes. That’s TMI.

The researchers turned to an AI machine learning technique originally developed to train computers to understand human language using sequences of words that have distinct meanings. The “words” of a virus are the amino acids that build its proteins.

“Toto, I have a feeling we’re not in Kansas anymore” has a literal meaning between Dorothy and her dog, but the deeper meaning is the emotion of being suddenly flung far away from home. Similarly, certain amino acid changes in certain parts of a key protein – the spike – spell “escape,” and the virus instantaneously has an advantage.

Validation on influenza and HIV

To test this linguistic approach to predicting scary new COVID mutations, the researchers looked at proteins from two viruses that have plagued humans for decades: influenza A and HIV.

For influenza, the algorithm compared the “head” and “stalk” regions of sugar-tipped proteins that jut from the viral surface like Tootsie roll pops. Escape mutants arise from the head region of the proteins because antibodies glom onto the stalk, preventing those viruses from replicating. In fact, the instability of the heads is what has stymied attempts to develop a universal flu vaccine.

(Flu viruses change too often and in too many ways for any one vaccine to fight off all variants. That’s why we need new flu shots every year. But flu virus surfaces are not at all like those of the spiky coronaviruses.)

HIV (Credit: Kateryna Kon/Shutterstock)

Similarly, for HIV, the AI algorithm zeroed in on part of a sugar-tipped protein that forms a highly variable part of the virus’s outermost layer, the envelope. The protein was already known to spawn new mutations, so the AI is accurate.

AI also painted what the investigators call “semantic landscapes” that illuminated viral evolution. For example, one part of the 2009 avian influenza A surface matched a telltale amino acid sequence from the virus that caused the 1918 flu.

The AI predictions for SARS-CoV-2 are consistent with those for the better known viruses: the immune system “sees” surfaces, such as the tootsie-roll pops and spikes.

It’s also intriguing to compare the amino acid sequences of spike proteins from the new coronavirus to the older ones that cause SARS and MERS. Matching spikes among the viruses and 22 types of mammals showed that bats and pangolins gave us SARS-CoV-2, camels transmitted MERS, and SARS came from civets and bats. (One theory of viral origin is that they came from the genomes of their hosts.)

The spike spawns escapees

Trios of spikes crown the coronaviruses, hence their name. Each spike has two parts. Subunit 1 (S1) binds the spike to the receptor (ACE2) on cells of the lungs and elsewhere, attaching like a fighter plane landing on an aircraft carrier. Subunit 2 (S2) then fuses with the cell membrane, creating an entryway for the virus into the cell.

The central role of the spike in infection is why it is the target of treatments for COVID-19 and the basis of vaccines (see “COVID-19 Vaccine Will Close in on the Spikes”).

Using the AI technique described in the new study revealed precisely where in the spike protein’s amino acid sequence escape mutations are most likely to arise, and where they’re not . And that could be important intel for predicting the next threatening viral variant.

The escapees come from two specific parts of the spike, and are much less likely to come from a third place.

Molecular model of coronavirus spike (S) protein (red) bound to angiotensin-converting enzyme 2 (ACE2) receptor (blue) on human cell.

The two parts of the spike most likely to spawn escapees are where subunit 1 grabs the receptor on our cells and at one end of the protein. That’s also where the new mapping technique that discovered the escape mutations that evade antibody treatments zeroed in – the receptor binding domain.

The part of the spike least likely to give rise to escape mutations, according to the AI algorithm, is the smaller subunit 2, the part that ushers the impinging virus across the cell membrane into the cell. The fact that the doorway into our cells doesn’t change much means that it’s helping the virus – that’s natural selection at work.

The bottom line is that we should focus on subunit 1 to catch future escapees, hopefully before they surreptitiously spread around the planet. We must get the spike where it binds – that’s the virus’s Achilles heel.

Even though viruses have points of vulnerability, they don’t mutate on purpose they’re not ‘trying’ to make more of us sick or sabotage our vaccines. Instead, they are in a Darwinian battle for survival. Mutations are a consequence of errors when genetic material copies itself, like a biochemical typo. If a mutation brings a survival advantage, it persists. Mutation and natural selection fuel evolution.

Getting ahead of the virus

Instead of discovering the latest variation on the coronavirus theme weeks or months after people have already exhaled it on international flights or other forms of traveling, virus trackers can use the new tool to actively watch for specific escape mutations, as well as for combinations of mutations into novel variants. Perhaps a catalog of hotspots in the SARS-CoV-2 genome, derived from many iterations of machine learning surveillance, can be translated into a rapid test applied to COVID test swabs, instead of sequencing entire viral genomes.

Zoonotic Paramyxoviruses

Emergence of Henipaviruses

Phylogenetic analyses show that Nipah and Hendra viruses are old viruses, 33,50 which suggests that their emergence in the 1990s was due to ecologic factors rather than virus mutations . 51 Ecologic change that drew flying foxes closer to horses, pigs, and humans was probably the largest contributor to the emergence of Hendra and Nipah viruses. Deforestation has caused flying foxes to move into suburban and urban areas to use the trees in these regions for roosting. Climate change is likely to be causing an expansion of the geographic areas that are suitable for the bat host species of henipaviruses. 52

How do viruses mutate and jump species?

Credit: Shutterstock

Viruses are little more than parasitic fragments of RNA or DNA. Despite this, they are astonishingly abundant in number and genetic diversity. We don't know how many virus species there are, but there could be trillions.

Past viral epidemics have influenced the evolution of all life. In fact, about 8% of the human genome consists of retrovirus fragments. These genetic "fossils" are leftover from viral epidemics our ancestors survived.

COVID-19 reminds us of the devastating impact viruses can have, not only on humans, but also animals and crops. Now for the first time, the disease has been confirmed in a tiger at New York's Bronx Zoo, believed to have been infected by an employee. Six other tigers and lions were also reported as "showing symptoms."

According to the BBC, conservation experts think COVID-19 could also threaten animals such as wild gorillas, chimps and orangutans.

While virologists are intensely interested in how viruses mutate and transmit between species—and understand this process to an extent—many gaps in knowledge remain.

Skilled in their craft

Most viruses are specialists. They establish long associations with preferred host species. In these relationships, the virus may not induce disease symptoms. In fact, the virus and host may benefit each other in symbiosis.

Occasionally, viruses will "emerge" or "spillover" from their original host to a new host. When this happens, the risk of disease increases. Most infectious diseases that affect humans and our food supply are the result of spillovers from wild organisms.

The new coronavirus (SARS-CoV-2) that emerged from Wuhan in November isn't actually "new." The virus evolved over a long period, probably millions of years, in other species where it still exists. We know the virus has close relatives in Chinese rufous horseshoe bats, intermediate horseshoe bats, and pangolins - which are considered a delicacy in China.

Past coronaviruses, including the severe acute respiratory syndrome coronavirus (SARS-CoV), have jumped from bats to humans via an intermediary mammal. Some experts propose Malayan pangolins provided SARS-CoV-2 this link.

Although the original host of the SARS-CoV-2 virus hasn't been identified, we needn't be surprised if the creature appears perfectly healthy. Many other coronaviruses exist naturally in wild mammal and bird populations around the world.

Where do they keep coming from?

Human activity drives the emergence of new pathogenic (disease-causing) viruses. As we push back the boundaries of the last wild places on Earth—felling the bush for farms and plantations—viruses from wildlife interact with crops, farm animals and people.

Species that evolved separately are now mixing. Global markets allow the free trade of live animals (including their eggs, semen and meat), vegetables, flowers, bulbs and seeds – and viruses come along for the ride.

Humans are also warming the climate. This allows certain species to expand their geographical range into zones that were previously too cold to inhabit. As a result, many viruses are meeting new hosts for the first time.

Smuggled pangolins are killed for their scales to be used in traditional Chinese medicine. They are suspected to be the world’s most-trafficked mammal, apart from humans. Credit: Shutterstock

How do they make the jump?

Virus spillover is a complex process and not fully understood. In nature, most viruses are confined to particular hosts because of specific protein "lock and key" interactions. These are needed for successful replication, movement within the host, and transmission between hosts.

For a virus to infect a new host, some or all protein "keys" may need to be modified. These modifications, called "mutations," can occur within the old host, the new one, or both.

For instance, a virus can jump from host A to host B, but it won't replicate well or transmit between individuals unless multiple protein keys mutate either simultaneously, or consecutively. The low probability of this happening makes spillovers uncommon.

To better understand how spillovers occur, imagine a virus is a short story printed on a piece of paper. The story describes:

how to live in a specific cell type, inside a specific hosthow to move to the cell next doorhow to transmit to a new individual of the same species.

The short story also has instructions on how to make a virus photocopying machine. This machine, an enzyme called a polymerase, is supposed to churn out endless identical copies of the story. However, the polymerase occasionally makes mistakes.

It may miss a word, or add a new word or phrase to the story, subtly changing it. These changed virus stories are called "mutants." Very occasionally, a mutant story will describe how the virus can live inside a totally new host species. If the mutant and this new host meet, a spillover can happen.

We can't predict virus spillovers to humans, so developing vaccines preemptively isn't an option. There has been ongoing discussions of a "universal flu vaccine" which would provide immunity against all influenza virus mutants. But so far this hasn't been possible.

Let wildlife be wildlife

Despite how many viruses exist, relatively few threaten us, and the plants and animals we rely on.

Nonetheless, some creatures are especially dangerous on this front. For instance, coronaviruses, Ebola and Marburg viruses, Hendra and Nipah viruses, rabies-like lyssaviruses, and mumps/measles-like paramyxoviruses all originate from bats.

Given the enormous number of viruses that exist, and our willingness to provide them global transport, future spillovers are inevitable. We can reduce the chances of this by practising better virus surveillance in hospitals and on farms.

We should also recognise wildlife, not only for its intrinsic value, but as a potential source of disease-causing viruses. So let's maintain a "social distance" and leave wildlife in the wild.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

What is the probability of virus undergoing a specific dangerous mutation? - Biology

Control of emerging infectious diseases will be difficult because of the large number of disease-causing organisms that are emerging or could emerge and the great diversity of geographic areas in which emergence can occur. The modern view of the evolution of pathogen virulence—specifically its focus on the tradeoff between costs and benefits to the pathogen from increased host exploitation—allows control programs to identify and focus on the most dangerous pathogens (those that can be established with high virulence in human populations).

Studies of emerging diseases have focused chiefly on the spectrum of different emerging pathogens, epidemiologic reasons for emergence, and interventions to control emergence. The feasibility of disease control is hampered by the potentially vast number of emerging and reemerging pathogens, the diversity of geographic sources, the potential for rapid global dissemination from these sources, and numerous ecologic and social factors influencing emergence (1-4). Disease control could be made more manageable if the most dangerous pathogens could be singled out for the most intense study, surveillance, and control efforts. Experts who have addressed this problem from an epidemiologic but not an evolutionary perspective disagree about the feasibility of predicting and preventing the emergence of the most damaging new pathogens (5-8). In this perspective, I argue that improved understanding of the evolution of virulence (defined broadly as the harmfulness of an infection) can make this goal more feasible in two ways: 1) by facilitating identification and blocking of pathogens that represent the greatest threat should they become established in human populations (e.g., Yersinia pestis during the Middle Ages and human immunodeficiency virus [HIV] during recent decades) and 2) by providing methods for inhibiting the emergence of particularly virulent variants of pathogens that are already established in human populations (e.g., the pathogen that caused the 1918 influenza pandemic and virulent, antibiotic-resistant strains of Staphylococcus aureus).

Modern understanding of the evolution of virulence focuses on a tradeoff to which pathogens are subjected: the competitive benefits that pathogens accrue through increased exploitation of hosts and the costs that result from any effects of disease that reduce infectious contact between infected and susceptible hosts. The traditional view presumed that natural selection would favor evolution toward benign coexistence between host and parasite (9-12). The modern view, however, stresses that such benign coexistence will be unstable if pathogens that exploit hosts to a greater degree have more overall success across transmission cycles than those that achieve benign coexistence (13-17).

The primary assumption of this evolutionary argument is that increased toxin virulence is correlated with increased pathogen propagation (manifested as increases in pathogen reproduction within hosts and/or pathogen shedding from infected hosts). This correlation need not be strong across host/pathogen associations for the arguments to be valid differences in pathogenic mechanisms, for example, could make the correlation virtually undetectable when extremely different kinds of pathogens are compared. Rather, the tradeoff argument states that for a given pathogen (with its particular tropisms and pathogenic mechanisms), mutations that increase the level of host exploitation tend to increase harmfulness. The association between virulence, exploitation, and pathogen propagation is expected among "wild type" mutants, but not among novel laboratory-generated virulent ones. Because there are many routes to increased virulence and laboratory-generated variants are often not selected on the basis of competitive superiority in vivo, the increased virulence of variants generated in the laboratory may not be linked to propagative superiority. In contrast, natural selection should eliminate any variants for which increases in virulence are not linked to increases in pathogen fitness.

The connection between virulence, host exploitation, and pathogen propagation may be indirect or direct. If the pathogenic mechanism involves toxin production, a positive association is expected between production and pathogen propagation. In Vibrio cholerae, for example, high toxin production is associated with increased densities of vibrios in the fecal material, apparently as a result of the toxin's flushing of competing organisms from the intestinal tract (17). In other organisms, the association between virulence, host exploitation, and pathogen propagation is more direct. The human plasmodia that reproduce more extensively often cause more severe illness and are more life-threatening (16). Similarly, more virulent strains of vector-borne dengue virus reproduce more extensively in cell culture (18). Growth rates of Salmonella typhimurium were reduced by eliminating one of its virulence plasmids and inhibiting the plasmid's expression introduction of an 8-kb region encoding the spv genes restored increased growth rate (19). Comparison of Shigella species suggests a similar association between virulence and pathogen reproduction (20).

Sexually transmitted pathogens show analogous associations. For the best studied pathogen, HIV, more rapidly replicative HIVs are associated with greater cellular destruction in vitro, more rapid destruction of the immune system, and more rapid onset of AIDS (21-35). Similarly, the more oncogenic serotypes of human papilloma-viruses (HPV) generate greater numbers of progeny by interfering with the cell's mechanisms for restricting cell division (36). For both viruses, increased viral loads are associated with increased probability of transmission to contacted persons (37-39), and HIV-1, which propagates to higher densities than HIV-2, is more transmissible per contact (40).

The association between virulence and viral propagation in pathogens circulating naturally in human populations therefore supports the modern emphasis on a tradeoff between the fitness benefits and the costs accrued by pathogens as a function of changes in host exploitation.

Transmission Associated with High Virulence

Transmission from Immobile Hosts

Like the traditional view of host/parasite coevolution, the modern view identifies host illness as a potential liability for the pathogen. When pathogens rely on the mobility of their current host to reach susceptible hosts, the illness caused by intense exploitation typically reduces the potential for transmission. The modern perspective on host/parasite coevolution differs from the traditional one, however, in its emphasis on weighing these setbacks against the benefits of exploitation: high virulence can contribute to evolutionary stability if the costs incurred by parasites from exploitation-induced damage are particularly small and/or the benefits obtained from exploitation are particularly big. Thus, if host immobilization has little negative effect on transmission, pathogen variants that exploit the host so intensely that it is immobilized will reap the benefits of exploitation. Put more generally, when the costs incurred from transmission associated with immobilization are small, the costs of exploitation should outweigh the benefits at a higher level of exploitation—and hence virulence—than would occur if immobilization severely impaired transmission (16).

Recognizing this version of the general tradeoff led to several predictions: Because vector-borne parasites can be transmitted effectively from immobilized hosts, they should evolve to a higher level of virulence than directly transmitted parasites (16). Similarly, aspects of human behavior and culture can form "cultural vectors," which transmit pathogens from immobile to susceptible hosts (41). For example, diarrheal pathogens that are largely waterborne should evolve to relatively high levels of virulence because effective transmission can occur even when infected hosts are mobilized: persons carrying contaminated clothing and bedding, the water used for washing bed sheets, and the movement of contaminated water into drinking water together act like a swarm of mosquitoes, transmitting pathogens from the immobilized host. Attendant-borne pathogens should also become virulent. Attendant-borne transmission often occurs in hospitals, when nurses and physicians transmit pathogens from one immobilized patient to another. A reciprocal process occurs when parasites rely on the mobility of susceptible persons rather than the mobility of the infected hosts to reach the susceptible persons. Parasites that are durable in the external environment should thus evolve toward a higher level of virulence than nondurable pathogens because durable pathogens may remain viable in the environment until the movement of susceptible individuals brings them into contact with the pathogens.

Each of these hypotheses has been evaluated and in each case the expected association occurred: virulence is positively associated with vector-borne transmission, waterborne transmission, attendant-borne transmission, and durability in the external environment (Table 1). This evolutionary framework, therefore, explains the diversity of human parasites in a way that contrasts starkly with the traditional view. Instead of being seen as a sign of maladaption, the severity of diseases such as malaria, tuberculosis, smallpox, cholera, and typhoid fever is seen as a consequence of evolutionary adaptation because the causative parasites do not rely on host mobility for transmission. The tradeoffs between the benefits and costs of exploitation, therefore, favor evolution of relatively high levels of exploitation for such pathogens and hence high degrees of harm to the host.

Sexual Transmission

The evolutionary tradeoffs associated with virulence in sexually transmitted diseases involve the requirements for sexual transmission imposed on the pathogens by the sexual behavior of the host. Short durations of infections would be ineffective for most sexually transmitted pathogens. If people changed sex partners once per year, for example, a pathogen that was rendered noninfectious by immunologic defenses or the host's death within a few weeks would have little chance of being transmitted. To survive, the pathogen must be transmissible for a period that extends into the time of the next sexual partnership. To prosper, the pathogen must be transmissible for periods that span more than one change in sex partners therefore, sexually transmitted pathogens may often need cell and tissue tropisms that keep them from being eliminated by the immune system for relatively long periods.

The evolutionary effects of changes in sexual behavior on virulence may be strongly influenced by tropisms that were present before the behavior change. Increased potential for sexual transmission should favor pathogen variants that reproduce more extensively sooner after the onset of infection. If the preexisting tropisms target nonessential cell types, this selection for earlier reproduction will have relatively little effect on virulence. If, for example, people changed sex partners every few days, the sexually transmitted pathogen should evolve virulence levels much like those of respiratory tract pathogens, which rely on host mobility for transmission. Examples of such pathogens are sexually transmitted unicellular pathogens such as Neisseria gonorrheae and Chlamydia trachomatis, which tend to infect mucosal tissues and, therefore, have relatively minor negative effects on the survival of adult hosts. If, however, the tropisms involve critical cells, the damage associated with increased levels of host exploitation should be more severe to the host. HIV provides an example: HIV has a tropism for helper T cells, which are critical regulators of immunologic responses. Although a high level of replication in these cells can be tolerated over short periods, it eventually leads (by mechanisms that are still being clarified) to the decimation of this category of cells and the collapse of the immune system.

If these arguments about evolutionary forces and tissue tropisms are applicable to HIV, HIVs should be more virulent in areas where the potential for sexual transmission is greater. In accordance with this prediction, HIV-2 tends to be less virulent than HIV-1 moreover, evidence indicates that during the early years of HIV infection in Africa, HIV-2 tended to be transmitted in populations having a lower potential for sexual transmission (17,20). The overall validity of this approach to HIV virulence, however, will be better tested as different variants of HIV emerge in different geographic regions. Information about the potential for sexual transmission can help predict the evolution of HIV virulence in different geographic areas. On the basis of the evolutionary tradeoffs mentioned above, for example, the type E HIV-1s that are circulating in Thailand (where the potential for sexual transmission has been great) are predicted to be particularly virulent (17). Although this prediction needs to be evaluated rigorously, recently gathered data support the prediction: the decline in CD4+ cell counts of persons infected with HIV and the progression of illness in these patients appear to be particularly rapid in Thailand (43-44).

The most important application of this evolutionary approach to HIV, however, pertains to interventions that can be used to control the future evolution of HIV. If the inherent virulences of HIVs depend evolutionarily on the potential for sexual transmission, interventions that reduce this potential should have a long-term evolutionary effect, as well as widely recognized short-term epidemiologic effects in addition to reducing the spread of HIV infection, such interventions should reduce the harmfulness per infection. Follow-up of persons infected with HIV-1 for more than a decade without deterioration of the immune system indicates that the mildness of the infections is sometimes attributable to inherently mild viruses (45-47). The raw material for this evolutionary change, therefore, appears to be already present in the HIV gene pool.

In Japan, which has a relatively low potential for sexual transmission (48), type E HIV-1s have recently been introduced from Southeast Asia. If a low potential for sexual transmission favors evolution toward mildness, the Japanese type E viruses should become milder over the next few decades.

Assessing the Threat Posed by Pathogens

Assessment Goals

Focusing investigative and intervention efforts on the most significant disease threats makes sense only if the threats can be reliably assessed. The long-term threat depends on the evolutionary stability of high pathogen virulence, and the most dangerous pathogens are those that threaten widespread persistence with severely damaging manifestations. One of the most important tasks in controlling emerging diseases is to identify and block such pathogens during the early stages of emergence, or better yet, before they emerge. If the most dangerous pathogens—the future analogs of the causes of AIDS, malaria, smallpox, tuberculosis, and cholera—could be effectively blocked, the effort against emerging diseases would be successful. If not, the effort may be looked on as a failure in spite of successes against pathogens that are less able to effectively penetrate human populations or relatively benign when they do establish themselves. The emergence, spread, and persistence of pathogens with the characteristics of rhinoviruses, for example, would not be looked on as a great failure. The establishment of such pathogens would hardly be noticed against the current backdrop of mild to moderately severe respiratory tract pathogens.

To identify pathogens that must be studied and controlled most intensively, each pathogen should be assessed for two characteristics that are associated with high virulence: 1) an ability to spread well from human to human (directly or indirectly through vectors) rather than infecting humans as dead-end hosts, and 2) transmission features that select for high levels of virulence.

The existing associations between virulence and transmission characteristics (Table 1) can be used to make such identifications. Table 2 offers a checklist that could be applied to each emerging pathogen to determine whether it makes the first cut in the process of identifying the most dangerous candidates. Subsequent analyses of the pathogens would then assess the nature of any barriers that limit the establishment of pathogens in human populations (e.g., the absence of suitable arthropod vectors for large proportions of the year).


Although durability in various external environments was quantified in detail by microbiologists during the first half of this century (49), modern studies have paid this attribute little attention. Evolutionary considerations, however, indicate that it should be one of the first variables quantified when a new pathogen is being studied. If a new, directly transmitted pathogen can remain viable in the external environment for many days to many weeks, it falls in the category of especially dangerous pathogens. If, for example, Ebola virus were viable upon natural desiccation for weeks instead of hours, its level of host exploitation and potential for transmission from exploited hosts would not be so mismatched, and it, like smallpox virus, would pose a much more serious threat. Durability in the external environment depends largely on environmental conditions (49), and thus assessments of viability should cover all feasible environmental conditions.

Vector-borne Transmission

The most serious threat involved in vector-borne transmission comes from pathogens that can be maintained by human/mosquito cycles but are absent from suitable areas because of historical accidents or past eradication campaigns. Dengue and malaria are members of this category they have the potential to spiral out of control immediately upon release into areas with suitable vectors. Nonevolutionary analyses of emerging infections recognize the threat posed by these pathogens because their damaging effects on human populations are known.

Vector-borne pathogens that have not used humans as the primary vertebrate host but may be capable of doing so represent less easily recognized threats. Evolutionary considerations heighten concern because such vector-borne pathogens are expected to become increasingly harmful as they become adapted to human/vector cycles of transmission (16).

Rift Valley fever virus provides an example. For most of this century, this virus was believed to infect humans only as dead-end hosts. Although it was vector-borne in ungulates, humans were seen as acquiring the infection either when involved in the slaughtering process or when bitten by mosquitoes that had acquired infection from other vertebrates. Recent outbreaks have spread to an extent consistent with substantial human/mosquito cycling, but the existence of such cycling has not been conclusively documented. If human/mosquito cycling is occurring, the door is open for further adaptation to humans and for evolution of increased virulence in humans, increased efficiency of human/vector transmission, and increased spread through human populations. Rift Valley fever virus viremias seem sufficient for human/mosquito cycling, and the lethality of the largest outbreaks was particularly high, as one would expect if some evolution toward increased virulence accompanied a temporary establishment of human/mosquito cycles (50-51). To assess the long-term threat posed by Rift Valley fever virus and to block this virus should it prove to be particularly threatening, we need to emphasize the following research priorities: 1) study the transmission of Rift Valley fever virus in human/mosquito cycles, 2) assess the potential for such transmission over extended periods, and 3) evaluate the effects of such transmission on virus virulence.

All emerging vector-borne pathogens need not be viewed as equally threatening. For example, Borrelia burgdorferi, the agent of Lyme disease (an emerging vector-borne pathogen in human populations in North America), does not need to be monitored to avoid its establishment as a human pathogen because once emerged, it does not threaten to spiral out of control it is tick-borne, and ongoing human/tick cycles are not feasible because of the limited exposure of infected humans to susceptible tick populations of the appropriate instar. Tick- and mite-borne rickettsiae do not present a great threat for similar reasons.

Sexual Transmission

The tradeoff concerning sexually transmitted pathogens may prove particularly useful in identifying pathogens that are capable of sexual transmission and have cell tropisms that would cause severe damage if host exploitation increased but have not had high potential for sexual transmission. Human T-cell lymphotropic virus (HTLV) is in this category, even though by nonevolutionary criteria it could be dismissed because it has been geographically widespread in humans for a long time (1). HTLV type 1 (HTLV-I) is less damaging than HIV it kills or severely handicaps 5% to 10% of the people it infects, generally decades after infection. Although HTLV-I and HIV infections share many characteristics, HTLV does not have HIV's high mutation rate and hence does not have the potential for staying ahead of immune responses and eventually decimating the immune system. Instead, HTLV relies on modes of transmission that do not expose it to the immune system: proviral replication through stimulation of host cell proliferation and transmission through cell-to-cell contact. A concern with HTLV is that a high potential for sexual transmission may favor increased rates of viral replication leading to increased exposure to the immune system and increased mutation rates (48).

A preliminary step toward evaluating the threat posed by the emergence of particularly virulent HTLVs is assessing whether HTLVs exposed to different levels of potential for sexual transmission vary in virulence. HTLV-I infections tend to lead to leukemias and lymphomas at younger ages in Jamaica, where the potential for sexual transmission is high, than in Japan, where potential for sexual transmission is low (48). This difference also occurs among North Americans of Japanese and Caribbean descent (52), who presumably are infected predominantly (if not exclusively) by Japanese and Caribbean HTLVs, respectively. The inherent virulence and mutation-proneness of the Japanese and Caribbean HTLVs need to be assessed. Similarly, HTLV virulence needs to be better studied in regions of Africa where it has been long endemic to determine whether variations in HTLV virulence are correlated with the potential for sexual transmission.

Although mutation-prone sexually transmitted viruses that infect critical cell types are particularly threatening, sexually transmitted viruses in general deserve special attention. Even if a sexually transmitted virus invades only epithelial cells and replicates with low mutation rates, a high potential for sexual transmission may lead to evolution of increased lethality. Death caused by HTLV-induced lymphomas and leukemias is one manifestation of the danger posed by an RNA virus that replicates substantially in its DNA form and hence is in a middle area within the spectrum of mutation-proneness. HPVs illustrate dangers posed by sexually transmitted viruses that, because they are DNA viruses, are even further away from HIV on the mutation-proneness continuum. The mechanism by which HPV nudges infectious cells toward cancer is associated with increased viral replication moreover, high potential for sexual transmission (as indicated by the number of lifetime sex partners) is a strong risk factor for infection with the more oncogenic HPV serotypes but not for the mild HPV serotypes (53). This association supports the idea that reductions in the potential for sexual transmission should cause evolution of reduced HPV virulence. Specifically, as the potential for sexual transmission decreases, the risk for acquiring the oncogenic serotypes (vs benign serotypes) should disproportionately decrease. Similarly, if interventions prevent the potential for sexual transmission from increasing, the emergence of oncogenic HPV serotypes should be disproportionately suppressed.

Waterborne Transmission

Although such pathogens as Vibrio cholerae O139 and Shigella dysenteriae type 1 threaten emergence in countries with inadequate water supplies, the threat is much lower in countries with safe water supplies. Although such pathogens continue to be brought into the countries with safe water supplies by travelers and commerce, the pathogens show little potential for emergence. For example, a major epidemic of S. dysenteriae type 1 spread from Guatemala through Central America during the early 1970s. It entered the United States in several places but dissipated without any great effort at containment. Its transmission was studied in a Los Angeles neighborhood, where each infection gave rise on average to about 0.4 new infections (54). Without amplification by waterborne transmission, this outbreak, like other introductions in the United States, was self-limited (54). The situation at the other end of Central America was similar. The S. dysenteriae epidemic dissipated as it moved into Costa Rica, where water supplies were relatively pure (L. J. Mata, pers. comm.).

Attendant-borne Transmission

Emerging hospital-acquired pathogens may pose one of the greatest and most controllable threats to people in countries like the United States, where more than 5% of hospital admissions and about 14% of intensive care patients acquire infections during their stay (55-57). According to some estimates, nosocomial infections rank among the ten leading causes of death in the United States (56), with dangerous bloodstream infections approximately doubling during the 1980s (58).

Although high virulence has been documented in pathogens involved in nosocomial outbreaks (59-63), the damage caused by nosocomial pathogens has generally been attributed to the state of hospitalized patients, who may be compromised by underlying disease, immunosuppressive drugs, and invasive procedures. These factors, however, do not explain why nosocomial pathogens, such as Staphylococcus aureus often cause symptomatic infections in hospital staff (60) but rarely in persons in the outside community. They also do not explain the association between the extent of nosocomial transmission and the virulence of infection, or the differences in symptomatic infections among otherwise healthy babies (17,20,41). In a New York City hospital, for example, where attendant-borne transmission rates were very low, only approximately one of 30 babies with S. aureus were symptomatic (64). Among nosocomial outbreaks of endemic disease, the analogous proportion may be 5- to 10-fold higher (65).

Without an evolutionary framework for understanding pathogen virulence, researchers would have no reason for expecting to find particularly virulent endemic pathogens in hospitals. The only serious attempts to explain the apparently high-level of pathogen virulence in hospitals involved the linking of virulence to another characteristic associated with hospitals: antibiotic resistance. The emergence of antibiotic-resistant organisms in hospitals in concert with the use of the antibiotics (66) led researchers to conclude that high levels of antibiotic use caused the emergence of resistant organisms and to speculate that antibiotic-resistant organisms might be inherently more virulent than their antibiotic-sensitive counterparts (67). Yet when infections caused by resistant nosocomial organisms are compared with sensitive (generally nosocomial) infections, the former are only sometimes found to be associated with more severe infections. Even when they are associated with more severe disease (62,63), any differences in inherent virulence tend to be confounded with other factors, such as increased severity due to lowered effectiveness of antibiotics. The increased severity of disease, however, is sometimes associated with resistance to antibiotics other than the one being used (61), suggesting that the increased damage is not simply a result of ineffective antibiotics. The presence of virulence-enhancing bacterial characteristics in damaging, resistant nosocomial strains (63,68) also suggests a link between nosocomial transmission, antibiotic resistance, and virulence: antibiotic-resistant strains may have been particularly virulent because they were nosocomial, but this virulence was not apparent in many of the comparisons because the sensitive strains were also nosocomial.

Although the controversy regarding virulence and antibiotic resistance in hospital-acquired infections can be explained by the hypothesized connection between attendant-borne transmission and the evolution of both virulence and antibiotic resistance, none of the investigations of the topic made measurements that would allow assessment of the connection between attendant-borne transmission and the emergence of variants with increased virulence. The critical measure is the harmfulness per person housing the organisms in question, and the critical comparison is between nosocomial and community-acquired strains. Among persons that harbor nosocomial strains of S. aureus, for example, the proportion that show symptomatic infection could be compared with the analogous proportion of matched persons who are harboring community strains. After virulence-enhancing mechanisms are well understood, pathogens can be assayed for their virulence directly. Thus Clostridium difficile pathogens isolated from prolonged nosocomial outbreaks are predicted to be more toxigenic than C. difficile isolated from the outside community. Similarly, nosocomial Escherichia coli are predicted to have virulence-enhancing characteristics (e.g., invasiveness, adherence) (69) more often than community strains.

Further knowledge about virulence enhancing mechanisms and development of techniques for rapid detection (e.g., [72-75]) should offer opportunities for carefully controlled experiments to test whether reduction in attendant-borne transmission causes a greater decline in the inherent virulence of nosocomial pathogens in experimental hospitals than in control hospitals in which interventions are not imposed. Long-term follow-up should clarify the degree to which attendant-borne transmission may foster the emergence of virulent variants among both established human pathogens (e.g., S. aureus, E. coli) and new or newly recognized pathogens (e.g., Serratia spp., and Pseudomonas aeruginosa).

Harmful, often antibiotic-resistant, hospital-acquired pathogens can readily emerge beyond a hospital's boundary, when patients are moved, or attendants move between hospitals the documentation is particularly strong for dangerous variants of E. coli and S. aureus (62,74-78). The degree to which emerging nosocomial pathogens spill over to generate outbreaks in the outside community is not well understood, but evidence suggests that this spillover represents a substantial threat when the organisms can infect healthy people. When large-scale communitywide epidemics of pathogenic E. coli have occurred, for example, transmission in hospitals often was strongly implicated. During 1953 and 1954, an E. coli epidemic advanced up the East Coast of the United States from the Carolinas through New England "As it spread, explosive outbreaks were limited to institutions, hospital wards, and newborn nurseries" (59). A focal study of the U.S. Army Hospital at Fort Belvoir, Virginia, indicated that the epidemic strain was brought into the hospital by infected people in the community, with the proportion of inpatient to outpatient cases reversing dramatically during the hospital's 5-month outbreak (59). Similarly, during the winter of 1961, in an outbreak in Chicago and adjacent communities in Indiana, about 5% of the infants were affected, and nearly half of the affected infants had direct or indirect contact with one of the 29 involved hospitals just before their illnesses (75).

Studies of S. aureus have also shown that nosocomial and community outbreaks are sometimes synchronous with transmission occurring in both directions between the hospital and the outside community (79-80). The long-term consequences of emergence of nosocomial strains for the outside community, however, still need to be assessed. The possibility that nosocomial pathogens may tend to be not only more resistant to antibiotics, but also more inherently virulent lends some urgency to this need.

Almost no work has been done to determine the potential of pathogens thought to be almost exclusively associated with nosocomial infection (e.g., Enterococcus, C. difficile) to take hold in the outside community. The high durability in the external environment of many nosocomial pathogens heightens the need for additional information. Durable pathogens that can infect uncompromised hosts (e.g., antibiotic-resistant S. aureus and to a lesser extent C. difficile) possess the basic characteristics that damaging organisms need to spread in the outside community. Durable organisms unable to infect healthy people pose a relatively low threat, but this inability is often presumed. Any transmission of durable nosocomial organisms like P. aeruginosa from patients after discharge heightens the threat to the outside community by providing an avenue for further adaptation to humans. Molecular analyses that allow reconstruction of epidemiologic patterns (e.g., molecular phylogenetics) could be used to improve assessments of the degree to which nosocomial pathogens can emerge in the outside community such studies need to provide quantitative assessments not only of the threats posed by nosocomial pathogens in their current state, but also of their potential to breach by evolution the barriers that have inhibited their broader spread in the past.

Conceptual Innovation, Explanatory Power, and Precision

Dangerous Emergences of the Past

Each of the organisms that caused devastating epidemics over the past 5 centuries, would have been identified as an extremely dangerous pathogen by the criteria proposed here. Y. pestis, for example, is durable in the external environment (49) and is vector-borne. Its threat is lower now than centuries ago when fleas and rats were abundant domiciliary inhabitants, but it still represents a threat where these hosts are present.

The periodic emergence of yellow fever in European and American cities during the 18th and 19th centuries took a heavy toll the 1878 epidemic, for example, killed about a quarter of the population of Memphis, Tennessee (81). If yellow fever virus were first encountered today, it would be recognized as an important threat because it is vector-borne and can be transmitted indefinitely through human/mosquito cycles.

With regard to the emergence of virulent variants from established pathogens, the influenza viruses circulating at the Western Front during World War I would be considered dangerous because barriers to transmission from immobile hosts were removed by cultural practices and because influenza virus is mutation prone (17,20). It is, therefore, not surprising that the Western Front has been identified as the source of the highly lethal variants of the 1918 influenza pandemic and that a pandemic of this severity has never recurred (17). More importantly, evolutionary considerations suggest that such a lethal pandemic will not recur unless influenza viruses are again exposed to opportunities that allow transmission from immobile hosts, as they are on poultry farms where highly lethal influenza outbreaks periodically emerge (17).

Uncertainty about the Dangerous Epidemics of the Future

These arguments about the evolution of virulence provide only coarse approximations of the selective processes in pathogen populations. To determine whether the implications of these arguments need to be substantially modified, we need empirical studies that evaluate these arguments against alternative explanations. Considering the current state of uncertainty, some might argue that it is dangerous to incorporate the current coarse understanding of the evolution of virulence into policy making. But failing to incorporate this understanding is dangerous.

If we do not adjust investments to take into account the evolutionary arguments, and the arguments prove correct, the reduction in death and illness per unit investment will be lower than it could have been. If we do adjust investments on the basis of these evolutionary arguments, and the arguments prove wrong, the nonevolutionary benefits of the investments would still be obtained.

Although the precise mechanisms that increase virulence in pathogens in the high-risk categories still need to be clarified, the associations (Table 1) are strong. One could argue, for example, that durable or waterborne pathogens are more harmful because hosts tend to pick up a greater diversity of genotypes from the environment when pathogens are more durable or are mixed in water if the within-host genetic variability of such pathogens is greater, they would have more potential for within-host competition, which could favor the evolution of increased virulence. By this argument, factors such as durability, vector-borne transmission, and waterborne transmission would increase virulence indirectly by increasing within-host genetic variation. With regard to the prevention of the emergence of highly virulent disease, uncertainties about mechanisms are not critical. Whether the effects of these factors are direct or indirect, elimination of the factors should discourage the emergence of severe disease and favor the decrease of highly virulent pathogens.

Decisions to invest in interventions without certainty about mechanisms is not new to the health sciences. The hygienic interventions to control hospital acquired diseases and the purification of water supplies to control cholera were appropriately advocated on the basis of epidemiologic data (from Ignaz Semmelweis and John Snow) a half century before the causative agents of these or any other infectious diseases were first identified. Jenner's smallpox vaccine program was accepted globally more than a century before viruses were discovered or the mechanisms by which vaccines provide protection were understood. Even now the mechanisms by which the immune system provides protection encompass major areas of uncertainty. This uncertainty is evidenced, for example, by the controversies about the importance of the different legs of the immune system (such as cytotoxic T cells, neutralizing antibody, and subsets of helper T cells) in HIV pathogenesis.

If the evolutionary arguments are correct, the emergence of the most harmful diseases can be countered not only for pathogens that are recognized as threats but also for those posing threats that are not yet recognized. Providing pure water supplies, reducing attendant-borne transmission, and reducing vector-borne transmission preferentially from ill people (e.g., by providing mosquito-proof houses [17]) should guard against the emergence of virulent pathogens, whether the pathogens are unidentified or are highly virulent variants of identified human pathogens. An understanding of the evolutionary determinants of virulence may thus make surveillance and prompt intervention much more manageable.

The emphasis thus is on suppression of the emergence of particularly virulent variants rather than suppression of the emergence of new disease organisms. The expectation is that the frequency of disease will drop even though the frequency of individuals harboring organisms may decline little if at all. The data on decentralization of nursery/maternity wards, for example, indicate that the rates of nosocomial infection decline among mothers and babies, even though the rates at which babies harbor pathogens (colonization plus infection) do not decline (82). Indeed the disagreement about the value of rooming-in as a mode of infection control (82) can be attributed to a failure to distinguish the prevalence of disease organisms from the prevalence of disease. Controversies about the value of waterborne transmission can be traced to a similar failure (17).

The lead article of the first issue of this journal was entitled, "Emerging infections: getting ahead of the curve" (4). I propose that integrating evolutionary principles with epidemiology would enhance our ability to stay ahead of the curve. Evolutionary insights should increase our ability to distinguish emerging pathogens according to the long-term threat that they pose and thereby adjust investments in accordance with the threat. Knowledge of the evolution of virulence should also guide us to identify for each pathogen the critical data that will allow us to make this assessment. Finally, evolutionary considerations should allow identification of infrastructural investments that will guard against the most dangerous pathogens, even if they are not blocked by surveillance and containment efforts and even if they have not yet been identified or are never identified as emerging pathogens.

Dr. Ewald is a professor in the Department of Biology at Amherst College. Trained in ecology and evolutionary biology, he works at the interface of these areas with epidemiology, focusing on the evolution of virulence among infectious diseases of humans and insects.

How the Flu Virus Can Change: “Drift” and “Shift”

Influenza viruses are constantly changing. They can change in two different ways.

Antigenic Drift

One way influenza viruses change is called &ldquoantigenic drift.&rdquo These are small changes (or mutations) in the genes of influenza viruses that can lead to changes in the surface proteins of the virus: HA (hemagglutinin) and NA (neuraminidase). The HA and NA surface proteins of influenza viruses are &ldquoantigens,&rdquo which means they are recognized by the immune system and are capable of triggering an immune response, including production of antibodies that can block infection. The changes associated with antigenic drift happen continually over time as the virus replicates. Most flu shots are designed to target an influenza virus&rsquo HA surface proteins/antigens. The nasal spray flu vaccine (LAIV) targets both the HA and NA of an influenza virus.

The small changes that occur from antigenic drift usually produce viruses that are closely related to one another, which can be illustrated by their location close together on a phylogenetic tree. Influenza viruses that are closely related to each other usually have similar antigenic properties. This means that antibodies your immune system creates against one influenza virus will likely recognize and respond to antigenically similar influenza viruses (this is called &ldquocross-protection&rdquo).

However, the small changes associated with antigenic drift can accumulate over time and result in viruses that are antigenically different (further away on the phylogenetic tree). It is also possible for a single (or small) change in a particularly important location on the HA to result in antigenic drift. When antigenic drift occurs, the body&rsquos immune system may not recognize and prevent sickness caused by the newer influenza viruses. As a result, a person becomes susceptible to flu infection again, as antigenic drift has changed the virus enough that a person&rsquos existing antibodies won&rsquot recognize and neutralize the newer influenza viruses.

Antigenic drift is the main reason why people can get the flu more than one time, and it&rsquos also a primary reason why the flu vaccine composition must be reviewed and updated each year (as needed) to keep up with evolving influenza viruses.

Antigenic Shift

The other type of change is called &ldquoantigenic shift.&rdquo Antigenic shift is an abrupt, major change in an influenza A virus, resulting in new HA and/or new HA and NA proteins in influenza viruses that infect humans. Shift can result in a new influenza A subtype in humans. One way shift can happen is when an influenza virus from an animal population gains the ability to infect humans. Such animal-origin viruses can contain an HA or HA/NA combination that is so different from the same subtype in humans that most people do not have immunity to the new (e.g., novel) virus. Such a &ldquoshift&rdquo occurred in the spring of 2009, when an H1N1 virus with genes from North American Swine, Eurasian Swine, humans and birds emerged to infect people and quickly spread, causing a pandemic. When shift happens, most people have little or no immunity against the new virus.

While influenza viruses change all the time due to antigenic drift, antigenic shift happens less frequently. Influenza pandemics occur very rarely there have been four pandemics in the past 100 years. For more information, see pandemic flu. Type A viruses undergo both antigenic drift and shift and are the only influenza viruses known to cause pandemics, while influenza type B viruses change only by the more gradual process of antigenic drift.


On July 2, researchers from Los Alamos Laboratory released a new study in Cell—a highly influential journal in the scientific community—that examines whether a particular mutation of the coronavirus increases the virus’ transmission rate. Of primary concern to the study’s authors is the G614 mutation on the spike protein of the coronavirus, the protein responsible for invading host cells. The authors contend that this mutation began circulating throughout Europe in early February and began displacing the D614 form of the virus that originated in Wuhan, China. According to the study, this G614 variant possesses a higher transmission rate, results in a higher viral load, and consistently becomes the dominant form of the virus wherever it spreads.

Understanding how a virus is mutating is important for several reasons. Do mutations make the virus more dangerous, as described above? If there are different strains, will they respond to treatments differently, or target different segments of the population? And what does it all mean for having an effective vaccine?

While some researchers immediately embraced this study as a clear indicator that this particular mutation is increasing the virus’ transmissibility rate, others are less convinced. Dr. Raul Andino-Pavlovsky, a professor of microbiology and immunology at the University of California at San Francisco, called the spike protein mutation findings “intriguing,” but told The Dispatch “it may be a little too early to say that [these mutations] are being selected.” Dr. William Schaffner, an infectious disease specialist at the Vanderbilt University Medical Center, told us “it is too early to draw those conclusions.”

Other scientists suggest that gene sequencing studies focus too much on singular mutations when there’s so much about the virus we still don’t know. “People who are writing papers about sequence variation want to highlight the variation that they find because that’s how they’re going to publish the paper,” said Dr. Colin Parrish, a professor of virology at Cornell University. Excessive gene sequencing for the coronavirus has already become the norm, meaning that moving forward, researchers will continue talking about mutations “as if they're more important than they may well be.”

According to Parrish, “There are a hundred people doing sequencing for every one person that’s doing biology or actually going back and testing mutations for their real effect on the virus’ replication.” But even if particular mutations are unlikely to change viral behavior, tracing a virus’ lineage—or phylogeographic variation—can be helpful from a public health point of view. “There are sufficient mutations so that virologists can track the lineage of the virus,” Schaffner said, meaning “they can distinguish the viruses that probably came to the U.S. from Europe from those that came to us from Asia.”

What makes this mutation worth highlighting? According to Bette Korber, the study’s leading author, the coronavirus is shifting toward the “G clade” form (which refers to the particular descendant of SARS-CoV-2 that contains the G614 mutation on the virus’ spike protein). This mutation, she claims, is not due to random chance, but rather a fitness advantage.

She explains that random things, like superspreader events or human hosts moving into new regions, “are by definition just that, random.” So, if the two forms were equally likely to propagate,” she said, “you wouldn’t expect such a shift to almost always go in one direction, towards higher frequency G clade.” The repetition of this particular pattern in nearly all of the locations they studied—with few exceptions—is what Korber and her colleagues “found to be compelling evidence of positive selection,” meaning a mutation that improves the overall fitness of the virus.

Before jumping to conclusions, it’s worth mentioning that viral mutations are a normal part of the evolutionary process. Every time a virus infects a new host, it takes over the host’s replication machinery to reproduce billions of genetic copies of itself so that the virus can spread to other cells. When viruses reproduce, they will inevitably make mistakes—mutations—that become incorporated into the viral genome. Whereas DNA viruses—such as smallpox and HPV—have a low mutation rate, RNA viruses regularly mutate. According to Parrish, most of these RNA mutations “decrease the fitness of the virus and so they generally get purged from the virus over time.”

But some phylogenetic patterns accumulate over time, creating new lineages within the virus’ evolutionary tree. In a process called genetic drift, some mutations cause the frequency of an existing gene variant within a particular population to change over time by chance alone. Of course, not all of these fixed mutations from genetic drift confer evolutionary benefits to the virus.

If, however, mutations do provide an advantage to the virus, then they will be selected for, meaning they will be more likely to replicate and transmit over time. These adaptive mutations—which are essentially happy accidents from a virus’ point of view—often take the form of improving transmissibility and resistance to antiviral drugs like remdesivir. Selective mutations can also change a virus’ antigenic properties, allowing it to cause disease in formerly resistant hosts, or its “pathogenicity,” meaning its ability to infect and harm its host in the first place.

But how do we know when a new mutation constitutes a new viral strain? “In our original preprint we used the word ‘strain’ to refer to viruses that carried the G614D,” Korber told The Dispatch. She said this term was rejected by many other scientists who said the word “strain” should be used more judiciously. For example: Only if the genetic variant is associated with unique phenotypic characteristics that are different from the compared reference virus.”

“If you call it a new strain then there’s zillions of new strains out there because almost every virus will have some mutation that may or may not be of any importance,” said Dr. Diane Griffin, a professor in the department of molecular microbiology and immunology at the

Johns Hopkins Bloomberg School of Public Health. Typically, a strain refers to a mutation that is similar enough to be part of the same species of virus but deserves some degree of differentiation because it behaves differently. One way to determine whether a mutation constitutes a strain is to see if it confers functional differences in terms of its virulence, resistance, or transmissibility. Scientists can also look at phenotypic variation between mutations, including its reproductive capabilities, increased titers, or survival rate.

“We don’t have any evidence for greater or lesser virulence,” said Dr. Paul Offit—director of the Vaccine Education Center at Children’s Hospital of Philadelphia—regarding G614 mutation, “but I think there is reasonable evidence for increased transmissibility.” Still, this functional difference does not necessarily constitute a new strain. “It’s tricky because essentially you're always making strains,” he added.

Looking beyond the specific mutation the study’s authors cite, it’s important to consider how often a virus mutates. Whereas some viruses—like polio, HIV and influenza—are constantly mutating, other viruses remain stable over long stretches of time. Influenza is a single stranded RNA virus that serves as a perfect example of what scientists call a very “plastic” virus. The flu mutates so much from one year to the next that natural infection or immunization from the previous year does not typically protect individuals from the functional mutation of the new virus, hence the need for a new vaccine each year. Sometimes there is a carry-over effect, although this is quite rare.

But other viruses, like measles, hardly mutate at all. “The essential measles virus is the same virus that was around in 1934, just to pick a number out of the hat,” said Schaffner. “It’s pretty darn stable, and that’s why we have one measles vaccine. It works around the world, it’s worked for 50 years, and it’s going to keep working because this is a very stable virus.”

SARS-CoV-2 is somewhere in the middle. According to Diana Griffin, coronaviruses “have some editing function, but they still have an error prone polymerase.” This means that unlike most other RNA viruses, coronaviruses have some capacity to identify errors while copying nucleic material, thus reducing the mutation rate. Because of its low mutation rate, SARS-CoV-2 has remained generally stable, which is a good sign for vaccine research.

“This is very important.” Schaffner said, “particularly as regards the now notorious spike protein on its surface, because that means all this vaccine work that’s going on around the world is more likely to result in successful vaccines, because the virus, unlike flu, is not mutating in its essence.” Whereas many vaccines undergoing trial tests are targeting the coronavirus’ spike protein in particular, scientists are also developing RNA vaccines, DNA vaccines, viral vector vaccines, and recombinant vaccines, among others, which target other parts of the virus.

It is possible that after a long period of time—say, 10 to 20 years—the virus may vary sufficiently such that it can start to evade the vaccines that we make now. But there is some good news about the mutation that is the subject of the Cell study. As much as it might be increasing transmission rate, “the G614 form is actually more sensitive to neutralizing antibodies,” said Korber. This means that even if the G614 mutation continues to spread, a vaccine targeting it will likely be extremely effective.

Food, genetically modified

These questions and answers have been prepared by WHO in response to questions and concerns from WHO Member State Governments with regard to the nature and safety of genetically modified food.

Genetically modified organisms (GMOs) can be defined as organisms (i.e. plants, animals or microorganisms) in which the genetic material (DNA) has been altered in a way that does not occur naturally by mating and/or natural recombination. The technology is often called &ldquomodern biotechnology&rdquo or &ldquogene technology&rdquo, sometimes also &ldquorecombinant DNA technology&rdquo or &ldquogenetic engineering&rdquo. It allows selected individual genes to be transferred from one organism into another, also between nonrelated species. Foods produced from or using GM organisms are often referred to as GM foods.

GM foods are developed &ndash and marketed &ndash because there is some perceived advantage either to the producer or consumer of these foods. This is meant to translate into a product with a lower price, greater benefit (in terms of durability or nutritional value) or both. Initially GM seed developers wanted their products to be accepted by producers and have concentrated on innovations that bring direct benefit to farmers (and the food industry generally).

One of the objectives for developing plants based on GM organisms is to improve crop protection. The GM crops currently on the market are mainly aimed at an increased level of crop protection through the introduction of resistance against plant diseases caused by insects or viruses or through increased tolerance towards herbicides.

Resistance against insects is achieved by incorporating into the food plant the gene for toxin production from the bacterium Bacillus thuringiensis (Bt). This toxin is currently used as a conventional insecticide in agriculture and is safe for human consumption. GM crops that inherently produce this toxin have been shown to require lower quantities of insecticides in specific situations, e.g. where pest pressure is high. Virus resistance is achieved through the introduction of a gene from certain viruses which cause disease in plants. Virus resistance makes plants less susceptible to diseases caused by such viruses, resulting in higher crop yields.

Herbicide tolerance is achieved through the introduction of a gene from a bacterium conveying resistance to some herbicides. In situations where weed pressure is high, the use of such crops has resulted in a reduction in the quantity of the herbicides used.

Generally consumers consider that conventional foods (that have an established record of safe consumption over the history) are safe. Whenever novel varieties of organisms for food use are developed using the traditional breeding methods that had existed before the introduction of gene technology, some of the characteristics of organisms may be altered, either in a positive or a negative way. National food authorities may be called upon to examine the safety of such conventional foods obtained from novel varieties of organisms, but this is not always the case.

In contrast, most national authorities consider that specific assessments are necessary for GM foods. Specific systems have been set up for the rigorous evaluation of GM organisms and GM foods relative to both human health and the environment. Similar evaluations are generally not performed for conventional foods. Hence there currently exists a significant difference in the evaluation process prior to marketing for these two groups of food.

The WHO Department of Food Safety and Zoonoses aims at assisting national authorities in the identification of foods that should be subject to risk assessment and to recommend appropriate approaches to safety assessment. Should national authorities decide to conduct safety assessment of GM organisms, WHO recommends the use of Codex Alimentarius guidelines (See the answer to Question 11 below).

The safety assessment of GM foods generally focuses on: (a) direct health effects (toxicity), (b) potential to provoke allergic reaction (allergenicity) (c) specific components thought to have nutritional or toxic properties (d) the stability of the inserted gene (e) nutritional effects associated with genetic modification and (f) any unintended effects which could result from the gene insertion.

While theoretical discussions have covered a broad range of aspects, the three main issues debated are the potentials to provoke allergic reaction (allergenicity), gene transfer and outcrossing.


As a matter of principle, the transfer of genes from commonly allergenic organisms to non-allergic organisms is discouraged unless it can be demonstrated that the protein product of the transferred gene is not allergenic. While foods developed using traditional breeding methods are not generally tested for allergenicity, protocols for the testing of GM foods have been evaluated by the Food and Agriculture Organization of the United Nations (FAO) and WHO. No allergic effects have been found relative to GM foods currently on the market.

Gene transfer

Gene transfer from GM foods to cells of the body or to bacteria in the gastrointestinal tract would cause concern if the transferred genetic material adversely affects human health. This would be particularly relevant if antibiotic resistance genes, used as markers when creating GMOs, were to be transferred. Although the probability of transfer is low, the use of gene transfer technology that does not involve antibiotic resistance genes is encouraged.


The migration of genes from GM plants into conventional crops or related species in the wild (referred to as &ldquooutcrossing&rdquo), as well as the mixing of crops derived from conventional seeds with GM crops, may have an indirect effect on food safety and food security. Cases have been reported where GM crops approved for animal feed or industrial use were detected at low levels in the products intended for human consumption. Several countries have adopted strategies to reduce mixing, including a clear separation of the fields within which GM crops and conventional crops are grown.

Environmental risk assessments cover both the GMO concerned and the potential receiving environment. The assessment process includes evaluation of the characteristics of the GMO and its effect and stability in the environment, combined with ecological characteristics of the environment in which the introduction will take place. The assessment also includes unintended effects which could result from the insertion of the new gene.

Issues of concern include: the capability of the GMO to escape and potentially introduce the engineered genes into wild populations the persistence of the gene after the GMO has been harvested the susceptibility of non-target organisms (e.g. insects which are not pests) to the gene product the stability of the gene the reduction in the spectrum of other plants including loss of biodiversity and increased use of chemicals in agriculture. The environmental safety aspects of GM crops vary considerably according to local conditions.

Different GM organisms include different genes inserted in different ways. This means that individual GM foods and their safety should be assessed on a case-by-case basis and that it is not possible to make general statements on the safety of all GM foods.

GM foods currently available on the international market have passed safety assessments and are not likely to present risks for human health. In addition, no effects on human health have been shown as a result of the consumption of such foods by the general population in the countries where they have been approved. Continuous application of safety assessments based on the Codex Alimentarius principles and, where appropriate, adequate post market monitoring, should form the basis for ensuring the safety of GM foods.

The way governments have regulated GM foods varies. In some countries GM foods are not yet regulated. Countries which have legislation in place focus primarily on assessment of risks for consumer health. Countries which have regulatory provisions for GM foods usually also regulate GMOs in general, taking into account health and environmental risks, as well as control- and trade-related issues (such as potential testing and labelling regimes). In view of the dynamics of the debate on GM foods, legislation is likely to continue to evolve.

GM crops available on the international market today have been designed using one of three basic traits: resistance to insect damage resistance to viral infections and tolerance towards certain herbicides. GM crops with higher nutrient content (e.g. soybeans increased oleic acid) have been also studied recently.

The Codex Alimentarius Commission (Codex) is the joint FAO/WHO intergovernmental body responsible for developing the standards, codes of practice, guidelines and recommendations that constitute the Codex Alimentarius, meaning the international food code. Codex developed principles for the human health risk analysis of GM foods in 2003.

The premise of these principles sets out a premarket assessment, performed on a caseby- case basis and including an evaluation of both direct effects (from the inserted gene) and unintended effects (that may arise as a consequence of insertion of the new gene) Codex also developed three Guidelines:

Codex principles do not have a binding effect on national legislation, but are referred to specifically in the Agreement on the Application of Sanitary and Phytosanitary Measures of the World Trade Organization (SPS Agreement), and WTO Members are encouraged to harmonize national standards with Codex standards. If trading partners have the same or similar mechanisms for the safety assessment of GM foods, the possibility that one product is approved in one country but rejected in another becomes smaller.

The Cartagena Protocol on Biosafety, an environmental treaty legally binding for its Parties which took effect in 2003, regulates transboundary movements of Living Modified Organisms (LMOs). GM foods are within the scope of the Protocol only if they contain LMOs that are capable of transferring or replicating genetic material. The cornerstone of the Protocol is a requirement that exporters seek consent from importers before the first shipment of LMOs intended for release into the environment.

The GM products that are currently on the international market have all passed safety assessments conducted by national authorities. These different assessments in general follow the same basic principles, including an assessment of environmental and human health risk. The food safety assessment is usually based on Codex documents.

Since the first introduction on the market in the mid-1990s of a major GM food (herbicide-resistant soybeans), there has been concern about such food among politicians, activists and consumers, especially in Europe. Several factors are involved. In the late 1980s &ndash early 1990s, the results of decades of molecular research reached the public domain. Until that time, consumers were generally not very aware of the potential of this research. In the case of food, consumers started to wonder about safety because they perceive that modern biotechnology is leading to the creation of new species.

Consumers frequently ask, &ldquowhat is in it for me?&rdquo. Where medicines are concerned, many consumers more readily accept biotechnology as beneficial for their health (e.g. vaccines, medicines with improved treatment potential or increased safety). In the case of the first GM foods introduced onto the European market, the products were of no apparent direct benefit to consumers (not significantly cheaper, no increased shelflife, no better taste). The potential for GM seeds to result in bigger yields per cultivated area should lead to lower prices. However, public attention has focused on the risk side of the risk-benefit equation, often without distinguishing between potential environmental impacts and public health effects of GMOs.

Consumer confidence in the safety of food supplies in Europe has decreased significantly as a result of a number of food scares that took place in the second half of the 1990s that are unrelated to GM foods. This has also had an impact on discussions about the acceptability of GM foods. Consumers have questioned the validity of risk assessments, both with regard to consumer health and environmental risks, focusing in particular on long-term effects. Other topics debated by consumer organizations have included allergenicity and antimicrobial resistance. Consumer concerns have triggered a discussion on the desirability of labelling GM foods, allowing for an informed choice of consumers.

The release of GMOs into the environment and the marketing of GM foods have resulted in a public debate in many parts of the world. This debate is likely to continue, probably in the broader context of other uses of biotechnology (e.g. in human medicine) and their consequences for human societies. Even though the issues under debate are usually very similar (costs and benefits, safety issues), the outcome of the debate differs from country to country. On issues such as labelling and traceability of GM foods as a way to address consumer preferences, there is no worldwide consensus to date. Despite the lack of consensus on these topics, the Codex Alimentarius Commission has made significant progress and developed Codex texts relevant to labelling of foods derived from modern biotechnology in 2011 to ensure consistency on any approach on labelling implemented by Codex members with already adopted Codex provisions.

Depending on the region of the world, people often have different attitudes to food. In addition to nutritional value, food often has societal and historical connotations, and in some instances may have religious importance. Technological modification of food and food production may evoke a negative response among consumers, especially in the absence of sound risk communication on risk assessment efforts and cost/benefit evaluations.

Yes, intellectual property rights are likely to be an element in the debate on GM foods, with an impact on the rights of farmers. In the FAO/WHO expert consultation in 2003, WHO and FAO have considered potential problems of the technological divide and the unbalanced distribution of benefits and risks between developed and developing countries and the problem often becomes even more acute through the existence of intellectual property rights and patenting that places an advantage on the strongholds of scientific and technological expertise. Such considerations are likely to also affect the debate on GM foods.

Certain groups are concerned about what they consider to be an undesirable level of control of seed markets by a few chemical companies. Sustainable agriculture and biodiversity benefit most from the use of a rich variety of crops, both in terms of good crop protection practices as well as from the perspective of society at large and the values attached to food. These groups fear that as a result of the interest of the chemical industry in seed markets, the range of varieties used by farmers may be reduced mainly to GM crops. This would impact on the food basket of a society as well as in the long run on crop protection (for example, with the development of resistance against insect pests and tolerance of certain herbicides). The exclusive use of herbicide-tolerant GM crops would also make the farmer dependent on these chemicals. These groups fear a dominant position of the chemical industry in agricultural development, a trend which they do not consider to be sustainable.

Future GM organisms are likely to include plants with improved resistance against plant disease or drought, crops with increased nutrient levels, fish species with enhanced growth characteristics. For non-food use, they may include plants or animals producing pharmaceutically important proteins such as new vaccines.

WHO has been taking an active role in relation to GM foods, primarily for two reasons:

on the grounds that public health could benefit from the potential of biotechnology, for example, from an increase in the nutrient content of foods, decreased allergenicity and more efficient and/or sustainable food production and

based on the need to examine the potential negative effects on human health of the consumption of food produced through genetic modification in order to protect public health. Modern technologies should be thoroughly evaluated if they are to constitute a true improvement in the way food is produced.

WHO, together with FAO, has convened several expert consultations on the evaluation of GM foods and provided technical advice for the Codex Alimentarius Commission which was fed into the Codex Guidelines on safety assessment of GM foods. WHO will keep paying due attention to the safety of GM foods from the view of public health protection, in close collaboration with FAO and other international bodies.