What if a Point Mutation is seen in only half the coverage for its location?

What if a Point Mutation is seen in only half the coverage for its location?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I've been looking at some sequenced exomes and found an interesting point mutation that causes a Proline-to-Leucine amino acid change in the protein. This seems like it could have a big impact on the protein's functionality but before I go any further I want to explore whether or not the variant is a sequencing artifact.

I looked at the coverage for this particular region of the genome and found that in some samples, the point mutation is seen in every single read covering the base in question while in others the point mutation is seen in approximately half of the reads. In all my samples, the base in question is covered by at least 15 separate reads but usually its more than 20.

My primary question is: how should I interpret the cases where the point mutation is seen in some but not all of the reads covering its location?

I'm also interested in any suggestions/advice on the more general topic of determining whether or not the mutation I've found is a sequencing artifact.

I don't know, whether the organism you are working with is diploid, but suspect it's an animal (or even a mammal), so the most parsimonious explanation would be that you have homozygotes and heterozygotes at this SNP-position.

Also I don't know what kind of genetic input your given but if there is variation in origin ie some saliva some cheek skin then there could be a tissue based difference in the genome.

Point accepted mutation

A point accepted mutation — also known as a PAM — is the replacement of a single amino acid in the primary structure of a protein with another single amino acid, which is accepted by the processes of natural selection. This definition does not include all point mutations in the DNA of an organism. In particular, silent mutations are not point accepted mutations, nor are mutations which are lethal or which are rejected by natural selection in other ways.

A PAM matrix is a matrix where each column and row represents one of the twenty standard amino acids. In bioinformatics, PAM matrices are regularly used as substitution matrices to score sequence alignments for proteins. Each entry in a PAM matrix indicates the likelihood of the amino acid of that row being replaced with the amino acid of that column through a series of one or more point accepted mutations during a specified evolutionary interval, rather than these two amino acids being aligned due to chance. Different PAM matrices correspond to different lengths of time in the evolution of the protein sequence.

What if a Point Mutation is seen in only half the coverage for its location? - Biology

Since all cells in our body contain DNA, there are lots of places for mutations to occur however, some mutations cannot be passed on to offspring and do not matter for evolution. Somatic mutations occur in non-reproductive cells and won't be passed onto offspring. For example, the golden color on half of this Red Delicious apple was caused by a somatic mutation. Its seeds will not carry the mutation.

The only mutations that matter to large-scale evolution are those that can be passed on to offspring. These occur in reproductive cells like eggs and sperm and are called germ line mutations.

Effects of germ line mutations
A single germ line mutation can have a range of effects:

    No change occurs in phenotype.
    Some mutations don't have any noticeable effect on the phenotype of an organism. This can happen in many situations: perhaps the mutation occurs in a stretch of DNA with no function, or perhaps the mutation occurs in a protein-coding region, but ends up not affecting the amino acid sequence of the protein.

Little mutations with big effects: Mutations to control genes
Mutations are often the victims of bad press — unfairly stereotyped as unimportant or as a cause of genetic disease. While many mutations do indeed have small or negative effects, another sort of mutation gets less airtime. Mutations to control genes can have major (and sometimes positive) effects.

Some regions of DNA control other genes, determining when and where other genes are turned "on". Mutations in these parts of the genome can substantially change the way the organism is built. The difference between a mutation to a control gene and a mutation to a less powerful gene is a bit like the difference between whispering an instruction to the trumpet player in an orchestra versus whispering it to the orchestra's conductor. The impact of changing the conductor's behavior is much bigger and more coordinated than changing the behavior of an individual orchestra member. Similarly, a mutation in a gene "conductor" can cause a cascade of effects in the behavior of genes under its control.

Many organisms have powerful control genes that determine how the body is laid out. For example, Hox genes are found in many animals (including flies and humans) and designate where the head goes and which regions of the body grow appendages. Such master control genes help direct the building of body "units," such as segments, limbs, and eyes. So evolving a major change in basic body layout may not be so unlikely it may simply require a change in a Hox gene and the favor of natural selection.

Cancer Genetics and Biology

Theresa V. Strong PhD , in Pediatric Cancer Genetics , 2018


Advances in the molecular analysis of pediatric tumors are providing unprecedented insight into the genetic and cellular changes that drive tumor development. The genomic changes that underlie tumor development include direct alterations to DNA sequence (mutations), chromosomal changes (deletions, insertions, translocations, and aneuploidy), and epigenetic changes. Through these alterations, tumor cells acquire the capacity to thwart normal proliferative controls. Six key hallmarks of cancer cells have been described by Hanahan and Weinberg and provide a useful framework for understanding cancer biology. Cancer cells must be able to sustain abnormal proliferation and circumvent the function of growth-suppressing genes, resist normal cell death programs, achieve replicative immortality, induce angiogenesis, and acquire the ability to metastasize. In addition to these intrinsic changes, the tumor microenvironment plays a critical role in supporting the growth and progression of cancer. Finally, a complex interaction of tumor cells with the cells of the immune system has profound impact on how cancer progresses. A deeper understanding of each of these aspects of cancer biology offers new opportunities and targets for therapeutic development.


High coverage sequencing increases pedigree-based mutation rate

From the Karmin et al. 2015 and Maretty et al. 2017 datasets, we extracted the per-generation pedigree-based mutation rates using standard filtering criteria. For the Karmin et al. study, several filter options were supplied, and we chose filter “c” since it corresponded to previously published criteria. Consistent with what is already known from comparisons of low coverage and high coverage sequencing runs (Poznik et al. 2016), the two high coverage datasets revealed a per-generation mutation rate that was, on average, 10- to 17-times faster (table 1) than the previously published (i.e., Xue et al. 2009 Helgason et al. 2015) low coverage studies. This suggested that the real per-generation Y chromosome single nucleotide mutation rate was much higher than previously determined. In fact, it suggested that, in the future, sequence runs at even higher coverage values might further increase this value.

Table 1. Y chromosome mutation rate by study.
Xu et al. 2009 Helgason et al. 2015 Karmin et al. 2015 Maretty et al. 2017
Sequence coverage (given in units of fold-coverage 15.5 12.4 35.5 40
Average mutation sper base pair per generation 3.0E-08 3.00E-08 3.02E-07 5.0E-07
95% confidence interval 8.9E-9 to 7.0E-8 2.85E-8 to 3.16E-8 (not given) 3.5E-07 to 6.4E-07

Failure of counter-explanations

Counter-explanations for these results came from only one of these two studies. With respect to Maretty et al. (2017) [and the corresponding study in Skov et al. (2017)], the authors appeared to possess the raw data indicating a high per-generation Y chromosome mutation rate. However, no comment on these rates were made.

In contrast, Karmin et al. (2015) attempted to explain the unusually high mutation rate that they discovered by employing additional filtration steps to the Y chromosome sequence reads. However, rather than strengthen their counter-explanations, their attempts strengthened the original implications of their findings.

Two lines of evidence supported this contention. First, Karmin et al. (2015) employed a logically circular argument to explain away the high mutation rate. In the Supplemental Text, they explained that their rationale for explaining away the high rate was not a new discovery about the ambiguity of sequence read mapping. Rather, they stated that “we initially applied a combination of regional filters previously defined on the basis of analyses of Illumina HiSeq data (Poznik et al. 2013 Wei et al. 2013a), resulting in ten regions of Chr Y sequence, altogether capturing 10.8 Mb (filter c, Table S2). However, the application of the regional filters led only to a modest reduction of false positive calls judged by the number of father-son/brother-brother (FS) differences and the count of recurrent mutations (Table S2)” (page 4). (A higher mutation rate would also lead to more recurrent mutations thus, attempts to reduce father-son/brother-brother mutation rates and the number of recurrent mutations are, essentially, two forms of achieving the same goal.) In other words, the Karmin et al. test for false positives was an evolutionarily-defined low mutation rate.

This was made even more clear in how Karmin et al. (2015) defined the accuracy of their filtering strategy: “The number of FS [i.e., Father-Son] differences was approximately 10-fold higher than the expected number of de novo mutations considering the range of published Chr Y mutation rates (Xue et al. 2009 Francalacci et al. 2013 Mendez et al. 2013 Poznik et al. 2013). This finding prompted us to explore additional filters” (page 4 of the Supplementary Information). Of the four studies they cited—Xue et al. 2009 Francalacci et al. 2013 Mendez et al. 2013 Poznik et al. 2013—only the Xue et al. study represented a pedigree-based Y chromosome mutation rate. The other three studies derived a mutation rate via the historically circular evolutionary geology-based molecular clock method—see Introduction—or by extrapolating the autosomal mutation rate onto the Y chromosome.

Second, independent tests of the specificity and sensitivity of the Karmin et al. (2015) extra filtration steps revealed a slight gain in specificity at the expense of a large loss in sensitivity. Unfortunately, in the Karmin et al. (2015) description of their uniquely defined 8.8 Mb filter “a + b + d”—the filter combination that achieved a lower Y chromosome mutation rate—the authors reported no results on the specificity and sensitivity of their filtration strategy. Aside from their circular attempts to reduce the father-son mutation rate to a value in line with what their evolutionary expectations defined, the authors provided no checks and balances on their methods.

However, shortly after the Karmin et al. (2015) paper appeared in print, Helgason et al. (2015) reported their pedigree-based mutation rate (based on low coverage sequencing runs) that happened to fall in line with evolutionary expectations, and also fell in line with the previously published low coverage results from Xue et al. (2009). In principle, we might expect the authors of Karmin et al. (2015) to endorse the Helgason et al. (2015) results and conclusions. Thus, we can use the Helgason et al. (2015) results to test the Karmin et al. (2015) filters for sensitivity and specificity.

Since Helgason et al. weighed their mutation results based on mapping—rather than based on some evolutionary-defined endpoint—we can evaluate the Karmin et al. filters based on their ability to reproduce the Helgason et al. results. To test for specificity, we can examine which Karmin et al. filters call the least number mutations to which Helgason et al. assigned a low weight. To test for sensitivity, we can examine which Karmin et al. filters capture the most mutations to which Helgason et al. assigned a full weight.

We found that both filter “c” and filter “a + b + d” from Karmin et al. (2015) rejected—filtered out—the vast majority of the low weight Helgason et al. (2015) mutations (table 2). As expected, the more stringent filter “a + b + d” rejected more mutations than filter “c.” However, neither filter captured all of the high weight Helgason et al. (2015) mutations (table 2). Nevertheless, filter “c” retained many more high weight mutations than filter “a + b + d.”

Table 2. Mutation retention and rejection by Helgason filters.
Filter Unweighted (reliable) mutations Weighted (unreliable) mutations
(None–original Helgason et al. 2015 data) 1015 1035
Karmin et al. 2015 filter c 718 15
Karmin et al. 2015 filter abd 593 3

Quantifying these results in terms of percentages, we found that both filters rejected >98% of the low weight mutations—with the difference between filter “a + b + d” and filter “c” being just 1.2% of the low weight mutations (table 3). Conversely, filter “c” retained 70.7% of the high weight Helgason et al. mutations, whereas filter “a + b + d” retained only 58.4%—a loss of 12.3% of the high weight mutations. Thus, by the independent test of the filtering strategy of Helgason et al. (2015), the Karmin et al. filter “a + b + d” is more stringent that the more commonly used filter “c”, yet the gain in specificity is offset by the large loss in sensitivity. This fact demonstrated that the Karmin et al. filter “a + b + d” was excessively stringent and that it artificially reduced the mutation rate to a value less than it is.

Table 3. Sensitivity and specificity of Helgason filters.
Filter Sensitivity (% of reliable Iceland data retained Specificity (% of unreliable Iceland retained
Karmin et al. 2015 filter c 70.7 1.4
Karmin et al. 2015 filter abd 58.4 0.3
Loss/Gain -12.3 1.2

Subsequent Y chromosome sequencing efforts from other investigators did not employ the Karmin et al. (2015) filter “a + b + d.” For example, to date, one of the largest global analyses of human paternal genetic history is the 1000 Genomes Project. From the 1,244 Y chromosome sequences in this project, a tree was constructed using the mapping-based 10.3Mb filter (Poznik et al. 2016)—not the 8.8 Mb filter “a + b + d.”

Together, these results strengthened the original implications (i.e., a higher mutation rate than previous studies had found) of Karmin et al.’s high coverage sequencing results.

High coverage mutation rate explains Y chromosome differences in 4,500 years

We found that the mutation rates from the high coverage studies explained the branch lengths of the Y chromosome tree within just a few thousand years (fig. 1). We also found that these rates rejected the evolutionary time of origin for the first modern Homo sapiens (fig. 2). For simplicity, when measuring total branch lengths, we began by simply adopting the typical evolutionary root position. Conversely, based on the results of the accompanying paper (Jeanson 2019), we also explored an alternative, better-supported (see Jeanson 2019) root position, and we found that the high coverage Y chromosome mutation rate explained all but the most divergent haplogroup A branch lengths in about 4,500 years (fig. 3).

Fig. 1. Mutation accumulation in the Y chromosome over the young-earth creation (YEC) timescale, evolutionary root. The derived Y chromosome pedigree-based mutation rates from high coverage sequencing runs were converted to units of mutations per year and multiplied over the YEC timescale. These predictions were compared to the branch lengths derived from Karmin et al. (2015) data, based on the typical evolutionary root position, and these predictions captured the branch length values.

Fig. 2. Mutation accumulation in the Y chromosome over the evolutionary timescale, evolutionary root. The derived Y chromosome pedigree-based mutation rates from high coverage sequencing runs were converted to units of mutations per year and multiplied over the evolutionary timescale. These predictions were compared to the branch lengths derived from Karmin et al. (2015) data, based on the typical evolutionary root position. The evolutionary predictions over-predicted the branch length values by 8- to 59-fold.

Fig. 3. Mutation accumulation in the Y chromosome over the young-earth creation (YEC) timescale, alternate root. The derived Y chromosome pedigree-based mutation rates from high coverage sequencing runs were converted to units of mutations per year and multiplied over the YEC timescale. These predictions were compared to the branch lengths derived from Karmin et al. (2015) data, based on the Alpha (Jeanson 2019) root position, and these predictions captured all but the A00 branch length values.

These latter results predicted a higher per-generation mutation rate for the most divergent A00 individuals. In addition, because this same root position shows a gradient of branch lengths (fig. 3), Fig. 3 implied a gradient of per-generation mutation rates, depending on exactly which root position (i.e., Gamma, Epislon, or Alpha—or somewhere in between see Jeanson 2019) turned out to be correct.

Finally, these results made indirect predictions about the relationship between the history of civilization and the structure of the Y chromosome tree. Since phylogenetic trees record changes in population size (e.g., see Karmin et al. 2015), the current Y chromosome tree must have also recorded changes in past human population size. However, since our results implied that the entire tree was only a few thousand years old, our results predicted that known recent (i.e., with in the last few thousand years) changes in population size would be stamped throughout the tree in a manner consistent with the recent origin of the tree. In other words, the deeper roots of the Y chromosome tree should record not changes in population size from 200,000 years ago but changes in population size from the recent past (see accompanying Jeanson 2019 paper).

Mutation examples and how they happen

We are quick to notice and utilize some plant mutations while others go undetected.

Photo 1. Naturally occurring plant color mutations. Photos credits: orange – Forest Starr and Kim Starr, CC BY 2.0 ficus – public domain iris – Bob Gutowski CC BY-NC-SA 2.0 hibiscus – Dariusz Malinowski CC BY-NC-ND 2.0.

Health and survival of an organism depends on reliable and accurate DNA (Deoxyribonucleic Acid) replication and orderly cell division. Without these processes being highly dependable, survival is questionable. However, occasional mistakes occur. What kind of mistakes happen, what causes them to happen and what are some of the outcomes?

First, it is important to know most DNA does nothing. DNA is classified as &ldquocoding&rdquo or &ldquonon-coding.&rdquo Coding DNA codes for the production of enzymes and proteins required to run the processes necessary for life. Non-coding DNA is similar to random letters placed together that do not make sense. The purpose of such an abundance of non-coding DNA is poorly understood, but of the 6.5 feet of DNA in each human cell, only about 1 inch is coding DNA. Mistakes within the non-coding sections have no apparent consequences and that is one theory as to why there is so much&mdashit may act as a buffer to protect coding DNA. A previous Michigan State University Extension article, &ldquoMutants have value too,&rdquo mentioned some DNA changes are useful. This article will discuss how they occur and gives examples of commonly seen plant mutations.

Mutations are due to changes occurring within DNA itself or in the replication/cell division process. Changes within the DNA molecule are referred to as &ldquopoint mutations&rdquo since they occur in a small portion of the DNA but may still have significant effect because they change the &ldquomeaning of the code.&rdquo Point mutations can be due to damage from cosmic rays, chemicals and viruses. They can also be due to stress from heat, cold, severe pruning or replication error causing a shift in DNA sequences so it no longer makes sense. Many biological systems are pathway-type systems requiring intermediate products to form before producing the final product. Enzymes control these intermediate steps, and interruption in any step prevents the end product from being produced. Therefore, the more steps in the pathway, the more vulnerable the system is to possible change.

Photo 2. Dwarf spruce with a branch reverting to the original non-dwarf state. Photo by Ragesoss CC BY-SA 3.0.

Point mutations affect many systems within plants. The most visually dramatic are color or shape. Photo 1 shows various naturally occurring color mutations. The change could affect a portion of a flower, fruit or leaf, or an entire branch. Depending on which tissue is involved, the change can be passed onto the next generation through seeds. They can also be propagated through grafting or cuttings. Some mutations can be unstable and result in producing sections of the plant that revert to their original state (Photo 2).

Plant point mutations are often found after stressful environmental conditions, especially cold. All cells in an organism contain the same genetic information no matter their location. Some cells form roots while others form flowers even though both contain the same genetic information. We do not fully understand what regulates this process. However, we do know that cells forced to reprogram to a different function appear prone to making mistakes in the process. This happens when plants experience bud-killing temperatures. When normal vegetative buds suffer damage, the plant forms adventitious buds that grow into new shoots. Most cells will reprogram successfully, but some may express change. Most changes go unnoticed and are not beneficial, but there could be a change in color or growth habit, which we easily spot and find attractive or beneficial.

A little explanation on plant anatomy and development may clarify mutation appearance. Plant structures begin with a single cell. That one cell divides to make two, those two divide to make four, then four divide to make eight and on and on until the structure is complete. That is why some visual mutations appear quite geometric. The hibiscus flower in Photo 1 is mostly half-white and half-pink, indicating the color change occurred at the two-cell stage. That is also what happens in a half red, half yellow apple fruit.

Photo 3. Fruit mutations found in a supermarket produce section. Striping on Gala apple (A, left) and a red pear (A, right). Rind thickness change on orange (B and C). Arrows indicate the area of rind thickening on the oranges (B and C). Photos by Ron Goldy, MSU Extension.

To prepare for this article, I took a field trip to the local super market. As expected, I found mutations. They are easy to spot once you know what to look for. Photo 3 shows what I found. Based on the size of the change, the orange fruit on the left in Photo 3B and C apparently had a change occur at the four-cell and the one on the right at the 16-cell stage. These visual changes can be surprising when observed since they do not happen often, but not unusual once the process is understood.

Fruit color mutations are most obvious. Color development is a pathway process with several intermediate steps between initial and final product. Color changes, therefore, happen quite often, especially changing to less color. However, many red apples have improved color from the original because apple growers find single limbs with highly colored fruit. Buds from those limbs are then propagated into entire trees.

Another common type of mutation involves the adding or deleting chromosomes or adding an entire set of chromosomes. These result from mistakes during the cell division process. During normal cell division, chromosomes line up, duplicate and then are pulled apart and equally distributed into the two resulting cells. Sometimes chromosomes &ldquolag&rdquo and get left behind, resulting in an unequal distribution&mdashone cell has more and the other has fewer. These cells often do not do well since half of them are missing necessary information and unequal numbers lead to further replication difficulties.

However, occasionally chromosomes duplicate and a new cell does not form. This results in the original cell having an entire extra chromosome set. These changes are quite stable since they have the necessary information&mdashjust twice as much, and they have an equal number of chromosomes, making further cell division regular. The resulting cells of this change are said to be polyploid (poly = many ploidy = chromosomes). This change can happen in all cells, but if it occurs in cells responsible for sexual reproduction, they form egg cells and pollen grains that have twice the normal number of chromosomes and the resulting eggs and pollen are referred to as &ldquounreduced gametes.&rdquo

If an unreduced pollen grain combines with an unreduced egg cell from the same species, it has the potential of developing into an entire new plant species. This process has given rise to some well-known food plants. Blueberries and strawberries are part of a polyploid series with some being diploids (the normal situation of two sets of chromosomes), tetraploids (four sets), hexaploids (six sets) and octoploids (eight sets). Commercial strawberries are octoploids and commercial blueberries are either tetraploids or hexaploids. The tetra-, hexa- and octoploids all are thought to trace their origin back to a diploid ancestor that went through the unreduced gamete production steps and combinations. Other polyploid plants include wheat (tetraploid or hexaploid), oats (hexaploid), kiwifruit (hexaploid) and others. In fact, 30 to 80 percent of all plants are polyploids.

You will note all levels mentioned are even numbers&mdashtwo, four, six, eight etc. None were odd&mdashone, three, five, etc. That is because odd numbers return us back to the problem of unequal chromosome distribution during cell division. However, for every rule there is an exception, and the potato has members with two, three, four and five sets of chromosomes, but then potato does not rely solely on sexual reproduction but can be propagated through asexual seed pieces. The odd sets do exist or can be made in other plant species, and we have taken advantage of them as food crops since in many cases the unequal chromosome distribution leads to seedlessness such as seedless watermelon and bananas. The plants will grow, but they will not produce offspring, they are sterile and only have traces of seeds.

Mutations also occur within animal systems. However, since animal systems are more complex, their survival is not as dependable and changes not as dramatic. There are some polyploid fish and amphibians, but polyploid mammals are rare and it is even rarer for them to survive until birth.

Genetic Instability

Meiyun Guo , . Christopher A. Maxwell , in Encyclopedia of Cancer (Third Edition) , 2019

Chromosomal Instability and Chromothripsis

Chromosomal instability is highly prevalent in human cancers with about 90% of tumors displaying chromosomal abnormalities. Chromosomal instability is defined as an increased rate of change in the structure or number of chromosomal segments or whole chromosomes, including amplification, deletion, loss of heterozygosity, translocation, insertion, inversion, and homozygous deletion. The consequence of chromosomal instability can be severe owning to the large-scale alterations that can affect the expression of thousands of gene products and, as a result, can dramatically alter cancer cell phenotypes that enable progression or endow intrinsic multidrug resistance. Therefore, many cellular mechanisms are dedicated to the preservation of chromosome stability, including DNA repair pathways, telomere regulation, and checkpoints to ensure mitotic spindle assembly and chromosome segregation.

Chromothripsis is a chromosomal instability phenomenon where hundreds of chromosomal rearrangements occur during one event in a localized region of one or a few chromosomes ( Fig. 2 ). An analysis of 746 cancer cell lines revealed that more than 2%–3% of cancers display massive genomic rearrangements on one chromosome. It is proposed that the massive rearrangements occur from one single chromosome fragmentation event followed by faulty DNA repair that joins the fragments together. DNA sequencing analysis of a patient with chronic lymphocytic leukemia revealed some characteristics of chromothripsis. First, chromothripsis occurs at specific locations of the genome while rearrangement events that characterize conventional genetic instability are assumed to occur randomly across the genome. Second, the copy number at many regions in an individual chromosome changes between one or two copies, and regions with only one copy are not generated simply by deletion but through rearrangements. Third, the breakpoints at the chromosome arm are clustered so that multiple rearrangements take place within a narrow region, and the fragments of the chromosome connected at breakpoints are originally located distal from each other. Since the locations of breakpoints are clustered, it is plausible that chromothripsis occurs during chromosome condensation related to mitosis. The occurrence of chromothripsis is especially high in bone cancers, but the process is not restricted to a specific tumor subtype.

Fig. 2 . Chromothripsis leads to chromosomal instability and is often found in aggressive tumor cells. There are several proposed insults that can induce chromothripsis, including ionizing radiation acting on condensed chromosomes, chromosome end-to-end fusion caused by telomere shortening that leads to a double strand break, abortive apoptosis, pulverization of chromosomes in micronuclei, and chromosome condensation that occurs in S phase. These insults may cause a single catastrophic event which fragments specific regions of a chromosome. During the subsequent DNA repair, some of the chromosomal fragments are lost and some are rearranged.

Something new

While the WHO emphasized the similarity between B.1.1.7 and other strains of SARS-CoV-2, the UK government has been focusing on what's new. It's here where the strain was first identified, and most people infected with it are in the UK.

Details on the new strain became apparent because of ongoing surveillance work within the UK, where researchers randomly sequence the genomes of thousands of virus samples every month. Over the course of the pandemic, circulating strains of SARS-CoV-2 have typically picked up an average of one or two mutations a month, so this level of surveillance has been sufficient to follow the origin and spread of new strains. But B.1.1.7, first spotted in samples obtained in late September, had nothing like the sort of gradual accumulation of changes we've seen before. There were 17 differences between it and the most closely related known strain, giving B.1.1.7 a branch way off on its own on the coronavirus family tree.

That "on its own" is a curiosity rather than a concern. What grabbed people's attention was a correlation. In response to the winter wave of infections throughout Europe, the UK had restarted a set of social restrictions intended to bring infection levels back down. And in most of the country, those restrictions were working as intended. But not in the southeast and east of the UK. And it was precisely that region where levels of the B.1.1.7 strain were highest. In one county, B.1.1.7 accounted for over 20 percent of all new infections by mid-December, and that number has gone up since.

That's not decisive evidence that the B.1.1.7 strain has any sort of advantage. The COVID-19 pandemic has been marked by many "superspreader" events and social groups that flout public health measures. This combination can cause the rapid expansion of strains that happen to be circulating within these groups at opportune moments. But by this weekend, B.1.1.7 was accounting for about 60 percent of the new cases in London, prompting government officials there to claim that the strain can spread more rapidly.

To know this for sure, however, we'll have to engineer the mutations found in B.1.1.7 into a lab strain and then test its infectivity. In the mean time, scientists have looked over the mutations present in the new viral strain and speculated on which ones could potentially be providing it with enhanced infectivity or altering the course of infections.


WHO. Coronavirus disease 2019 (COVID-19) situation report - 172. Coronavirus Disease (COVID-2019) Situation Reports (2020).

Shu, Y. & McCauley, J. GISAID: Global initiative on sharing all influenza data–from vision to reality. Eurosurveillance 22, 30494 (2017).

Sevajol, M., Subissi, L., Decroly, E., Canard, B. & Imbert, I. Insights into RNA synthesis, capping, and proofreading mechanisms of SARS-coronavirus. Virus Res. 194, 90–99 (2014).

Ferron, F. et al. Structural and molecular basis of mismatch correction and ribavirin excision from coronavirus RNA. Proc. Natl Acad. Sci. USA 115, E162–E171 (2018).

Wu, F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020).

Wang, R., Hozumi, Y., Yin, C. & Wei, G.-W. Decoding SARS-CoV-2 Transmission and Evolution and Ramifications for COVID-19 Diagnosis, Vaccine, and Medicine. J. Chem. Inf. Model. 60, 5853–5865 (2020).

Mercatelli, D. & Giorgi, F. M. Geographic and genomic distribution of SARS-CoV-2 mutations Front. Microbiol. 11, 1800 (2020).

Mousavizadeh, L. & Ghasemi, S. Genotype and phenotype of COVID-19: Their roles in pathogenesis. J. Microbiol. Immunol. Infect. (2020).

Yin, C. Genotyping coronavirus SARS-CoV-2: methods and implications. Genomics 112, 3588–3596 (2020).

Wang, R., Hozumi, Y., Yin, C. & Wei, G.-W. Decoding Asymptomatic COVID-19 Infection and Transmission. J. Phys. Chem. Lett. 11, 10007–10015 (2020).

Estrada, E. Topological analysis of SARS-CoV-2 main protease. Chaos 30, 061102 (2020).

Korber, B. et al. Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell 182, 812–827.e19 (2020).

Pachetti, M. et al. Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant. J. Transl. Med. 18, 1–9 (2020).

Sarkar, J. & Guha, R. Infectivity, virulence, pathogenicity, host-pathogen interactions of SARS and SARS-CoV-2 in experimental animals: a systematic review. Vet. Res. Commun. 44, 101–110 (2020).

He, J., Tao, H., Yan, Y., Huang, S.-Y. & Xiao, Y. Molecular mechanism of evolution and human infection with SARS-CoV-2. Viruses 12, 428 (2020).

Yao, H. et al. Molecular architecture of the sars-cov-2 virus. Cell 183, 730–738.e13 (2020).

Glowacka, I. et al. Evidence that TMPRSS2 activates the severe acute respiratory syndrome coronavirus spike protein for membrane fusion and reduces viral control by the humoral immune response. J. Virol. 85, 4122–4134 (2011).

Hoffmann, M. et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 181, 271–280.e8 (2020).

Li, W. et al. Bats are natural reservoirs of SARS-like coronaviruses. Science 310, 676–679 (2005).

Qu, X.-X. et al. Identification of two critical amino acid residues of the severe acute respiratory syndrome coronavirus spike protein for its variation in zoonotic tropism transition via a double substitution strategy. J. Biol. Chem. 280, 29588–29595 (2005).

Song, G. & Li, Y. Cross-layer optimization for OFDM wireless networks-part I: theoretical framework. IEEE Trans. Wirel. Commun. 4, 614–624 (2005).

Walls, A. C. et al. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell 181, 281–292.e6 (2020).

Stewart, A. D., Logsdon, J. M. & Kelley, S. E. An empirical study of the evolution of virulence under both horizontal and vertical transmission. Evolution 59, 730–739 (2005).

Williams, P. D. & Day, T. Interactions between sources of mortality and the evolution of parasite virulence. Proc. R. Soc. Lond. Ser. B Biol. Sci. 268, 2331–2337 (2001).

Nguyen, D. D., Xia, K. & Wei, G.-W. Generalized flexibility-rigidity index. J. Chem. Phys. 144, 234106 (2016).

Xia, K., Opron, K. & Wei, G.-W. Multiscale multiphysics and multidomain models-Flexibility and rigidity. J. Chem. Phys. 139, 11B614_1 (2013).

Cang, Z. & Wei, G.-W. Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology. Bioinformatics 33, 3549–3557 (2017).

Wang, M., Cang, Z. & Wei, G.-W. A topology-based network tree for the prediction of protein-protein binding affinity changes following mutation. Nat. Mach. Intell. 2, 116–123 (2020).

Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Statist. 29, 1189–1232 (2001).

Carlsson, G. Topology and data. Bull. Am. Math. Soc. 46, 255–308 (2009).

Estrada, E. & Rodriguez-Velazquez, J. A. Subgraph centrality in complex networks. Phys. Rev. E 71, 056103 (2005).

Bishop, K. N., Holmes, R. K., Sheehy, A. M. & Malim, M. H. APOBEC-mediated editing of viral RNA. Science 305, 645–645 (2004).

Li, T. et al. siRNA targeting the leader sequence of SARS-CoV inhibits virus replication. Gene Ther. 12, 751–761 (2005).

Rangan, R. et al. Rna genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look. RNA 26, 937–959 (2020).

Lee, N. et al. A major outbreak of severe acute respiratory syndrome in Hong Kong. N. Engl. J. Med. 348, 1986–1994 (2003).

Zhou, P. et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 270–273 (2020).

Hu, D. et al. Genomic characterization and infectivity of a novel SARS-like coronavirus in Chinese bats. Emerg. Microb. Infect. 7, 1–10 (2018).

Drexler, J. F. et al. Genomic characterization of severe acute respiratory syndrome-related coronavirus in European bats and classification of coronaviruses based on partial RNA-dependent RNA polymerase gene sequences. J. Virol. 84, 11336–11349 (2010).

DeLano, W. L. et al. Pymol: an open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr. 40, 82–92 (2002).

Gao, Y. et al. Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science 368, 779–782 (2020).

Yurkovetskiy, L. et al. Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant. Cell 183, 739–751.e8 (2020).

Yurkovetskiy, L. et al. SARS-CoV-2 Spike protein variant D614G increases infectivity and retains sensitivity to antibodies that target the receptor binding domain. bioRxiv (2020).

A. Brufsky. Distinct viral clades of SARS-CoV-2: Implications for Modeling of Viral Spread. J. Med. Virol. 92, 1386–1390 (2020).

Omasits, U., Ahrens, C. H., Müller, S. & Wollscheid, B. Protter: interactive protein feature visualization and integration with experimental proteomic data. Bioinformatics 30, 884–886 (2014).

Ren, Y. et al. The ORF3a protein of SARS-CoV-2 induces apoptosis in cells. Cell. Mol. Immunol. 17, 881–883 (2020).

Hassan, S. S., Choudhury, P. P., Basu, P. & Jana, S. S. Molecular conservation and differential mutation on ORF3a gene in Indian SARS-CoV2 genomes. Genomics 112, 3226–3237 (2020).

Shah, A. Novel coronavirus-induced NLRP3 inflammasome activation: a potential drug target in the treatment of COVID-19. Front. Immunol. 11, 1021 (2020).

Cornillez-Ty, C. T., Liao, L., Yates, J. R., Kuhn, P. & Buchmeier, M. J. Severe acute respiratory syndrome coronavirus nonstructural protein 2 interacts with a host protein complex involved in mitochondrial biogenesis and intracellular signaling. J. Virol. 83, 10314–10318 (2009).

Adedeji, A. O. et al. Mechanism of nucleic acid unwinding by SARS-CoV helicase. PLoS ONE 7, e36521 (2012).

Yuen, C.-K. et al. SARS-CoV-2 nsp13, nsp14, nsp15 and orf6 function as potent interferon antagonists. Emerg. Microbes Infect. 9, 1418–1428 (2020).

Knoops, K. et al. SARS-coronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum. PLoS Biol. 6, e226 (2008).

Deng, X. et al. Genomic surveillance reveals multiple introductions of SARS-CoV-2 into northern california. Science 369, 582–587 (2020).

Ni, L. et al. Detection of SARS-CoV-2-specific humoral and cellular immunity in COVID-19 convalescent individuals. Immunity 52, 971–977.e3 (2020).

Zhang, Y. et al. The ORF8 protein of SARS-CoV-2 mediates immune evasion through potently downregulating MHC-I. bioRxiv (2020).

Zeng, W. et al. Biochemical characterization of SARS-CoV-2 nucleocapsid protein. Biochem. Biophys. Res. Commun. 527, 618–623 (2020).

Korber, B. et al. Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2. bioRxiv (2020).

Zhang, L. et al. SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nat. Commun. 11, 6013 (2020).

Chen, J., Wang, R., Wang, M. & Wei, G.-W. Mutations strengthened SARS-CoV-2 infectivity. J. Mol. Biol. 432, 5212–5226 (2020).

Sievers, F. & Higgins, D. G. Clustal omega. Curr. Protoc. Bioinformatics 48, 3–13 (2014).

Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res. 37, D26–D31 (2009).

Yang, J. et al. The I-TASSER Suite: protein structure and function prediction. Nat. Methods 12, 7–8 (2015).

Levandowsky, M. & Winter, D. Distance between sets. Nature 234, 34–35 (1971).

Jankauskaitė, J., Jiménez-García, B., Dapkūnas, J., Fernández-Recio, J. & Moal, I. H. SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 35, 462–469 (2019).

Dehouck, Y. et al. Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. Bioinformatics 25, 2537–2543 (2009).

Genetic Testing for ALS

If there is more than one person with ALS and/or frontotemporal dementia in your family or someone was diagnosed at a younger age (such as age 45), you may want to meet with a genetic counselor. Meeting with a genetic counselor involves taking a detailed medical and family history, evaluating risks, and discussing the impact of genetic testing.

A genetic counselor can help you work through the pros and cons of genetic testing based on your concerns and values. Genetic counseling does not always lead to genetic testing.

For more information about genetic counseling or how to find a counselor in your area, please visit

Genetic testing can help determine the cause of Familial ALS in a family. Testing is most useful in a person who has been diagnosed with ALS. About 60-70 percent of individuals with Familial ALS will have a positive genetic test result (meaning a mutation has been identified).

Those families with Familial ALS where a mutation is not identified may have ALS caused by a gene or genes that have not yet been discovered. Not having an identified genetic mutation does not eliminate a Familial ALS diagnosis and other family members may still be at risk for developing ALS.

If a mutation has been identified, biological family members who don’t have symptoms can be tested to see if they inherited the genetic mutation this is called predictive testing. Some medical centers may require a neurological exam, psychological assessment and counseling before predictive testing.

If a person in the family with ALS has a negative genetic test result (no identified genetic mutation), testing family members without a diagnosis of ALS will not provide more information. If no one in the family with ALS is available for genetic testing, a negative test result in an unaffected person cannot be interpreted.

Genetic testing usually involves taking blood or saliva samples. Because this testing needs to be ordered by a health care professional, the sample is usually taken in the doctor’s office or in a lab associated with the doctor’s office. Results can take anywhere from a few weeks to a few months depending on the type of testing ordered. Results should be communicated by the genetic counselor or doctor who ordered the test. This is often done in person at a follow-up appointment or sometimes by telephone.

Because Familial ALS is usually an adult-onset condition, genetic testing of children under age 18 is not recommended.

Genetic testing protocols may differ among clinics. Some clinics may offer testing for different genes, and focus on testing specific patients or people in a family. Other testing may be offered on a research basis only. Test results are not always straightforward. There are some genetic changes that scientists do not understand yet so results can be difficult to interpret.

Genetic testing is a personal choice. Some people with ALS want genetic testing to better understand why they got the disease and help other family members. Some unaffected people want to know if they are at risk for ALS, while others would prefer not to know. Consultation with a genetic counselor can help you decide if testing is the right decision for you.

Some reasons people at-risk for Familial ALS decide to have testing include: getting more information to help make life decisions, allowing time to adjust to the fact they will likely get ALS, reducing anxiety, to help guide reproductive decisions and to provide information for the next generation.

Some reasons people at-risk for Familial ALS decline genetic testing include: the desire to avoid worry about getting ALS, knowing there is currently no cure, and avoiding guilt about passing it on to children or testing negative when others in the family test positive.

Genetic testing may:

  • Explain if there is a genetic cause of ALS in the family.
  • Allow other family members to have testing to see if they carry the genetic mutation.
  • Allow couples planning on having children to pursue prenatal testing.

Genetic testing does not:

  • Currently change medical treatment.
  • Diagnose ALS in people without symptoms.
  • Tell a person without symptoms when they may start showing symptoms or what their progression will be.

DNA banking is a valuable option for people with ALS who do not currently have an identifiable genetic mutation. DNA banking means storing a person’s blood so that it is available for future testing. For more information, you can consult a genetic counselor.

Concerns about Genetic Testing

Genetic testing for all of the currently known Familial ALS genes can cost from about $1600 to $5000. Genetic testing for one gene usually costs $500 - $1500. When the genetic mutation in a family is already known, the cost to test for the familial mutation is usually around $400. Genetic testing is not always covered by insurance. Check with your insurance company about any out of pocket expense prior to testing.