How many possible codons?

How many possible codons?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Consider a codon of the form NNK (where N = Adenine, Cytosine, Guanine or Uracil & K = Uracil or Guanine). How many codons are now available? I know if all were available there would be 4^3 = 64 codons. How many are possible now? When I have tried combinations manually I got it to be 32, is this correct?

Yes, 32 is correct.

Technically, I have nothing to add what Gerardo Furtado and a tiger haven't already mentioned but a graphical representation of all permutations might help to understand this a bit better.

For the first 2 positions in the codon we have 4 bases to choose from (adenine, guanine, uracil and cytosine). So this can mathematically be represented as 4 x 4 = 16 or visually as:

Now, for the next position (the position of interest) we only have two bases to choose from (uracil and guanine). Leaving us with 4 x 4 x 2 = 32 different permutations, or visually:

Yes, it is.

Gerardo Furtado has already provided a short answer in the comments, but allow me to explain why.

The 4^3 = 64 if all bases can be chosen freely comes down to 4*4*4=64, because you can choose from 4 options for each of the 3 bases.

If you restrict the possibilities of one of the bases, e.g. there are only 2 options for the third base (as in your example), the formula changes to 4*4*2=32, as you can choose from 4 options each for the first 2 bases and from 2 options for the 3rd base.

Edit: If the 2 'N's from 'NNK' have to be the same, you can choose the first base freely (4 options), the second base is already defined (it has to be the same as the first base, 1 option) and there are 2 options for the 3rd base as described above. Thus, we get 4*1*2=8 possible codons

I hope I could clear this up.

What Is the Genetic Code?

The genetic code consists of the sequence of nitrogen bases in a polynucleotide chain of DNA or RNA. The bases are adenine (A), cytosine (C), guanine (G), and thymine (T) (or uracil, U, in RNA). The four bases make up the &ldquoletters&rdquo of the genetic code. The letters are combined in groups of three to form code &ldquowords,&rdquo called codons. Each codon stands for (encodes) one amino acid unless it codes for a start or stop signal. There are 20 common amino acids in proteins. With four bases forming three-base codons, there are 64 possible codons. 61 codons are more than enough to code for the 20 amino acids, thus more than one codon codes for a single amino acid. Please find genetic codes in Table (PageIndex<1>).

Table (PageIndex<1>): Codon Chart. To find the amino acid for a particular codon, find the cell in the table for the first, second, and third bases of the codon. Once you have found the codon, you can find the corresponding amino acid in the adjacent cell on the right side of the codon cell. For example CUG codes for leucine (Leu), AAG codes for lysine (Lys), and GGG codes for glycine (Gly).
Second base U Amino acid Second base C Amino acid Second base A Amino acid Second base G Amino acid
First base U UUU Phe UCU Ser UAU Tyr UGU Cys Third base U
First base U UUC Phe UCC Ser UAC Tyr UGC Cys Third base C
First base U UUA Leu UCA Ser UAA (stop) no amino acid UGA (stop) no amino acid Third base A
First base U UUA Leu UCG Ser UAG (stop) no amino acid UGG Trp Third base G
First base C CUU Leu CCU Pro CAU His CGU Arg Third base U
First base C CUC Leu CCC Pro CAC His CGC Arg Third base C
First base C CUA Leu CCA Pro CAA Gln CGA Arg Third base A
First base C CUG Leu CCG Pro CAG Gln CGG Arg Third base G
First base A AUU Ile ACU Thr AAU Asn AGU Ser Third base U
First base A AUC Ile ACC Thr AAC Asn AGC Ser Third base C
First base A AUA Ile ACA Thr AAA Lys AGA Arg Third base A
First base A AUG Met (start) ACG Thr AAG Lys AGG Arg Third base G
First base G GUU Val GCU Ala GAU Asp GGU Gly Third base U
First base G GUC Val GCC Ala GAC Asp GGC Gly Third base C
First base G GUA Val GCA Ala GAA Glu GGA Gly Third base A
First base G GUG Val GCG Ala GAG Glu GGG Gly Third base G

Genetic Code: 8 Important Properties of Genetic Code

(1) Code is a Triplet (2) The Code is Degenerate (3) The Code is Non-overlapping (4) The Code is Comma Less (5) The Code is Unambiguous (6) The Code is Universal (7) Co-linearity and (8) Gene-polypeptide Parity.

Genetic Code refers to the relationship between the sequence of nitrogenous bases (UCAG) in mRNA and the sequence of amino acids in a polypeptide chain. In other words, the relationship between the 4 letters language of nucleotides and twenty letters language of amino acids is known as genetic code.

DNA (or RNA) carries all the genetic information and it is expressed in the form of proteins. Proteins are made of 20 different amino acids. The information about the number and sequence of these amino acids forming protein is present in DNA, and during transcription is passed over to mRNA. The form in which it is transferred was not understood for long.

Sugar (pentose) and phosphate of DNA could not perform this job of passing on the genetic message to mRNA because sugar is only of one type and so also the phosphate. This leaves only four nucleotides to form the message for 20 amino acids, but 4 nucleotides are too few for twenty amino acids.

This difficult problem was solved with the discovery that a codon (hereditary unit of a gene) containing coded information for one amino acid consists three nucleotides (i.e., a triplet code). Thus for twenty amino acids, 64 (4 x 4 x 4 or 4 3 = 64) possible permutation are available. This break through resulted into 64 codons dictionary — the Genetic Code.

According to Bark (1970) the genetic code is a code for amino acids, specifically it is concerned with as to what codons specify what amino acids. Genetic code is the outcome of experiments performed by M. Nirenberg, S. Ochoa, H. Khorana, F. Crick and Mathaei. Professor M. Nirenberg was awarded Nobel Prize in 1961 for this outstanding work.

The dictionary of genetic code employs the letters in RNA (U, C, A, G, i.e., A = Adenine, U = Uracil, C = Cytosine, G = Guanine)

The codon for the amino acids, which are the same in all known life forms, have been determined experimentally. They are given in Fig. 7.3.

In Fig. 7.3 note that more than one codon can signal a particular amino acid to be incorporated into a protein. In addition, some codons serve special functions.

For example, the codon AUG serves two functions:

(1) As an initiator codon signaling for the start of synthesis of a peptide, and

(2) For the incorporation of methionine into the growing chain of a peptide. Other special-purpose codons are UAA (Ochre), UAG (Amber), and UGA (Umber), all of which signal STOP.

When the ribosomal synthesis site encounters one of these stop codons, the peptide chain is released and assumes its secondary and tertiary structures. Since UAA (Ochre), UAG (Amber) and UGA (Umber) do not specify any amino acid they are also called nonsense codons.

“When preceded by an initiator region, the codon AUG signals: “Start a new peptide molecule beginning with N-formylmethionine, or fMet.” The codons UAA, UAG and UGA signal termination of the protein synthesis.”

Properties of Genetic Code:

The properties of genetic code determined by extensive experimental evidences may be summarized as follows:

1. Code is a Triplet:

As pointed out earlier, the coding units or codons for amino acids comprise three letter words, 4 x 4 x 4 or 4 3 = 64. 64 codons are quite adequate to specify 20 proteinous amino acids.

2. The Code is Degenerate:

The occurrence of more than one codon for a single amino acid is referred to as degenerate. A review of genetic code dictionary will reveal that most of the amino acids have more than one codon. Out of 61 functional codons, AUG and UGG code to one amino acid each. But remaining 18 amino acids are coded by 59 codons.

3. The Code is Non-overlapping:

4. The Code is Comma Less:

A comma less code means that no nucleotide or comma (or punctuation) is present in between two codons. Therefore, code is continuous and comma less and no letter is wasted between two words or codons.

5. The Code is Unambiguous:

There is no ambiguity in the genetic code. A given codon always codes for a particular amino acid, wherever it is present.

6. The Code is Universal:

The genetic code has been found to be universal in all kinds of living organisms — prokaryotes and eukaryotes.

7. Co-linearity:

DNA is a linear polynucleotide chain and a protein is a linear polypeptide chain. The sequence of amino acids in a polypeptide chain corresponds to the sequence of nucleotide bases in the gene (DNA) that codes for it. Change in a specific codon in DNA produces a change of amino acid in the corresponding position in the polypeptide. The gene and the polypeptide it codes for are said to be co-linear.

8. Gene-polypeptide Parity:

A specific gene transcribes a specific mRNA that produces a specific polypeptide. On this basis, a cell can have only as many types of polypeptides as it has types of genes. However, this does not apply to certain viruses which have overlapping genes.

Related Articles:

Welcome to BiologyDiscussion! Our mission is to provide an online platform to help students to share notes in Biology. This website includes study notes, research papers, essays, articles and other allied information submitted by visitors like YOU.

Before sharing your knowledge on this site, please read the following pages:


Table of Contents

About Us


New Questions and Answers and Forum Categories

This is a question and answer forum for students, teachers and general visitors for exchanging articles, answers and notes. Answer Now and help others.

Quick Notes on Genetic Code | Cell Biology

Living things depend on proteins for exis­tence, the latter produce enzymes necessary for all chemical reactions. Structural infor­mation required to specify the synthesis of any given protein resides in the molecule of DNA which has the spatial configuration of a double helix proposed by Watson and Crick (1953).

The linear sequence of bases in DNA consti­tutes alphabet (hereditary lettering of 4 bases – A, T, C, C) which ‘codes’ for another linear structure, a protein, written in another alphabet of 20 amino acids.

The actual transfer of infor­mation is, however, indirect. DNA is a ‘tem­plate’ for the formation of RNAs, which are incorporated into ribosomes and in turn act as templates for protein synthesis.

All properties of protein, including its secondary and tertiary structure, are ultimately determined by chro­mosomal DNA, and all biological properties are in turn determined by the amino acid sequence of the proteins within an organism, through protein structure and enzyme activity.

The term ‘coding’ implies the relationship between DNA and protein. By coding, the hereditary lettering carried in the four alphabet of DNA is ultimately converted into the protein language composed of twenty letter alphabet of amino acids.

Co-linearity of Gene and Polypep­tide:

In 1958, Crick proposed the hypothesis that DNA determines the sequence of amino acids in a polypeptide. Fundamental to this relationship is that they are both linear in structures, in one case a sequence of nucleotides, in the other case a sequence of amino acids.

By comparing the nucleotide sequence of a gene with the amino acid sequence of a protein, we can determine directly whether the gene and the protein are co-linear or not. A gene of 3N base pairs is required to code for a protein of N amino acids.

The co-linearity of gene and protein was ori­ginally investigated in the tryptophan synthetase gene of E. coli by Yanofsky and his co-workers by utilizing a polypeptide chain A of tryptophan synthetase enzyme. It has been observed that different mutations in the DNA sequence were present in the same order as is observed in the alterations noticed in corresponding amino acid sequence in polypeptide chain A.

The recom­bination distances are relatively similar to the actual distances in the protein, so in this case there is much similarity between the recombina­tion map and the physical map.

For eukaryotic split gene having introns where all base sequences are not translated into amino acid in proteins demonstrates that co-linearity between base sequence of gene and amino acid sequence in protein may be interrupted but not violated.

Properties of Genetic Code:

Code is Triplet:

Researches have been carried out by Ochoa, Kornberg, Nirenberg, Brenner, Crick and others to detect the coding ratio, i.e., the number of units in one system required to specify one unit in the other system. Certainly no one-to-one correspondence can be observed between nucleotides and amino acids.

If each kind of nucleotide specified a single amino acid, only proteins consisting of four amino acids could be constructed. Similarly, the correspondence of an amino acid to two nucleotides would give a larger number of possibilities but still not enough, only = 16.

If a three digit code is employed, however, a total of = 64 kinds of units or codons are established (Fig. 15.1), more than enough to encode twenty amino acids. The surplus forty four triplets were initially thought to be nonsense codons and the remaining twenty as sense codons.

However, later studies have shown that several triplets can code for one amino acid. As such the number of nonsense triplets is very few. Some of the nonsense triplets might also be used as ‘punctuations’, designating the end of a chemical message.

Critical information on the nature of coding units (i.e., the code is in triplets) was gathered from studies of the muta­genic effect on polynucleotide chain (DNA).

Application of mutagen leads to the deletion or duplication of one nucleotide pair or several adjacent pairs. Addition or deletion of one or two bases respectively often causes a drastic effect and the organisms ultimately dies.

The addition or deletion of three bases together, on the other hand, though causing changes in the behaviour of the organism, yet may not necessarily induce a lethal effect and organism may survive with altered mutated tissue.

(i) The direct and exact evidence suppor­ting the triplet code concept was provided by Crick et al. (1961) based on their experiments on a virus, T4 bacteriophage (Fig. 15.2). They found, that the treatment with a chemical called pro-flavin either added or removed a base in its DNA molecule, thus damaging the virus and resulting in an altered or mutant form of the virus.

An addition followed by a deletion of base close by resulted in the restoration of the original virus. This implied that the normal sequences of bases in the DNA molecule had been restored by the second change.

A deletion or insertion completely upsets the reading frame as may be seen from the example of the base sequence GTCCAGACC. Normally the sequence will be read as GTC, CAG, ACC, …, but with the insertion of a new base T between the first and second nucleotides, it yields the sequence GTTCCAGACC … and leads to reading in the groups GTT, CCA, GAC, C …, and specifies wrong amino acids.

A similar con­sequence results from a deletion. Crossing between an addition and deletion will restore the correct reading frame of the sequence except in the region between them. It is easy to see that the combinations of two mutants in the form of two insertions or two deletions will still produce a misplaced reading frame.

Crick (1961) found that three additions or deletions of adjacent nucleotides resulted in the production of the normal virus, due to the restoration of the normal base sequence in DNA.

Thus experiments demonstrating that a combina­tion of three insertions or deletions produced a bacteriophage of perfectly normal appearance and that recombinants containing insertions or deletions in numbers not multiples of three pro­duce only nonfunctional or wrong protein, pro­vided strong evidence that the genetic code operates as a triplet code or that one triplet of nucleotides constitutes a codon.

(ii) The triplet nature of the code was fur­ther confirmed through the research work of Nirenberg and Leder (1965) who found that although little binding of tRNA was possible in the presence of dinucleotide messengers, it occurred preferentially with trinucleotides.

They were able to stimulate binding of different amino acids through different sequences of the same three bases, once again giving credence to the existence of a triplet code.

Code is Non-Overlapping:

In nature, there is always a tendency towards economy. As suggested by Gamow, in his ‘over­lapping’ coding hypothesis, the code is in the form of triplets, but not arranged in a straight chain. It is overlapping in the regions where a particular nucleotide serves in more than one coding unit.

Gamow suggested overlapping code on the basis of two characteristics:

(a) Distance between two bases in a DNA molecule is 3.4A

(b) In a protein molecule also, the distance between two adjacent amino acids is 3.4A.

This can be explained in cases of mono-coding as well as overlapping coding but this is quite impro­bable in a straight chain triplet coding. In the non-overlapping code six nucleotides would code for two amino acids, while in case of over­lapping code up-to four (Fig. 15.3).

In the non- overlapping code each letter Is read only once while in the overlapping code it would be read three times, each time as a part of different words. Mutational changes in one letter would affect only one word in the non-overlapping code while it would affect three words in the overlapping code.

There are evidences of non- overlapping nature of genetic code.

(i) The experimental evidence by Crick (1961) compellingly argued against an over­lapping code and through their research substan­tiated the arguments provided by earlier scien­tists in favour of a non-overlapping code. They started with a messenger of known triplet sequence and used this to synthesize a particular protein.

On adding a nucleotide to it, the parti­cular protein could no longer be synthesized. The result remained unaltered even with the addition of a second necleotide. The proper function of the nucleotide was restored, how­ever, on introduction of a third nucleotide.

A given nucleotide sequence ACTACTAC- TACT bears the codons ACT, ACT, ACT, ACT under the non-overlapping coding systems. An insertion of a nucleotide G between the first C and the first T, under such a system will change the nucleotide sequence to ACGTACTACTACT and codon sequences to ACG, TAG, TAG, TAG, T.

The synthesis of original protein will not take place after the addition of a nucleotide. Instead the altered amino acid chain will be producing an altogether different protein. A second inser­tion of another nucleotide G between the first C and first G of the previously altered nucleotide chain results into a new nucleotide sequence ACGGTACTACTACT and the corresponding codon sequence ACG, GTA, CTA, CTA, CT.

The particular protein still cannot be synthesized. A third nucleotide addition, an insertion of nucleotide G, in the beginning of the nucleotide chain available after the last step causes it to read as GAGGGTACTACTACT and the corresponding codon chain available is GAC, GGT, ACT, ACT, ACT.

The third addition has restored most of the original triplet sequence. The deletion of bases from DNA has the same effect as that of deletion. The third deletion will, however, restores most of the reading frame and allow a sequence of amino acids, differing slightly from its original one. This suggests that the code is non-overlapping.

(ii) Another evidence supporting the exis­tence of a non-overlapping code is provided by the effect of single-site mutations.

A single muta­tion in an overlapping coding system would invariably affect two or more adjacent amino acids in the nucleotide chain. A mutation from the first G to C in the nucleotide sequence ATGATGATG will cause change in one codon only in the case of a non-overlapping code. The original codon sequence of ATG, ATG, ATG will result into a codon sequence ATC, ATG, ATG after single mutation.

However, if the code was an overlapping one, the original codon sequence ATG, TGA, GAT, ATG, TGA, GAT, ATG will change into the codon sequence ATC, TGA, CAT, ATC, TGA, GAT, .ATG. As a result of single mutation, three changes take place. In the codon sequence when the overlapping code is in ope­ration.

Only one change would be expected in case of a non-overlapping code. Since only sin­gle amino acid changes have been observed in the experimental studies of single-site mutation, this evidence reinforces the existence of non-overlapping code.

(iii) Brenner (1957), on the basis of all the published data on the studies of the sequence of amino acids in proteins, concluded that there were no forbidden zones in proteins, and neigh­bouring amino acids were invariably coded by unrelated groups of nucleotides.

It was further established that no specific amino acid will always have the same nearest neighbours and the amino acid sequences appear to be almost completely at random. Such revelations would not have been feasible had the code been of an overlapping nature.

(iv) Yanofsky (1963) provided perhaps the most convincing evidence available that excludes any overlapping code. In his studies of both mutation and recombination through transduc­tion technique, he found that in each protein with a different amino acid at a given position, the amino acids on either side remained unchanged.

Code is Degenerate:

Sometimes three or four triplet codons code for a particular amino acid. Such a genetic code where there are more than one triplet (codon) codes for a single amino acid is known as degen­erate code. Out of possible 64 different codons, 61 codons code for different amino acids.

As there are 20 amino acids, so it is obvious that more than one codon or triplet codes for one amino acid. If each amino acid is coded by a single codon, 44 codons out of 64 will be useless or nonsense codons.

Numerous evidences indicate that the genetic code is degenerate.

(i) If twenty triplets only would have made sense and the remaining forty four remained non­sense, then in a chromosome length mutations could occur only at very limited sites representing one-third of the length and not throughout its entire length.

But the rate of spontaneous muta­tion as well as the results of induced mutation through X-rays has shown that nearly the entire chromosome site is capable of undergoing muta­tion. It is possible if only when the code is degene­rate. However, though the degenerate nature of the code has been established, the presence of high number of repeated sequences may make major segments of chromosomes non-mutable.

(ii) When two bases U and C, in a 3:1 pro­portion are synthesized into in RNA, the possible triplets and their frequency can be mathemati­cally determined :

UUU = 3/4 x 3/4 x 3/4 = 27/64 UUC = 3/4 x 3/4 x 1/4 =9/64 UCU = 3/4 X 1/4 X 3/4 = 9/64 CUU = 1/4 x 3/4 x 3/4 = 9/64 UCC = 3/4 x 1/4 X 1/4 = 3/64 CUC = 1/4 x 3/4 X 1/4 = 3/64 CCU = 1/4 x 1/4 x 3/4 = 3/64 CCC = 1/4 X 1/4 X 1/4 = 1/64.

mRNA of this compo­sition should guide the incorporation of eight amino acids but in fact only four amino acids were actually detected in the protein chain indi­cating the degenerate nature of the code, i.e., some of the codons in this case have directed the incorporation of the same amino acid.

(iii) According to the wobble hypothesis of Crick (1966), the first two bases of the triplet codon pair according to the set rules, i.e., A with U and G with C but the third base having much more freedom of movement than the other two, wobbles and permits more than one type of pair­ing at that position. Thus the wobble hypothesis explains the degeneracy of the code to some extent.

It is sometimes argued that the third base of a code is not very important and that specificity of a codon is particularly determined by the first two bases. It has been shown that the same tRNA can recognise more than one codons differing only at the third posi­tion. This paring is not very stable and is allowed due to wobbling in base pairing at this third posi­tion.

Crick in 1965 proposed a hypothesis called wobble hypothesis to explain this phenomenon. He discovered that if U is present at first position of anticodon, it can pair with either A or G at the third position of codon. Similar is the case with G, found in anticodon, which can pair with either C or U of codon (Table 15.1 A).

The wobble hypothesis visualizes that many codons are able to tolerate mutations at the third base site because of the non-restrictive spatial limitations for the corresponding base in the anti- codon. The third nucleotide in many codons was better tolerated and could be substituted without damage.

The corresponding base in the anticodon would wobble and accommodate. This kind of wobbling allows economy of the number of tRNA molecules since several codons meant for same amino acid are recognized by same tRNA.

Code is Comma-less:

A comma-less code means that no punctua­tion marks are needed between two words. In other words, we can say that after one amino acid is coded, the second amino acid will be automatically coded by the next three letters and no letters are wasted (Fig. 15.4).

However, the code for an entire polypeptide having several amino acids is always terminated by a nonsense codon which servers as full stop in the coding terminology.

If the genetic code functions with commas, a specific nucleotide serves as a punc­tuation mark. Through experiments it has been established that poly-A (AAA) codes for lysine, poly-C (CCC) for proline, and poly-U (UUU) for phenylalanine, which implies that the commas are not made up of A, C and U.

Code is Non-Ambiguous:

Ambiguity denotes that a single codon may code for more than one amino acid. Non- ambiguous means that there is no ambiguity about a particular codon. A particular codon will always code for the same amino acid.

The genetic code is generally non-ambiguous, can be experimentally confirmed using a specific single triplet-ribosome complex which directs the binding of specific tRNA. For example, UUU triplet-ribosome complex directs the binding of phenylalanine-tRNA and AAA triplet-ribosome complex directs the binding of the lysine-tRNA.

In the similar manner, by using the triplets of known sequence, the codons for valine, cysteine, leucine and some other amino acids were determined, thus clearly establishing the non-ambiguous nature of the genetic code under natural physiological conditions.

Code is Universal:

The genetic code is universal. It means that the same codon codes for the same amino acid in all the organisms, from human beings to virus.

Universal nature of genetic code has been experimentally evidenced.

(i) The crucial point in the genetic code is the fitting of tRNA with specific anticodon into the codon of the mRNA.

Thus if mRNA is taken from an eukaryote and tRNA from a prokaryote and protein synthesis could be carried as coded in the mRNA, then it can be proved that code is universal, if mRNA and ribosome are taken from E. coli, and amino acid and tRNA from rat, pro­tein synthesis can be carried out as coded in the mRNA of E. coli. This is true also the other way round.

Von Ehrenstein and Lipmann found that E. coli tRNA to which labeled amino acids were added would form haemoglobin when incubated with the mRNA and ribosomes of rabbit reticulo­cytes.

The precision with which this interspecific attachment occurs was shown by converting cysteine into alanine in amino acid-activated tRNAcys and then observing that this alanine was now inserted into peptide positions ordinari­ly occupied by cysteine, in other words, the anti- codon of the cysteine-tRNA of a bacterial species recognized the cysteine codon of mammalian mRNA in spite of the fact that the tRNA was carrying an alanine amino acid.

(ii) The tRNA from E. coli, Xenopus laevis and guineapig bind to the same trinucleotides as shown by Nirenberg et al., indicates the univer­sality of the code.

(iii) Studies of Merril and co-workers (1971) revealed that a bacterial enzyme X-D-galactose -1 phosphate uridyl transferase which catalyses the metabolism of galactose sugars is produced in human tissue culture cells, previously unable to make it, after infection by a virus carrying the E. coli gal + gene. This provides strong evidence in favour of the universality of the code.

(iv) The correlated nucleotide and amino acid sequences in the overlapping genes of the DNA bacteriophage ф x 174 and in the capsid protein coding gene of RNA bacteriophage MS2 indicates that the genetic code is universal.

(v) Uniformity in amino acid sequence of homologous proteins, e.g., cytochrome c collec­ted from widely divergent species like human, horse, chickens, yeast and bacteria displayed universality of the genetic code.

(vi) Finally genes from human and other organisms have been expressed in E. coli and those from bacteria and other organisms in plants. In each such case, the polypeptide produced by a gene in the new organism was identical with the one it produced in the orga­nism of its origin.

Exceptions of Genetic Code:

A triplet codon demands its own tRNA with a complementary anticodon or a single tRNA responds to both members of a codon pair or to all (or at least some) of the four members of a codon family. Often one tRNA can recognise more than one codon, i.e., codon is degenerate.

This means that the base in the first position of the anticodon must be able to partner alternative bases in the corresponding third position of the codon. In such cases there may be differences in the efficiencies of the alternative recognition reactions (as a general rule, codons that are com­monly used tend to be more efficiently read).

In addition to the constructions of a set of tRNAs able to recognise all the codons, there may be multiple tRNAs that respond to the same codon. The predictions of wobble pairing accord very well with the observed abilities of almost all tRNAs. But there are exceptions in which the codons recognized by a tRNA differ from those predicted by the wobble rules.

Such effects pro­bably result from the influence of neighbouring bases and/or the conformation of the anticodon loop in the overall tertiary structure of the tRNA. Indeed, the importance of the structure of anti­codon loop is inherent in the idea of the wobble hypothesis itself.

Further support for the influ­ence of the surrounding structure is provided by the isolation of occasional mutants in which a change in a base in some other region of the molecule alters the ability of the anticodon to recognize codons.

Another unexpected pairing reaction is pre­sented by the ability of the bacterial initiator, fMet-tRNA ƒmet to recognize both AUG and GUG. This misbehavior involves the third base of the anticodon. Though the genetic code is non-ambiguous, but GUG codes for methionine when used as initiator codon, but it codes for valine if present at the intercalary position, indi­cating its ambiguous nature.

The universality of the genetic code is stri­king, but some exceptions exist. They tend to affect the codons involved in initiation or termi­nation and result from the production (or absence) of tRNAs representing certain codons. Almost all of the changes found in principal genomes affect termination codons.

In the prokaryote Mycoplasma capricolum, UGA is not used for termination, instead codes for tryptophan. In fact, it is the predominant Trp codon, and UGG is used only rarely. Two Trp-tRNA species exist, with the anticodons UCA (reads UCA and UGG) and CCA (reads only UGG).

Some ciliates (unicellular protozoa) read UAA and UAG as glutamine instead of termina­tion signals. Tetrahymena thermophile, one of the ciliates, contains three tRNAglu species. One recognises the usual codons CAA and CAG for glutamine, one recognises both UAA and UAG (according to wobble hypothesis), and the last recognizes only UAG.

We assume that the release factor eRF has a restricted specificity, compared with that of other eukaryotes.

In another ciliate (Euplotes octacarinatus), UGA codes for cysteine. Only UAA is used as a termination codon, and UAG is not found. The change in meaning of UGA might be accom­plished by a modification in the anticodon of tRNAcys to allow it to read UGA with the usual codon UGU and UGC.

The only substitution in coding for amino acids occurs in a yeast (Candida), where CUG means serine instead of leucine (and UAG is used as a sense codon).

All of these changes are sporadic, which is to say that they appear to have occurred indepen­dently in specific lines of evolution. They may be concentrated on termination codons, because these changes do not involve substitution of one amino acid for another. Thus the divergent uses of the termination codons could represent their ‘capture’ for normal coding purposes.

Exceptions to the universal genetic code also occur in the mitochondria from several species.

The earliest change was the employment of uni­versal stop codon UGA to code for tryptophan which is common to all (non-plant) mitochon­dria. It is not likely that UGA coded for trypto­phan in the universal code, but was changed to termination in cytoplasmic translation, because it is a stop codon in bacteria, plant mitochondria and nuclear genomes.

Departures from the universal code, all in non-plant mitochondria, are CUN (leucine) for threonine (in yeasts), AAA (lysine) for asparagine (in Platyhelminthes and echinoderms), UAA (stop) for tyrosine (in Planaria), and AGR (arginine) for serine (in several animal orders and for stop (in vertebrates) [N = A, U, G or C R = A or G) (Table 15.1B).

The mitochondria of plants and protozoans differ in importing and utilizing tRNAs encoded by the nuclear as well as the mitochondrial genome, whereas in animal mitochondria, all the tRNAs are encoded by the organelle.

The small number of tRNAs encoded by the mitochondrial genome highlights an important feature of the mitochondrial genetic system — the use of a slightly different genetic code, which is distinct from the universal code used by both prokaryotic and eukaryotic cells.

Some of these changes make the code simpler, by-replacing two codons that had different meanings with a pair that has a single meaning. Pairs treated like this include UGG and UGA both Trp instead one Trp and one termination) and AUG and AUA (both Met instead of one Met and other lie).

The changes are typically prece­ded by loss of a codon from all coding sequences in an organism or organelle, often as a result of directional mutation pressure, accompanied by loss of the tRNA that translates the codon.

The code reappears later by conversion of another codon and emergence of a tRNA that translates the reappeared codon with a different assign­ment. Changes in release factors also contribute to this revised assignment. Thus the genetic code, formerly thought to be frozen, is now known to be in a state of evolution.

Decipherence of Genetic Code:

It was not possible to say which codon of the possible 64 codons should code for which of the 20 amino acids until the first clue to this problem came when M.W. Nirenberg used in vitro sys­tem for the synthesis of a polypeptide using an artificially synthesized mRNA molecule.

In 1961 Nirenberg and Mathaei characterized the first specific coding sequences, which helped in analysis of genetic code.

Their success on decipherence of code was dependent on two experimental systems:

(i) In vitro (cell free) protein synthesizing system,

(ii) An enzyme, polynucleotide phosphorylase which allowed the synthesis of synthetic mRNAs. These mRNAs served as templates for polypeptide synthesis in the cell free system.

The enzyme polynucleotide phosphorylase functions metabolically in bacteria to degrade RNA, but with high concentrations of ribo­nucleotide diphosphates, the reaction can be ‘forced’ in the opposite direction to synthesize RNA.

Like RNA polymerase it does not require any DNA template, each addition of ribo­nucleotide is random based on the relative concentration of the four ribonucleoside diphos­phates added to the reaction mixtures. The probability of insertion of a specific ribonucleo­tide is proportional to the availability of that molecule, relative to other available ribonucleo­tides.

The cell free system for protein synthesis and the availability of synthetic mRNAs provided a means of deciphering the ribonucleotide compo­sition of various triplets encoding specific amino acids.

Homopolymers Technique (Poly U Experiment):

In their initial experiments, Nirenberg and Mathaei, synthesized RNA homopolymers, each consisting of only one type of ribonucleotide, i.e., the produced mRNA in the in vitro system is either UUUUU …, AAAAA …, CCCCC … or GGGGG … In testing each mRNA, it was very much easy to determine which amino acid was incorporated in the polypeptide chain.

Different amino acids were labelled by using 14 C and tested separately by radioactive counting. In the synthesized RNA using only uracil, there was no other base all along the length of mRNA and the only possible triplet was UUU.

When such a poly-U (RNA) was used in the synthesis of a polypeptide (using all extracts from E. coli, and supplying all the required components of protein synthesizing machinery), only polyphenylalanine was synthesized, meaning that the only amino acid coded was phenylalanine.

It was, therefore, immediately concluded that the input UUU coded for the amino acid phenylalanine. Subsequently, poly A gave polylysine and poly C gave poly-proline. Therefore, UUU was assigned to phenylalanine, AAA to lysine and CCC to pro­line. But the poly G did not serve as template as it gets folded backs on itself, for this assignment other method had been followed.

Heteropolymers (Random): Mixed Copolymers Technique:

The study of polynucleotides were further extended with copolymers as synthetic messen­gers containing two or more bases in definite proportion in cell free system. These randomly synthesized polynucleotides resulted in direct incorporation of amino acids into protein in a manner which indicated that a number of different code words are involved in the binding of different amino acids.

In cell free culture, with these synthetic polyribonucleotide’s, the different amino acids incorporated in a messenger could be clearly correlated with the expected variations in the frequency of different triplets in the synthetic copolymers. Thus this experiment showed the way of deriving nucleotide composition of triplets for each of the amino acids.

Nirenberg, Mathaei and Ochoa did their experiments using the RNA heteropolymers in this technique two or more different ribonucleoside diphosphates were added in combination to form the artificial message. The frequency of a particular triplet codon on the synthetic mRNA depended on the relative proportion of ribo­nucleotide addition in the cell free system.

The percentage of incorporation of particular amino acid in the polypeptide chain could be used for prediction against a particular triplet codon.

For example, in a system A and C are added in a ratio of 1 A: 5C. Now, the insertion of a ribonu­cleotide at any position along the RNA molecule during its synthesis is determined by the ratio of A:C. Therefore, there is a 1/6 possibility for an A and a 5/6 chance for a C to occupy each position.

On this basis, we can calculate the frequency of any given triplet appearing in the message. For AAA, frequency is (1/6) 3 or 0.4%. For AAC, ACA and CAA, the frequencies are identical (1/6) 3 x 5/6 or 2.3%, all three together it is 6.9%. In the same way 1A:2C is calculated which is 1/6 x (5/6) 2 or 11.6% or all together 34.8%, whereas CCC is (5/6)3 or 57.9% of the triplets.

Now by examining the percentage of any given amino acid incorporated into the protein synthesized under the direction of this message, it is possible to propose probable base composi­tion. As because proline appears 69%, it can be deduced that proline is likely to be coded by CCC (57.9%) and also by one of the triplet code 1A : 2C variety (11.6%), i.e., 57.9 + 11.6.

Histidine incorporation percentage is 14% which is probably coded by one 1A:2C category and another 1C:2A category (11.6+2.3)%. Threonine shows 12% incorporation, i.e., likely to be coded by one 1A:2C category. Asparagine and glutamine appear to be coded by one of the 1C:2A triplets and lysine appears to be coded by AAA.

Using as many as all four ribonucleotides to construct this kind of random heteropolymers of synthetic mRNA, the composition of triplet code words corresponding to all 20 amino acids could be determined (Table 15.2).

Heteropolymers (Ordered): Repea­ting Copolymers Technique:

In early 1960s H.G. Khorana could chemi­cally synthesize long RNA molecule consisting of short sequences repeated many times. The short sequences were of di-, tri- or tetra-nucleotides, which were replicated many a times and finally joined enzymatically to form the long polynu­cleotides.

The dinucleotide repeats will be trans­lated for two different amino acids trinucleotide repeats will be converted into 3 potential triplets, depending on the point at which initiation occurs and a tetra-nucleotide creates four repea­ting triplets.

When these synthetic mRNAs were added to a cell free system and amino acid incorporation is matched, the conclusions can be drawn from the composition assignment and triplet binding, and specific assignments were possible.

When the repeating dinucleotide sequence is UCUCUCUC…, it produces the triplets UCU and CUC — they can incorporate leucine and serine into the polypeptide. When the repeating trinucleotide sequence is UUCUUCUUC…, the possible triplets are of three kinds: UUC, UCU and CUU depending on the initiation point and they can incorporate phenylalanine, serine and leucine.

From the above two results it can be concluded that UCU and CUC encode for serine and leucine and also either UUC or CUU encodes for serine or leucine, while the other encodes for phenylala­nine. Further, when the tetra-nucleotide sequence UUAC is repeated then it produces the UUA, UAC, ACU and CUU.

Here the incorporated amino acids are leucine, threonine and tyrosine. In the above two cases, the common code is CUU and common amino acid incorporated is leucine, so it can be concluded that CUU encodes for leucine.

Now from these experiments logically it can be determined that UCU encodes for serine and the rest UUC encodes for phenylalanine and also the CUC encodes for leucine (Table 15.3).

Like this way, by logical interpretations, Khorana reaffirmed triplets that were already deciphered and filled in gaps left from other approaches (Table 15.4).

Triplet Binding Technique:

Nirenberg and Leder in 1964 found that if a synthetic tri-nucleotide for a known sequence is used with ribosome and a particular aminoacyl- tkNA, these will form a complex provided that the used codon codes for the amino acid attached to the given aminoacyl-tRNA.

In order to work out the code for all 20 amino acids, all the possible 64 triplets had to be tried in cell free culture.

In the experiment, 20 samples of the mixture of all 20 amino acids were taken and in each sample, one amino acid was made radioactive in such a manner that each and every amino acid is radioactive in one sample or the other, and no two samples have same radioactive amino acid. For instance, in one set valine has been labelled and the rest 19 remained unlabelled.

Similarly, in another set lysine was labelled and the rest 19 remained un-labelled. Then the tRNAs and ribosomes are mixed with each of these samples and the same codon is used for all sets. When the mixture is poured on the nitro­cellulose membrane, radioactivity on membrane will be observed only when the radioactive amino acid is taking part in the formation of complex.

Since in each sample the radioactive amino acid is known, it would be possible to detect the amino acid coded by a given codon by the presence of radioactivity on the membrane. Such a treatment was given to all 64 synthetic codons, and their respective amino acids were identified.

Codon Dictionary:

The base sequence in mRNA and the resul­ting amino acid sequence in protein reveals the code for each amino acid. All the 64 codons, along with their amino acids, are represented in Table 15.5.

An examination of the code table reveals the following characteristics:

i. Each codon consists of three nucleo­tides, i.e., the code is triplet. 61 codons represent 20 amino acids. Three represent (UAA, UAG, UGA) punctuation marks for termination of pro­tein synthesis.

ii. Almost all amino acids are coded by more than one codon, except methionine and tryptophan which have only one codon. Phenylalanine, tyrosine, histidine, glutamine, asparagine, lysine, aspartic acid, glutamic add and cysteine are the nine amino acids which are represented by two codons each. Three amino acids, i.e., arginine, serine and leucine have Six codons each. The table indicates the degeneracy of the genetic, code.

iii. If an amino acid has more than one codon, the first two nucleotides are identical and the third nucleotide can be either cytosine or uracil. Adenine and guanine are also similarly interchangeable at the third position. For example, UUU and UUC, both code for phenylalanine, and UCU, UCC, UGA and UCG code for serine.

However, there are some exceptions to the equi­valence rule of the first two nucleotides, as AGU and AGC also code for serine apart from UCU, UCC, UCA and UCG.

Similarly, the amino acid leucine is also coded- by six codons, i.e., UUA, UUG, CUU, CUC, CUA and CUG.

The frequent interchange of cytosine and uracil or guanine and adenine suggests that great variations can occur in AT/GC ratio in certain organisms without affecting large changes in the relative proportions of amino acids present in them, as for almost every amino acid there is one codon that carries G or C and another that carries A or U as its third nucleotide.

The two organisms carrying the same protein sequence information in their DNA, by selecting one or the other kind of synonym codon, can show different AT/GC ratios.

iv. The genetic code has a definite structure in the sense that the synonyms for the same amino acid are not randomly dispersed over the table but are usually found together. The only exceptions are the codons, six each for arginine, serine, and leucine, which are spread over the table.

v. Multiple codons for an amino acid show in general the similarity in first two nucleotides and it is the third nucleotide which varies.

AUG is the initiation codon, i.e., the polypeptide chain starts with methionine. This amino acid is the formulated form of methionine. The initiation codon binds to fmet-tRNA having an anticodon 3′ UAC 5′ which is identical to that of met-tRNA, i.e., both met- tRNA and fmet-tRNA are coded by AUG but the signal for the starting amino acid is much more complex than the signal for all other amino acids.

According to Stent, there exist two separable species of tRNA capable of accepting methionine. Methionine of only one of these is concerned into formyl methionine by the action of the special formulation enzyme. The other or ordinary met- tRNA incorporates methionine into the interior of the growing polypeptide chain and responds to the codon AUG only.

Formyl-met-tRNA initiates the polypeptide chain and responds to GUG (valine codon) also. The GUG while present at the initiation point, codes for methionine whereas in the intercalary position, it codes for valine. The anticodon of this species of tRNA seems to be per­missive with respect to the first nucleotide base of the codon and selective with respect to the second and third nucleotide bases.

UAA, UAG and UGA are the chain termination codons. They do not code for any of the amino acids but serve as stop codon. These codons do not have any tRNA but are read by specific proteins called release fac­tors. These codons are also called nonsense codons.

A mutation from a sense to nonsense codon in the middle of a genetic message results in the release of immature or incomplete polypeptides which do not have any biological activity. Nonsense mutations can be induced by mutagens. UAG was formerly known as amber, UAA as ochre and UGA as opal.

How many possible codons? - Biology

Given the different numbers of “letters” in the mRNA and protein “alphabets,” scientists theorized that combinations of nucleotides corresponded to single amino acids. Nucleotide doublets would not be sufficient to specify every amino acid because there are only 16 possible two-nucleotide combinations (42). In contrast, there are 64 possible nucleotide triplets (43), which is far more than the number of amino acids. Scientists theorized that amino acids were encoded by nucleotide triplets and that the genetic code was degenerate. In other words, a given amino acid could be encoded by more than one nucleotide triplet. This was later confirmed experimentally Francis Crick and Sydney Brenner used the chemical mutagen proflavin to insert one, two, or three nucleotides into the gene of a virus. When one or two nucleotides were inserted, protein synthesis was completely abolished. When three nucleotides were inserted, the protein was synthesized and functional. This demonstrated that three nucleotides specify each amino acid. These nucleotide triplets are called codons. The insertion of one or two nucleotides completely changed the triplet reading frame, thereby altering the message for every subsequent amino acid (Figure 1). Though insertion of three nucleotides caused an extra amino acid to be inserted during translation, the integrity of the rest of the protein was maintained.

Figure 1. The deletion of two nucleotides shifts the reading frame of an mRNA and changes the entire protein message, creating a nonfunctional protein or terminating protein synthesis altogether.

Scientists painstakingly solved the genetic code by translating synthetic mRNAs in vitro and sequencing the proteins they specified (Figure 2).

Figure 2. This figure shows the genetic code for translating each nucleotide triplet in mRNA into an amino acid or a termination signal in a nascent protein. (credit: modification of work by NIH)

In addition to instructing the addition of a specific amino acid to a polypeptide chain, three of the 64 codons terminate protein synthesis and release the polypeptide from the translation machinery. These triplets are called nonsense codons, or stop codons. Another codon, AUG, also has a special function. In addition to specifying the amino acid methionine, it also serves as the start codon to initiate translation. The reading frame for translation is set by the AUG start codon near the 5′ end of the mRNA.

The genetic code is universal. With a few exceptions, virtually all species use the same genetic code for protein synthesis. Conservation of codons means that a purified mRNA encoding the globin protein in horses could be transferred to a tulip cell, and the tulip would synthesize horse globin. That there is only one genetic code is powerful evidence that all of life on Earth shares a common origin, especially considering that there are about 1084 possible combinations of 20 amino acids and 64 triplet codons.

Transcribe a gene and translate it to protein using complementary pairing and the genetic code at this site.

Degeneracy is believed to be a cellular mechanism to reduce the negative impact of random mutations. Codons that specify the same amino acid typically only differ by one nucleotide. In addition, amino acids with chemically similar side chains are encoded by similar codons. This nuance of the genetic code ensures that a single-nucleotide substitution mutation might either specify the same amino acid but have no effect or specify a similar amino acid, preventing the protein from being rendered completely nonfunctional.

Related articles


You might be familiar with the term chromosomes, but what are they—and what do chromosomes do? Chromosomes are packets of genetic material—that .

DNA Nucleotides

Nucleotides are the basic building blocks of nucleic acids, including DNA and RNA. Nucleotides also are an energy storage molecule. Learn more at .

Where Is DNA Found in a Cell?

From the single cell of bacteria to the trillions in humans, cells, often called the “building blocks of life,” make up all living things. Learn .

TRNA Structure and Function

Transfer RNAs are coded by a number of genes, and are usually short molecules, between 70-90 nucleotides (5 nm) in length. The two most important parts of a tRNA are its anticodon and the terminal 3’ hydroxyl group, which can form an ester linkage with an amino acid. However, there are other aspects to a tRNA’s structure such as the D-arm and T-arm, which contribute to its high level of specificity and efficiency. Only 1 in 10,000 amino acids are incorrectly attached to a tRNA, which is a remarkable number given the chemical similarities between many amino acids.

Transfer RNAs have a sugar-phosphate backbone like all other cellular nucleic acids and the orientation of the ribose sugar gives rise to directionality in the molecule. One end of the RNA has a reactive phosphate group attached to the fifth carbon atom of ribose while the other end has a free hydroxyl group on the third carbon atom. This gives rise to the 5’ and 3’ ends of the RNA since all the other phosphate and hydroxyl groups are involved in phosphodiester bonds within the nucleic acid.

The last three bases on the 3’ end of tRNA are always CCA – two cytosines followed by one adenine base. This stretch is part of the acceptor arm of the molecule, where an amino acid is covalently attached to the hydroxyl group on the ribose sugar of the terminal adenine nucleotide. The acceptor arm also contains parts of the 5’ end of the tRNA, with a stretch of 7-9 nucleotides from opposite ends of the molecule base pairing with each other.

The other structure that influences the role of tRNA in translation is the T-arm. Similar to the D-arm, it contains a stretch of nucleotides that base pair with each other and a loop that is single stranded. The paired region is called the ‘stem’ and mostly contains 5 base pairs. The loop contains modified bases and is also called the TΨC arm, to specify the presence thymidine, pseudouridine and cytidine residues (modified bases). tRNA molecules are unusual in containing a high number of modified bases as well as containing thymidine, usually seen only in DNA. The T-arm is involved in the interaction of tRNA with the ribosome.

Finally, a variable arm containing less than 20 nucleotides is situated between the anticodon loop and the T-arm. It plays a role in AATS recognition of tRNA, but could be absent in some species.

The secondary structure of tRNA containing the acceptor region, D- and T-arms and the anticodon loop is said to resemble a cloverleaf. After the RNA folds into its tertiary structure, it is L-shaped, with the acceptor stem and T-arm forming an extended helix and the anticodon loop and D-arm similarly making another extended helix. These two helices align perpendicularly to each other in a way that brings the D-arm and T-arm into close proximity while the anticodon loop and the acceptor arm are positioned on opposite ends of the molecule.

In this image, the 3’ CCA region is in yellow, the acceptor arm is in purple, the variable loop in orange, the D-arm is in red, the T-arm in green and the anticodon loop is in blue.

Genetic Code and Amino Acid Translation

Table 1 shows the genetic code of the messenger ribonucleic acid (mRNA), i.e. it shows all 64 possible combinations of codons composed of three nucleotide bases (tri-nucleotide units) that specify amino acids during protein assembling.

Each codon of the deoxyribonucleic acid (DNA) codes for or specifies a single amino acid and each nucleotide unit consists of a phosphate, deoxyribose sugar and one of the 4 nitrogenous nucleotide bases, adenine (A), guanine (G), cytosine (C) and thymine (T). The bases are paired and joined together by hydrogen bonds in the double helix of the DNA. mRNA corresponds to DNA (i.e. the sequence of nucleotides is the same in both chains) except that in RNA, thymine (T) is replaced by uracil (U), and the deoxyribose is substituted by ribose.

The process of translation of genetic information into the assembling of a protein requires first mRNA, which is read 5' to 3' (exactly as DNA), and then transfer ribonucleic acid (tRNA), which is read 3' to 5'. tRNA is the taxi that translates the information on the ribosome into an amino acid chain or polypeptide.

For mRNA there are 4 3 = 64 different nucleotide combinations possible with a triplet codon of three nucleotides. All 64 possible combinations are shown in Table 1. However, not all 64 codons of the genetic code specify a single amino acid during translation. The reason is that in humans only 20 amino acids (except selenocysteine) are involved in translation. Therefore, one amino acid can be encoded by more than one mRNA codon-triplet. Arginine and leucine are encoded by 6 triplets, isoleucine by 3, methionine and tryptophan by 1, and all other amino acids by 4 or 2 codons. The redundant codons are typically different at the 3rd base. Table 2 shows the inverse codon assignment, i.e. which codon specifies which of the 20 standard amino acids involved in translation.

Table 1. Genetic code: mRNA codon -> amino acid

U Phenylalanine Serine Tyrosine Cysteine U
Phenylalanine Serine Tyrosine Cysteine C
Leucine Serine Stop Stop A
Leucine Serine Stop Tryptophan G
C Leucine Proline Histidine Arginine U
Leucine Proline Histidine Arginine C
Leucine Proline Glutamine Arginine A
Leucine Proline Glutamine Arginine G
A Isoleucine Threonine Asparagine Serine U
Isoleucine Threonine Asparagine Serine C
Isoleucine Threonine Lysine Arginine A
Methionine (Start) 1 Threonine Lysine Arginine G
G Valine Alanine Aspartate Glycine U
Valine Alanine Aspartate Glycine C
Valine Alanine Glutamate Glycine A
Valine Alanine Glutamate Glycine G

Table 2. Reverse codon table: amino acid -> mRNA codon

Amino acid mRNA codons Amino acid mRNA codons

The direction of reading mRNA is 5' to 3'. tRNA (reading 3' to 5') has anticodons complementary to the codons in mRNA and can be "charged" covalently with amino acids at their 3' terminal. According to Crick the binding of the base-pairs between the mRNA codon and the tRNA anticodon takes place only at the 1st and 2nd base. The binding at the 3rd base (i.e. at the 5' end of the tRNA anticodon) is weaker and can result in different pairs. For the binding between codon and anticodon to come true the bases must wobble out of their positions at the ribosome. Therefore, base-pairs are sometimes called wobble-pairs.

Table 3 shows the possible wobble-pairs at the 1st, 2nd and 3rd base. The possible pair combinations at the 1st and 2nd base are identical. At the 3rd base (i.e. at the 3' end of mRNA and 5' end of tRNA) the possible pair combinations are less unambiguous, which leads to the redundancy in mRNA. The deamination (removal of the amino group NH2) of adenosine (not to confuse with adenine) produces the nucleotide inosine (I) on tRNA, which generates non-standard wobble-pairs with U, C or A (but not with G) on mRNA. Inosine may occur at the 3rd base of tRNA.

Table 3. Base-pairs: mRNA codon -> tRNA anticodon

Table 3 is read in the following way: for the 1st and 2nd base-pairs the wobble-pairs provide uniqueness in the way that U on tRNA always emerges from A on mRNA, A on tRNA always emerges from U on mRNA, etc. For the 3rd base-pair the genetic code is redundant in the way that U on tRNA can emerge from A or G on mRNA, G on tRNA can emerge from U or C on mRNA and I on tRNA can emerge from U, C or A on mRNA. Only A and C at the 3rd place on tRNA are unambiguously assigned to U and G at the 3rd place on mRNA, respectively.

Due to this combination structure a tRNA can bind to different mRNA codons where synonymous or redundant mRNA codons differ at the 3rd base (i.e. at the 5' end of tRNA and the 3' end of mRNA). By this logic the minimum number of tRNA anticodons necessary to encode all amino acids reduces to 31 (excluding the 2 STOP codons AUU and ACU, see Table 5). This means that any tRNA anticodon can be encoded by one or more different mRNA codons (Table 4). However, there are more than 31 tRNA anticodons possible for the translation of all 64 mRNA codons. For example, serine has a fourfold degenerate site at the 3rd position (UCU, UCC, UCA, UCG), which can be translated by AGI (for UCU, UCC and UCA) and AGC on tRNA (for UCG) but also by AGG and AGU. This means, in turn, that any mRNA codon can also be translated by one or more tRNA anticodons (see Table 5).

The reason for the occurrence of different wobble-pairs encoding the same amino acid may be due to a compromise between velocity and safety in protein synthesis. The redundancy of mRNA codons exist to prevent mistakes in transcription caused by mutations or variations at the 3rd position but also at other positions. For example, the first position of the leucine codons (UCA, UCC, CCU, CCC, CCA, CCG) is a twofold degenerate site, while the second position is unambiguous (not redundant). Another example is serine with mRNA codons UCA, UCG, UCC, UCU, AGU, AGC. Of course, serine is also twofold degenerate at the first position and fourfold degenerate at the third position, but it is twofold degenerate at the second position in addition. Table 4 shows the assignment of mRNA codons to any possible tRNA anticodon in eukaryotes for the 20 standard amino acids involved in translation. It is the reverse codon assignment.

Table 4. Reverse amino acid encoding: amino acid -> tRNA anticodon -> mRNA codon

While it is not possible to predict a specific DNA codon from an amino acid, DNA codons can be decoded unambiguously into amino acids. The reason is that there are 61 different DNA (and mRNA) codons specifying only 20 amino acids. Note that there are 3 additional codons for chain termination, i.e. there are 64 DNA (and thus 64 different mRNA) codons, but only 61 of them specify amino acids.

Table 5 shows the genetic code for the translation of all 64 DNA codons, starting from DNA over mRNA and tRNA to amino acid. In the last column, the table shows the different tRNA anticodons minimally necessary to translate all DNA codons into amino acids and sums up the number in the final row. It reveals that the minimum number of tRNA anticodons to translate all DNA codons is 31 (plus 2 STOP codons). The maximum number of tRNA anticodons that can emerge in amino acid transcription is 70 (plus 3 STOP codons).

Table 5. Genetic code: DNA -> mRNA codon -> tRNA anticodon -> amino acid

1 The codon AUG both codes for methionine and serves as an initiation site: the first AUG in an mRNA's coding region is where translation into protein begins.

Stop Codon Mutations

Stop codon mutations can easily occur, especially when we consider the length of the genome and the thousands of different nucleotide triplets. Both transcription and translation processes are susceptible to a broad range of potential errors that may or may not lead to anatomical and physiological changes. The insertion of the wrong nucleotide in the KRT-9 gene in family members already predisposed to the disease has been found to contribute to the development of a skin disease known as epidermolytic palmoplantar keratoderma.

Which type of mutation creates a stop codon? Radiation, chemicals, pollution, infection, and the aging process are just some ways in which the DNA can become damaged attempts to repair this damage can accidentally insert the wrong nucleotide. This might change a triplet that would normally have coded for an amino acid into a stop codon. When this happens, the result is a nonsense mutation. A nonsense mutation specifically changes an amino acid-producing triplet into a stop codon and leads to premature termination of protein synthesis in the ribosome.

While all kinds of mutations occur during DNA to mRNA transcription, mRNA only copies what is written without ever needing to understand it. For the period where mRNA is not in contact with a ribosome, even multiple mutations will not cause an effect. Effects are only seen when the changed code is translated into a faulty protein This is why most mutations are labeled as being part of the translation process, where the edited code may or may not produce a different amino acid. The fact that most amino acids match up to six different nucleotide triplets means there is a chance that, even in the presence of a mutation, the same protein will be produced. We usually associate genetic mutations with illness however, they are also responsible for successful evolution. Genetic mutations help organisms to adapt to their environment.

There are various forms of genetic mutation. Deletion mutations do not copy certain parts of the genome and so change the order of the nucleotides. A single base or multiple bases may be completely missed out. Insertion mutations add one or more nucleotides and also change the order of the genetic code. Substitution mutations (silent, missense, and nonsense) swap a single nucleotide (not multiple nucleotides) with a different base and this may or may not substitute a different amino acid in a polypeptide chain. If the same protein is produced, even in the presence of a mutation, it is called a silent mutation. In some cases, an entire section of DNA can swap between the two strands – this is called translocation.

If a different amino acid is added to the polypeptide chain that may or may not change its function, the cause is a missense mutation. Where substitution creates a stop codon by changing the code of a nucleotide triplet that matches an amino acid, it is called a nonsense mutation. The below image shows three types of mutation: A is the nonsense mutation, B the insertion mutation, and C and D show deletion mutations.


  1. Kelby

    This remarkable thought, by the way, just falls

  2. Laurence

    ok movie?

  3. Harald

    A very valuable piece

Write a message