What is the last heavy atom of an amino acid?

What is the last heavy atom of an amino acid?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm a electrical engineer and I'm learning about bioinformatics. I was reading this paper that uses the last heavy atom to search for active sites of a protein but what would be a heavy atom? Is it all atoms but hydrogen? Atoms heavier than carbon? And how do I know which is the last heavy atom?

The article:

It talks about the last heavy atom under Materials and Methods 2.1.1 Individual Representation and Population Initialization

As mentioned in the comments by Roland, this term is not common and is first used by the authors of the mentioned paper (also the package mentioned by Roland in comments - STING).

From this link you can find the definition of the Last Heavy Atom which is possibly the most distal non-hydrogen (N, C, O, S) atom in the amino-acid side chain.

$$egin{array}{|c|c|c|c|} hline Ala : C_eta & Asp : O_{delta^2} &Trp : C_{eta^2} &Asn : N_{delta^2} hline Lys : N_zeta & Glu : O_{epsilon^2} & Ser : O_gamma & Gln : N_{epsilon^2} hline Cys : S_gamma & His : N_{epsilon^2} & Tyr : O_eta & Val : C_{gamma^2} hline Gly : C_alpha & Leu : C_{delta^2} & Met : C_epsilon & Ile : C_{delta^1} hline Arg : N_{eta^2} & Phe : C_zeta & Pro : C_delta & Thr : C_{gamma^2} hline end{array}$$

They have mentioned in the same link that:

If in any of the PDB files (to be analyzed by the BLUE STAR STING components) that specific atom (LHA) is missing in the record, then our algorithms will search for the next closer atom in the side chain that would be considered as the LHA.

The exact location of some of the abovementioned atoms.

Some of the locations might be confusing. So I am just indicating what these are, for a few the amino acids (especially for the branched and cyclic side chains).
[Images reproduced from here.]


The terminal nitrogens $eta^1$ and $eta^2$ are symmetric and the atom that is actually (stereochemically) the most distal is considered the LHA, as inferred from the protein structure. The second choice, obviously is the $eta^1$-Nitrogen.

Aspartic acid

The two terminal oxygens ($delta^1$ and $delta^2$) in the ionized form are equivalent (also otherwise because of resonance). The actual distal atom would be labelled as $delta^2$

you can extrapolate the same for Glutamic acid


$epsilon^2$ refers to the pyrollic nitogen…


The last carbons ($delta^1$ and $delta^2$) are equivalent and the choice is based on the actual distance. Extrapolate for valine


$zeta$ refers to the carbon in the para-position of the benzene ring. For tyrosine the LHA is the oxygen of the para-OH in the benzene ring.



For Aspargine and Glutamine the LHA is the amide nitrogen of the side chain.

Metalla derivatives of amino acids and peptides. 1. Rhena derivatives of glycine, L-alanine, and glycylglycine. A new N-terminal and protecting group and heavy-atom label

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

Note: In lieu of an abstract, this is the article's first page.

What amino acids make hemoglobin?

The distal histidine amino acid from the hemoglobin protein molecule further stabilizes the O2 molecule by hydrogen-bonding interactions. Myoglobin is a protein molecule that has a similar structure and function to hemoglobin. It binds and stores oxygen without concerning cooperativity.

Similarly, what type of molecule is hemoglobin? Hemoglobin consists of protein subunits (the "globin" molecules), and these proteins, in turn, are folded chains of a large number of different amino acids called polypeptides. The amino acid sequence of any polypeptide created by a cell is in turn determined by the stretches of DNA called genes.

Likewise, people ask, which amino acid in hemoglobin is involved in coordinating iron in addition to the heme prosthetic group?

The oxygen carried by hemeproteins is bound directly to the ferrous iron atom of the heme prosthetic group. Hydrophobic interactions between the tetrapyrrole ring and hydrophobic amino acid R groups on the interior of the cleft in the protein strongly stabilize the heme protein conjugate.

Types of Amino Acids

There are 21 amino acids that are used to make proteins in horses. These all have a similar chemical structure, but differ in the arrangement of atoms in a part of the molecule referred to as the amino acid side chain.

Amino acids can be broadly divided into three categories:

    Essential: 10 amino acids that must be provided in the diet because they can not be made in the body (endogenously).

Below we will review the roles, sources, symptoms of deficiency and excess, and requirements for each amino acid. We also evaluate the amino acid profile of various protein sources.

Before making changes to your feeding program, you can submit your horse’s diet for analysis online and one of our equine nutritionists will help you review your horse’s needs.


We describe briefly the dataset and method used for training BepiPred-2.0, and the validations we have performed. More details on the material and methods can be found in Supplementary Materials .

Structural dataset

A dataset consisting of 649 antigen-antibody crystal structures was obtained from the Protein Data Bank (PDB) ( 14). In each complex, we identified the antibody molecules using HMM models developed elsewhere, and for each antibody we define its antigens as all the non-antibody protein chains that have at least one atom in a 4 Å radius from its Complementarity Determining Region (CDR) atom ( 15). We removed complexes in which the antigen sequence was >70% identical to any other sequence in our dataset, thus obtaining 160 structures. We randomly selected five structures published after 2014 as a final evaluation dataset and used the remaining 155, split into five equally-sized partition for cross-validation, to create our training dataset. The epitope residues were defined as those in a 4 Å radius of any antibody residue's heavy atom. Also, if multiple identical antigen chains bind to the same antibody, the epitope was defined as the union of the epitope residues on all the chains, thus resulting in a positive dataset of 3542 residues. All 36 785 non-epitopes were defined as negatives. All the positive and negative residues were used when evaluating the methods’ performance, but for training the negative dataset was downsized by random sampling to the same size of the positive one (see Supplementary Materials for more details).

Training a random forest prediction model

To predict the probability that a given antigen residue is part of an epitope, a Random Forest Regression (RF) algorithm was trained using a 5-fold cross validation approach. Each residue was encoded using its computed volume ( 16), hydrophobicity ( 17), polarity ( 18), together with the relative surface accessibility (RSA) and secondary structure (SS) as predicted by NetSurfP ( 19) of all the residues in a window of size 9 centered on the residue itself. Also, the overall volume of the antigen obtained by summing the individual volumes of all the antigen's residues was used, for a total of 46 variables. A rolling average of window 9 was then performed on the RF output to obtain the final BepiPred-2.0 predictions. More details on the parameter optimization can be found in the Supplementary text and Supplementary Figures S1 and S2 .

Evaluation measurements

We evaluated the performance for each antigen in terms of the area under the receiver operation curve (AUC), the area under the first 10% of the receiver operation curve normalized by multiplying by 10 (AUC10%), the positive predictive rate (PPR) and the true positive rate (TPR) of the top 60 predictions ( 20).

When comparing the performance of two models, a paired t-test was calculated on their performances on individual antigens. A confidence interval of 95% was used to define a significant difference between two compared models.

Evaluation on a linear epitopes dataset

A set of known linear peptides that were tested for immune recognition and were found to be epitopes (positive assay results) or non-epitopes (negative assay results) were downloaded from the Immune Epitope Database (IEDB) ( 21). Peptides shorter than five or larger than 25 amino acids were removed, as B cell epitopes rarely are outside these boundaries ( 1). Only peptides confirmed as positives in two or more separate experiments were included in the positive dataset, and only peptides seen as negative in two or more separate experiments and never observed as positives in any experiment were included in the negative dataset. This resulted in 11 834 positives and 18 722 negative peptides. Each peptide was mapped back on its original protein sequence, and this was used to calculate the output prediction. This dataset is available for download on the BepiPred web page (

The evaluation was only performed on the residues within the positive and negative peptides. In this case, an AUC was calculated only on the pooled positive and negative residues and not per antigen sequence.

Polypeptide Chains

The resulting chain of amino acids is called a polypeptide chain. Each polypeptide has a free amino group at one end. This end is called the N terminal, or the amino terminal, and the other end has a free carboxyl group, also known as the C or carboxyl terminal. When reading or reporting the amino acid sequence of a protein or polypeptide, the convention is to use the N-to-C direction. That is, the first amino acid in the sequence is assumed to the be one at the N terminal and the last amino acid is assumed to be the one at the C terminal.

Although the terms polypeptide and protein are sometimes used interchangeably, a polypeptide is technically any polymer of amino acids, whereas the term protein is used for a polypeptide or polypeptides that have folded properly, combined with any additional components needed for proper functioning, and is now functional.

Boundless vets and curates high-quality, openly licensed content from around the Internet. This particular resource used the following sources:


Adderly, Brenda. "Amino Acids." Better Nutrition (September 1999). Available from

"Amino acid screening." Everything You Need to Know about Medical Tests, Annual. Springhouse Corporation: 1996. Available from

Antinoro, Linda. "Food and Herbs That Keep Blood Moving, Prevent Circulatory Problems." Environmental Nutrition (February 2000).

"Arginine Seems to Benefit Both Immune and Sexual Response." RN (February 2002): 22.

Austin Nutritional Research. "Amino acids." Reference Guide for Amino Acids. 2000. Available from

Body Trends Fitness Products. "Amino acids." commercial website. (2000). Available from

"Creatine Supplementation Speeds Rehabilitation." Health and Medicine Week (January 21, 2002): 6.

Davidson, Tish. "Amino acid disorders screening." Gale Encyclopedia of Medicine. Edition 1. Detroit: 1999. Available from

Dolby, Victoria. "Anxiety? Send herbs, 5 – HTP, and amino acids to the rescue!" Better Nutrition (June 1998). Available from

Gersten, Dennis J., M.D. "Amino Acids: Building Blocks of Life, Building Blocks of Healing." The Gersten Institute for Integrative Medicine. (2000). Available from

Gower, Timothy. "Eat Powder! Build Muscle! Burn Calories!" Esquire (February 1998). Available from

Moyano, D. Vilaseca, M.AA. Artuch, R. and, Lambruschini, N. "Plasma Amino Acids in Anorexia Nervosa." Nutrition Research Newsletter (November 1998). Available from

"Studies Say Creatine is OK." Obesity, Fitness & Wellness Week (January 12, 2002): 12.

Toews, Victoria Dolby. "6 Amino Acids Unleash the Energy." Better Nutrition (June 1999). Available from

Totheroh, Gailon. "Amino Acid Therapy Pays Off." Christian Broadcasting Network (10 May 1999). Available from

Tuttle, Dave. "Muscle's little helper." Men's Fitness (December 1998). Available from

Wernerman, Jan. "Documentation of clinical benefit of specific amino acid nutrients." The Lancet (5 September 1998). Available from

Williams, Stephen. "Passing the Acid Test." Newsweek (27 March 2000).

Cite this article
Pick a style below, and copy the text for your bibliography.

Spehar, Jane Odle, Teresa "Amino Acids ." Gale Encyclopedia of Alternative Medicine. . 17 Jun. 2021 < > .

Spehar, Jane Odle, Teresa "Amino Acids ." Gale Encyclopedia of Alternative Medicine. . (June 17, 2021).

Spehar, Jane Odle, Teresa "Amino Acids ." Gale Encyclopedia of Alternative Medicine. . Retrieved June 17, 2021 from

Citation styles gives you the ability to cite reference entries and articles according to common styles from the Modern Language Association (MLA), The Chicago Manual of Style, and the American Psychological Association (APA).

Within the “Cite this article” tool, pick a style to see how all available information looks when formatted according to that style. Then, copy and paste the text into your bibliography or works cited list.

Because each style has its own formatting nuances that evolve over time and not all information is available for every reference entry or article, cannot guarantee each citation it generates. Therefore, it’s best to use citations as a starting point before checking the style against your school or publication’s requirements and the most-recent information available at these sites:

Modern Language Association

The Chicago Manual of Style

American Psychological Association

  • Most online reference entries and articles do not have page numbers. Therefore, that information is unavailable for most content. However, the date of retrieval is often important. Refer to each style’s convention regarding the best way to format page numbers and retrieval dates.
  • In addition to the MLA, Chicago, and APA styles, your school, university, publication, or institution may have its own requirements for citations. Therefore, be sure to refer to those guidelines when editing your bibliography or works cited list.


Structure of the atom

  • a small, dense, positively-charged nucleus surrounded by
  • much lighter, negatively-charged electrons.
  • a single positively-charged proton. Because of its single proton, the atom of hydrogen is assigned an atomic number of 1.
  • a single electron.

The charge of the electron is the same magnitude as that of the proton, so the atom as a whole is electrically neutral. Its proton accounts for almost all the weight of the atom.

  • two protons (hence helium has an atomic number of 2) and
  • two neutrons. Neutrons have the same weight as protons but no electrical charge.

The helium atom has two electrons so that, once again, the atom as a whole is neutral.

The structure of each of the other kinds of atoms follows the same plan. From Lithium (At. No. = 3) to uranium (At. No. = 92), the atoms of each element can be listed in order of increasing atomic number. There are no gaps in the list. Each element has a unique atomic number and its atoms have one more proton and one more electron than the atoms of the element that precedes it in the list.


Atomic Number Element Energy Levels or "shells"
1 Hydrogen (H) 1
2 Helium (He) 2
3 Lithium (Li) 2 1
4 Beryllium (Be) 2 2
5 Boron (B) 2 3
6 Carbon (C) 2 4
7 Nitrogen (N) 2 5
8 Oxygen (O) 2 6
9 Fluorine (F) 2 7
10 Neon (Ne) 2 8
11 Sodium (Na) 2 8 1
12 Magnesium (Mg) 2 8 2
13 Aluminum (Al) 2 8 3
14 Silicon (Si) 2 8 4
15 Phosphorus (P) 2 8 5
16 Sulfur (S) 2 8 6
17 Chlorine (Cl) 2 8 7
18 Argon (Ar) 2 8 8
19 Potassium (K) 2 8 8 1
20 Calcium (Ca) 2
8 8 2
21 Scandium (Sc) 2 8 9 2
22 Titanium (Ti) 2 8 10 2
23 Vanadium (V) 2 8 11 2
24 Chromium (Cr) 2 8 13 1
25 Manganese (Mn) 2 8 13 2
26 Iron (Fe) 2 8 14 2
27 Cobalt (Co) 2 8 15 2
28 Nickel (Ni) 2 8 16 2
29 Copper (Cu) 2 8 18 1
30 Zinc (Zn) 2 8 18 2
31 Gallium (Ga) 2 8 18 3
32 Germanium (Ge) 2 8 18 4
33 Arsenic (As) 2 8 18 5
34 Selenium (Se) 2 8 18 6
35 Bromine (Br) 2 8 18 7
36 Krypton (Kr) 2 8 18 8
42 Molybdenum (Mo) 2 8 18 13 1
48 Cadmium (Cd) 2 8 18 18 2
50 Tin (Sn) 2 8 18 18 4
53 Iodine (I) 2 8 18 18 7

Electrons are confined to relatively discrete regions around the nucleus. The two electrons of helium, for example, are confined to a spherical zone surrounding the nucleus called the K shell or K energy level.

Lithium (At. No. = 3) has three electrons, two in the K shell and one located farther from the nucleus in the L shell. Being farther away from the opposite (+) charges of the nucleus, this third electron is held less tightly.

Each of the following elements, in order of increasing atomic number, adds one more electron to the L shell until we reach neon (At. No. = 10) which has eight electrons in the L shell.

Sodium places its eleventh electron in a still higher energy level, the M shell.

From sodium to argon, this shell is gradually filled with electrons until, once again, a maximum of eight is reached.

Note that after the K shell with its maximum of two electrons, the maximum number of electrons in any other outermost shell is eight.

As we shall see, the chemical properties of each element are strongly influenced by the number of electrons in its outermost energy level (shell).

This table shows the electronic structure of the atoms of elements 1 &ndash 36 with those that have been demonstrated to be used by living things shown in red . Four elements of still higher atomic numbers that have been shown to be used by living things are also included.

Making Chains

Even though scientists have discovered over 50 amino acids, only 20 are used to make something called proteins in your body. Of those twenty, nine are defined as essential. The other eleven can be synthesized by an adult body. Thousands of combinations of those twenty are used to make all of the proteins in your body. Amino acids bond together to make long chains. Those long chains of amino acids are also called proteins.

Essential Amino Acids: Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, and Valine.
Nonessential Amino Acids: Alanine, Asparagine, Aspartic Acid, Glutamic Acid.
Conditional Amino Acids: Arginine (essential in children, not in adults), Cysteine, Glutamine, Glycine, Proline, Serine, and Tyrosine.

Nanopores can identify the amino acids in proteins, the first step to sequencing

In this artist's rendering, a portion of a protein moves through an aerolysin nanopore. Credit: Aleksei Aksimentiev

While DNA sequencing is a useful tool for determining what's going on in a cell or a person's body, it only tells part of the story. Protein sequencing could soon give researchers a wider window into a cell's workings. A new study demonstrates that nanopores can be used to identify all 20 amino acids in proteins, a major step toward protein sequencing.

Researchers at the University of Illinois at Urbana-Champaign, Cergy-Pontoise University in France and the University of Freiburg in Germany published the findings in the journal Nature Biotechnology.

"DNA codes for many things that can happen it tells us what is potentially possible. The actual product that comes out—the proteins that do the work in the cell—you can't tell from the DNA alone," said Illinois physics professor Aleksei Aksimentiev, a co-leader of the study. "Many modifications happen along the way during the process of making protein from DNA. The proteins are spliced, chemically modified, folded, and more."

A DNA molecule is itself a template designed for replication, so making copies for sequencing is relatively easy. For proteins, there is no such natural machinery by which to make copies or to read them. Adding to the difficulty, 20 amino acids make up proteins, as compared with the four bases in DNA, and numerous small modifications can be made to each amino acid during protein production and folding.

"Many amino acids are very similar," Aksimentiev said. "For example, if you look at leucine and isoleucine, they have the same atoms, the same molecular weight, and the only difference is that the atoms are connected in a slightly different order."

Nanopores, small protein channels embedded in a membrane, are a popular tool for DNA sequencing. Previously, scientists thought that the differences in amino acids were too small to register with nanopore technology. The new study shows otherwise.

The researchers used a membrane channel naturally made by bacteria, called aerolysin, as their nanopore. In both computer modeling and experimental work, they chopped up proteins and used a chemical carrier to drive the amino acids into the nanopore. The carrier molecule also kept the amino acids inside the pore long enough for it to register a measurable difference in the electrical signature of each amino acid—even leucine and isoleucine, the near-identical twins.

"This work builds confidence and reassures the nanopore community that protein sequencing is indeed possible," said Abdelghani Oukhaled, a professor of biophysics at Cergy-Pontoise whose team carried out much of the experimental work.

The researchers found they could further differentiate modified forms of amino acids by using a more sensitive measurement apparatus or by treating the protein with a chemical to improve differentiation. The measurements are precise enough to potentially identify hundreds of modifications, Aksimentiev said, and even more may be recognized by tweaking the pore.

"This is a proof-of-concept study showing that we can identify the different amino acids," he said. "The current method for protein characterization is mass spectrometry, but that does not determine the sequence it compares a sample to what's already in the database. Its ability to characterize new variations or mutations is limited. With nanopores, we finally could look at those modifications which have not yet been studied."

The aerolysin nanopore could be integrated into standard nanopore setups, Aksimentiev said, making it accessible to other scientists. The researchers are now exploring approaches to read the amino acids in sequential order as they are cut from the protein. They also are considering other applications for the system.

"One potential application would be to combine this with immunoassays to fish out proteins of interest and then sequence them. Sequencing them will tell us whether they're modified or not, and that could lead to a clinical diagnostic tool," Aksimentiev said.

"This work shows that there's really no limit to how precisely we can characterize biological molecules," he said. "Very likely, one day we will be able to tell the molecular makeup of the cell—what we are made of, down to the level of individual atoms."


  1. Mador

    Yes, it is fantastic

  2. Tiffney

    It doesn't suit me at all.

  3. Ferrex

    I am of the same opinion.

  4. Kaemon

    I consider, that you are not right. I can prove it. Write to me in PM, we will talk.

Write a message