How is protein cavity centre related to binding?

How is protein cavity centre related to binding?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I am confused and I have the following questions:
1. What are (in the context of article below) protein cavity centres?
2. How are they related to binding?

(Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study, Santos et a. 2012)

As you can find under the table with cavity centers in pdf version of this article:

The table lists the protein's PDB ID, the ligand considered and the specified cavity center. 22 ligands are similar to hexoses in shape and/or size. The cavity center is the centroid of the reported PDB atom numbers.

And a little later:

The binding-site center is computed as the hexose pyranose ring centroid for the positive examples, and as the ligand or empty pocket centroid for the negative ones. The hexose pyranose-ring atoms are located up to 2.9 ̊ A away from the ring's centroid. Since some atomic interactions can be important up to 7 ̊ A [22], we consider the binding-site as all protein atoms present within a 10 ̊ A radius sphere around the binding center. All other atoms are discarded.

So basically @aandreev is right. You can read about centroids here

The cryptochromes

Cryptochromes are photoreceptors that regulate entrainment by light of the circadian clock in plants and animals. They also act as integral parts of the central circadian oscillator in animal brains and as receptors controlling photomorphogenesis in response to blue or ultraviolet (UV-A) light in plants. Cryptochromes are probably the evolutionary descendents of DNA photolyases, which are light-activated DNA-repair enzymes, and are classified into three groups - plant cryptochromes, animal cryptochromes, and CRY-DASH proteins. Cryptochromes and photolyases have similar three-dimensional structures, characterized by an α/β domain and a helical domain. The structure also includes a chromophore, flavin adenine dinucleotide (FAD). The FAD-access cavity of the helical domain is the catalytic site of photolyases, and it is predicted also to be important in the mechanism of cryptochromes.


TetR functions as a homodimer. [1] Each monomer consists of ten alpha helices connected by loops and turns. The overall structure of TetR can be broken down into two DNA-binding domains (one per monomer) and a regulatory core, which is responsible for tetracycline recognition and dimerization. TetR dimerizes by making hydrophobic contacts within the regulatory core. There is a binding cavity for tetracycline in the outer helices of the regulatory domain. When tetracycline binds this cavity, it causes a conformational change that affects the DNA-binding domain so that TetR is no longer able to bind DNA. As a result, TetA and TetR are expressed. There is still some debate in the field whether tetracycline derivatives alone can cause this conformational change or whether tetracycline must be in complex with magnesium to bind TetR. [4] (TetR typically binds tetracycline-Mg 2+ complexes inside bacteria, but TetR binding to tetracycline alone has been observed in vitro.)

The DNA-binding domains of TetR recognize a 15 base pair palindromic sequence of the TetA operator. [1] [5] These domains mainly consist of a helix-turn-helix (HTH) motif that is common in TetR protein family members (see below). However, the N-terminal residues preceding this motif have also been shown to be important for DNA binding. [6] Although these residues do not directly contact the DNA, they pack against the HTH and this packing is essential for binding. The HTH motifs have mostly hydrophobic interactions with major grooves of the target DNA. [1] Binding of TetR to its target DNA sequence causes changes in both the DNA and TetR. [7] TetR causes widening of the major grooves as well as kinking of the DNA one helix of the HTH motif of TetR adopts a 310 helical turn as the result of complex DNA interactions.

As of June 2005, this family of proteins had about 2,353 members that are transcriptional regulators. [1] (Transcriptional regulators control gene expression.) These proteins contain a helix-turn-helix (HTH) motif that is the DNA-binding domain. The second helix is considered to be most important for DNA sequence specificity and often recognizes nucleic acids within the major groove of the double helix. [7] In the majority of the family members, this motif is on the N-terminal end of the protein and is highly conserved. [1] The high conservation of the HTH motif is not observed for the other domains of the protein. The differences observed in these other regulatory domains are likely due to differences in the molecules that each family member senses.

TetR protein family members are mostly transcriptional repressors, meaning that they prevent the expression of certain genes at the DNA level. These proteins can act on genes with various functions including antibiotic resistance, biosynthesis and metabolism, bacterial pathogenesis, and response to cell stress.

Scientists uncover how the Wntless protein carries Wnts in its signalling pathways

Researchers from Duke-NUS Medical School in Singapore and Columbia University in the U.S. have solved how Wnt proteins, which play a fundamental role in cell proliferation and differentiation, hitch a ride to travel from their cellular factory to the cell surface. Drugs that interfere with Wnt transport, like the made-in-Singapore anti-cancer drug ETC-159, can be used to treat diseases with excess Wnt signaling, such as cancer and fibrosis.

"Since excessive Wnt signaling can drive cancer, supress immunity and trigger fibrosis, there is great interest in trying to block this transport link," said Professor David Virshup, the director of Duke-NUS' Cancer and Stem Cell Biology Programme and a corresponding author of the study.

Wnts are proteins that send signals from cells to tell tissues and organs what is going on around them. Animals from sponges and jellyfish to humans rely on Wnt signaling to build their body plans. In adult humans, Wnt continues to control functions, including maintaining hair, intestines and tastebuds. Unlike most other cell-to-cell signaling proteins, however, Wnts have a fatty acid attached to them. Because of this attached fatty acid, Wnts require a dedicated transporter protein, called Wntless (WLS).

But how the WLS protein actually carries the fatty acid-modified Wnt signal around cells and between cells has not been understood.

In this work, published last week in Cell, the researchers determined the molecular structure of Wnt as it is being carried by Wntless using cryo-electron microscopy. This method allowed the researchers to study the structures of the two proteins in a near native state without interference from the staining or fixing required by traditional electron microscopy.

"We determined the structure of the short-range signaling molecule Wnt in complex with WLS. The structure explains why these two proteins form such a tight complex, as we observe a very large binding surface between the two proteins," said Filippo Mancia, an associate professor of Physiology and Cellular Biophysics at Columbia University Medical Center and the co-corresponding author of the study.

From left: Study authors Dr Yu Jia, a senior research fellow, and Prof David Virshup, director of Duke-NUS’ Cancer and Stem Cell Biology Programme, analyse the 3D structure of Wnt and Wntless proteins to develop important insights. Credit: Duke-NUS Medical School

"The structure also reveals how the fatty acid attached to Wnt can be shielded in the membrane when bound to WLS and helps to explain why a receptor, such as WLS, is necessary to transport Wnt from inside the cells to the cell membrane," added Rie Nygaard, an associate research scientist in the Mancia Lab, who led the structural biology component of the project.

The structure revealed that WLS has two domains: a transmembrane domain and a second domain that resembles an ancient fatty acid regulator. The fatty acid tail of Wnt is inserted into a conserved cavity in the transmembrane domain of WLS.

The transmembrane domain where the fatty acid tail binds is a promising drug target as it is structurally related to the family of G-protein-coupled receptors (GPCRs), which have been found to be very druggable.

"What's even more encouraging for us is that we already have a candidate drug that blocks this particular interaction and that's ETC-159," said Yu Jia, who is one of the authors of the study and a Senior Research Fellow at Duke-NUS' Cancer and Stem Cell Biology Programme.

ETC-159 is a made-in-Singapore anti-cancer drug, which was jointly developed by Duke-NUS and the Experimental Drug Development Centre (EDDC), a national platform for drug discovery and development hosted by A*STAR, the Agency for Science, Technology and Research. The Wnt-inhibitor is a novel small-molecule drug candidate that targets a range of cancers. It is currently progressing through clinical trials as a treatment for a subset of colorectal and gynecological cancers.

Over the next year or so, the researcher team hopes to build on this structure to understand in detail how Wnts get loaded onto WLS and how WLS is delivered to its receptors.

Article information

Crystal structure of coelenterazine -binding protein from Renilla muelleri at 1.7 Å: Why it is not a calcium-regulated photoprotein

If you are not the author of this article and you wish to reproduce material from it in a third party non-RSC publication you must formally request permission using Copyright Clearance Center. Go to our Instructions for using Copyright Clearance Center page for details.

Authors contributing to RSC publications (journal articles, books or book chapters) do not need to formally request permission to reproduce material contained in this article provided that the correct acknowledgement is given with the reproduced material.

Reproduced material should be attributed as follows:

  • For reproduction of material from NJC:
    Reproduced from Ref. XX with permission from the Centre National de la Recherche Scientifique (CNRS) and The Royal Society of Chemistry.
  • For reproduction of material from PCCP:
    Reproduced from Ref. XX with permission from the PCCP Owner Societies.
  • For reproduction of material from PPS:
    Reproduced from Ref. XX with permission from the European Society for Photobiology, the European Photochemistry Association, and The Royal Society of Chemistry.
  • For reproduction of material from all other RSC journals and books:
    Reproduced from Ref. XX with permission from The Royal Society of Chemistry.

If the material has been adapted instead of reproduced from the original RSC publication "Reproduced from" can be substituted with "Adapted from".

In all cases the Ref. XX is the XXth reference in the list of references.

If you are the author of this article you do not need to formally request permission to reproduce figures, diagrams etc. contained in this article in third party publications or in a thesis or dissertation provided that the correct acknowledgement is given with the reproduced material.

Reproduced material should be attributed as follows:

  • For reproduction of material from NJC:
    [Original citation] - Reproduced by permission of The Royal Society of Chemistry (RSC) on behalf of the Centre National de la Recherche Scientifique (CNRS) and the RSC
  • For reproduction of material from PCCP:
    [Original citation] - Reproduced by permission of the PCCP Owner Societies
  • For reproduction of material from PPS:
    [Original citation] - Reproduced by permission of The Royal Society of Chemistry (RSC) on behalf of the European Society for Photobiology, the European Photochemistry Association, and RSC
  • For reproduction of material from all other RSC journals:
    [Original citation] - Reproduced by permission of The Royal Society of Chemistry

If you are the author of this article you still need to obtain permission to reproduce the whole article in a third party publication with the exception of reproduction of the whole article in a thesis or dissertation.

Information about reproducing material from RSC articles with different licences is available on our Permission Requests page.


We thank Hanna Nilsson for protein purification, Torbjörn Drakenberg for help with 13 C NMR experiments, Hans Lilja for NMR spectrometer maintenance, Ulf Ryde for help with simulations in Lund, Andrew McCammon for providing facilities for the initial computations, and Gerhard Hummer for valuable comments. This work was supported by the Swedish Research Council. Computational resources were financed by National Science Foundation and Howard Hughes Medical Institute (at University of California, San Diego) and by The Lund Centre for Computational Science (at Lund University).


2.1 MDpocket input

The general input format of MDpocket is a text file listing filenames to all pdb files to be considered for the analysis. This choice is motivated by the fact that MD trajectories are stored in different file formats depending on the specifications defined in programs such as Amber ( Case et al., 2005), Charmm ( Brooks et al., 2009 MacKerel et al., 1998), Gromacs ( Hess et al., 2008) or NAMD ( Phillips et al., 2005). Due to the lack of a common format, we have decided to transform the trajectory to a set of pdb files corresponding to snapshots taken along the simulation. Moreover, when those files are ordered by time, MDpocket permits the analysis of time-dependent events. In addition, the use of pdb files also facilitates the analysis of X-ray structures taken from the PDB. Finally, the PDB files do not need to be identical (no generic topology required), which makes MDpocket easily adaptable to analyse conformational ensembles from various sources (homologous proteins, for instance).

To carry out the cavity detection with MDpocket, it is important to superimpose the PDB structures onto each other. To this end, solvent molecules and counter ions were stripped off the system prior to pdb export, and then structural alignments were carried out using ptraj from AmberTools ( Case et al., 2005).

2.2 fpocket parameters and output

MDpocket relies on the pocket detection program fpocket, which makes extensive use of Voronoi tessellation during cavity detection. This geometric approach allows retrieving without the α-spheres (i.e. spheres that are in contact with exactly four atoms without any other atom situated within the sphere). The centre of the α-sphere corresponds to a Voronoi vertex. A list of all Voronoi vertices (clustered into pockets) situated on the protein surface is provided in the output of fpocket.

The fpocket module is very flexible regarding the type of cavity to be detected. The flexibility is achieved through user accessible command line parameters that influence filtering and clustering of α-spheres. The most important parameters are those that define the size of α-spheres built up in a binding site (−m: minimum α-sphere size -M: maximum α-sphere size). Moreover, filtering and clustering of α -spheres can be modified using parameters -i (the minimum number of α-spheres in the final pocket) and −n (the minimum number of α-spheres close to each other for merging two binding sites into a single one).

Three different parameter sets for pocket detection have been assessed here in order to illustrate the scalability of the algorithm. Set 1 denotes the default fpocket parameter set (-m 3.0, -M 6.0, -i 30, -n 3), which is tailored for detection of small molecule (i.e. peptides, drug-like compounds) binding sites. Set 2 is intended to identify very small channels and pockets (-m 2.8, -M 6.0, -i 3, -n 2). Finally, Set 3 (-m 3.5, -M 5.5, -i 1, -n 2) is chosen to represent an α-sphere with a physically meaningful minimum size, while retaining all the pockets (even tiny ones built by a single α-sphere), it being thus better suited to identify very open cavities physically accessible to a water molecule and detection of continuous channels that can accommodate a water molecule.

2.3 MDpocket workflow: pocket detection

Pockets are detected based on the workflow depicted in Supplementary Figure S1 :

A 1 Å spaced grid is placed over the first snapshot of the set of superposed pdb files.

fpocket is run on every snapshot of the set of pdb files.

In contrast to the pocket detection on a single static structure, MDpocket provides information about the plasticity of pockets from the normalized frequency/density maps generated from a given ensemble of structures. This is a major difference to other approaches that assign discrete pocket Ids to track pockets during MD trajectories ( Eyrisch and Helms, 2007). In MDpocket, an accurate pocket Id identification and tracking is not necessary, rendering the detection protocol more generic and less error prone.

2.4 MDpocket workflow: pocket characterization

Frequency and density maps are valuable to explore pocket opening/closure. MDpocket also permits to characterize those pockets or binding sites by providing a variety of descriptors, which include the accessible surface area and volume of the pocket, the number of α-spheres and the mean local hydrophobic density, which is an index of binding site druggability ( Schmidtke and Barril, 2010).

To carry out the pocket characterization, the user can extract all grid points having a grid value equal or higher than a certain threshold from the previously calculated pocket frequency map (a default value of 0.5 is defined in MDpocket). Thus, visualization of the frequency map permits the user to select an area of interest (i.e. a transient channel or a binding site) using a graphical display tool, and the user-defined zone (saved as pdb file) can then be used as input for MDpocket in order to determine all pocket descriptors corresponding to the selected area for the whole ensemble.

2.5 MDpocket validation

The usefulness and accuracy of MDpocket have been calibrated considering three molecular systems studied previously in our group.

HSP90: a 78.5 ns trajectory run of the N-terminal domain of the heat shock protein 90 (HSP90) with explicit solvent (TIP3P water model) was first considered. The simulation was run in the NPT ensemble (1 atm, 298 K) using periodic boundary conditions and Ewald sums (grid spacing of 1 Å) for long-range electrostatic interactions. The parm99 force field and the Amber ( Case et al., 2005) package were also used. From this trajectory, 3925 equally spaced snapshots were extracted and analysed with MDpocket. Furthermore, an alternative ensemble of structures was built up by retrieving 88 X-ray crystallographic structures from the PDB ( Supplementary Table S1 ), which were subsequently aligned using PyMOL.

Mb: the crystal structure of Mb (PDB entry 1VXD) ( Yang and Phillips, 1996) was immersed in an octahedral box of TIP3P water molecules and the net charge of the system was neutralized with sodium ions. The final system contained

21 000 atoms. The simulations were run using the PMEMD module of amber9 and the parmm99 force field with special parameters for the haem residue ( Bidon-Chanal et al., 2006 Marti et al., 2006). The SHAKE algorithm was used to keep bonds involving hydrogen atoms at their equilibrium length, in conjunction with a 1 fs time step for the integration of the Newton's equations. Trajectories were collected in the NPT ensemble (1 atm, 298 K) using periodic boundary conditions and Ewald sums (grid spacing of 1 Å) for long-range electrostatic interactions. The systems were minimized using a multi-step protocol, involving first the adjustment of hydrogens, then the refinement of water molecules and finally the minimization of the whole system. The equilibration was performed by heating from 100 to 298 K in four 100 ps steps at 150, 200, 250 and 298 K. Finally, a 50 ns trajectory was obtained, collecting frames at 1 ps intervals. The MDpocket analysis was performed with 10 000 snapshots equally spaced in time.

P38 Map kinase: the PDB structure 1P38 was used as initial structure for a 50 ns MD trajectory. Leap was used to immerse the protein in an octahedral solvent box. The overall charge of the system was neutralized by the addition of counterions. The solvent box contained a mixture of water and 20% isopropanol molecules. In order to obtain more information about the equilibration protocol, refer to Seco et al. (2009). The production run was carried out at 1 atm and 300 K using periodic boundary conditions. In all, 5000 snapshots equally spaced in time have been used for the MDpocket analysis.

To assess whether MDpocket is able to give useful hints during the selection of receptor conformations for molecular docking, 32 X-ray crystallographic structures of P38 with DFG-in conformations and bound ligands were extracted from the PDB ( Supplementary List S2 ), and aligned to all snapshots of the MDs using the Cα atoms of residues 35–39, 45–50 and 100–104, which correspond to the stable part of the β-sheet lining the binding site. In order to extract the interaction energies for each DFG in ligand, the aligned ligand was extracted from the crystal structure and added to each snapshot of the MD trajectory to calculate the interaction energy. All energy calculations were performed using molecular operating environment (MOE) ( Chemical Computing Group, 2009), and the default potential energy function with the merck molecular force field (MMFF) force field. No modifications or conformational changes were applied to the ligands near the residues in the binding site. Thus, this very crude interaction energy evaluation should mainly give insights into steric clashes that could occur in the ligand–protein complex, if the ligand is docked in a given conformation of the protein sampled during the MD trajectory.

Principles of herbal pharmacology

Adverse reactions and toxicology

Tannins are found to some extent in many medicinal plants. The following comments about adverse reactions refer only to relatively high doses of herbs containing significant quantities of tannins. It is unlikely that incidental exposure to tannins at low levels has any significant negative impact on health. In fact, condensed tannins are found in several commonly consumed foods. In contrast, hydrolysable tannins are less common in foods (the pomegranate being one exception) and this suggests that the long-term therapeutic intake of this group of tannins should be avoided.

High doses of tannins lead to excessive astringency on mucous membranes, which produces an irritating effect. This probably led to the practice of adding milk to tea whereby the tannins preferentially bind to proteins in the milk rather than the gut wall. However, even adding milk does not prevent the constipation that can result from chronic intake of high levels of tannins. For these reasons, high doses of strongly astringent herbs should be used cautiously in highly inflamed or ulcerated conditions of the gastrointestinal tract and in patients complaining of constipation.

Chronic intake of tannins inhibits digestive enzymes, especially the membrane-bound enzymes on the small intestinal mucosa. 277 Tannins complex metal ions and inhibit their absorption. One study found that as long as tea and iron are consumed separately, iron absorption is not affected. 278 This iron-complexing property of tannins could be exploited in male patients with haemochromatosis, which is now recognised to be a relatively common disorder. (See Appendix C for more information on such potential interactions with tannin-containing herbs.) Tannins can also react with thiamine and decrease its absorption. 279

Addition of tannic acid, a hydrolysable tannin, to the barium sulphate mixture used in barium enemas increases the yield and accuracy of the examination. The colonic mucosa stands out clearly and tumour visualisation is improved. However, the practice was banned in 1964 by the US FDA (Food and Drug Administration). 280 Several deaths caused by acute hepatotoxicity, the majority in children, were attributed to this practice. 281 In these cases, quantities of tannic acid sufficient to cause massive liver damage were absorbed directly into the bloodstream from the colon. This effect is highly unlikely to follow from use of tannin-containing herbs. Nonetheless, some unexplained cases of herbal hepatotoxicity have been recorded. It is therefore prudent to avoid the use of high doses of highly astringent herbs in patients with very damaged gastrointestinal tracts, other than in the circumstances outlined above. Green tea extract consumption has been linked to rare cases of idiosyncratic hepatotoxicity. 282

Tannins are carcinogenic when injected subcutaneously 283 and herbal teas containing tannins have been implicated in the possible development of oesophageal cancers. 284,285 While these associations probably have little relevance to phytotherapy, they do suggest caution with the long-term oral and topical use (on damaged skin) of tannin-containing herbs.

Computational Protein Analysis

Proteins play key roles in almost all biological pathways in a living system, and their functions are determined by the three-dimensional shape of the folded polypeptide chain. Advances in DNA sequencing and structural biology over the years have revolutionized our understanding of structure-function relationship of these macromolecules. Especially, an increased number of protein structures in the Protein Data Bank have provided invaluable information about the precise position of each atom, which allows for the understanding of the protein structures and cellular machinery at atomic level and facilitates the discovery of therapeutic drugs. Complementary to experimental methods, in silico approaches can reveal additional information related to many aspects of the protein structure-function relationship, which could be masked by the static picture of the protein configuration.

At Profacgen, we utilize the most state-of-the-art computer software tools that enable comprehensive analyses for a protein by integrating both sequence data and structural information. Our experienced bioinformatics team can help reveal various features of the protein of interest, including but not limited to:

  • Physico-chemical parameters
  • Annotation of protein functions
  • Homology detection and structural alignment
  • Motif discovery
  • Functional/binding sites analysis and calculation of cavity volume
  • Global topological analysis and local structure characterization
  • Identification of surface/interface contact map
  • Mapping of H-bonding networks
  • Electrostatics calculation
  • Hydrophobic cluster analysis
  • Evaluation of protein stability/folding energy
  • Prediction of protein intrinsic disorder
  • Estimation of structure quality and error correction
  • Sequence- and structure-based PPI networks

We attempt to accurately characterize proteins in silico and to provide insights into the functions based on information such as their sequence, structure, dynamics, evolutionary history, and their association with other molecules. The results will offer an explanation of observed experimental data from a computational biology perspective, or serve as a guide for further lab experiments.

Computational Protein Analysis


Fast turnaround time
Detailed project report usually within a week
A variety of computational analyses available
Customized analysis as requested
Assistance in identifying appropriate software tools
Experts available for technical consultation and experimental design

We promise to offer customized service according to the specific need of our customers&rsquo and integrate our computational procedures into your workflow. Please do not hesitate to contact us for more details about our computational protein analysis service.


We have presented an algorithm that infers binding site patterns by utilizing local similarity among active site sub-cavities. The uniqueness of our approach lies not only in the consideration of sub-cavities, but also in the more complete structural representation of these sub-cavities, their parametrization and the method by which they are compared. We demonstrated the algorithm's ability to leverage previously unused structural information to perform binding inference for proteins that do not share significant structural similarity with known systems. Using HIV-1 Protease and Thrombin as test cases, we have taken the first step toward sub-cavity-based pharmacophore inference. We intend to extend our work toward fully automatic pharmacophore inference and protein function prediction. More specifically, we believe that an automatically generated pharmacophore map could be used for virtual docking, lead optimization and de novo drug design. An example of a lead optimization effort using this approach would be to apply predicted binding preferences to the replacement of chemical groups on a well-studied scaffold. Detailed knowledge of a pharmacophore map may also allow protein function prediction or provide support for human-generated binding hypotheses.