How does RNA transcription determine which half of the DNA to use?

How does RNA transcription determine which half of the DNA to use?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I feel that I might have a complete misunderstanding here. If DNA has two strands, how does the machinery of RNA transcription determine which one to transcribe?

I'll keep this short and simple. The direction of transcription (which determines which strand is used as the template) is controlled by the promoter, which is a region of specific DNA motifs at the 5' end of a gene. RNA polymerase binds to the promoter, which orients it on the correct strand and in the correct direction, after which it can proceed to transcribe the gene.

That great little animation is from this website.

To add to canadianer's answer, in fact genes can be found on both strands of the DNA in most eukaryotic cells, in the sense that the sense and anti-sense strands are not always the same strand. The direction is therefore completely determined by the promoter. Furthermore, there are bidirectional promoters.

Short RNA half-lives in the slow-growing marine cyanobacterium Prochlorococcus

RNA turnover plays an important role in the gene regulation of microorganisms and influences their speed of acclimation to environmental changes. We investigated whole-genome RNA stability of Prochlorococcus, a relatively slow-growing marine cyanobacterium doubling approximately once a day, which is extremely abundant in the oceans.


Using a combination of microarrays, quantitative RT-PCR and a new fitting method for determining RNA decay rates, we found a median half-life of 2.4 minutes and a median decay rate of 2.6 minutes for expressed genes - twofold faster than that reported for any organism. The shortest transcript half-life (33 seconds) was for a gene of unknown function, while some of the longest (approximately 18 minutes) were for genes with high transcript levels. Genes organized in operons displayed intriguing mRNA decay patterns, such as increased stability, and delayed onset of decay with greater distance from the transcriptional start site. The same phenomenon was observed on a single probe resolution for genes greater than 2 kb.


We hypothesize that the fast turnover relative to the slow generation time in Prochlorococcus may enable a swift response to environmental changes through rapid recycling of nucleotides, which could be advantageous in nutrient poor oceans. Our growing understanding of RNA half-lives will help us interpret the growing bank of metatranscriptomic studies of wild populations of Prochlorococcus. The surprisingly complex decay patterns of large transcripts reported here, and the method developed to describe them, will open new avenues for the investigation and understanding of RNA decay for all organisms.

Within Individual Variation

Katherine E. Willmore , Benedikt Hallgrímsson , in Variation , 2005

2 Causes at the Level of RNA and Protein Production

RNA transcription and translation have an error rate that is three orders of magnitude greater than that of DNA replication, and is therefore, a significant source of developmental noise intrinsic to the cell ( Alberts et al., 1998 ).

In addition to the errors that can occur during the process, transcription is an inherently stochastic process, leading to variable amounts of products at variable time intervals ( McAdams and Arkin, 1999 ). Recent research indicates that transcription in eukaryotes is regulated by a binary “on” versus “off” switching mechanism, based on probability ( McAdams and Arkin, 1999 Fiering et al., 2000 Blake et al., 2003 Klingenberg 2004a ). Fiering and colleagues (2000) found evidence in support of this binary regulation in a study of the effect of enhancers on transcription. They found that gene expression in myofibers was stochastic, that within a given nucleus, a gene was either on or off, and that gene enhancers increase the probability that a gene will be activated. A result of this binary regulation is that transcription occurs in bursts of activity with intervals of inactivity of random duration ( Klingenberg, 2004a ). The products of transcription as well as the proteins from their subsequent translation will likewise be produced in sporadic bursts, enhancing the effects of the noise created during transcription ( McAdams and Arkin, 1999 Swain et al., 2002 Blake et al., 2003 Klingenberg, 2004a ). The production of proteins in this sporadic fashion at random time intervals, leads to variable amounts of protein product in the cell at any given time ( McAdams and Arkin, 1997 , 1999 ). Stochastic protein production can lead to large differences in subsequent regulatory cascades across a population of cells and is another potential source of developmental noise ( McAdams and Arkin, 1997 ).

Models based on stochastic transcription and translation predict that developmental noise increases as the amount of transcript decreases ( Elowitz et al., 2002 ). Therefore, slower transcription rates ( Alberts et al., 1998 Blake et al., 2003 ), longer durations between transcriptional activity, and shorter half-lives of products ( Klingenberg, 2004a ) will potentially lead to increased noise. In addition, reduced numbers of products extrinsic to transcription that are involved in its regulation may also increase developmental noise. These extrinsic factors include the number of RNA polymerases, ribosomes, and proteins ( Swain et al., 2002 ).

The endoplasmic reticulum (ER) is the gateway for proteins to the secretory pathway ( Fewell et al., 2001 Sitia and Braakman, 2003 ). The ER provides a particularly favorable environment for the folding and maturation of proteins, often acting as a final check for the fidelity of proteins before their entry to the secretory pathway ( Fewell et al., 2001 Ellgaard and Helenius, 2003 Sitia and Braakman, 2003 ). Quality control in the ER is particularly important as the final locations of secreted proteins are often lacking chaperones or other folding factors that could fix their configuration if they were to become damaged ( Ellgaard and Helenius, 2003 ). The proteins secreted from the ER are involved in a variety of important processes such as nutrient storage, gene regulation, and some extracellular carrier functions for ligands ( Ellgaard and Helenius, 2003 ). The functional importance of proteins produced in the ER, coupled with their potentially far-reaching effects as a result of entering the secretory pathway, would enable a cascade of downstream perturbations if the stability of these proteins were disrupted. Several mechanisms exist to ensure the fidelity of proteins produced in the ER (see the following text under mechanisms). However, proteins that have somehow evaded the quality control mechanisms of the ER have a great potential to cause several and potentially severe downstream disruptions. One possible cause for the release of damaged or nonfunctional proteins from the ER arises from the primary method of quality control. The ER does not account for the functionality of proteins. Rather, quality control relies on the correct conformation of proteins, assuming that proteins that are properly folded are functional, potentially leading to the secretion of nonfunctional, but correctly folded proteins ( Ellgaard and Helenius, 2003 ).

The presence of disulfide bonds is characteristic of secretory proteins, and the ER provides a unique environment that supports disulfide bond formation ( Fewell et al., 2001 ). Under normal circumstances, disulfide bonds are beneficial, and there are several factors within the ER to ensure their formation. However, the processes involved in disulfide bond formation are quite slow in comparison with other folding processes, and they may therefore slow the production and the subsequent secretion of proteins ( Fewell et al., 2001 ). If there were a disruption to disulfide bond formation within the ER, time-sensitive downstream processes could be halted or slowed because of a reduced number of secretory proteins.

Proteins act to regulate the further production of other proteins as well as to modulate the expression of genes. Therefore, any stochasticity at the level of RNA and protein production can have cascading consequences throughout the developmental system.

In DNA, regulation of gene expression normally happens at the level of RNA biosynthesis (transcription), and is accomplished through the sequence-specific binding of proteins (transcription factors) that activate or inhibit transcription. Transcription factors may act as activators, repressors, or both. Repressors often act by preventing RNA polymerase from forming a productive complex with the transcriptional initiation region (promoter), while activators facilitate formation of a productive complex. Furthermore, DNA motifs have been shown to be predictive of epigenomic modifications, suggesting that transcription factors play a role in regulating the epigenome. [2]

In RNA, regulation may occur at the level of protein biosynthesis (translation), RNA cleavage, RNA splicing, or transcriptional termination. Regulatory sequences are frequently associated with messenger RNA (mRNA) molecules, where they are used to control mRNA biogenesis or translation. A variety of biological molecules may bind to the RNA to accomplish this regulation, including proteins (e.g. translational repressors and splicing factors), other RNA molecules (e.g. miRNA) and small molecules, in the case of riboswitches.

A regulatory DNA sequence does not regulate unless it is activated. Different regulatory sequences are activated and then implement their regulation by different mechanisms.

Enhancer activation and implementation Edit

Up-regulated expression of genes in mammals can be initiated when signals are transmitted to the promoters associated with the genes. Cis-regulatory DNA sequences that are located in DNA regions distant from the promoters of genes can have very large effects on gene expression, with some genes undergoing up to 100-fold increased expression due to such a cis-regulatory sequence. [3] These cis-regulatory sequences include enhancers, silencers, insulators and tethering elements. [4] Among this constellation of sequences, enhancers and their associated transcription factor proteins have a leading role in the regulation of gene expression. [5]

Enhancers are sequences of the genome that are major gene-regulatory elements. Enhancers control cell-type-specific gene expression programs, most often by looping through long distances to come in physical proximity with the promoters of their target genes. [6] In a study of brain cortical neurons, 24,937 loops were found, bringing enhancers to promoters. [3] Multiple enhancers, each often at tens or hundred of thousands of nucleotides distant from their target genes, loop to their target gene promoters and coordinate with each other to control expression of their common target gene. [6]

The schematic illustration in this section shows an enhancer looping around to come into close physical proximity with the promoter of a target gene. The loop is stabilized by a dimer of a connector protein (e.g. dimer of CTCF or YY1), with one member of the dimer anchored to its binding motif on the enhancer and the other member anchored to its binding motif on the promoter (represented by the red zigzags in the illustration). [7] Several cell function specific transcription factor proteins (in 2018 Lambert et al. indicated there were about 1,600 transcription factors in a human cell [8] ) generally bind to specific motifs on an enhancer [9] and a small combination of these enhancer-bound transcription factors, when brought close to a promoter by a DNA loop, govern the level of transcription of the target gene. Mediator (coactivator) (a complex usually consisting of about 26 proteins in an interacting structure) communicates regulatory signals from enhancer DNA-bound transcription factors directly to the RNA polymerase II (RNAP II) enzyme bound to the promoter. [10]

Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two eRNAs as illustrated in the Figure. [11] An inactive enhancer may be bound by an inactive transcription factor. Phosphorylation of the transcription factor may activate it and that activated transcription factor may then activate the enhancer to which it is bound (see small red star representing phosphorylation of a transcription factor bound to an enhancer in the illustration). [12] An activated enhancer begins transcription of its RNA before activating a promoter to initiate transcription of messenger RNA from its target gene. [13]

CpG island methylation and demethylation Edit

5-methylcytosine (5-mC) is a methylated form of the DNA base cytosine (see Figure). 5-mC is an epigenetic marker found predominantly on cytosines within CpG dinucleotides, where 5’ cytosine is followed by 3’ guanine (CpG sites). About 28 million CpG dinucleotides occur in the human genome. [14] In most tissues of mammals, on average, 70% to 80% of CpG cytosines are methylated (forming 5-methylCpG or 5-mCpG). [15] Methylated cytosines within 5’cytosine-guanine 3’ sequences often occur in groups, called CpG islands. About 59% of promoter sequences have a CpG island while only about 6% of enhancer sequences have a CpG island. [16] CpG islands constitute regulatory sequences, since if CpG islands are methylated in the promoter of a gene this can reduce or silence gene expression. [17]

DNA methylation regulates gene expression through interaction with methyl binding domain (MBD) proteins, such as MeCP2, MBD1 and MBD2. These MBD proteins bind most strongly to highly methylated CpG islands. [18] These MBD proteins have both a methyl-CpG-binding domain as well as a transcription repression domain. [18] They bind to methylated DNA and guide or direct protein complexes with chromatin remodeling and/or histone modifying activity to methylated CpG islands. MBD proteins generally repress local chromatin such as by catalyzing the introduction of repressive histone marks, or creating an overall repressive chromatin environment through nucleosome remodeling and chromatin reorganization. [18]

As noted in the previous section, transcription factors are proteins that bind to specific DNA sequences in order to regulate the expression of a given gene. The binding sequence for a transcription factor in DNA is usually about 10 or 11 nucleotides long. As summarized in 2009, Vaquerizas et al. indicated there are approximately 1,400 different transcription factors encoded in the human genome and they constitute about 6% of all human protein coding genes. [19] About 94% of transcription factor binding sites (TFBSs) that are associated with signal-responsive genes occur in enhancers while only about 6% of such TFBSs occur in promoters. [9]

EGR1 protein is a particular transcription factor that is important for regulation of methylation of CpG islands. An EGR1 transcription factor binding site is frequently located in enhancer or promoter sequences. [20] There are about 12,000 binding sites for EGR1 in the mammalian genome and about half of EGR1 binding sites are located in promoters and half in enhancers. [20] The binding of EGR1 to its target DNA binding site is insensitive to cytosine methylation in the DNA. [20]

While only small amounts of EGR1 transcription factor protein are detectable in cells that are un-stimulated, EGR1 translation into protein at one hour after stimulation is drastically elevated. [21] Expression of EGR1 transcription factor proteins, in various types of cells, can be stimulated by growth factors, neurotransmitters, hormones, stress and injury. [21] In the brain, when neurons are activated, EGR1 proteins are up-regulated and they bind to (recruit) the pre-existing TET1 enzymes which are highly expressed in neurons. TET enzymes can catalyse demethylation of 5-methylcytosine. When EGR1 transcription factors bring TET1 enzymes to EGR1 binding sites in promoters, the TET enzymes can demethylate the methylated CpG islands at those promoters. Upon demethylation, these promoters can then initiate transcription of their target genes. Hundreds of genes in neurons are differentially expressed after neuron activation through EGR1 recruitment of TET1 to methylated regulatory sequences in their promoters. [20]

Signal-responsive promoters and enhancers subject to limited, short-term, double-strand or single-strand breaks Edit

About 600 regulatory sequences in promoters and about 800 regulatory sequences in enhancers appear to depend on double strand breaks initiated by topoisomerase 2-beta (TOP2B) for activation. [22] The induction of particular double-strand breaks are specific with respect to their inducing signal. When neurons are activated, just 22 of TOP2B-induced double-strand breaks occur in their genomes. [23]

Such TOP2B-induced double-strand breaks are accompanied by at least four enzymes of the non-homologous end joining (NHEJ) DNA repair pathway (DNA-PKcs, KU70, KU80 and DNA LIGASE IV) (see Figure). These enzymes repair the double-strand breaks within about 15 minutes to two hours. [23] [24] The double-strand breaks in the promoter are thus associated with TOP2B and at least these four repair enzymes. These proteins are present simultaneously on a single promoter nucleosome (there are about 147 nucleotides in the DNA sequence wrapped around a single nucleosome) located near the transcription start site of their target gene. [24]

The double-strand break introduced by TOP2B apparently frees the part of the promoter at an RNA polymerase-bound transcription start site to physically move to its associated enhancer. This allows the enhancer, with its bound transcription factors and mediator proteins, to directly interact with the RNA polymerase paused at the transcription start site to start transcription. [23] [10]

Topoisomerase I (TOP1) enzymes appear to be located at a large number of enhancers and those enhancers become activated when TOP1 introduces a single-strand break. [25] TOP1 causes single strand breaks in particular enhancer DNA regulatory sequences when signaled by a specific enhancer-binding transcription factor. [25] Topoisomerase I breaks are associated with different DNA repair factors than those surrounding TOP2B breaks. In the case of TOP1 the breaks are associated most immediately with DNA repair enzymes MRE11, RAD50 and ATR. [25]

Research to find all regulatory regions in the genomes of all sorts of organisms is under way. [26] Conserved non-coding sequences often contain regulatory regions, and so they are often the subject of these analyses.

What Is the TATA Box? (with pictures)

In living organisms, transcription of deoxyribonucleic acid (DNA) is the initial step necessary for the expression of a gene. The TATA box, also known as the Goldberg-Hogness box, is a region of DNA that helps initiate the process of transcription. It is part of the promoter region, which regulates gene expression by providing a binding site for enzymes involved in transcribing genes. The TATA box is found in eukaryotes — organisms that have complex membrane-bound structures within their cells — including humans.

DNA consists of nucleotides, repeating structural units that come in four varieties: the nucleobases adenine (A), thymine (T), guanine (G), and cytosine (C). As these bases repeat, they form patterns that encode genetic information. They also form pairs by chemically bonding in a complementary fashion, with adenine attaching to thymine and guanine attaching to cytosine. Base pairs connect the two strands of a DNA molecule into a double helix structure.

When DNA is transcribed, enzymes split the double helix into its constituent threads, exposing the genetic code for duplication. Each DNA strand is used as a template for synthesizing a strand of ribonucleic acid (RNA). An enzyme known as RNA polymerase constructs the RNA chain by binding complementary nucleobases to each exposed DNA strand.

In order for complete genes to be transcribed into messenger RNA (mRNA) for eventual expression, RNA polymerase must begin transcription at the correct point in the DNA sequence. This point, known as the initiation site, is indicated by a promoter region that occurs slightly upstream of the gene. The TATA box is a sequence of DNA, consisting of nucleobases TATAAA, located in the promoter region about 25 base pairs before the site of transcription.

Proteins known as transcription factors bind to the TATA box. One of these, the TATA-binding protein (TBP), is TATA-specific, while the others may be able to bind to non-TATA promoter regions. RNA polymerase is able to recognize the presence of transcription factors as a signal to bind to that location. After binding to the TATA box, RNA polymerase is at the initiation site and can now begin to transcribe the gene.

Most promoter regions of genes do not contain a TATA box. In TATA-less genes, transcription factors recognize other promoter sequences and RNA polymerase binds to these instead. Researchers have discovered differences in regulation between genes with the TATA box and those without the TATA box through the study of model organisms such as Saccharomyces yeast and the fruit fly Drosophila.

How does RNA transcription determine which half of the DNA to use? - Biology

What is a regulatory mutation?

All cells in an organism contain the same genome - that is, the same genes arranged in the same order along the same chromosomes. This is a generalization - some genes are rearranged within certain cells, but this is the exception rather than the rule. For example, the genes coding for immunoglobulins (antibody molecules) are rearranged in B lymphocytes this is how novel antibody structures are created in response to new invading organisms or viruses. Also, chromosomes are often broken and rejoined aberrantly in cancer cells. And, of course, germ cells (sperm and eggs) contain only half the number of chromosomes as normal somatic (body) cells.

Nevertheless, most cells contain identical DNA. What makes one type of cell different from another is not the genes they contain but which of these genes they express - i.e. which genes they transcribe into RNA, then translate into protein. A muscle cell differs from a neuron in a great many of its constituent RNAs and proteins. For example, the muscle cell transcribes the genes coding for muscle actin and myosin but not the gene encoding neurofilament the reverse is true of the neuronal cell. What determines whether a particular gene is transcribed or not? This depends to a large extent on the other proteins that are present in the cell - in particular, proteins that are collectively known as "transcription factors".

Every gene consists of a protein coding sequence, which might be contiguous or broken up into a series of exons and introns, and which begins with a START codon (ATG) and concludes with a STOP codon (TAA, TAG or TGA). Apart from this, a gene must have regulatory sequences associated with it. These are stretches of DNA which do not themselves code for protein but which act as binding sites for RNA polymerase and its accessory molecules as well as a variety of transcription factors. Together, the regulatory sequences with their bound proteins act as molecular switches that determine the activity state of the gene - e.g. OFF or FULL-ON or, more often, something in between. The regulatory sequences include the promoter region together with enhancer elements.

Every gene has a promoter, which is the binding site for the basal transcriptional apparatus - RNA polymerase and its co-factors. This provides the minimum machinery necessary to allow transcription of the gene. The enhancer regions are found at a distance from the promoter, to either the5' or 3' sides of the gene or within introns. They are typically short stretches of DNA (200bp, say), each made up of a cluster of even shorter sequences (25bp, say) that are the binding sites for a variety of cell- or region-specific transcription factors. Once bound, these transcription factor complexes interact with the basal transcriptional machinery at the promoter to enhance (or sometimes diminish) the transcription rate of the gene. Such interactions are possible because of the flexible nature of DNA which allows the enhancers to come close to the promoter by looping out the DNA in between (see diagram below).

We can think of the activating function of enhancers as follows. Binding of RNA polymerase and the basal transcriptional machinery at the gene promoter is like switching on the engine and allowing it to idle in neutral. When the supplementary transcription factors bound to enhancer elements interact with the basal machinery, it is like putting the engine into gear and pulling away from the kerb. (Alternatively, for a repressor site it is like putting on the handbrake.)

Frequently, a given gene is subject to complex regulation. That is, it might have to be transcribed at different times and in different places during development, or in response to different extracellular stimuli. In the present context (Drosophila embryogenesis) we have seen that the segmentation genes are expressed according to their position in the embryo. An example is the even-skipped (eve) gene, a pair-rule gene that is transcribed in alternate embryonic parasegments to generate a zebra pattern of seven stripes. The transcriptional state of the eve gene - either ON or OFF according to which parasegment we are in - is under the control of a series of enhancer elements, one for each stripe. Each enhancer element contains binding sites for upstream segmentation gene products such as Bicoid and Kruppel (which, as you will recall, are themselves transcription factors). Thus the particular constellation of maternal effect genes, gap genes and other pair-rule genes that is expressed in a given parasegment determines whether or not one of the enhancer elements is fully occupied and consequently whether eve gene transcription is activated or not in that parasegment. The specificity of the enhancers can be demonstrated by removing just one of them (the element that specifies stripe 2, say) and inserting it upstream of a reporter gene like the bacterial beta-galactosidase (Bgal). when this is introduced into the embryo, Bgal is expressed in just one stripe - in the position of the eve stripe 2. Alternatively, particular enhancer elements can be deleted from the normal eve gene, resulting in deletion of the corresponding stripes of eve expression. Such a mutation - loss of an enhancer element - is called a regulatory mutation. It affects the spatial or temporal regulation of the gene without causing universal loss of the gene product.

The upper part of this diagram shows part of the regulatory region of the eve gene - the part closest to the transcription start site (red arrow). In this region there are three enhancer elements that control eve expression in stripes 2, 3 and 7. Other elements, not shown in the diagram, lie further away from the transcription start site (to the left) and control eve expression in stripes 1,4,5 and 6. Deleting these upstream elements and leaving only those shown in the diagram gives rise to the expression pattern shown above right - stripes 2, 3 and 7. Taking just one of the elements, that for stripe 2, and linking it to a reporter gene gives the pattern shown above centre - stripe 2 only. The wild-type eve expression pattern is shown above left. (From Gilbert, Developmental Biology 4th edition Chapter 15, p549 and Alberts et al. Molecular Biology of the Cell 3rd edition, Chapter 9 p428)

A similar thing goes for regulation of the Bithorax Complex. Transcription of Ubx, abd-A and Abd-B is controlled by a series of enhancer elements, each of which is specific for a particular parasegment. Loss of one of the Ubx enhancers will result in loss of Ubx expression from the corresponding parasegment and transformation of that parasegment. The BX-C mutations picked up by Lewis in his genetic screen (bx, pbx, bxd, iab2 etc) turn out to be regulatory mutations of this sort. For example, bxd controls Ubx expression in A1 mutations of bxd result in loss of Ubx expression in A1 and homeotic transformation of A1 to T3). Thus, transformations of segment identity result from subtle, parasegment-specific alterations to the expression patterns of Ubx, abd-A or Abd-B, not complete loss of their encoded proteins.

New technology will show how RNA regulates gene activity

Figure 1. Gene expression (a): The DNA sequence of a gene is copied to make an RNA molecule (transcription), then the RNA sequence is decoded (translated) to build a protein molecule. (b) Inside a cell nucleus, a DNA molecule is packaged by special proteins in chromatin, which makes up a chromosome Credit: Genetic Code Table, and Diagram of Chromatin from the Wiring Diagram Database. Credit: Moscow Institute of Physics and Technology

The discovery of a huge number of long non-protein coding RNAs, aka lncRNAs, in the mammalian genome was a major surprise of the recent large-scale genomics projects. An international team including a bioinformatician from the Research Center of Biotechnology of the Russian Academy of Sciences, and the Moscow Institute of Physics and Technology has developed a reliable method for assessing the role of such RNAs. The new technique and the data obtained with it allow generating important hypotheses on how chromatin is composed and regulated, as well as identifying the specific functions of lncRNAs.

Presented in Nature Communications, the technology is called RADICL-seq and enables comprehensive mapping of each RNA, captured while interacting with all the genomic regions that it targets, where many RNAs are likely to be important for genome regulation and structure maintenance.

RNA and gene regulation

It was previously believed that RNA functions mostly as an intermediary in building proteins based on a DNA template (fig. 1a), with very rare exceptions such as ribosomal RNAs. However, with the development of genomic analysis, it turned out that not all DNA regions encode RNA, and not all transcribed RNA encodes proteins.

Although the number of noncoding RNAs and those that encode proteins is about the same, the function of most noncoding RNA is still not entirely clear.

Every type of cell has its own set of active genes, resulting in the production of specific proteins. This makes a brain cell different from a blood cell of the same organism—despite both sharing the same DNA. Scientists are now coming to the conclusion that RNA is one of the factors that determine which genes are expressed, or active.

Long noncoding RNAs are known to interact with chromatin—DNA tightly packaged with proteins (fig. 1b). Chromatin has the ability to change its conformation, or "shape," so that certain genes are either exposed for transcription or concealed. Long noncoding RNAs contribute to this conformation change and the resulting change in gene activity by interacting with certain chromatin regions. To understand the regulatory potential of RNA—in addition to it being a template for protein synthesis—it is important to know which chromatin region any given RNA interacts with.

Figure 1b. Gene expression (a): The DNA sequence of a gene is copied to make an RNA molecule (transcription), then the RNA sequence is decoded (translated) to build a protein molecule. (b) Inside a cell nucleus, a DNA molecule is packaged by special proteins in chromatin, which makes up a chromosome Credit: Genetic Code Table, and Diagram of Chromatin from the Wiring Diagram Database. Credit: Moscow Institute of Physics and Technology

RNAs interact with chromatin inside the cell nucleus by binding to chromatin-associated proteins that fold a DNA molecule. There are several technologies that can map such RNA-chromatin interactions. However, all of them have significant limitations. They tend to miss interactions, or require a lot of input material, or disrupt the nuclear structure.

To address these shortcomings, a RIKEN-led team has presented a new method: RNA and DNA Interacting Complexes Ligated and Sequenced, or RADICL-seq for short. The technique produces more accurate results and keeps the cells intact up until the RNA-chromatin contacts are ligated.

The main idea of the RADICL-seq method is the following. First, the RNA is crosslinked to proteins located close to it in the nucleus of cells with formaldehyde. Then, DNA is cut into pieces by digesting it with a special protein. After that, the technology employs RNase H treatment to reduce ribosomal RNA content, thus increasing the accuracy of the result. Then, by using a bridge adapter (a molecule with single-stranded and double-stranded ends) the proximal DNA and RNA are ligated (fig. 2a). After the reversal of crosslinks, the RNA-adapter-DNA chimera is converted to double-stranded DNA for sequencing (fig. 2b), revealing the sequence of the ligated RNA and DNA.

  • Figure 2a. Reactions in the cell nucleus (a): The RNA is depicted in red, with the DNA shown as black double lines and a linking molecule in dark blue. The light blue circles stand for proteins. The black dot is a molecule that allows to single out the DNA-RNA hybrid in solution. Explanations are provided in the text. Reactions in solution (b): protein removal, second strand synthesis, clipping to predetermined size, addition of sequences used for recognition, and sequencing Credit: Alessandro Bonetti et al./Nature Communications. Credit: Moscow Institute of Physics and Technology
  • Figure 3. Noncoding RNA-chromatin interaction patterns: Neat1 (a, b) and Fgfr2 (c, d) in mouse embryonic stem cells (mESC) and mouse oligodendrocyte progenitor cells (mOPC). Neat1 and Fgfr2 come from chromosome Nos. 18 and 7, respectively Credit: Alessandro Bonetti et al./Nature Communications

Decoding the noncoding

In comparison with other existing methods, RADICL-seq mapped RNA-chromatin interactions with a higher accuracy. Moreover, the superior resolution of the technology allowed the team to detect chromatin interactions not only with the noncoding but also with the coding RNAs, including those found far from their transcription locus. The research confirmed that long noncoding RNAs play an important role in the regulation of gene expression occurring at a considerable distance from the regulated gene.

This technology can also be used to study cell type-specific RNA-chromatin interactions. The scientists proved it by looking at two noncoding RNAs in a mouse cell, one of them possibly associated with schizophrenia. They found that an interaction pattern between chromatin and those RNAs in two different cells—the embryonic stem cell and the oligodendrocyte progenitor cell—correlated with preferential gene expression in those cell types (fig. 3).

The new method's flexibility means scientists can gather additional biological information by modifying the experiment. In particular, this technology can make it possible to identify direct RNA-DNA interactions not mediated by chromatin proteins. The analysis performed by bioinformaticians from the Research Center of Biotechnology and MIPT showed that not only the standard double helix interactions between DNA and RNA but also those involving RNA-DNA triplexes could participate in gene regulation. Also, such interactions highlight the significance of noncoding RNA in protein targeting to particular gene loci.

"We are planning to conduct further research on the role of RNA in the regulation of gene expression, chromatin remodeling, and ultimately, cell identity. Hopefully, we will be able to regulate genes by using these noncoding RNAs in the near future. This can be especially helpful for treating diseases," says Yulia Medvedeva, who leads the Regulatory Transcriptomics and Epigenomics group at the Research Center of Biotechnology, RAS, and heads the Lab of Bioinformatics for Cell Technologies at MIPT. She also manages the grant project supported by the Russian Science Foundation, which co-funded the study.

Where does transcription occur?

Transcription takes place in the cell’s nucleus and starts when an enzyme called RNA polymerase binds to the section of DNA it needs and opens the double helix. RNA polymerase binds at an area called the promoter, which is a found a short distance “upstream” from the gene itself. The promoter sequence is found on one strand only – this indicates not only where to start transcribing but also which strand of DNA to use.

Where does transcription occur?

Transcription takes place in the cell’s nucleus.

RNA polymerase starts to build a strand of mRNA using the DNA as a template. Thus, the DNA strand being used is called the template strand and the strand not used is called the coding strand (which contains the gene itself). The mRNA is made using complementary base pairing. As the DNA strand is unwound and its bases exposed, the corresponding RNA base is put in place by RNA polymerase. Adenine always pairs with thymine (or uracil, in the case of RNA), and cytosine always pairs with guanine. So, if the DNA template strand says A T C G A T C G, the RNA will read U A G C U A G C.

RNA polymerase continues to build the strand of mRNA until it finds a terminator sequence at the end of the gene. The enzyme then leaves the DNA strand, and is now free to transcribe another gene.

Some mRNA strands need modifications before they can leave the nucleus: 1) a “cap” is put on one end, 2) a string of about 200 adenines is added to other end, and 3) junk sequences, called introns, are removed. Once these modifications are complete, the mRNA is ready to head out into the cell for translation.

There is no spell-check or quality control so many errors do occur. However, mRNA and the proteins that are made from them are easily broken down and remade if they are incorrect the first time.

How does RNA transcription determine which half of the DNA to use? - Biology

I. What is Sickle Cell Anemia?

A gene is a segment of DNA that codes for a protein or a trait.. Genes can be any length and sometimes involve multiple sections of DNA. The HBB gene provides instructions for making a protein called Beta-globin which is part of a large protein called hemoglobin that is found in red blood cells. Each hemoglobin protein can carry four molecules of oxygen, which is delivered to the body's organs and tissues.

If a person doesn't have enough red blood cells or the cells don't work properly, organs can become deprived of oxygen. This condition is called anemia. A person with anemia may feel tired all the time, experience difficulty with breathing, leg cramps, and dizziness.

There is one type of anemia that is related to the shape of the HBB protein. When a person has sickle cell anemia, the hemoglobin protein forms long chains that change the shape of the red blood cell. Instead of a disc shaped structure that moves easily through blood vessels, sickled blood cells are shaped like bananas. The reason they have a sickled shape is because the underlying gene has the wrong instructions. These misshapen blood cells get clogged in vessels and don't have the life expectancy of normal blood cells. A person with sickle cell disease will experience fatigue (feeling tired) and have episodes of extreme pain, called a pain crisis. Sickled blood cells that block vessels in the brain can even cause stroke.

Sickle cell anemia is a life threatening disease that affects about 100,000 Americans. It is an inherited disease that is passed from parents to their children, but parents can be carriers of the gene and not have any symptoms. If both parents are carriers, their children have a 25% chance of having sickle cell anemia.

1. What is a gene? _________________________________________

2. What is hemoglobin? _________________________________________

3. How is a sickled blood cell different from a normal one? __________________________

4. Why are the blood cells the wrong shape? ___________________________

5. What are the symptoms of sickle cell anemia? ___________________________

6. What is a carrier? ______________________________

II. How DNA Makes Protein

Recall that DNA contains four bases: Adenine, Guanine, Cytosine, and Thymine. The sequence of A's, T's, G's, and C's are what determines the protein that is built. Each set of three bases will code for a single amino acid. Proteins are simply chains of amino acids. To make proteins, DNA must send its code sequence to the ribosomes in the cell, but it needs a messenger to do that. Transcription is the process where DNA is converted to a molecule of messenger RNA (mRNA). The mRNA is then used to build a protein like hemoglobin.


To determine the amino acid sequence of the gene, you must transcribe the DNA to RNA. The base pair rule is used to create RNA, but RNA does not contain thymine, it contains URACIL instead. This is why codon charts have U's in them and no T's.

A codon chart tells you what bases in RNA code for what amino acids. The ribosome combines all the amino acids to create a single protein, like hemoglobin. It takes three bases to determine one amino acid. Amino acids are usually abbreviated. GUC makes the amino acid valine, abbreviated as "Val" on the chart.

Here is how the codon chart could be used to determine the amino acid sequence:

DNA: A A T C A G → (DNA sequence of gene)
RNA: U U A G U C (transcribed from RNA follow base pair rule, no T's)
Amino Acids: Leu Val (find on codon chart)

7. In order to make a protein, the message on DNA must be converted to what? ___________

8. How many bases in DNA are needed to code for a single amino acid? _________

9. What is a protein? _________________________________________________

10. . What base is found in RNA, but not DNA? ________

11. Consider the sequence shown, determine the complementary RNA and the amino acids

Amino Acids

III. A Change in DNA Can Change the Protein

Sometimes, one of the letters in DNA gets switched with another letter, causing a mutation in the DNA. Many mutations don't have any effects, but some will change the amino acid made by the ribosomes. In the case of sickle cell anemia, just a single letter change alters the shape of the hemoglobin protein.

12. Use the codon chart to determine the amino acids created from each DNA.

13. What codon in the sickle cell DNA is altered? ___________________ (1st, 2nd, or 3rd)

14. What happens in people that have this difference in their DNA? ____________________

15. Explain how it would be possible to have a change in a single base of DNA, but have the protein NOT change and be functional. Hint: look at the codon chart.

TED-Ed Video on Sickle Cell (

/>This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Before we dive straight into the code let’s do a brief overview of the inputs, outputs, and algorithm for converting a template DNA strand into an RNA strand.

Input (dna_strand)

Our input is a 5′ to 3′ DNA strand. In programming terms this is a String consisting of the characters ‘A’, ‘C’, ‘T’, or ‘G’

Output (rna_strand)

Our output is a 3′ to 5′ RNA strand. That is we will have a String consisting of the characters ‘A’, ‘C’, ‘U’, or ‘G’


This algorithm simply goes through each nucleotide (character) in our DNA strand and replaces it with the appropriate base for the RNA strand. For those familiar with Python this code will look very straight forward, and will even work in Python with slight modifications. Below I will give a Python implementation and Javascript implementation of this algorithm.

Transcription of DNA

Transcription describes the process by which the genetic information contained within DNA is re-written into messenger RNA (mRNA) by RNA polymerase. This mRNA then exits the nucleus, where it provides the basis for the translation of DNA. By controlling the production of mRNA in the nucleus, the cell regulates the rate of gene expression.

In this article we will look at the process of DNA transcription, including the post-transcriptional modification of mRNA and its importance.

Watch the video: Transcription and Translation: From DNA to Protein (July 2022).


  1. Corben

    After mine, the subject is very interesting. Give with you we will deal in PM.

  2. Felamaere

    Something didn't bring me to that topic.

  3. Azzam

    Well done, it seems to me, this is the excellent sentence

  4. Amiti

    I hope tomorrow will be ...

  5. Zale

    I don’t understand what’s the matter, but my current 2 pictures were loaded. (((And finally you liked it! :)

  6. Frang

    Female beauty, this is something without which the world will not be interesting! Photos class !!!!!

Write a message