We are searching data for your request:
Upon completion, a link will appear to access the found materials.
The increasing complexity of eukaryotic organisms was thought to arise from an increasing number of genes. There is about one transcription factor for every gene in yeast, but one for every ten in humans.
In simple eukaryotes, cis regulatory elements would include the promoter (TATA box region), and upstream regulatory sequences (enhancer) and silencers about 100-200 base pairs from the promoter. In more complex eukaryotic species like humans, the promoter is more complex, containing the TATA box, initiator sequences (INR) and downstream promoter elements (DPE). Upstream cis regulatory elements (as far as 10 kb from the promoter) include multiple enhancers, silencers, and insulators. Most promoters have TATA boxes, where TATA Binding Protein (TBP) binds. Upstreams elements in turn regulate the binding of TBP.
- Eukaryotic promoters and regulatory regions
- Eukaryotic multisubunit general transcription apparatus
- Biological Regulation: BioBase - Gene Regulation (TRANS-FAC 6.0 public site free with registration)
Rapid evolution of enormous, multichromosomal genomes in flowering plant mitochondria with exceptionally high mutation rates
Genome size and complexity vary tremendously among eukaryotic species and their organelles. Comparisons across deeply divergent eukaryotic lineages have suggested that variation in mutation rates may explain this diversity, with increased mutational burdens favoring reduced genome size and complexity. The discovery that mitochondrial mutation rates can differ by orders of magnitude among closely related angiosperm species presents a unique opportunity to test this hypothesis. We sequenced the mitochondrial genomes from two species in the angiosperm genus Silene with recent and dramatic accelerations in their mitochondrial mutation rates. Contrary to theoretical predictions, these genomes have experienced a massive proliferation of noncoding content. At 6.7 and 11.3 Mb, they are by far the largest known mitochondrial genomes, larger than most bacterial genomes and even some nuclear genomes. In contrast, two slowly evolving Silene mitochondrial genomes are smaller than average for angiosperms. Consequently, this genus captures approximately 98% of known variation in organelle genome size. The expanded genomes reveal several architectural changes, including the evolution of complex multichromosomal structures (with 59 and 128 circular-mapping chromosomes, ranging in size from 44 to 192 kb). They also exhibit a substantial reduction in recombination and gene conversion activity as measured by the relative frequency of alternative genome conformations and the level of sequence divergence between repeat copies. The evolution of mutation rate, genome size, and chromosome structure can therefore be extremely rapid and interrelated in ways not predicted by current evolutionary theories. Our results raise the hypothesis that changes in recombinational processes, including gene conversion, may be a central force driving the evolution of both mutation rate and genome structure.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figure 1. Sequence divergence, genome size, and…
Figure 1. Sequence divergence, genome size, and gene content in seed plant mitochondria.
Figure 2. Levels of synonymous ( d…
Figure 2. Levels of synonymous ( d S ) and nonsynonymous ( d N )…
Figure 3. Number of indels in mitochondrial…
Figure 3. Number of indels in mitochondrial protein genes and introns that are unique to…
Figure 4. Protein and RNA gene content…
Figure 4. Protein and RNA gene content in sequenced seed plant mitochondrial genomes.
Figure 5. Size distribution of repetitive content…
Figure 5. Size distribution of repetitive content by the number of repeat pairs (left column)…
Figure 6. Repeat-mediated recombinational activity in the…
Figure 6. Repeat-mediated recombinational activity in the low mutation rate S. latifolia and S. vulgaris…
Figure 7. Distribution of percent sequence identity…
Figure 7. Distribution of percent sequence identity between pairs of repeats detected by BLAST.
Figure 8. Silene mitochondrial genome sizes relative…
Figure 8. Silene mitochondrial genome sizes relative to all sequenced mitochondrial and eubacterial genomes from…
Dataset complexity impacts both MOTU delimitation and biodiversity estimates in eukaryotic 18S rRNA metabarcoding studies
How does the evolution of bioinformatics tools impact the biological interpretation of high-throughput sequencing datasets? For eukaryotic metabarcoding studies, in particular, researchers often rely on tools originally developed for the analysis of 16S ribosomal RNA (rRNA) datasets. Such tools do not adequately account for the complexity of eukaryotic genomes, the ubiquity of intragenomic variation in eukaryotic metabarcoding loci, or the differential evolutionary rates observed across eukaryotic genes and taxa. Recently, metabarcoding workflows have shifted away from the use of Operational Taxonomic Units (OTUs) towards delimitation of Amplicon Sequence Variants (ASVs). We assessed how the choice of bioinformatics algorithm impacts the downstream biological conclusions that are drawn from eukaryotic 18S rRNA metabarcoding studies. We focused on four workflows including UCLUST and VSearch algorithms for OTU clustering, and DADA2 and Deblur algorithms for ASV delimitation. We used two 18S rRNA datasets to further evaluate whether dataset complexity had a major impact on the statistical trends and ecological metrics: a “high complexity” (HC) environmental dataset generated from community DNA in Arctic marine sediments, and a “low complexity” (LC) dataset representing individually-barcoded nematodes. Our results indicate that ASV algorithms produce more biologically realistic metabarcoding outputs, with DADA2 being the most consistent and accurate pipeline regardless of dataset complexity. In contrast, OTU clustering algorithms inflate the metabarcoding-derived estimates of biodiversity, consistently returning a high proportion of “rare” Molecular Operational Taxonomic Units (MOTUs) that appear to represent computational artifacts and sequencing errors. However, species-specific MOTUs with high relative abundance are often recovered regardless of the bioinformatics approach. We also found high concordance across pipelines for downstream ecological analysis based on beta-diversity and alpha-diversity comparisons that utilize taxonomic assignment information. Analyses of LC datasets and rare MOTUs are especially sensitive to the choice of algorithms and better software tools may be needed to address these scenarios.
The Secret Role Histones Played in Complex Cell Evolution
To revist this article, visit My Profile, then View saved stories.
By managing gene expression in complex cells, octets of histone proteins helped to enable the explosive diversity of eukaryotic life. Illustration: Jason Lyon/Quanta Magazine
To revist this article, visit My Profile, then View saved stories.
Molecular biology has something in common with kite-flying competitions. At the latter, all eyes are on the colorful, elaborate, wildly kinetic constructions darting through the sky. Nobody looks at the humble reels or spools on which the kite strings are wound, even though the aerial performances depend on how skillfully those reels are handled. In the biology of complex cells, or eukaryotes, the ballet of molecules that transcribe and translate genomic DNA into proteins holds centerstage, but that dance would be impossible without the underappreciated work of histone proteins gathering up the DNA into neat bundles and unpacking just enough of it when needed.
Histones, as linchpins of the apparatus for gene regulation, play a role in almost every function of eukaryotic cells. “In order to get complex, you have to have genome complexity, and evolve new gene families, and you have to have a cell cycle,” explained William Martin, an evolutionary biologist and biochemist at Heinrich Heine University in Germany. “And what’s in the middle of all this? Managing your DNA.”
Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.
New work on the structure and function of histones in ancient, simple cells has now made the longstanding, central importance of these proteins to gene regulation even clearer. Billions of years ago, the cells called archaea were already using histones much like our own to manage their DNA—but they did so with looser rules and much more variety. From those similarities and differences, researchers are gleaning new insights, not only into how the histones helped to shape the origins of complex life, but also into how variants of histones affect our own health today. At the same time, though, new studies of histones in an unusual group of viruses are complicating the answers about where our histones really came from.
Eukaryotes arose about 2 billion years ago, when a bacterium that could metabolize oxygen for energy took up residence inside an archaeal cell. That symbiotic partnership was revolutionary because energy production from that proto-mitochondrion suddenly made expressing genes much more metabolically affordable, Martin argues. The new eukaryotes suddenly had free rein to expand the size and diversity of their genomes and to conduct myriad evolutionary experiments, laying the foundation for the countless eukaryotic innovations seen in life today. “Eukaryotes are an archaeal genetic apparatus that survives with the help of bacterial energy metabolism,” Martin said.
But the early eukaryotes went through serious growing pains as their genomes expanded: The larger genome brought new problems stemming from the need to manage an increasingly unwieldy string of DNA. That DNA had to be accessible to the cell’s machinery for transcribing and replicating it without getting tangled up in a hopeless spaghetti ball.
The DNA also sometimes needed to be compact, both to help regulate transcription and regulation, and to separate the identical copies of DNA during cell division. And one danger of careless compaction is that DNA strands can irreversibly bind together if the backbone of one interacts with the groove of another, rendering the DNA useless.
Bacteria have a solution for this that involves a variety of proteins jointly “supercoiling” the cells’ relatively limited libraries of DNA. But eukaryotes’ DNA management solution is to use histone proteins, which have a unique ability to wrap DNA around themselves rather than just sticking to it. The four primary histones of eukaryotes—H2A, H2B, H3 and H4—assemble into octamers with two copies of each. These octamers, called nucleosomes, are the basic units of eukaryotic DNA packaging.
By curving the DNA around the nucleosome, the histones prevent it from clumping together and keep it functional. It’s an ingenious solution—but eukaryotes didn’t invent it entirely on their own.
Back in the 1980s, when the cellular and molecular biologist Kathleen Sandman was a postdoc at Ohio State University, she and her adviser, John Reeve, identified and sequenced the first known histones in archaea. They showed how the four principal eukaryotic histones were related to each other and to the archaeal histones. Their work provided the early evidence that in the original endosymbiotic event that led to eukaryotes, the host was likely to have been an archaeal cell.
But it would be a teleological mistake to think that archaeal histones were just waiting for the arrival of eukaryotes and the chance to enlarge their genomes. “A lot of these early hypotheses looked at histones in terms of their ability to allow the cell to expand its genome. But that doesn’t really tell you why they were there in the first place,” said Siavash Kurdistani, a biochemist at the University of California, Los Angeles.
As a first step toward those answers, Sandman joined forces several years ago with the structural biologist Karolin Luger, who solved the structure of the eukaryotic nucleosome in 1997. Together, they worked out the crystallized structure of the archaeal nucleosome, which they published with colleagues in 2017. They found that the archaeal nucleosomes are “uncannily similar” in structure to eukaryotic nucleosomes, Luger said—despite the marked differences in their peptide sequences.
Archaeal nucleosomes had already “figured out how to bind and bend DNA in this beautiful arc,” said Luger, now a Howard Hughes Medical Institute investigator at the University of Colorado, Boulder. But the difference between the eukaryotic and archaeal nucleosomes is that the crystal structure of the archaeal nucleosome seemed to form looser, Slinky-like assemblies of varying sizes.
In a paper in eLife published in March, Luger, her postdoc Samuel Bowerman, and Jeff Wereszczynski of the Illinois Institute of Technology followed up on the 2017 paper. They used cryo-electron microscopy to solve the structure of the archaeal nucleosome in a state more representative of a live cell. Their observations confirmed that the structures of archaeal nucleosomes are less fixed. Eukaryotic nucleosomes are always stably wrapped by about 147 base pairs of DNA, and always consist of just eight histones. (For eukaryotic nucleosomes, “the buck stops at eight,” Luger said.) Their equivalents in archaea wind up between 60 and 600 base pairs. These “archaeasomes” sometimes hold as few as three histone dimers, but the largest ones consist of as many as 15 dimers.
They also found that unlike the tight eukaryotic nucleosomes, the Slinky-like archaeasomes flop open stochastically, like clamshells. The researchers suggested that this arrangement simplifies gene expression for the archaea, because unlike eukaryotes, they don’t need any energetically expensive supplemental proteins to help unwind DNA from the histones to make them available for transcription.
That’s why Tobias Warnecke, who studies archaeal histones at Imperial College London, thinks that “there’s something special that must have happened at the dawn of eukaryotes, where we transition from just having simple histones … to having octameric nucleosomes. And they seem to be doing something qualitatively different.”
What that is, however, is still a mystery. In archaeal species, there are “quite a few that have histones, and there are other species that don’t have histones. And even those that do have histones vary quite a lot,” Warnecke said. Last December, he published a paper showing that there are diverse variants of histone proteins with different functions. The histone-DNA complexes vary in their stability and affinity for DNA. But they are not as stably or regularly organized as eukaryotic nucleosomes.
As puzzling as the diversity of archaeal histones is, it provides an opportunity to understand the different possible ways of building systems of gene expression. That’s something we cannot glean from the relative “boringness” of eukaryotes, Warnecke says: Through understanding the combinatorics of archaeal systems, “we can also figure out what’s special about eukaryotic systems.” The variety of different histone types and configurations in archaea may also help us deduce what they might have been doing before their role in gene regulation solidified.
Because archaea are relatively simple prokaryotes with small genomes, “I don’t think that the original role of histones was to control gene expression, or at least not in a manner that we are used to from eukaryotes,” Warnecke said. Instead, he hypothesizes that histones might have protected the genome from damage.
Archaea often live in extreme environments, like hot springs and volcanic vents on the seafloor, characterized by high temperatures, high pressures, high salinity, high acidity or other threats. Stabilizing their DNA with histones may make it harder for the DNA strands to melt in those extreme conditions. Histones also might protect archaea against invaders, such as phages or transposable elements, which would find it harder to integrate into the genome when it’s wrapped around the proteins.
Kurdistani agrees. “If you were studying archaea 2 billion years ago, genome compaction and gene regulation are not the first things that would come to mind when you are thinking about histones,” he said. In fact, he has tentatively speculated about a different kind of chemical protection that histones might have offered the archaea.
Last July, Kurdistani’s team reported that in yeast nucleosomes, there is a catalytic site at the interface of two histone H3 proteins that can bind and electrochemically reduce copper. To unpack the evolutionary significance of this, Kurdistani goes back to the massive increase in oxygen on Earth, the Great Oxidation Event, that occurred around the time that eukaryotes first evolved more than 2 billion years ago. Higher oxygen levels must have caused a global oxidation of metals like copper and iron, which are critical for biochemistry (although toxic in excess). Once oxidized, the metals would have become less available to cells, so any cells that kept the metals in reduced form would have had an advantage.
During the Great Oxidation Event, the ability to reduce copper would have been “an extremely valuable commodity,” Kurdistani said. It might have been particularly attractive to the bacteria that were forerunners of mitochondria, since cytochrome c oxidase, the last enzyme in the chain of reactions that mitochondria use to produce energy, requires copper to function.
Because archaea live in extreme environments, they might have found ways to generate and handle reduced copper without being killed by it long before the Great Oxidation Event. If so, proto-mitochondria might have invaded archaeal hosts to steal their reduced copper, Kurdistani suggests.
The hypothesis is intriguing because it could explain why the eukaryotes appeared when oxygen levels went up in the atmosphere. “There was 1.5 billion years of life before that, and no sign of eukaryotes,” Kurdistani said. “So the idea that oxygen drove the formation of the first eukaryotic cell, to me, should be central to any hypotheses that try to come up with why these features developed.”
Kurdistani’s conjecture also suggests an alternative hypothesis for why eukaryotic genomes got so big. The histones’ copper-reducing activity only occurs at the interface of the two H3 histones inside an assembled nucleosome wrapped with DNA. “I think there’s a distinct possibility that the cell wanted more histones. And the only way to do that was to expand this DNA repertoire,” Kurdistani said. With more DNA, cells could wrap more nucleosomes and enable the histones to reduce more copper, which would support more mitochondrial activity. “It wasn’t just that histones allowed for more DNA, but more DNA allowed for more histones,” he said.
Low-complexity sequences are extremely abundant in eukaryotic proteins for reasons that remain unclear. One hypothesis is that they contribute to the formation of novel coding sequences, facilitating the generation of novel protein functions. Here, we test this hypothesis by examining the content of low-complexity sequences in proteins of different age. We show that recently emerged proteins contain more low-complexity sequences than older proteins and that these sequences often form functional domains. These data are consistent with the idea that low-complexity sequences may play a key role in the emergence of novel genes.
Low-complexity regions (LCRs) are amino acid sequences that contain repeats of single amino acids or short amino acid motifs. They are extremely abundant in eukaryotic proteins ( Green and Wang 1994 Golding 1999 Marcotte et al. 1999). In fact, the majority of proteins from a wide range of eukaryotic species show a significant tendency toward being more repetitive than expected given their amino acid composition ( Alba, Tompa, et al. 2007). Many LCRs are highly unstable due to the action of replication slippage and recombination ( Ellegren 2004), and the uncontrolled expansion of short sequence motifs causes several human diseases, including Huntington’s disease and other neurodegenerative disorders ( Gatchel and Zoghbi 2005), as well as a number of developmental diseases ( Brown and Brown 2004). The abundance of LCRs seems paradoxical given their high pathogenic potential. One hypothesis to explain their persistence is that they increase phenotypic variation within populations, facilitating adaptation ( Kashi and King 2006). While many LCRs are of unknown function, there are examples of LCRs playing various functional roles, including the modulation of protein–protein interactions ( Xiao and Jeang 1998), protein–nucleic acid interactions ( Shen et al. 2004), and protein subcellular localization ( Salichs et al. 2009). Expansion or contraction of LCRs can therefore potentially impact protein function.
An alternative hypothesis to explain the abundance of LCRs is that they facilitate the formation of novel coding sequences ( Green and Wang 1994). Analysis of human family genotypes has shown that, when the repeats are short, they are more likely to expand than to contract ( Xu et al. 2000), which favors the extension of initially short “seed” repeats into longer repeats. Accumulation of subsequent mutations may lead to the emergence of new useful protein functions. A more radical idea is that repetitive sequences are important for the generation of completely novel coding sequences. In the early 80s, Ohno and Epplen proposed that the first protein encoding sequences were probably highly repetitive, as expansion of repeated tracts was more likely to yield long polypeptide chains with no interrupting codons than when sequences had a random amino acid composition ( Ohno and Epplen 1983 Ohno 1984). Inspired by this idea, we decided to test if recently emerged genes contain more LCRs than older genes. Although there have been some observations that point in this direction ( Nishizawa et al. 1999 Alba and Castresana 2005), the question had not been yet examined in detail. To learn about the contribution of LCRs to protein function, we also quantified how many LCRs were located in already described protein domains and how many were located in regions not corresponding to domains.
To study the correlation of gene age and LCR content, we obtained three groups of human proteins that arose at different periods “Mammalian” (∼300 to 100 Ma), “Vertebrate” (∼500 to 300 Ma) and “Old” (>500 Ma). For proteins containing hits to Pfam protein domains ( Finn et al. 2008), the classification was based on the phylogenetic distribution of such domains, as determined by domain-specific hidden Markov model searches in different eukaryotic proteomes (see Materials and Methods). For proteins not containing hits to Pfam protein domains, the classification was based on database sequence similarity searches using BlastP ( Altschul et al. 1997). We searched for LCRs in all classified proteins using the SEG algorithm, which identifies regions of biased composition enriched in one or a few amino acids ( Wootton and Federhen 1996). The majority of proteins from each of the age classes contained at least one LCR (83.1% Old, 83.7% Vertebrate, and 87% Mammalian), confirming the strong pervasiveness of these sequences. However, in younger proteins, LCRs occupied a significantly larger fraction of the sequence than in older proteins ( fig. 1). On average, the LCR content of Mammalian proteins was double the LCR content of proteins classified as Old (the values for all proteins can be found in supplementary file 1 , Supplementary Material online). This relationship was maintained in proteins containing known domains and in proteins lacking them ( supplementary table 1 , Supplementary Material online), and consistent results were obtained using a different algorithm to measure sequence repetitiveness, SIMPLE ( Alba, Laskowski, et al. 2002) ( supplementary table 2 , Supplementary Material online).
Younger proteins are enriched in low-complexity sequences. Box-plot of the percentage of the protein composed of low-complexity sequences, for proteins of different age. The horizontal line represents the median. Number of proteins: 12,855 “Old”, 1,324 “Vertebrate”, and 420 “Mammalian”
Protists are one-celled eukaryotes. However, like every rule in biology, exceptions exist. Sometimes, various seaweeds are grouped with protists, even though they have many cells. The protists include a wide range of organisms. Some are not particularly closely related. In fact, genetics reveals that protists consist of at least ten groups equivalent to kingdoms. To put this in perspective, all animals, from worms to humans, belong to a single kingdom. Examples of protists include amoebas, parameciums and kelp. All algae, except blue-green algae (now known as cyanobacteria) are eukaryotes.
Challenges in Classifying Biology
There are multiple ways to categorize nearly everything in biology. The value of an individual categorization scheme depends on the perspective of the user. As Shirley Malcom (Director, Education and Human Resources Programs, AAAS) once said, “We [Biologists] are splitters not lumpers.” This need for classification makes consensus difficult, can limit our way of thinking, and can make scientific findings sound more sensational than they really are. The increasing appreciation for the multifunctionality of biological entities at all scales reveals how difficult it is to put biology into exclusive and unambiguous categories. Furthermore, as multifunctionality becomes the norm, we should remember that it is the human in the role of researcher, clinician, scientific author, and reader of the scientific literature that needs classification to reveal the patterns and meaning in the complexity of living organisms. We should not let the current classification systems (ontologies) of genes, proteins, organelles, pathways, physiological systems, or even organisms limit how we think about and explore biology.
The protein beta catenin functions in cell adhesion complexes and in complexes that regulate gene expression. ubus12 – Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=27094098
A few examples at each scale of biology illustrate the challenges created by our human predilection for classification. Just given the name of a protein or the abbreviation of the gene that encodes it, the first questions are likely to be: What is it? What does this protein do? If the name includes a function, then there is a reasonable chance of assuming the function implied by the name is correct and relevant. Otherwise, the gene abbreviation or protein name is unlikely to be meaningful unless it is very common in medicine, like insulin, or unless you have studied that gene or protein. Many proteins are multifunctional. The protein β-catenin is a good example. When incorporated into the protein complexes that allow cells to form stable contacts with each other, it is part of a cellular adhesion complex and so has the function of mediating cell-cell adhesion. In response to certain external signals, β-catenin can move into the nucleus and regulate gene expression. Thus, it is a transcriptional regulator. Certainly, β-catenin should be categorized with both functions. What if each function is important in a different context? Both functions need to be captured, but somehow the context-specific details need to be included as well.
The organization of proteins into distinct regulatory or biochemical pathways is also a human construction. Regulatory and signaling pathways are highly interconnected. Indeed, this must be true. Cells cannot move and divide at the same time, so the pathways controlling movement and division must be connected. Molecules previously considered as biochemical intermediates in metabolic pathways are becoming increasingly appreciated as regulators of signaling pathways and cellular behavior. How can this complexity in molecular function be captured in a useful way in a classification scheme?
Transmission electron micrograph of mitochondria. By Louisa Howard – http://remf.dartmouth.edu/imagesindex.html Public Domain via Wikipedia
Moving up in size, the organelles in a eukaryotic cell tend to be functionally defined according to the function first identified or most studied. For example, textbooks describe mitochondria as the cell’s powerhouse, because mitochondria generate ATP but the mitochondria are a source of reactive oxygen species and many kinds of intracellular signaling molecules and are a sink for calcium. So, what is the best way to classify mitochondrial function?
Moving even farther up in size, organs are classified into physiological systems—the cardiovascular system, the endocrine system, the musculoskeletal system and so on. An excellent example is bone. As the skeletal system, bones provide support, movement, and protection. However, bones are also part of the immune system: Bones are the site of blood cell production. Bones are part of the endocrine system: They release hormones that regulate appetite, fertility, and metabolism. Even the well-known and long-standing physiological categories fail to represent a true picture of the complex multifunctionality of the tissues and cells that comprise organ systems.
Hawaiian Bobtail squid. This squid has a symbiotic relationship with a bioluminescent bacteria. Photo by Margaret McFall-Ngai – Divining the Essence of Symbiosis: Insights from the Squid-Vibrio Model, CC BY 4.0, via Wikipedia
Going all the way to a person, plants, and marine organisms, these are defined by a single species name yet people have microbiomes in their gut, mouth, skin, ears, eyes, and genitals legumes have symbiotic fungi that are part of their root systems and many bioluminescent marine animals have bacteria that provide the light. So, how should we classify these? They, indeed even we humans, are all metaorganisms—multiple species living in harmony.
Why does it matter if it is hard to classify biological information? Classification enables systems-level analysis of large data sets. Classification enables automation. Classification increases the ability to retrieve information from large data sets and enables the interpretation, discovery of new patterns, and acquisition of knowledge from large data sets. However, information acquired through use of classification schemes is only as good as the classification scheme, the consistency with which it is applied, and knowledge about its limitations.
Ideally, all functionally important information should be included whenever possible in the scientific literature. Furthermore, the relevant context-specific function(s) should be indicated when known. This need for context-specific information to ensure accuracy means that using text-mining and then applying an ontology that includes all functional classifications is not going to provide the necessary context-specific information. Automated classification is challenging and curation is necessary to ensure context-dependent accuracy. Thus, effective scientific communication relies on the author to provide the contextual details to ensure that the literature is accurate and precise, which makes biological findings as reproducible as possible.