We are searching data for your request:
Upon completion, a link will appear to access the found materials.
I had a discussion with a colleague who told me internal transcribed spacers (ITSs) are not good for divergence time estimation:
- Because they are non protein coding genes;
- Because they belong to a gene family.
However, I was not able to find anything in the literature about that. Can someone please enlighten me?
Premise of the Study
The internal transcribed spacer (ITS) region is situated between 18S and 26S in a polycistronic rRNA precursor transcript. It had been proved to be the most commonly sequenced region across plant species to resolve phylogenetic relationships ranging from shallow to deep taxonomic levels. Despite several taxonomical revisions in Cassiinae, a stable phylogeny remains elusive at the molecular level, particularly concerning the delineation of species in the genera Cassia, Senna and Chamaecrista. This study addresses the comparative potential of ITS datasets (ITS1, ITS2 and concatenated) in resolving the underlying morphological disparity in the highly complex genera, to assess their discriminatory power as potential barcode candidates in Cassiinae.
A combination of experimental data and an in-silico approach based on threshold genetic distances, sequence similarity based and hierarchical tree-based methods was performed to decipher the discriminating power of ITS datasets on 18 different species of Cassiinae complex. Lab-generated sequences were compared against those available in the GenBank using BLAST and were aligned through MUSCLE 3.8.31 and analysed in PAUP 4.0 and BEAST1.8 using parsimony ratchet, maximum likelihood and Bayesian inference (BI) methods of gene and species tree reconciliation with bootstrapping. DNA barcoding gap was realized based on the Kimura two-parameter distance model (K2P) in TaxonDNA and MEGA.
Based on the K2P distance, significant divergences between the inter- and intra-specific genetic distances were observed, while the presence of a DNA barcoding gap was obvious. The ITS1 region efficiently identified 81.63% and 90% of species using TaxonDNA and BI methods, respectively. The PWG-distance method based on simple pairwise matching indicated the significance of ITS1 whereby highest number of variable (210) and informative sites (206) were obtained. The BI tree-based methods outperformed the similarity-based methods producing well-resolved phylogenetic trees with many nodes well supported by bootstrap analyses.
The reticulated phylogenetic hypothesis using the ITS1 region mainly supported the relationship between the species of Cassiinae established by traditional morphological methods. The ITS1 region showed a higher discrimination power and desirable characteristics as compared to ITS2 and ITS1 + 2, thereby concluding to be the locus of choice. Considering the complexity of the group and the underlying biological ambiguities, the results presented here are encouraging for developing DNA barcoding as a useful tool for resolving taxonomical challenges in corroboration with morphological framework.
Software design and operation
ITSx expects query sequences in the fasta format (Pearson & Lipman 1988 ), with or without gaps. There is no limit on the number of query sequences. The software first examines the sequences in the default orientation the search is repeated in the reverse complementary orientation to account for incorrectly cast sequences (cf. Nilsson et al. 2011 ). Reverse complementary sequences are logged, reoriented and treated in the correct orientation in all subsequent steps. Each sequence is examined for matches to the HMMs. If the multiple-processor option is activated, ITSx employs the number of processor cores (or physical/logical processors as applicable) specified by the user, such that the speed of the analysis will roughly scale linearly with the number of CPU cores. An index is built of all regions matched by the HMMs.
The extraction is based on the HMM index of each query. By default, the ITS1 and ITS2 will be extracted from the query sequences and saved as separate fasta files. The user can opt to also produce separate files for the SSU, 5.8S and LSU. In addition, fasta files containing only those entries with the entire ITS region, or with the entire ITS1 or ITS2 regions, can be generated. This feature supports, for example, predicting the ITS1 and ITS2 secondary structure, which should be performed on full-length sequences (Koetschan et al. 2010 ). The SSU is extracted as everything from the 5′ end of the query sequence to the 3′ end of SSU as indicated by the HMM match the ITS1 is extracted 1 bp downstream from the end of SSU and 1 bp upstream of the start of 5.8S and so on. Partial extractions are supported. If, for example, only the 3′ end of 5.8S is detected, the ITS2 is extracted as everything downstream of that location. Various summary files are also written (see software documentation). A tab-separated file gives the start and stop positions for all markers in each query sequence. A log file records which query sequences, if any, were found to be reverse complementary. Additional, separate files record query sequences for which no HMMs were detected and query sequences for which the HMM matches occurred in an unexpected order. The open-source command line-based software is written in Perl, and it is freely available for unix -type operating systems (including MacOS X , linux and bsd ). Although distributed over the Internet (Item S1 http://microbiology.se/software/itsx/), the software does not require Internet access to run. Computer memory (RAM) roughly 1·5 times the size of the input data set and free disc space corresponding to about 4–5 times the size of the query file are needed to run the software.
Sequenced region contained 3` end of 5.8S gene, ITS2, and the 5` end of the 28S gene. Direct sequencing of amplicons of 30 individuals (10 individuals per each species) displayed intra-individual heterogeneities in all specimens analyzed. There are two kinds of heterogeneities: single nucleotide substitutions and mono, bi- and multi-nucleotide insertions/deletions. The presence of heterogeneities was indicated by double peaks in substitution positions, and by a series of mixed peaks in case of indel events, both positioned after a sequence of good quality. The examples of heterogeneities revealed by direct sequencing are displayed in Figure Figure1 1 .
Examples of results from direct sequencing of ITS2. a Example of polymorphism caused by an indel (black arrow indicate the beginning position of an indel) b Example of single nucleotide substitutions (indicated by black arrows).
To elucidate the visible heterogeneity, the amplicons for 2 specimens of Polyommatus (Agrodiaetus) peilei, 2 specimens of Polyommatus (Agrodiaetus) karindus and one specimen of Polyommatus (Agrodiaetus) morgani were cloned and 10 clones per specimen were sequenced. The summary of the heterogeneities in the ITS region displayed by the clones is depicted in Table Table1. 1 . Partial sequences of 5,8S and 28S genes were cropped from further analysis. Total length of ITS2 varied from 477 bp up to 512 bp depending on the presence of insertionsdeletions. Uncorrected “p” pairwise distances for all clones are given in Table Table2 2 .
Variable positions among sequenced clones.
|W136 Polyommatus (Agrodiaetus) peilei||#01||T||G||-||C||G||C||A||A||T||T||T||T||T||T||T||T||-||-||C||G||T||T||T||T||T||-||C||G||G||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#02||T||G||-||C||G||C||A||A||T||T||T||T||T||T||T||T||-||-||C||G||T||T||T||T||T||-||C||G||G||G||T|
|W136 Polyommatus (Agrodiaetus) peilei||#03||T||A||A||C||A||C||G||G||C||T||T||T||T||T||T||T||T||T||T||G||T||T||T||T||T||-||C||A||G||A||C|
|W136 Polyommatus (Agrodiaetus) peilei||#04||T||G||-||C||G||C||A||A||T||T||T||T||T||T||T||T||-||-||T||G||T||T||T||T||T||-||C||G||G||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#05||T||A||A||C||A||C||G||G||C||T||T||T||T||T||T||T||T||T||T||G||T||T||T||T||T||T||-||A||G||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#06||T||G||-||C||G||C||A||A||T||T||T||T||T||T||-||-||-||-||T||G||T||T||T||T||T||-||C||G||G||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#07||T||G||-||C||G||C||A||A||C||T||T||T||T||T||T||T||T||T||T||G||T||T||T||T||T||T||-||A||G||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#08||A||A||-||-||-||-||A||A||T||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||A||A||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#09||T||A||A||C||A||C||G||G||C||T||T||T||T||T||T||T||T||T||T||G||T||T||T||T||T||T||-||A||G||G||C|
|W136 Polyommatus (Agrodiaetus) peilei||#10||T||G||-||C||G||C||A||A||T||T||T||T||T||T||T||T||-||-||T||G||T||T||T||T||T||-||C||G||G||G||C|
|W202 Polyommatus (Agrodiaetus) peilei||#01||T||C||T||-||A||C||C||T||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||A||-||T||T||T||-||T||-||A|
|W202 Polyommatus (Agrodiaetus) peilei||#02||T||T||T||A||A||C||C||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||G||G||C||T||T||T||T||-||A|
|W202 Polyommatus (Agrodiaetus) peilei||#03||T||T||G||-||-||-||-||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||G||G||T||T||T||C||-||C||G|
|W202 Polyommatus (Agrodiaetus) peilei||#04||T||T||T||-||A||C||C||C||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||A||-||T||T||-||C||-||C||G|
|W202 Polyommatus (Agrodiaetus) peilei||#05||C||C||T||A||A||C||C||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||G||G||C||T||T||-||T||-||A|
|W202 Polyommatus (Agrodiaetus) peilei||#06||T||T||T||A||A||C||C||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||G||G||C||T||T||T||T||-||A|
|W202 Polyommatus (Agrodiaetus) peilei||#07||T||T||T||A||A||C||C||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||G||G||C||T||T||T||T||-||A|
|W202 Polyommatus (Agrodiaetus) peilei||#08||T||C||T||-||A||C||C||T||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||A||-||T||-||-||C||-||C||G|
|W202 Polyommatus (Agrodiaetus) peilei||#09||T||T||T||A||A||C||C||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||G||G||T||T||T||T||T||-||A|
|W202 Polyommatus (Agrodiaetus) peilei||#10||T||C||T||-||A||C||C||T||T||C||G||C||G||T||C||G||G||C||G||A||C||G||T||G||C||A||-||T||T||T||C||-||C||G|
|V145 Polyommatus (Agrodiaetus) karindus||#01||A||C||-||-||T||T||T||T||-|
|V145 Polyommatus (Agrodiaetus) karindus||#02||A||C||-||-||T||T||T||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#03||A||T||C||G||T||T||-||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#04||A||T||C||G||T||T||-||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#05||A||T||C||G||T||T||T||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#06||A||C||-||-||T||C||-||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#07||A||T||C||G||C||T||T||T||T|
|V145 Polyommatus (Agrodiaetus) karindus||#08||A||C||-||-||T||T||T||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#09||A||C||-||-||T||T||-||-||-|
|V145 Polyommatus (Agrodiaetus) karindus||#10||G||C||-||-||T||T||-||-||-|
|Z04 Polyommatus (Agrodiaetus) karindus||#01||C||G||T||T||C||C||A||C||T||T||T||T||T||T||T||C||-||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#02||C||G||T||T||C||C||A||T||T||T||T||T||T||T||T||C||-||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#03||C||G||T||T||C||C||A||T||T||T||T||T||T||T||T||C||-||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#04||C||G||T||T||C||C||A||T||T||T||T||T||T||T||T||C||-||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#05||T||G||T||-||C||C||A||T||T||T||T||T||T||T||-||C||A||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#06||T||G||T||-||C||C||A||T||T||T||T||T||T||T||-||C||A||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#07||T||A||T||-||C||C||A||T||T||T||T||T||T||T||-||C||A||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#08||C||G||T||T||C||C||A||T||T||T||T||T||T||T||-||C||-||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#09||C||G||T||T||C||C||A||T||T||T||T||T||T||T||-||C||-||A||A||A|
|Z04 Polyommatus (Agrodiaetus) karindus||#10||T||G||G||-||-||-||-||C||-||-||-||-||-||-||-||-||-||-||-||-|
|W127 Polyommatus (Agrodiaetus) morgani||#01||A||A||C||-||C||G||T||-||-||-||T||C||A||C||A||C||G||T||T||T||T||T||T||T||T||-||-||-||-||A||A||C||G||-||-||-||A||A||A||A||G||G|
|W127 Polyommatus (Agrodiaetus) morgani||#02||A||A||C||-||C||G||T||-||-||-||C||C||A||C||A||C||G||T||T||T||T||T||T||T||T||T||T||T||T||A||A||C||G||-||-||-||A||A||T||A||G||G|
|W127 Polyommatus (Agrodiaetus) morgani||#03||A||A||C||-||A||T||C||C||G||C||T||G||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||G||C||A||A||G||A||G||G||T|
|W127 Polyommatus (Agrodiaetus) morgani||#04||A||G||C||G||A||T||T||C||G||C||T||G||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||G||C||A||A||G||A||G||G||G|
|W127 Polyommatus (Agrodiaetus) morgani||#05||G||G||C||G||C||G||T||-||-||-||T||C||A||C||A||C||G||T||T||C||T||T||T||T||T||T||T||T||-||A||A||C||G||-||-||-||A||A||A||A||G||G|
|W127 Polyommatus (Agrodiaetus) morgani||#06||A||A||C||-||C||G||T||-||-||-||T||C||A||C||A||C||G||T||T||T||T||T||T||T||T||T||T||-||-||A||A||C||G||-||-||-||A||A||A||A||A||G|
|W127 Polyommatus (Agrodiaetus) morgani||#07||A||A||C||-||C||G||T||C||G||C||T||G||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||G||C||A||A||G||A||G||G||G|
|W127 Polyommatus (Agrodiaetus) morgani||#08||A||A||C||G||A||T||T||C||G||C||T||C||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||G||C||A||-||A||A||A||G||T|
|W127 Polyommatus (Agrodiaetus) morgani||#09||A||A||T||G||A||T||T||C||G||C||T||G||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||G||C||A||A||G||A||G||G||G|
|W127 Polyommatus (Agrodiaetus) morgani||#10||A||A||C||-||C||G||T||-||-||-||T||G||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||G||C||A||A||G||A||G||G||G|
Uncorrected ‘‘p’’ distance matrix of clones.
|Polyommatus (Agrodiaetus) peilei||W136_#01||W136_#02||W136_#03||W136_#04||W136_#05||W136_#06||W136_#07||W136_#08||W136_#09||W136_#10|
|Polyommatus (Agrodiaetus) peilei||W202_#01||W202_#02||W202_#03||W202_#04||W202_#05||W202_#06||W202_#07||W202_#08||W202_#09||W202_#10|
|Polyommatus (Agrodiaetus) karindus||V145_#01||V145_#02||V145_#03||V145_#04||V145_#05||V145_#06||V145_#07||V145_#08||V145_#09||V145_#10|
|Polyommatus (Agrodiaetus) karindus||Z704_#01||Z704_#02||Z704_#03||Z704_#04||Z704_#05||Z704_#06||Z704_#07||Z704_#08||Z704_#09||Z704_#10|
|Polyommatus (Agrodiaetus) morgani||V127_#01||V127_#02||V127_#03||V127_#04||V127_#05||V127_#06||V127_#07||V127_#08||V127_#09||V127_#10|
There were 11 single-base substitutions, 3 mono and 4 multi-nucleotide indels, in clones of specimen W136 (Polyommatus (Agrodiaetus) peilei). Interestingly, that clone “W136_#08” differed significantly from all others in having 16-nycleotide polyT deletion at positions -344” and 3 base indel at position -173”. Clones of second specimen Polyommatus (Agrodiaetus) peilei (W202) had 8 sites with single nucleotide substitutions and 8 positions, where mono multi-nucleotide indels occurred. Three clones had large polymorphic 17-nucleotide indel at positions -200”. Variation among clones was significant, with intragenomic differences ranging from 0.0% to 2.39%. The average intragenomic genetic distances for two specimens of Polyommatus (Agrodiaetus) peilei (W136 and W202) were very similar: 1.34% and 1.35% respectively.
Polyommatus (Agrodiaetus) karindus had significantly lower rate of intragenomic variability. Specimens V145 and Z704 had 9 and 10 polymorphic positions, respectively. Furthermore, majority number of indels and base substitutions of Z704 specimen is accounted for by one clone (Z704#10). It has one single substitution and 3 multi-nucleotide deletions, which never occurred in other clones. The average intragenomic genetic distances for two specimens of Polyommatus (Agrodiaetus) karindus (V145 and Z704) were: 0.56% and 0.72%, respectively. The highest value was 1.82%.
Clones of Polyommatus (Agrodiaetus) morgani ITS2 showed greater diversity than the other 2 species. For instance, the genetic distance between V127#05 clone and V127#03 was 3.68%. The average intragenomic genetic distance was also significantly higher for this species – 1.77%
In Bayesian analysis 50 cloned amplicons from Polyommatus (Agrodiaetus) peilei, Polyommatus (Agrodiaetus) karindus, Polyommatus (Agrodiaetus) morgani and ITS2 sequences from all Agrodiaetus species available in the GenBank were included, giving a total of 127 sequences. Since Polyommatus icarus (Rottemburg, 1775) was earlier inferred as sister clade to the subgenus Agrodiaetus (Talavera et al. 2013), we used one specimen (GenBank accession number <"type":"entrez-nucleotide","attrs":<"text":"AY556732","term_id":"233149402","term_text":"AY556732">> AY556732) as outgroup to root the phylogeny. Fragment of consensus Bayesian tree, showing clusterization of cloned sequences is given in Figure Figure2. 2 . The complete tree is given online in the Suppl. material 1.
Fragment of consensus Bayesian tree of the subgenus Agrodiaetus inferred from ITS2 sequences. Posterior probability values 㹐% are shown. The complete tree is given online in the Suppl. material 1. Cloned sequences of three studied species are highlighted: Polyommatus (Agrodiaetus) peilei – orange colour, Polyommatus (Agrodiaetus) karindus – blue colour, Polyommatus (Agrodiaetus) morgani – green colour.
Phylogeny and divergence time estimation of Chamaecrista ser. Rigidulae (Leguminosae: Caesalpinioideae)
Chamaecrista is a monophyletic genus, but most of its infrageneric categories have been shown to be paraphyletic, including series Rigidulae which currently comprises 30 species, all endemic to Brazil and most occurring in the cerrado vegetation of the central highlands of the country. This molecular phylogenetic study tests the monophyly of C. ser. Rigidulae based on a broad sampling, considers its relationship with members of C. sect. Absus subsect. Absus, and estimates its divergence time in relation to the age of the genus. For that, individual and combined analyses were performed by the parsimony and Bayesian methods using chloroplast (trnL-F) and nuclear (ITS1-5.8S-ITS2) markers and a matrix containing 75 taxa of Chamaecrista (29 belonging to series Rigidulae), 6 of Senna and 1 of Cassia. The analyses showed that series Rigidulae, as traditionally circumscribed, is polyphyletic. When C. brachyblepharis and C. ciliolata are excluded and C. botryoides and C. sincorana included, the series is resolved as monophyletic, comprising 30 species here designated as the Rigidulae clade. This clade is further subdivided into two geographically and genetically structured subclades, the first containing 23 species, mostly from the highlands of Goiás State, and the second with 6 species from the Espinhaço Range, running north to south through central Bahia and Minas Gerais States. Divergence time analyses suggest that the Rigidulae clade originated about 5 million years ago. The recent radiation of the series repeats that seen in other species-rich genera in the Cerrado Biome, thus corroborating previous hypotheses about the recent age of the biome. The Rigidulae clade, as here circumscribed, has the following morphological synapomorphies: asymmetric flowers with their posterior petals similar to a typical papilionoid standard petal, and leaflets divaricate along the leaf rachis.
Fig. S1. Majority-rule consensus tree of the parsimony analysis (MP) resulting from the trnL-F marker. Numbers above branches indicate bootstrap support for clades recovered in the MP analysis. Numbers below the branches indicate the posterior probability of the clades recovered in the Bayesian inference analysis. The taxa on the right of the species names follow the classification of Irwin & Barneby (1982), sect. = section, subsect. = subsection, the taxa without a prefix correspond to series.
Fig. S2. One of the most parsimonious trees from combined dataset with reconstruction of the ancestral character states: A, Character 1: Habit and underground system B, Character 2: Orientation of the leaflets on the rachis. — Species of series Rigidulae are highlighted in bold.
Fig. S3. One of the most parsimonious trees from the combined dataset with reconstruction of the ancestral character states: A, Character 3: Number of leaflet pairs on mature leaves B, Character 4: Differentiation of the surfaces of the leaflets. — Species of series Rigidulae are highlighted in bold.
Fig. S4. One of the most parsimonious trees from the combined dataset with reconstruction of the ancestral character states: A, Character 5: Visibility of secondary veins on the adaxial surface of the leaflets B, Character 6: First pair of leaflets “amplexicaul”. — Species of series Rigidulae are highlighted in bold.
Fig. S5. One of the most parsimonious trees from the combined dataset with reconstruction of the ancestral character states: A, Character 7: Inflorescence type B, Character 8: Floral symmetry. — Species of series Rigidulae are highlighted in bold.
Appendix S1. Alignment of the combined dataset (ITS+trnL-F) used in this study. ITS marker from 1 to 901 bp , trnL-F from 902 to 2067 bp .
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
Type sequences of C. gloeosporioides sensu stricto, C. asianum, C. fructicola, C. siamense and C. tropicale (Additional file 1: Table S1) were mined from GenBank. These species within the C. gloeosporioides species complex have been commonly identified as the causal agents of anthracnose of tropical fruit (Phoulivong et al. 2010). In selecting the sequences from holotype and epitype specimens for analyses, there were three important considerations, (i) the type sequences were selected based on the study by Cannon et al. (2012), and were included to provide sufficient range of sequence diversity, (ii) the species used are well- characterized and recognized at the phylogenetic and morphological levels, and (iii) the selected species have been analyzed in previous independent studies.
To obtain test sequences, papaya fruit displaying symptoms of anthracnose were collected during the period 2011 to 2013 (Additional file 1: Table S2). DNA was extracted from pure single spore cultures of Colletotrichum sp. using the E.Z.N.A. fungal DNA extraction kit® according to the manufacturer’s instructions (Omega bio-tek Ltd., USA). The entire ITS region (496 bp) was amplified using the universal primer pair ITS4/5 (White et al. 1990) and sequenced with independent base call verification (Amplicon Express, WA, USA). Representative sequences were submitted to GenBank (KM117226 to KM117228). A total of 56 sequences were used in the final data set for generating consensus secondary structures for ITS1, 5.8S and ITS2 markers: 30 type sequences obtained from holotype and epitype specimens and 26 query sequences belonging to the C. gloeosporioides sensu lato complex. Other fungal sequences were also mined from GenBank (HQ238968, JF780523, EU480703, HQ238962, EF543854) and were used as out-groups to assist in defining of helical domains of the three markers based on mfold alignments in the UNAFold webserver (http://mfold.rna.albany.edu/) (Zuker 2003 Markham and Zuker 2008).
ITS sequence analysis and alignment
Errors can occur during PCR amplification when two different DNA templates may be present. The resulting amplicon may be chimeric, that is, a mosaic of these original sequences (Jumpponen 2011). Such chimeric sequences may be misinterpreted as novel which can artificially inflate estimates of diversity and interfere with phylogenetic inference and species discrimination if undetected (Hugenholtz and Huber 2003). ITS sequences were checked for possible chimeras using the UNITE PlutoF Chimera checker (Nilsson et al. 2010 Edgar et al. 2011) and the Chimera Test developed in the Fungal Metagenomics Project at the University of Alaska (https://biotech.inbre.alaska.edu/fungal_portal/?program=chimera_test).
Verifying ITS sequence validity
Alignments were carried out using the online version of the sequence alignment program MAFFT version 6 ((http://mafft.cbrc.jp/alignment/server/ (Katoh et al. 2005 Katoh and Toh 2008). This sequence analysis will also aid in determining if the ITS sequences were composed of stochastic, artifactual nucleotide data. The start and end point of each marker were first defined for each species using the pipeline available on the ITS2 database website (http://its2.bioapps.biozentrum.uni-wuerzburg.de/). Ultimately, three sets of sequence alignments were generated: ITS1, 5.8S and ITS2 as separate data sets.
Comparison of GC content and nucleotide diversity of the ITS sequences
The sequence length of the ITS1 and ITS2 region for a given species can be variable, however, the two markers should have similar GC content if they are authentic sequences under functional and selective constraints and not pseudogenes (Harpke and Peterson 2008 Mullineaux and Hausner 2009). The GC content of the ITS1, 5.8S and ITS2 sequences was determined using BioEdit version 7.2.0 software. DNASP version 5.10 (Rozas et al. 2003 Librado and Rozas 2009) was used to determine the nucleotide diversity (Pi), polymorphic and singleton sites, indel sites and indel haplotypes among the ITS1, 5.8S and ITS2 sequences.
Secondary structure prediction
Ribosomal secondary structure and motif detection was determined for the ITS2 marker. Secondary structure predictions of rRNA sequences are sensitive to single base changes which in turn, can affect hydrogen base-pairing especially along the stem aspect of a stem-loop secondary structure (Matthews et al. 2005). Consequently for this study, the ITS sequences and their electropherograms were manually reviewed and evaluated for signal quality and accurate nucleotide assignment in order to prevent user-induced errors in structural predictions. Because the core folding pattern of the ITS2 sequence is already known, this presents an external criterion or reference to check for the correctness of the predicted structures (Schultz and Wolf 2009). For the ITS2 consensus secondary structure prediction, the ITS2 database pipeline (http://its2.bioapps.biozentrum.uni-wuerzburg.de/) (Koetschan et al. 2010 Merget et al. 2012 Koetschan et al. 2012) was used. Consensus secondary structures for the ITS1 and 5.8S markers were determined using LocaRNA-P simultaneous RNA alignment and folding option of the Freiburg RNA Tools pipeline (http://rna.informatik.uni-freiburg.de:8080/LocARNA/Input.jsp) (Will et al. 2007, Will et al. 2012 and Smith et al. 2010), and the RNA folding form option of the mfold webserver using default conditions for temperature (37°C) and ionic conditions (http://mfold.rna.albany.edu/) (Zuker 2003 Markham and Zuker 2008). Consensus secondary structures for ITS1, 5.8S and ITS2 markers as radial view structures were re-drawn and annotated for publication purposes using VARNA 3.9 (Darty et al. 2009).
Lock, J. & Simpson, K. Legumes of West Asia, a Check-List. (Royal Botanic Gardens, Kew, 1991).
Wojciechowski, M. F., Sanderson, M. J. & Hu, J. M. Evidence on the monophyly of Astragalus (Fabaceae) and its major subgroups based on nuclear ribosomal DNA ITS and chloroplast DNA trnL intron data. Syst. Bot. 24, 409–437 (1999).
Wojciechowski, M. F., Sanderson, M. J, Steele, K. P. & Liston, A. Molecular phylogeny of the “temperate herbaceous tribes” of papilionoid legumes: a supertree approach. Advances in Legume Systematics 9, 277–298 (Royal Botanic Gardens, Kew, 2000).
Maassoumi, A. A. Old World Check-List of Astragalus. (Research Institute of Forests and Rangelands, Tehran, 1998).
Lewis, G., Schrire, B., Mackinder, B. & Lock, M. Legumes of the World. (Royal Botanic Gardens, Kew, 2005).
Mabberley, D. J. The Plant-Book: A Portable Dictionary of Plants, their Classification and Uses, third ed. (Cambridge Univ. Press, Cambridge, 2008).
Podlech, D. & Zarre, S. A Taxonomic Revision of the Genus Astragalus L. (Leguminosae) in the Old World, vols. 1–3. (NaturhistorischesMuseum, Wien, 2013).
Lavin, M., Doyle, J. J. & Palmer, J. D. Evolutionary significance of the loss of the chloroplast-DNA inverted repeat in the Leguminosae subfamily Papilionoideae. Evolution 44, 390–402 (1990).
Liston, A. Use of the polymerase chain reaction to survey for the loss of the inverted repeat in the legume chloroplast genome, (eds Crisp, M. D., Doyle, J. J.), Advances in Legume Systematics, part 7. Phylogeny. (Royal Botanic Gardens, Kew, pp 31–40 1995).
Wojciechowski, M. F., Lavin, M. & Sanderson, M. J. A phylogeny of legumes (Leguminosae) based on analysis of the plastid matK gene resolves many well-supported subclades within the family. Am. J. Bot. 91, 1846–1862 (2004).
Sanderson, M. J. & Wojciechowski, M. F. Diversification rates in a temperate legume clade: Are there “so many species” of Astragalus (Fabaceae)? Am. J. Bot. 83, 1488–1502 (1996).
Schneider, H. et al. Key innovations versus key opportunities: Identifying causes of rapid radiation in derived ferns (ed. Glaubrecht, M.) Evolution in Action. Springer, Heidelberg, pp 61–75 (2010).
Hodges, S. A. & Arnold, M. L. Spurring plant diversification: Are floral nectar spurs a key innovation? Proc. Royal Soc. London B 262, 343–348 (1995).
Schluter, D. The Ecology of Rapid Radiations. (Oxford Univ. Press, Oxford, 2000).
Blattner, F. R., Pleines, T. & Jakob, S. S. Rapid radiation in the barley genus Hordeum (Poaceae) during thePleistocene in the Americas(ed. Glaubrecht, M.) Evolution in Action. Springer, Heidelberg, pp. 17–33 (2010).
Bunge, A. Generis Astragali species Gerontogeae. Pars prior. Claves diagnosticae. Mém. Acad. Imp. Sci. St. Petersbourg ser. VII 11, 1–140 (1868).
Bunge, A. Generis Astragali species Gerontogeae. Pars altera. Speciarum enumeratio. Mém. Acad. Imp. Sci. St. Petersbourg ser. VII 15, 1–245 (1869).
Podlech, D. Die Typifizierung der altweltlichen Sektionen der Gattung Astragalus L. (Leguminosae). Mitteil. Bot. Staatssammlung München 29, 461–494 (1990).
Rechinger, K. H., Dulfer, H. & Patzak, A. Širjaevii fragmenta Astragalogica V-VII. sect. Hymenostegis. Sitzungsberichte. Abteilung II. Österreichische Akademie der Wissenschaften, Mathematisch-Naturwissenschaftliche Klasse. Mathematische, Physikalische und Technische Wissenschaften. Verlag der Österreichischen Akademie der Wissenschaften 168(2), 7–115 (1959).
Maassoumi, A. A. The Genus Astragalus in Iran, Perennials, vol. 3. (Research Institute of Forests and Rangelands, Tehran, 1995).
Zarre, S. & Podlech, D. Taxonomic revision of Astragalus L. sect. Hymenostegis Bunge (Leguminosae). Sendtnera 3, 255–312 (1996).
Podlech, D. & Maassoumi, A. A. Astragalus sect. Hymenostegis Bunge, (ed. Rechinger, K. H.), Flora Iranica, Papilionaceae IV, Astragalus II, No. 175. Akademische Druck- und Verlagsanstalt, Graz, pp 127–183 (2001).
Bagheri, A., Maassoumi, A. A. & Ghahremaninejad, F. Additions to Astragalus sect. Hymenostegis (Fabaceae) in Iran. Iran. J. Bot. 17, 15–19 (2011).
Bagheri, A., Rahiminejad, M. R. & Maassoumi, A. A. A new species of the genus Astragalus (Leguminosae-Papilionoideae) from Iran. Phytotaxa 178, 38–42 (2014).
Bagheri, A., Rahiminejad, M. R. & Maassoumi, A. A. Astragalus hakkianus a new species of Astragalus (Fabaceae) from N.W. Iran. Feddes Repert. 124, 46–49 (2013).
Bagheri, A., Karaman Erkul, S., Maassoumi, A. A., Rahiminejad, M. R. & Blattner, F. R. Astragalus trifoliastrum (Fabaceae), a revived species for the flora of Turkey. Nord. J. Bot. 33, 532–539 (2015).
Bagheri, A., Maassoumi, A. A., Rahiminejad, M. R. & Blattner, F. R. Molecular phylogeny and morphological analysis support a new species and new synonymy in Iranian Astragalus (Leguminosae). PLoS ONE 11, e0149726 (2016).
Bagheri, A., Ghahremaninejad, F., Maassoumi, A. A., Rahiminejad, M. R. & Blattner, F. R. Nine new species of the species-rich genus Astragalus L. (Leguminosae). Novon in press (2017).
Kazempour Osaloo, S., Maassoumi, A. A. & Murakami, N. Molecular systematic of the genus Astragalus L. (Fabaceae): Phylogenetic analyses of nuclear ribosomal DNA internal transcribed spacers and chloroplast gene ndhF sequences. Plant Syst. Evol. 242, 1–32 (2003).
Kazempour Osaloo, S., Maassoumi, A. A. & Murakami, N. Molecular systematics of the Old World Astragalus (Fabaceae) as inferred from nrDNA ITS sequence data. Brittonia 57, 367–381 (2005).
Naderi Safar, K., Kazempour Osaloo, S., Maassoumi, A. A. & Zarre, S. Molecular phylogeny of Astragalus section Anthylloidei (Fabaceae) inferred from nrDNA ITS and plastidrpl32-trnL sequence data. Turk. J. Bot. 38, 637–652 (2014).
Wojciechowski, M. F. Astragalus (Fabaceae): A molecular phylogenetic perspective. Brittonia 57, 382–396 (2005).
Álvarez, I. & Wendel, J. F. Ribosomal ITS sequences and plant phylogenetic inference. Mol. Phylogenet. Evol. 29, 417–434 (2003).
Blattner, F. R. Phylogeny of Hordeum (Poaceae) as inferred by nuclear rDNA ITS sequences. Mol. Phylogenet. Evol. 33, 289–299 (2004).
Brassac, J. & Blattner, F. R. Species level phylogeny and polyploid relationships in Hordeum (Poaceae) inferred by next-generation sequencing and in silico cloning of multiple nuclear loci. Syst. Biol. 64, 792–808 (2015).
Blattner, F. R. TOPO6: A nuclear single-copy gene for plant phylogenetic inference. Plant Syst. Evol. 302, 239–244 (2016).
Maassoumi, A. A. Flora of Iran, Papilionaceae (Astragalus II), no. 77. (Research Institute of Forests and Rangeland, Tehran, 2014).
Bartha, L., Dragoş, N., Molnár, A. V. & Sramkó, G. Molecular evidence for reticulate speciation in Astragalus (Fabaceae) as revealed by a case study from section. Dissitiflori. Botany 91, 702–714 (2013).
Pleines, T. & Blattner, F. R. Phylogeographic implications of an AFLP phylogeny of the American diploid Hordeum species (Poaceae: Triticeae). Taxon 57, 875–881 (2008).
Eaton, D. A. R. & Ree, R. H. Inferring phylogeny and introgression using RADseq data: An example from flowering plants (Pedicularis: Orobanchaceae). Syst. Biol. 62, 689–706 (2013).
Jones, R. C., Nicolle, D., Steane, D. A., Vaillancourt, R. E. & Potts, B. M. High density, genome-wide markers and intra-specific replication yield an unprecedented phylogenetic reconstruction of a globally significant, speciose lineage of Eucalyptus. Mol. Phylogenet. Evol. 105, 63–85 (2016).
Mitchell, A. D. & Heenan, P. B. Sophora sect. Edwardsia (Fabaceae): Further evidence from nrDNA sequence data of a recent and rapid radiation around the Southern Oceans. Bot. J. Linnean Soc. 140, 435–441 (2002).
Catalano, S. A., Vilardi, J. C., Tosto, D. & Saidman, B. O. Molecular phylogeny and diversification history of Prosopis (Fabaceae: Mimosoideae). Biol. J. Linnean Soc. 93, 621–640 (2008).
Egan, A. N. & Crandall, K. A. Divergence and diversification in North American Psoraleeae (Fabaceae) due to climate change. BMC Biol. 6, 55 (2008).
Drummond, C. S., Eastwood, R. J., Miotto, S. T. S. & Hughes, C. E. Multiple continental radiations and correlates of diversification in Lupinus (Leguminosae): Testing for key innovations with incomplete taxon sampling. Syst. Biol. 61, 443–460 (2012).
Nürk, N. M., Uribe-Convers, S., Gehrke, B., Tank, D. C. & Blattner, F. R. Oligocene niche shift, Miocene diversification – Cold tolerance and accelerated speciation rates in the St. John’s Worts (Hypericum, Hypericaceae). BMC Evol. Biol. 15, 80 (2015).
Lagomarsino, L. P., Condamine, F. L., Antonelli, A., Mulch, A. & Davis, C. C. The abiotic and biotic drivers of rapid diversification in Andean bellflowers (Campanulaceae). New Phytol. 210, 1430–1442 (2016).
Frenzel, B. The Pleistocene vegetation of northern Eurasia. Science 161, 637–649 (1968).
Tarasov, P. E., Volkova, V. S. & Webb, T. et al. Last glacial maximum biomes reconstructed from pollen and plant macrofossil data from northern Eurasia. J. Biogeography 27, 609–620 (2000).
Franzke, A. et al. Molecular signals for Late Tertiary/Early Quaternary range splits of an Eurasian steppe plant: Clausia aprica (Brassicaceae). Mol. Ecol. 13, 2789–2795 (2004).
Friesen, N. et al. Dated phylogenies and historical biogeography of Dontostemon and Clausia (Brassicaceae) mirror the palaeogeographical history of the Eurasian steppe. J. Biogeography 43, 738–749 (2016).
Hewitt, G. M. Some genetic consequences of ice ages, and their role in divergence and speciation. Biol. J. Linnean Soc. 58, 247–276 (1996).
Roy, K., Valentine, J. W., Jablonski, D. & Kidwell, M. Scales of climatic variability and time averaging in Pleistocene biotas: Implications for ecology and evolution. Trends Ecol. Evol. 11, 458–463 (1996).
Jakob, S. S., Ihlow, A. & Blattner, F. R. Combined ecological niche modeling and molecular phylogeography revealed the evolutionary history of Hordeum marinum (Poaceae) – niche differentiation, loss of genetic diversity, and speciation in Mediterranean Quaternary refugia. Mol. Ecol. 16, 1713–1727 (2007).
Ikeda, H., Carlsen, T., Fujii, N., Brochmann, C. & Setoguchi, H. Pleistocene climatic oscillations and speciation history of an alpine endemic and a widespread arctic-alpine plant. New Phytol. 194, 583–594 (2012).
Blattner, F. R. Direct amplification of the entire ITS region from poorly preserved plant material using recombinant PCR. Biotechniques 29, 1180–1186 (1999).
Swofford, D. L. PAUP*. Phylogenetic Analysis Using Parsimony (* and other methods). Version 4.0a150. (Sinauer Associates, Sunderland, 2002).
Farris, J. S., Kallersjo, M., Kluge, A. G. & Bult, C. Testing the significance of incongruence. Cladistics 10, 315–319 (1994).
Ronquist, F. et al. MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 61, 539–542 (2012).
Stamatakis, A. RAxML Version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Bouckaert, R. R. et al. BEAST 2: A software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 10, e1003537 (2014).
Drummond, A. J. & Suchard, M. A. Bayesian random local clocks, or one rate to rule them all. BMC Biology 8, 114 (2010).
Heled, J. & Drummond, A. J. Calibrated tree priors for relaxed phylogenetics and divergence time estimation. Syst. Biol. 61, 138–149 (2012).
Bartha, L., Sramkó, G. & Dragoş, N. New PCR primers for partialycf1 amplification in Astragalus (Fabaceae): Promising source for genus-wide phylogenies. Studia UBB Biologia 51, 33–46 (2012).
Hijmans, R. J. et al. DIVA-GIS, Version 5: A geographic information system for the analysis of biodiversity data [Software]. International Plant Genetic Resources Institute (IPGRI) Available from: http://diva-gis.org. (2005).
The Short ITS2 Sequence Serves as an Efficient Taxonomic Sequence Tag in Comparison with the Full-Length ITS
An ideal DNA barcoding region should be short enough to be amplified from degraded DNA. In this paper, we discuss the possibility of using a short nuclear DNA sequence as a barcode to identify a wide range of medicinal plant species. First, the PCR and sequencing success rates of ITS and ITS2 were evaluated based entirely on materials from dry medicinal product and herbarium voucher specimens, including some samples collected back to 90 years ago. The results showed that ITS2 could recover 91% while ITS could recover only 23% efficiency of PCR and sequencing by using one pair of primer. Second, 12861 ITS and ITS2 plant sequences were used to compare the identification efficiency of the two regions. Four identification criteria (BLAST, inter- and intradivergence Wilcoxon signed rank tests, and TaxonDNA) were evaluated. Our results supported the hypothesis that ITS2 can be used as a minibarcode to effectively identify species in a wide variety of specimens and medicinal materials.
1.1. DNA Barcoding of Degraded DNA Materials
DNA barcoding takes advantage of short standard sequences to discover and identify species . An ideal DNA barcode should be short enough to be amplified from archival specimens using universal primers. The term “minimalist barcode” was first defined by Herbert as a tool to overcome the low PCR efficiency of cytochrome c-oxidase subunit 1 (CO1) in archival animal specimens in museums, and the possibility of identifying animal specimens using a region of approximately 200 bp was discussed. The results of that study showed that minibarcodes can be isolated from different types of specimens, including museum samples, trace tissue samples with degraded DNA and other specimens, from which the acquisition of a full-length barcode (CO1) is not feasible . The amplification of DNA from herbarium specimens is also important for barcoding studies because it is often necessary to confirm the species identification of fresh specimens by comparing their sequences with those of older museum specimens . Additionally, most of the medicinal materials available in the market are dry and have been stored for long periods thus, it is very difficult to amplify long DNA regions from some of these materials, which prevent the use of DNA barcodes for herb identification.
1.2. The Trend of Core Plant DNA Barcodes
The Plant Working Group of the Consortium for the Barcode of Life (CBOL) recommended the use of a combination of matk and rbcL as a barcode for land plants , and internal transcribed spacer (ITS)/internal transcribed spacer 2 (ITS2) was proposed as a supplemental marker for further study. The ITS sequence contains enough variable sites for species identification in many samples [5–9], but ITS could not be amplified from approximately 12% of herbarium samples , because ITS1 is too variable to guarantee reliable alignments and contains variable indels (insertions/deletions) at this taxonomic level. Additionally, multiple functional copies exist in many taxa. Thus, ITS was excluded as a universal land plant barcode in the earlier stages. In contrast, ITS2 is considered to have evolved in concert, which leads to a homogenization of all the copies of this gene throughout the genome and in most organisms ITS2 was treated as a single locus. Thus, the ITS2 region might be a suitable marker for taxonomic classification [10–12]. Recently, ITS2 has been suggested as a useful barcode for medicinal plants [13–17], as a universal DNA barcode to identify plants and as a complementary locus of CO1 to identify animals . The China Plant Barcode of Life Group considered ITS2 to be a useful alternative to ITS because it is more easily amplified and sequenced . In addition, the secondary structure of ITS2 was shown to be an efficient tool for biological species identification [20, 21].
Here, we demonstrated the effectiveness of ITS2 as a minibarcode in comparison with the full-length ITS for the identification of a wide range of archived plant species. An initial set of 100 medicinal samples from museum specimens and the herb market was tested to determine the PCR and sequencing efficiencies of ITS and ITS2. A second set of 12861 sequences, representing 8313 species collected from GenBank, was examined to compare the identification abilities of ITS and ITS2. This work aims to provide an evaluation of ITS2 as a minibarcode for large samples.
2. Materials and Methods
2.1. Plant Material
The initial set of 100 museum medicinal specimens and herbal products from 92 species representing 5 orders (see Table 1 of the Supplementary Material available online at http://dx.doi.org/10.1155/2013/741476) was collected from the Buozhou herbal market and from specimens at the Institute of Medicinal Plant Development, some of which were collected 90 years ago, to test the efficiency of PCR and sequencing. All the samples were authenticated at the species level by Professor Yulin Lin (Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences). A second set of sequences for the identification efficiency analysis presented in this paper was obtained from the GenBank nucleotide sequence database. We carried out a bioinformatics analysis using all ITS sequences present in GenBank matching the search pattern “18S ribosomal RNA gene internal transcribed spacer 1, 5.8S ribosomal RNA gene, and internal transcribed spacer 2, and 28S ribosomal RNA gene.” Partial sequences, fungal sequences, and sequences of less than 100 bp were removed. A flowchart is shown in Figure 1. The complete ITS2 and full-length ITS regions were annotated using the Hidden Markov Model (HMM)  and ITS plant model, respectively, which rely on highly similar and correctly annotated reference sequences present in the public database. Ultimately, 12861 sequences representing 8313 species from 1699 genera were obtained (GenBank accession numbers are listed in Table 2S) and used to analyze the identification efficiencies of ITS and ITS2.
2.2. DNA Extraction, PCR Amplification and Sequencing
Total genomic DNA was extracted from specimens using the Plant Genomic DNA Kit (Tiangen Biotech Beijing Co., Ltd., China) according to the manufacturer’s instructions. The primer sequences for ITS2 were described by Chen et al. . ITS was amplified using the primers ITS5 and ITS4 . The PCR conditions and sequences used to amplify the two regions (ITS and ITS2) were based on the methods described by Kress et al. and Chen et al. [1, 13, 24, 25].
2.3. Analysis Method
Six parameters were used to characterize the interspecific and intraspecific divergences, according to a previously described method . Three of the parameters were used to estimate the interspecific variability: average inter-specific distance, average theta prime, and smallest inter-specific distance. The other three parameters were used to evaluate the intraspecific divergence: average intraspecific difference, theta, and average coalescent depth. The Wilcoxon signed rank test was used as described previously [13, 26, 27]. Basic Local Alignment Search Tool (BLAST1) was performed to identify the species . The TaxonDNA software was used to calculate the identification efficiency [28, 29].
3.1. PCR and Universal Primers
To evaluate the efficiency of PCR and sequencing, 100 medicinal samples from herbal market and museum specimens, including 91 species from 5 orders, were tested 16% of the samples were obtained from the herb market, and the remaining 84% were obtained from the Institute of Medicinal Plant Development. The ITS primer pair yielded a recovery rate of only 23%, compared with the 91% recovery rate for ITS2. All sequences were submitted to GenBank (the GenBank accession numbers are listed in Table 1S, Supplementary Material). The small size of ITS2 facilitates its amplification by universal primers, even in samples with partially degraded DNA.
3.2. Species Identification
3.2.1. Comparison of Inter- and Intraspecific Divergences
Comparison of the inter- and intraspecies sequence variation was an important aspect of the barcoding identification. For the 12861 ITS and ITS2 sequences, which contained 8313 species from 1699 genera, the average lengths of ITS and ITS2 were 634 bp and 233 bp, respectively. The comparison of the inter- and intraspecific genetic distances revealed that the ITS2 region exhibited a higher inter-specific divergence according to the three inter-specific parameters (Table 1). Another advantage of ITS2 is that its conserved secondary structure is associated with relatively low intra-specific variation. The combination of a conserved secondary structure with a variable sequence appears to be a major benefit of using ITS2 .
The differences in the percent sequence divergence between loci were tested using the Wilcoxon signed rank test. The results showed that ITS2 was a more variable barcode (Table 2). ITS contained a conserved 5.8S region, which decreased the comparative divergence. Based on these results, ITS2 demonstrates sufficient variation to differentiate plants.
3.2.2. BLAST-Based Identification
BLAST1 was used to evaluate the efficiencies of ITS2 and ITS. ITS and ITS2 successfully identified 89.2% and 79.2% of specimens, respectively, at the species level and 97.5% and 93.8%, respectively, at the genus level (Table 3). Additionally, the significantly smaller size of ITS2 (average length of approximately 233 bp) compared with that of ITS (average length of approximately 634 bp) makes ITS2 a better candidate for barcoding studies.
To estimate the respective identification efficiency per genus, genera that contain at least 20 species were selected independently (Table 4). In 85% (68/80) of the genera, the success rates of ITS and ITS2 are identical. ITS had an identification efficiency superior to that of ITS2 in the following 12 genera: Gunnera, Luzula, Strobilanthes, Nepeta, Dionysia, Adenia, Clidemia, Sedum, Indigofera, Kalanchoe, Pilea, and Melampodium. Of the 603 genera that contain at least 3 samples, ITS2 and ITS had the same identification efficiency in 394 genera (65.3%), and ITS and ITS2 shared a 100 % identification efficiency at the species level in 345 genera (57.2%) (Table 3S).
3.2.3. TaxonDNA Identification
We also used TaxonDNA to assess the accuracy of species identification based on ITS and ITS2. TaxonDNA is an alignment-based parametric clustering program that determines the closest match of a sequence by comparing it with all other sequences in the aligned data set. If the compared sequences were from the same species, the identification was considered successful, whereas mismatched names were counted as failures. Cases with several equally good best matches from different species were considered ambiguous . In this study, the successful identification rates of the “best match” were 67.88% and 60% for ITS and ITS2, respectively. The ambiguous identification rates of ITS and ITS2 were 14.9% and 0%, respectively, and the misidentification rates were 17.2% and 40%, respectively. The dataset contained 8607 sequences with duplication.
We used TaxonDNA to set the threshold value. All sequences without a match below the 97% threshold value remained unidentified. If the compared sample names were identical, the identification was considered correct if the sequence names were mismatched, the identification was considered a failure. When several equally good best matches that belonged to a minimum of two species were found, the identification was considered ambiguous [29, 31]. The successful identification rates under the “best close match” were 62.53% and 32% for ITS and ITS2, respectively. The ambiguous identification rates of ITS and ITS2 were 14.0% and 0%, respectively. The misidentification rates of ITS and ITS2 were 7.28% and 0%, respectively. The remaining samples were considered unidentified because they had no matches below the threshold value. The nonmatch ratios of ITS and ITS2 were 16.2% and 68%, respectively (Table 5). ITS provided slightly superior successful identification and misidentification rates compared with ITS2, but ITS2 provided a lower ambiguous identification rate (0% versus 14.9% and 14.0% under the “best match” and “best close match,” resp., for ITS).
4.1. PCR and Sequencing Success Rates
Many museum specimens are very useful for DNA barcoding studies. However, high-quality DNA can be difficult to obtain from these specimens, making PCR amplification and sequencing inefficient. In this study, we recovered short ITS2 sequences from more than 90% of the herbal specimens representing 5 orders, whereas the recovery rate for ITS with a single primer set was only 23%. This discrepancy between the two regions arises because ITS is very long relative to ITS2, and ITS require a variety PCR conditions and additives for successful amplification . Another potential explanation is that intact DNA was difficult to extract from these samples due to the degradation that occurred in the museum specimens during the long storage period and in the herbs from the market during harvesting, processing, and storage. In contrast, the ITS2 region can be easily amplified and sequenced with conserved primers. Due to its relatively short length, the ITS2 minibarcode could be amplified with greater success than the full-length ITS sequences in almost all groups.
4.2. Identification Efficiency of ITS and ITS2
To determine whether barcode gaps are present in this study, the relationships between the inter- and intraspecific divergences were compared for each species. For the 12861 samples, ITS and ITS2 could identify 97.5% and 93.8% of genera, respectively, by the BLAST method. The full-length ITS could identify approximately 89.2% of the species, and the mini-DNA barcode ITS2 successfully identified approximately 79.2% of the species, which is higher than the CBOL proposed plant combination of matK and rbcL (70%) [4, 5].
TaxonDNA was also used to compare the identification efficiencies of ITS and ITS2, and the result appeared to be similar to that obtained by the BLAST method. ITS had slightly superior successful identification and misidentification rates compared with ITS2, but the ambiguous identification rate of ITS2 was 0%, whereas that of ITS was 14.9% and 14.0% under the “best match” and “best close match” algorithms, respectively. The zero ambiguous identification rate of ITS2 may be due to its conserved secondary structure. The secondary structure of ITS2 has proven useful for diagnostic purposes at the species level , which might reduce the ambiguous identification rates and increase the correctness of the barcoding analysis. Evidence has shown that a combination of nucleotide and secondary structure data can overcome some of the limitations of ITS2  and that the ITS2 sequence and secondary structure (sequence-structure) provided the most accurate results, which benefit from the secondary structure [30, 34]. Thus, the use of the ITS2 secondary structures would be extremely helpful to address the challenges of species identification and classification.
4.3. ITS2 versus ITS: Advantages and Limitations
ITS2 has many advantages that make it superior to ITS. First, it is important that species be defined correctly for DNA barcoding by systematic analysis . ITS2 regions with secondary structures are more conserved than the DNA sequences alone, which could provide information that is useful for the cladistic inference of relationships , and the ITS2 sequence-structure information provides a compensatory base changes (CBCs) analysis result that correlates with the biological species concept . Thus, ITS2 has been considered a double-edged tool for evolutionary comparisons in eukaryotes .
Second, millions of species will need to be sequenced for a global barcode project, and this would be extremely costly using standard sequencing methods. The read lengths provided by high-throughput sequencing would be sufficient to build a database of ITS2 mini-DNA barcode sequences. High-throughput sequencing technology uses an emulsion PCR approach to simultaneously amplify several thousand 100–200 bp DNA molecules in one reaction and yields a large number of short sequences with a lower cost than standard approaches. Mello proved that the ITS2 read length obtained by high-throughput 454 sequencing provided adequate information for taxon assignment . Song et al. used high-throughout 454 sequencing to successfully obtain a large number of ITS2 sequences in one reaction . The amenability to high-throughput approaches and high identification efficiency makes the ITS2 minibarcode useful for projects involving a large number of environmental samples.
Third, although ITS2 was less powerful than ITS for resolving some closely related species, it showed many advantages, especially in identifying herbs and specimens containing degraded DNA. ITS2 sequences could be used to design taxon-specific probes for the rapid identification of plants , and an ITS2 microarray has been used to successfully separate species with sequence identities up to 97% . Considering the short length and high identification efficiency of the ITS2 sequence, we confirmed that this very short barcode sequence is valuable for the identification of old specimens and medicinal materials.
Finally, there are hundreds of copies of ITS within a genome. Nonetheless, ITS2 can be considered a single locus in the whole genome of most organisms [10, 12, 37], including Panax ginseng and Panax quinquefolius (unpublished), making ITS2 more suitable as a barcode than ITS.
This study demonstrated the potential of the ITS2 minibarcode for DNA barcoding analyses. ITS2 showed high sequence variability among 12861 samples from 8313 species. An ideal DNA barcoding marker for taxonomic classification should be fast-evolving to allow classification at the species level but must also contain highly conserved priming sites and be highly reliable for DNA amplification and sequencing . The ITS2 region meets the expected criteria of a global DNA barcode. Our analysis supports the use of the ITS2 minibarcode as a “universal DNA barcode” for the rapid identification of medicinal materials and specimens.
Conflict of Interests
The authors declare that no conflict of interests exists in this paper.
This work was supported by the National Natural Science Foundation of China (Grant nos. 81001608 and 81130069). The authors thank their colleagues who helped in the sample collection, identification, laboratory work, and paper preparation, including Professor Yulin Lin, Chang Liu, and many others.
Table S1: List of 100 museum medicinal specimens and herbal products from the Buozhou herbal market and from specimens at the Institute of Medicinal Plant Development
Table S2: List of GenBank accession numbers of 12861 ITS sequences
Table S3: List of 603 genera that contain at least 3 samples
- W. J. Kress, K. J. Wurdack, E. A. Zimmer, L. A. Weigt, and D. H. Janzen, “Use of DNA barcodes to identify flowering plants,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 23, pp. 8369–8374, 2005. View at: Publisher Site | Google Scholar
- M. Hajibabaei, M. A. Smith, D. H. Janzen, J. J. Rodriguez, J. B. Whitfield, and P. D. N. Hebert, “A minimalist barcode can identify a specimen whose DNA is degraded,” Molecular Ecology Notes, vol. 6, no. 4, pp. 959–964, 2006. View at: Publisher Site | Google Scholar
- D. Rubinoff, S. Cameron, and K. Will, “Are plant DNA barcodes a search for the Holy Grail?” Trends in Ecology and Evolution, vol. 21, no. 1, pp. 1–2, 2006. View at: Publisher Site | Google Scholar
- C.P.W. Group, “A DNA barcode for land plants,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106, no. 31, pp. 12794–12797, 2009. View at: Publisher Site | Google Scholar
- H. Yousefzadeh, A. H. Colagar, M. Tabari, A. Sattarian, and M. Assadi, “Utility of ITS region sequence and structure for molecular identification of Tilia species from Hyrcanian forests, Iran,” Plant Systematics and Evolution, vol. 298, no. 5, pp. 947–961, 2012. View at: Google Scholar
- I. Álvarez and J. F. Wendel, “Ribosomal ITS sequences and plant phylogenetic inference,” Molecular Phylogenetics and Evolution, vol. 29, no. 3, pp. 417–434, 2003. View at: Publisher Site | Google Scholar
- M. E. Mort, J. K. Archibald, C. P. Randle et al., “Inferring phylogeny at low taxonomic levels: utility of rapidly evolving cpDNA and nuclear ITS loci,” American Journal of Botany, vol. 94, no. 2, pp. 173–183, 2007. View at: Publisher Site | Google Scholar
- B. G. Baldwin, M. J. Sanderson, J. M. Porter, M. F. Wojciechowski, C. S. Campbell, and M. J. Donoghue, “The ITS region of nuclear ribosomal DNA: a valuable source of evidence on angiosperm phylogeny,” Annals of the Missouri Botanical Garden, vol. 82, no. 2, pp. 247–277, 1995. View at: Google Scholar
- A. Quijada, A. Liston, P. Delgado, A. Vázquez-Lobo, and E. R. Alvarez-Buylla, “Variation in the nuclear ribosomal DNA internal transcribed spacer (ITS) region of pinus rzedowskii revealed by PCR-RFLP,” Theoretical and Applied Genetics, vol. 96, no. 3-4, pp. 539–544, 1998. View at: Publisher Site | Google Scholar
- A. W. Coleman, “Pan-eukaryote ITS2 homologies revealed by RNA secondary structure,” Nucleic Acids Research, vol. 35, no. 10, pp. 3322–3329, 2007. View at: Publisher Site | Google Scholar
- J. Schultz, S. Maisel, D. Gerlach, T. Müller, and M. Wolf, “A common core of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota,” RNA, vol. 11, no. 4, pp. 361–364, 2005. View at: Publisher Site | Google Scholar
- A. W. Coleman, “ITS2 is a double-edged tool for eukaryote evolutionary comparisons,” Trends in Genetics, vol. 19, no. 7, pp. 370–375, 2003. View at: Publisher Site | Google Scholar
- S. Chen, H. Yao, J. Han et al., “Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species,” PLoS ONE, vol. 5, no. 1, Article ID e8613, 2010. View at: Publisher Site | Google Scholar
- T. Gao, H. Yao, J. Song et al., “Identification of medicinal plants in the family Fabaceae using a potential DNA barcode ITS2,” Journal of Ethnopharmacology, vol. 130, no. 1, pp. 116–121, 2010. View at: Publisher Site | Google Scholar
- X. Pang, J. Song, Y. Zhu, C. Xie, and S. Chen, “Using DNA barcoding to identify species within euphorbiaceae,” Planta Medica, vol. 76, no. 15, pp. 1784–1786, 2010. View at: Publisher Site | Google Scholar
- X. Pang, J. Song, Y. Zhu, H. Xu, L. Huang, and S. Chen, “Applying plant DNA barcodes for Rosaceae species identification,” Cladistics, vol. 27, no. 2, pp. 165–170, 2011. View at: Publisher Site | Google Scholar
- J. P. Han, L. C. Shi, X. C. Chen, and Y. L. Lin, “Comparison of four DNA barcodes in identifying certain medicinal plants of Lamiaceae,” Journal of Systematics and Evolution, vol. 50, no. 3, pp. 227–234, 2012. View at: Google Scholar
- H. Yao, J. Song, C. Liu et al., “Use of ITS2 region as the universal DNA barcode for plants and animals,” PLoS ONE, vol. 5, no. 10, Article ID e13102, 2010. View at: Publisher Site | Google Scholar
- D. Z. Li, L. M. Gao, H. T. Li et al., “Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants,” Proceedings of the National Academy of Sciences of the United States of America, vol. 108, no. 49, pp. 19641–19646, 2011. View at: Google Scholar
- B. Merget, C. Koetschan, T. Hackl et al., “The ITS2 database,” Journal of Visualized Experiments, vol. 61 article e3806, 2012. View at: Google Scholar
- T. Müller, N. Philippi, T. Dandekar, J. Schultz, and M. Wolf, “Distinguishing species,” RNA, vol. 13, no. 9, pp. 1469–1472, 2007. View at: Publisher Site | Google Scholar
- A. Keller, T. Schleicher, J. Schultz, T. Müller, T. Dandekar, and M. Wolf, “5.8SS rRNA interaction and HMM-based ITS2 annotation,” Gene, vol. 430, no. 1-2, pp. 50–57, 2009. View at: Publisher Site | Google Scholar
- T. J. White, “Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics,” in PCR Protocols: A Guide to Methods and Applications, pp. 315–322, Academic Press, New York, NY, USA, 1990. View at: Google Scholar
- C. Sass, D. P. Little, D. W. Stevenson, and C. D. Specht, “DNA barcoding in the Cycadales: testing the potential of proposed barcoding markers for species identification of Cycads,” PLoS ONE, vol. 2, no. 11, Article ID e1154, 2007. View at: Publisher Site | Google Scholar
- S. J. Chiou, J. H. Yen, C. L. Fang, H. L. Chen, and T. Y. Lin, “Authentication of medicinal herbs using PCR-amplified ITS2 with specific primers,” Planta Medica, vol. 73, no. 13, pp. 1421–1426, 2007. View at: Publisher Site | Google Scholar
- W. J. Kress and D. L. Erickson, “A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region,” PloS ONE, vol. 2, no. 6, article e508, 2007. View at: Google Scholar
- R. Lahaye, M. van der Bank, D. Bogarin et al., “DNA barcoding the floras of biodiversity hotspots,” Proceedings of the National Academy of Sciences of the United States of America, vol. 105, no. 8, pp. 2923–2928, 2008. View at: Publisher Site | Google Scholar
- C. P. Meyer and G. Paulay, “DNA barcoding: error rates based on comprehensive sampling,” PLoS Biology, vol. 3, no. 12 article e422, 2005. View at: Google Scholar
- R. Meier, K. Shiyang, G. Vaidya, and P. K. Ng, “DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success,” Systematic Biology, vol. 55, no. 5, pp. 715–728, 2006. View at: Publisher Site | Google Scholar
- A. Keller, F. Förster, T. Müller, T. Dandekar, J. Schultz, and M. Wolf, “Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees,” Biology Direct, vol. 5, article 4, 2010. View at: Publisher Site | Google Scholar
- H. A. Ross, S. Murugan, and W. L. S. Li, “Testing the reliability of genetic methods of species identification via simulation,” Systematic Biology, vol. 57, no. 2, pp. 216–230, 2008. View at: Publisher Site | Google Scholar
- M. W. Chase, R. S. Cowan, P. M. Hollingsworth et al., “A proposal for a standardised protocol to barcode all land plants,” Taxon, vol. 56, no. 2, pp. 295–299, 2007. View at: Google Scholar
- M. A. Buchheim, A. Keller, C. Koetschan, F. Förster, B. Merget, and M. Wolf, “Internal transcribed spacer 2 (nu ITS2 rRNA) sequence-structure phylogenetics: towards an automated reconstruction of the green algal tree of life,” PLoS ONE, vol. 6, no. 2, Article ID e16931, 2011. View at: Publisher Site | Google Scholar
- A. Grajales, C. Aguilar, and J. A. Sánchez, “Phylogenetic reconstruction using secondary structures of Internal Transcribed Spacer 2 (ITS2, rDNA): finding the molecular and morphological gap in Caribbean gorgonian corals,” BMC Evolutionary Biology, vol. 7, article 90, 2007. View at: Publisher Site | Google Scholar
- S. M. Markert, T. Müller, C. Koetschan, T. Friedl, and M. Wolf, “‘Y’ Scenedesmus (Chlorophyta, Chlorophyceae): the internal transcribed spacer 2 rRNA secondary structure re-revisited,” Plant Biology, vol. 14, no. 6, pp. 987–996, 2012. View at: Google Scholar
- A. Mello, C. Napoli, C. Murat, E. Morin, G. Marceddu, and P. Bonfante, “ITS-1 versus ITS-2 pyrosequencing: a comparison of fungal populations in truffle grounds,” Mycologia, vol. 103, no. 6, pp. 1184–1193, 2011. View at: Google Scholar
- J. Song, L. Shi, D. Li et al., “Extensive pyrosequencing reveals frequent intra genomic variations of internal transcribed spacer regions of nuclear ribosomal DNA,” PLoS ONE, vol. 7, no. 8, Article ID e43971, 2012. View at: Google Scholar
- M. Hajibabaei, G. A. C. Singer, E. L. Clare, and P. D. N. Hebert, “Design and applicability of DNA arrays and DNA barcodes in biodiversity monitoring,” BMC Biology, vol. 5, article 24, 2007. View at: Publisher Site | Google Scholar
- J. C. Engelmann, S. Rahmann, M. Wolf et al., “Modelling cross-hybridization on phylogenetic DNA microarrays increases the detection power of closely related species,” Molecular Ecology Resources, vol. 9, no. 1, pp. 83–93, 2009. View at: Publisher Site | Google Scholar
- P. Taberlet, E. Coissac, F. Pompanon et al., “Power and limitations of the chloroplast trnL (UAA) intron for plant DNA barcoding,” Nucleic Acids Research, vol. 35, no. 3, article e14, 2007. View at: Publisher Site | Google Scholar
Copyright © 2013 Jianping Han et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Intercontinental lineage divergence
Our results strongly support a hypothesis of lineage divergence and a lack of contemporary gene flow between North American and Scandinavian populations of T. populinum. Migration corridors were present either via the Bering Land Bridge (BLB) from eastern Siberia to Alaska or via the North Atlantic Land Bridge (NALB) between Europe and North America through Greenland. The NALB was intact until c. 40 Ma, but functional as late as 25–15 Ma ( Milne, 2006 ). However, recently, Denk et al. (2010) concluded that the NALB may have been an active route into the late Miocene (11.6–5.3 Ma). The BLB was severed approx. 5.4–5.5 Ma ( Gladenkov et al., 2002 ) and may have been functional continuously between 65 and 5.4 Ma, ( Milne, 2006 ). Ingvarsson (2005) , using an average (five gene) silent site divergence between P. tremula and P. trichocarpa of 6.1% and assuming a silent site substitution rate of 5.0–8.0 × 10 −9 , estimated the divergence of P. tremula and P. trichocarpa to have been between 3.8 and 6.2 Ma. Our coalescent-based estimates also place the divergence of P. tremula and P. balsamifera/P. trichocarpa during the Late Miocene to Pliocene (5.3–2.6 Ma) epochs, roughly coincident with the opening of the Bering Strait ( Milne, 2006 ). However, estimates of divergence in Tricholoma are much more recent and place the divergence of these lineages during the Pleistocene (2.6–0.01 Ma) epoch. Our divergence estimates strongly suggest that Tricholoma and its Populus hosts did not undergo intercontinental migration in tandem.
Possible explanations for the strongly supported, but more recent, intercontinental lineage divergence within T. populinum include the following: allopatric isolation of T. populinum populations in Scandinavia and North America that was coincident with speciation of Populus hosts (P. tremula and P. balsamifera/P. trichocarpa) effects of population contraction and expansion during the glacial and interglacial cycles of the Pleistocene and founder effect followed by genetic drift after a single, rare long-distance dispersal event. Since there were no shared haplotypes between continents, analyses of directional migration were not possible.
Determining the possible role coevolution with different Populus species (P. tremula vs P. balsamifera/P. trichocarpa) had on intercontinental T. populinum divergence will require additional sampling of host populations within Europe and North America or reciprocal inoculation trials. Based on the taxonomic distribution of the host species of T. populinum, it is unlikely that the North American and European lineages of T populinum diverged solely as a result of cospeciation with Populus. It is curious that there are few collections of T. populinum in North America with poplar species other than P. trichocarpa and P. balsamifera, because P. tremula and P. trichocarpa/P. balsamifera reside in two different and well-diverged sections of the genus. This enigma suggests that environmental factors associated with boreal forest-like environments may limit the distribution of T. populinum.
Overall, the extent to which T. populinum occurs in Europe, Asia and North America is not well documented. T. populinum is known to occur with P. alba and P. nigra and these populations need to be sampled to assess population structure and host specificity across Europe. Populations of T. populinum from other Populus species, including quaking aspen, P. tremuloides, which is the North America sister species to P. tremula, are needed to provide a more complete North American phylogeograhic history. Within North America the only potential population sampling of another Populus host came from samples from Colorado where P. tremuloides is dominant, although pockets of P. balsamifera are present. These isolates clearly group within the North American clade in each of the gene trees however, without a clear identification of host species, no further conclusions are possible at this time.
Estimates of average genetic diversity were low but similar in both Scandinavia and North America (Hd, π, θ Table 2), reflecting recent demographic events. Low diversity is indicative of a recent founding event or postglacial recolonization bottlenecks ( Hewitt, 2004 ), but also may reflect low mutation rate. Scandinavian populations of P. tremula survived in glacial refugia on the European continent ( Birks et al., 2008 Fussi et al., 2010 ). In North America, recent studies of P. balsamifera found evidence for glacial refugia in southern central Canada ( Keller et al., 2010 Levsen et al., 2012 ) and possibly in a glacial-free region of Beringia ( Hultén, 1937 Breen et al., 2012 ) during the last glacial maximum c. 28 000–15 000 yr before present ( Brubaker et al., 2005 ). T. populinum may have survived with these hosts in glacial refugia and would then have under gone population contraction and expansion during glacial and interglacial periods.
While the number of biogeographic studies of ectomycorrhizal fungi is increasing ( Douhan et al., 2011 ), few Holarctic ectomycorrhizal fungi have well-documented phylogeographies. Studies are beginning to show a recurrent pattern of intercontinental divergence. The ectomycorrhizal fungi Leccinum scabrum and Leccinum holopus both have phylogeographic structure similar to T. populinum, and both Leccinum spp. have genetic discontinuities between the North American and European continents, and little intracontinental phylogeographic structure ( den Bakker et al., 2007 ). These Leccinum spp. are also host specialists, associated only with Betula spp., and occur only in the Northern Hemisphere ( den Bakker et al., 2007 ). In a recent study using microsatellites and nuclear loci, Vincenot et al. (2012) found the ectomycorrhizal fungus Laccaria amethystine formed phylogenetic lineages in Europe and Japan with no shared haplotypes between regions. Genetic structure was not detected and only weak isolation by distance (IBD) was noted within Japan (FST = 0.04 960 km between the two populations) and Europe (FST = 0.041 2900 km between the most distance populations) ( Vincenot et al., 2012 ). Our results are very similar to those of Vincenot et al. (2012) at both inter- and intracontinental scales. We also found low differentiation of T. populinum populations in North America that were separated by as much as 2500 km (the maximum distance between our Pacific Northwest and interior Alaskan populations). A major difference between these study systems, however, is that L. amethystine associates with a variety of hosts, while T. populinum only occurs with Populus species. An interesting suggestion by Vincenot et al. (2012) is that L. amethystine may constitute a ring species. Future studies of Holarctic ectomycorrhizal fungi should also consider this hypothesis.
There are, however, also examples of intracontinental population structure within ectomycorrhizal species. A series of studies of A. muscaria, traditionally thought of as a host-generalist mycorrhizal fungus with a broad geographic range, have revealed strong allopatric divergence between Eurasia/Alaska and North American lineages from similar habitats as a result of a lack of gene flow ( Oda et al., 2004 Geml et al., 2006, 2008 ). However, in stark contrast to T. populinum, A. muscaria was demonstrated to be a phylogenetic species complex with strong divergence within North America attributed to ecoregional endemism ( Geml et al., 2006, 2008, 2010 ). Furthermore, phylogenetic species were found in sympatry in several regions ( Geml et al., 2006, 2008 ).
Tricholoma divergence time estimates
In this study, we estimated divergence time in T. populinum using two methods: one based on an estimated average nucleotide substitution rate for the ITS and the second based on a calibration point within the Basidiomycota but external to Tricholoma. Since the likelihood of discovering Tricholoma in the fossil record is very low, internal calibration of estimates of divergence time may never be possible. Hence we used calibration points between major groups within the Basidiomycota. It is clear that the lower substitution rate, 0.1 × 10 −9 , is not appropriate, as these estimates of divergence for Tricholoma spp. from T. populinum (c. 150 Ma) are approximately the same as the estimated diversification of the Agaricales ( Geml et al., 2004 Matheny et al., 2009 ). By contrast, the congruence of the estimates based on upper (0.87 Ma (95% HPD: 0.35, 1.46)) and average (1.7 Ma (95% HPD: 0.76,2.95 Ma)) ITS nucleotide substitution rates and from calibration points (1.0 Ma (95% HPD: 0.17, 2.40)) within the Basidiomycota is striking. As sequences from more fungal genomes become available, robust estimates of nucleotide substitution rates for increasing numbers of representative Basidiomycota will be developed as well as development of numerous additional loci, both of which will increase the accuracy of divergence time estimates. It should be possible to refine the hypotheses presented here on the divergence time estimates among T. populinum lineages in future studies.
This was the first phylogenetic and phylogeographic study of T. populinum. The only other study examined genet size and longevity at a very fine spatial scale associated with P. nigra in France ( Gryta et al., 2006 ). Thus we know very little about the reproductive biology of T. populinum. The two T. populinum lineages identified here may be considered phylogenetic species, or cryptic species, according to genealogical concordance phylogenetic species recognition (GCPSR Taylor et al., 2000 ). The number of different lineages in the T. populinum complex can only be ascertained after additional sampling of hosts and geographic locations as discussed earlier. The degree to which T. populinum lineages remain interfertile and undergo hybridization as a result of secondary contact between North American T. populinum lineage(s) and T. populinum lineage(s) associated with European or Asian poplars has not been examined. In this study, North American T. populinum populations from field collections were associated with host populations devoid of introduced Populus spp. Populus spp. frequently hybridize and it may be possible to examine hybridization of T. populinum from known Populus hybrid zones in future work.
Our results provide strong evidence for a lack of ongoing gene flow between North American and Scandinavian populations of T. populinum. The emerging trend of intercontinental genetic breaks in ectomycorrhizal fungi from mid-latitudes implies that fungal biogeography will be a rich area of inquiry, with relevance to past and present organismal responses to climate change.
The naturalness and content of subtribe Disinae
The ITS-rDNA sequence comparison of 30 Diseae, 20 Orchideae, and four Cranichideae and Diurideae outgroups indicates the naturalness of subtribe Disinae. Topologies constrained for the paraphyly of Disinae involve up to 17 extra steps and exhibit significantly less likelihood when compared to the best tree (Kishino and Hasegawa tests Table 2).
Disinae contain the genera Disa, Herschelia, Monadenia, and Schizodium. The type of the subtribe is the large genus Disa (128 species). The delimitation of Disa has been problematic due to: (1) the status of Monadenia, which had been considered a member of Disa by some authors, but with no clear affinities to any section (Kurzweil et al., 1995) (2) recognition of the genus Herschelia (often known under the name Herschelianthe), despite its close relationships to D. section Stenocarpa (Kurzweil et al., 1995) and (3) possible distinction of D. section Micranthae as a separate genus (Chesselet, 1989). The molecular phylogeny supports a circumscription of the genus Disa that includes Monadenia and Herschelia (as Disa bracteata and Disa spathulata, respectively Figs. 1, 3), therefore confirming previous observations (Linder and Kurzweil, 1994) and supporting the circumscription of the genus advocated by Bolus (1884, 1885) and Schlechter (1901). Additionally, Disa (Herschelia) spathulata strongly clusters with Disa chrysostachya (section Micranthae), and this clade is weakly supported as sister to Disa (Monadenium) bracteata. Among the 14 Disa sections recognized by Linder (1981a, b), our study included only four sections: Disa (six representatives), Coryphaea (two), Micranthae (one), and Phlebidia (one Table 1). According to Linder (1981a, b), plants possess either erect anthers (rarely, section Micranthae) or reflexed anthers. In the latter case, petals are reflexed next to the anther, either blue (section Phlebidia) or another color (section Disa), or plants have erect petals, free from the rostellum, and having lateral lobes squared (section Coryphaea). Sections Disa and Coryphaea are paraphyletic because Disa rosea (section Disa) clusters with D. sagittalis (section Coryphaea) in a clade distinct from other representatives of sections Disa and Coryphaea. This result agrees with Linder (1981a, p.272) who noted that D. rosea is morphologically isolated within the Corymbosae series. By contrast, a well-defined clade including D. cardinalis, D. tripetaloides, D. racemosa, D. uniflora, and D. pillansii is always recovered. This section Disa s.s. is characterized by chromosome numbers ranging from 2n = 36 to 38, and is distinct from the 2n = 40 found in D. sagittalis (Pienaar et al., 1989). Within D. section Disa s.s., the maximum parsimony analysis of ITS data with indels coded as a fifth character (Fig. 1) supports the monophyly of series Racemosae (D. racemosa, D. cardinalis, D. tripetaloides, D. uniflora).
The results within Disa must be considered in light of the relatively thin sampling of sections of the genus for ITS sequences (12 of the 128 Disa species). In morphological analyses (Johnson, Linder, and Steiner, 1998 Linder and Kurweil, unpublished data), D. sect. Stenocarpa was sister to the Disa (Herschelia) spathulata group and the Disa (Monadenia) bracteata group clustered near the base of Disa. Sampling of more species diversity within the large genera (Disa, Satyrium, Disperis, and Corycium) would be required to more accurately assess infrageneric boundaries.
The naturalness and content of subtribe Satyriinae
Satyriinae include the genera Satyrium and Pachites (not available for this study), defined by two morphological synapomorphies: an elongated column and lack of differentiation between sepals and petals (Linder and Kurzweil, 1994). Five of the seven sections recognized in the large genus Satyrium (100 species, Schlechter, 1901 88 species in Kurzweil and Linder, in press) were included in our molecular study: Eusatyrium, Leptocentrum, Chlorocorys, Brachysaccium, and Satyridium. The ITS results do not correspond to some of these taxonomic sections, but sampling is too limited (ten of the 100 Satyrium species) to draw conclusions. These sections, with the exception of Brachysaccium and Chlorocorys (one species of each sampled here), are also not supported by morphological, anatomical, and pollen/seed ultrastructural characters (Kurzweil and Linder, in press). Representatives of sections Eusatyrium (Satyrium membranaceum, S. humile, S. acuminatum, and S. carneum) and Leptocentrum (S. stenopetalum and S. ligulatum) are reciprocally embedded, thus indicating paraphyly of both groups. The apomorphy of two basal leaves appressed to the ground for S. section Eusatyrium thus represents a homoplastic condition (also found in Kurweil and Linder, in press). For the same reason, the labellum possessing two threadlike calli (perhaps spurs) in sections Leptocentrum (pinkish-redish, whitish, or yellowish leaves labellum oblong) and Chlorocorys (greenish leaves suborbicular labellum) likely represents a convergent character state (identical to the findings of Kurzweil and Linder [in press], but they found that the latter also has other apomorphic characters and was nonetheless monophyletic).
The monotypic section Satyridium (Satyridium rostratum = Satyrium rhynchantum) has been often treated as a monotypic genus. Our molecular data do not support this view: Satyrium rhynchantum is always strongly associated with S. bicallosum, indicating that Satyridium should be synonymized with Satyrium. In the morphological/anatomical analysis (Kurzweil and Linder, in press), it was also well nested within Satyrium, but the association with S. bicallosum was not predicted. Both species are morphologically unusual and are highly reduced or modified in several characteristics, leaving the impression that they are each isolated within the genus. As noted by Schlechter (1901), the labellum has short “little sacs” (“Sa¨ckchen”) in sections Brachysaccium (here represented by S. bicallosum). In S. rhynchantum, these are fleshy and elongate-conical, and so this character may represent a true synapomorphy defining the S. bicallosum/S. rhynchantum clade. To evaluate this hypothesis, ITS should be sequenced for representatives of sections Aviceps and Leucocomus because the “little sacs“ of the labellum are also encountered in these two groups.
Satyriinae also include Pachites, but its status is unclear (no material was available for this study). This genus comprises two rare species (P. appressa and P. bodkinii) and may be paraphyletic. Pachites bodkinii may actually be misplaced within Satyriinae because of the presence of monostelic (i.e., an undissected siphonostele) tubers, a character potentially of great assistance in resolving phylogenetic questions among Diseae (Kurzweil et al., 1995).
Coryciinae: relationships between the Pterygodium/Corycium complex and Disperis
Coryciinae originally included the genera Corycium, Pterygodium, Evotella, Ceratandra, and Disperis, all with flowers possessing a lip with an appendage (Kurzweil et al., 1991, hypothesized that the absence of the appendage in three species of Ceratandra was a secondary loss). Kurzweil et al. (1995) suggested that the two clades Corycium/Pterygodium and Ceratandra/Evotella cluster together with Disperis as their sister group. Morphological support for Coryciinae s. s. is clear, with five apomorphic characters, whereas only two characters support a relationship of these with Disperis (Linder and Kurzweil, 1994).
The MP and ML trees indicated paraphyly of Coryciinae with two distinct clades (Figs. 1, 3): (1) Coryciinae s.s. including Corycium and Pterygodium as sister group to Orchideae and Satyriinae and (2) Disperis as sister to Orchideae plus all other Diseae. However, the test of significance in likelihood differences does not reject the monophyly of the subtribe (Table 2), and it should be noted that we have sampled only two members of large and widespread Disperis. If monophyly of Coryciinae is constrained with a requirement of six extra steps, Disperis pairs with Pterygodium, conflicting with the conclusions of Kurzweil et al. (1995).
The amount of ITS divergence is high within the Corycium/Pterygodium complex (e.g., Fig. 3). Corycium is paraphyletic due to the internal position of Pterygodium with C. carnosum sister to the whole clade. More sampling is needed to evaluate better the position of Pterygodium, and it is possible that our result is an effect of undersampling (5 of the 32 Corycium/Pterygodium species were sampled) and the higher levels of divergence encountered. The isolated position of Corycium carnosum was emphasized on the basis of its morphology and anatomy (Kurzweil et al., 1991). The ITS data confirm its early diverging position among Coryciinae s.s. Corycium nigrescens and C. dracomontanum are close and sister to C. flanaganii, in full agreement with the morphological and anatomical study of Kurzweil et al. (1991). This clade is defined by a unique synapomorphy: the fusion of the lateral sepals.
Generic delimitation of Disperis has never been problematic, and the genus is isolated among Coryciinae (Kurzweil et al., 1991). The 5.8S rDNA sequence of Disperis capensis and D. lindleyana confirm their uniqueness because these two species are the only orchids possessing a single nucleotide insertion in this gene (i.e., cytosine in position 128), in a region known to be of variable length over a broad spectrum of plants (Hershkovitz and Lewis, 1996). A caveat to this statement must include as well an acknowledgment that <500 ITS sequences exist for a family with >20 000 species. The ITS1 and ITS2 sequences of the two Disperis are also highly divergent relative to other Diseae and Orchideae orthologues, as illustrated by the mean overall divergence of 51.0 ± 3.0% (ITS1) and 53.7 ± 3.0% (ITS2) for 96 pairwise comparisons. We cannot exclude the possibility that we have amplified an ITS pseudogene because divergent paralogues have been described in the nuclear genome of angiosperms (Buckler, Ippolito, and Holtsford, 1997). However, the amplification and sequencing of the ITS region were completed twice independently for Disperis lindleyana, without any differences detected between the two derived sequences. The two Disperis sequences also exhibited similarities (BS = +65 Fig. 1), including three consecutive 16-bp, 5-bp, and 9-bp deletions and a unique [AATTGC] insertion in the ITS1 region.
The divergent position of Disperis (Figs. 1, 3) could be explained by its apparent accelerated substitution rate leading to a long-branch attraction (Felsenstein, 1978) by the cranichid and diurid outgroups. Additional ITS sequencing among Disperis may help to clarify the occurrence of the molecular rate acceleration and to break up the long branch leading to Disperis capensis and D. lindleyana. Moreover, it is hypothesized that all phylogenetic reconstruction methods provide estimates of tree shape that are more unbalanced than the true tree, especially when rates of evolution are high (Huelsenbeck and Kirkpatrick, 1996). Such a trend is clear with the ITS data judging by the asymmetry of our MP and ML trees and the long branches for Diuris, Elythranthera, and Disperis (Figs. 1, 3). Paraphyly of Diseae may simply reflect the general behavior of phylogenetic algorithms to produce unbalanced trees rather than the true evolutionary history of the main clades of Diseae and Orchideae.
In support of the patterns obtained here with ITS, studies of three plastid regions, trnL-F, rbcL, and matK, demonstrate the same pattern of relationships for Disperis (Kores, Molvray and Chase, unpublished data), and we suggest that the basal position of Disperis is neither the result of a long-branch problem, nor the consequence of a tree-shape artifact. Likewise, Diseae s.l. are not monophyletic in these plastid trees: Satyriinae are sister to genera in Orchidinae/Habenariinae, but Disinae and Coryciinae are together sister to Satyriinae/Orchidinae/Habenariinae (a relationship supported here by the ML tests) Overall patterns of relationships as estimated by both plastid and nuclear regions are highly congruent, and we anticipate eventually being able to combine these and the morphogical/anatomical data in a single analysis once the sampling has been made more comparable and the remaining critical taxa are added to the molecular matrices.
The isolated position of Brownleeinae
Brownleeinae are monogeneric and have been hypothesized to have originated as a hybrid between species of Disinae and Coryciinae (Linder and Kurzweil, 1994). Linder and Kurzweil (1994) could not support the grouping of Brownleea with any confidence. Recognition of the tribe Brownleeinae is rather recent (Linder and Kurzweil, 1994), and we here included a single representative, Brownleea coerulea. The ITS sequence of the B. coerulea was without any indication of heterogeneity, but a long-established hybrid line may have its initial heterogeneity eliminated by concerted evolution. The constraint of Diseae monophyly (cladogram not shown) and the ML evaluation of alternative hypotheses (Table 2) both indicate that Brownleeinae may be the sister group of Disinae, despite the early divergence of Brownleea coerulea among Diseae/Orchideae in the MP (Fig. 1) and ML trees (Fig. 3). Linder and Kurzweil (1996) demonstrated the morphological heterogeneity with Brownleea, and ITS sequences from the six remaining Brownleeinae species, at least B. parviflora and B. mulanjiensis, are required to assign better the phylogenetic position of this isolated tribe. Brownleea has not yet been sampled for any plastid regions.
Phylogenetic relationships within Orchideae
Monophyly of Orchidinae is strongly supported by the molecular data (Figs. 1, 3). The phylogeny of the subset of Orchidinae genera included in the present study generally confirms the detailed molecular ITS analyses by Pridgeon et al. (1997) and Bateman, Pridgeon, and Chase (1997). Barlia, Ophrys, Serapias, and Orchis morio cluster in a clade characterized by globose tubers and chromosome numbers 2n = 32 or 36. Gymnadenia, Dactylorhiza, Pseudorchis, Platanthera, and Orchis italica fall into a more weakly supported clade, whereas they form a paraphyletic assemblage in the study by Pridgeon et al. (1997). This discrepancy likely arises because of the difference of level of sampling. Pridgeon et al. (1997) included 88 species with 79 Orchidinae and nine outgroups, whereas the present study included 54 species with nine Orchidinae and 45 outgroups.
Monophyly of Habenariinae appears in our molecular trees but is not well supported (Figs. 1, 3). The position of Brachycorythis (Orchidinae sensu Dressler, 1993) is different in the MP and ML trees but has no internal support in either position. This result for Brachycorythis was not unanticipated although the flowers look superficially like those of Orchidinae, it is becoming clear that Orchidinae is a north temperate subtribe with no representation likely in Africa (Pridgeon et al., 1997). The relationships between taxa of Habenariinae are actually poorly resolved by the ITS data, except for the clustering of Herminium lanceum with Habenaria sagittifera (cf. Pridgeon et al., 1997) and Habenaria procera with H. arenaria/Bonatea speciosa. The sequences of Holothrix scopularia and Habenaria repens are virtually identical, requiring additional sequencing of other Habenaria repens individuals and other Holothrix species to check for possible problems in sequence determination or species identification (they were not extracted at the same time, so switching of the samples is not likely). This pattern of relationships would be difficult to justify with morphological data.
Many of the other genera now assigned to Habenariinae have at one time or other been considered members of Habenaria. For example, Gennaria, Herminium, and Bonatea have combinations already made in Habenaria. We expected the circumscription of Habenaria to be problematic and so these results do not surprise us. Habenaria is a large cosmopolitan genus and at one time even included species of the distantly related genus Platanthera (Orchidinae). We expect that some species groups may have switched pollinator relationships and thus been segregated by authors relying solely upon floral characters for generic delimitation.
The position of Satyriinae as sister to Orchidinae/Habenariinae has its parallel in the studies of Linder and Kurzweil (in press), in which the placement of the former was ambiguous, with two of three possible arrangements present in their set of minimal length trees. The flowers of this subtribe are nonresupinate, and the column is highly modified, bent so that the base is uppermost to permit the entry of the pollinator from the opposite side of the flower. The presence of two spurs produced by the lip is not like other Diseae and is more similar to spurs in Orchideae, in which a single spur is produced, also by the lip. The column in the flowers of most Habenariinae (Orchideae) is so short that its exact orientation is not possible to determine. The main reason for placement of Satyriinae in Diseae is the presence of the bent anther, but an erect anther would be expected in flowers such as those of Orchideae in which the lip is the lowermost part of the perianth.
The congruence between morphological and ITS data
Numerous studies and cladistic analyses are available for Diseae (e.g., Kurzweil, Linder, and Chesselet, 1991 Kurzweil et al., 1995 Johnson, Linder, and Steiner, 1998). The superimposition of morphological, anatomical, and palynological characterstates on molecular cladograms should allow us to distinguish among characters incongruent with the rest of the data set, characters that support small subsets of taxa, and those that support a large set of taxa (Kurzweil et al., 1995). The vegetative and reproductive anatomical characters of Kurzweil et al. (1995) were mapped onto our MP cladograms to reveal potential synapomorphies defining the clades evidenced by the ITS data (Fig. 4).
Disinae excluding Schizodium (not represented in this study) are defined by three synapomorphies that are exclusive across Diseae: the sepals often apiculate, the lip patent or descending at the base, and the pollen surface rugose or hamulate. Satyrium (plus Satyridium) are marked by five exclusive synapomorphies: petals similar to the lateral sepals, lip galeate with two spurs, a well-developed column, and enlarged cells on the adaxial leaf epidermis. The Corycium/Pterygodium clade is defined by six exclusive synapomorphies: a single and fused lip-appendage, anther cells separated by a wide connective, elongated pollen tetrads, fasciculate massulae, and a striate and secondarily tectate pollen surface.
This review of diagnostic characters across Diseae (Fig. 4) indicates broad and general agreement between morphology/anatomy and the ITS data sets. However, the arrangement of vascular strands in tubers was thought to be of great assistance in resolving the phylogenetic questions in Diseae (Kurzweil et al., 1995). These molecular data validate the scenario of a monostelic (i.e., undissected tuber siphonostele) ancestral condition in Diseae and Orchideae. The “polystelic” condition (i.e., dissected siphonostele) is convergently derived in Disperis lindleyana, Brownleea, Herschelia, and in the Satyrium/Brachycorythis/Holothrix clade. Further studies to elucidate orchidoid phylogeny may focus on dissemination structures, either pollen or seed, following the example of Molvray and Kores (1995) in determining qualitative and quantitative characters of the diurid seed coat.
The phylogenetic affinities of Diseae relative to other core orchidoids
The tribe Diseae includes ∼400 species distributed in South Africa, Arabia, Madagascar, Mascarenes, India, China, and Indonesia. The greatest diversity is found in southern Africa, with several endemic genera (e.g., Schizodium). The tribe has been recognized as monophyletic, with all Diseae being characterized by the reflexed anther, thus placing pollinia on the ventral surfaces of pollinators (e.g., Dressler, 1993 this is actually a single character related to the orientation in which pollen vectors approach these flowers). The major discrepancy between morphology and the ITS tree thus centers on the question of whether or not Diseae are monophyletic. To investigate whether the weakly supported paraphyly of Diseae in the ITS tree could be taken as a rejection of monophyly, we constrained the ITS data to this topology (cladogram not shown). Monophyly of Diseae involves 19 extra steps compared to the MP trees (Fig. 1), which is a significantly less likely topology when compared to the highest-likelihood tree (Fig. 3, Table 2). In the context of Diseae monophyly, Satyriinae emerges first, and then Coryciinae (sistergroup of Disinae) plus Brownleea. The distinctness of Satyriinae relative to other Diseae is thus confirmed, and future studies should explore the evolutionary affinities between Satyriinae, Orchidinae, and Habenariinae. Dressler (1986) suggested that Disinae and Satyriinae are closer to each other than to Coryciinae because they share brightly colored sepals and a subterminal stigma. These characters may, however, be homoplastic because they are part of the pollinator-attracting/orienting syndrome. The ITS data reject this hypothesis (Table 2) but favor the association between Disinae and Coryciinae. This view agrees with Dressler (1993) and Kurzweil et al. (1995) who stressed the horizontally reflexed anther and the galeate dorsal sepal as synapomorphies for Disinae plus Coryciinae. This relationship is also found in our current plastid tree (Kores, Molvray and Chase, unpublished data).
Constraining Diseae monophyly actually involves the rooting of the MP tree (Fig. 1) nearly on the Orchideae internode (cladogram not shown). This alternative rooting of the core orchidoids subtree would neither explain the position of Disperis nor the relationships between Habenariinae. Thus, rooting of the ITS tree with diurids and spiranthoids as outgroups may be inadequate because of the great divergence between these outgroups and the core orchidoids (Fig. 2). Diseae and Orchideae represent a derived and well-characterized clade defined by a long ancestral segment (Kores et al., 1997), precluding the use of closer outgroups. This problem will hold for phylogenetic studies of core orchidoids whatever the kind of characters used (morphological, anatomical, palynological, chromosomal, and molecular). In the present case, ITS characters appear suitable to reconstruct the phylogeny within the major core orchidoid lineages but reach their limits for distant comparisons in subfamily Orchidoideae.
Conversely, this molecular study supports paraphyly of Diseae (Figs. 1, 3 Table 2), therefore conflicting with all previous interpretations of floral characters. However, analysis of morphological/anatomical data revealed no apomorphic characters in the vegetative anatomy for Diseae (Kurzweil et al., 1995). Our ITS data could be interpreted to mean that a reflexed anther represents: (1) the ancestral state among core orchidoids and the erect anther a derived state or (2) a labile character state controlled by pollinator pressures that has arisen multiple times. It is more likely that this scenario is too simple. The column and lip structure in each of the seven clades (Disperis, Brownleeinae, Disinae, Corycium/Pterogodium, Satyriinae, Orchidinae, and Habenariinae) identified here as well as in the morphological cladistic studies as monophyletic are so different from each other that simply describing the column as either bent or erect is exceedingly unsatisfactory and almost certainly inaccurate. When taken in conjuction with the finding of Kurzweil et al. (1995) that there are no apomorphic vegetative characters for these plants, the best assessment is that there has never been an accurate evaluation of phylogeny in Orchideae/Diseae because there are as yet no synapomorphies identified that could apply to any pair of subtribes other than Disinae/Corycium/Pterogodium, which all share the saccate cavity/spur on their dorsal sepals. Perhaps Orchideae/Diseae are best considered a single tribe.
This overview of molecular phylogeny of Diseae emphasizes the need for additional taxonomic sampling. The genera Schizodium and Pachites will complete the generic representation of Disinae and Satyriinae. Inclusion of the genera Ceratandra and Evotella will help us to reconstruct the phylogeny of Coryciinae, which appears as the keystone subtribe in understanding the evolution of Diseae and Orchideae. Molecular studies should also include the rare Huttonaea, characterized by flowers with spathulate sepals and previously placed in Orchideae (Dressler, 1981). The monogeneric tribe Huttonaeinae was recently recognized and included within Diseae (Linder and Kurzweil, 1994) and may represent an essential taxon to establish the exact relationship between Diseae and Orchideae (Dressler, 1993).
Further studies of core orchidoids using the ITS1 and ITS2 markers could explore their molecular evolution. Evaluation of ITS secondary structures as well may help to align divergent sequences and to characterize the dynamics of indel occurrence. Sequencing a phylogenetic marker distinct from noncoding nuclear DNA for the same Diseae, Orchideae, Diurideae, and spiranthoid taxa would also help to resolve remaining ambiguities. Promising alternative markers may be found in the plastid genome, for example the protein-coding matK, rps4, and rbcL genes, or the noncoding rps4/trnS, trnL-F and atpB/rbcL introns and spacers.
This study of ITS sequences confirms the findings of a large body of previous work on morphology of Diseae (cited in the text) and to a large degree identifies the same groups that were already clear from these studies. Support is high for most of the same groups supported by morphological apomorphies, and the points at which descrepancies exist could be the result of insufficient taxon sampling for ITS. Two major new and discordant patterns appear in the ITS trees: (1) the removal of Disperis from Coryciinae, and (2) the rooting of the basitonic orchids within Diseae rather than between Diseae and Orchideae, thus making Diseae paraphyletic. These two results are similar in that they are the weakest points in our understanding of these plants derived from previous works. The exact position of Disperis has been a problematic issue for morphology to address, and it clearly lacks the pattern of abundant synapomorphies that link Pterygodium and Corycium. The identification of an alternate position for Disperis is intriguing and will be followed up in future studies (some of which, at least the molecular ones, are already underway). The second pattern, the rooting, has never been subject to previous analysis with any form of data, and so it is perhaps not so surprising, especially if the bent anther of these groups, the sole potential synapomorphy of Diseae, is viewed against the diversity of column/lip structures observed. Although the ITS topology itself is not strongly supported along its spine, the alternative is much less parsimonious and also less likely from a ML perspective. The significance of this study then resides, first, in the confirmation by DNA sequence analysis of the results from meticulous and lengthy studies of morphology/anatomy and, second, in the discovery of new patterns that appear to be reasonable alternatives to those previously held and upon which future work can now be focused.Table 1. List of species and genera for which the ITS region has been sequenced in this study. The taxonomic framework for suprageneric categories follows Schlechter (1901), Linder (1981a), Kurzweil, Linder, and Chesselet (1991), Dressler (1993), and Linder and Kurzweil (1994). Accession numbers in the EMBL/GenBank/DDBJ data banks are listed in the last column.
Strict consensus of the two most-parsimonious trees reconstructed by PAUP 3.1.1. from the ITS1-5.8S rDNA-ITS2 matrix of 54 orchid taxa and 711 sites (566 were variable and 506 phylogenetically informative), with indels coded as fifth character state. Tree statistics are: length = 3471, CI = 0.40, RI = 0.56, and CI excluding uninformative sites = 0.39. The spiranthoid and diurid taxa were specified as the outgroups. Numbers above branches are bootstrap percentages (BP) from 100 replicates (a dash indicates that there is no bootstrap support for the node). Numbers below branches are the corresponding Bremer support (BS) indices, showing the number of extra steps required to collapse the branch. Thick lines indicate high BP and BS support for the clades. Satyriinae denote the subtribe division of Linder and Kurzweil (1994) minus Pachites (note that Satyrium rhynchantum = Satyridium rostratum). Coryciinae s.s. (Kurzweil, Linder, and Chesselet, 1991) correspond to the subtribe division of Linder and Kurzweil (1994) without Disperis (tissue of Ceratandra and Evotella were not available). Habenariinae here include Brachycorythis and Holothrix. The Habenaria repens, H. sagittifera, and H. procera sequences were kindly provided by J. R. Hapeman.
Plot of the number of observed differences (transitions, transversions, and deletions) against the number of inferred substitutions between each pair of taxa as derived by maximum parsimony for 1431 pairwise comparisons between the ITS1-5.8S rDNA-ITS2 sequences of 54 orchid taxa. The shortest tree on which the patristic distances were computed was 3471 steps long. Black triangles represent the comparisons between the six most distant taxa (Cranichis, Chloraea, Diuris, Elythranthera, and the two Disperis) and all the other Diseae and Orchideae taxa. Open circles represent all the comparisons among Disinae, Satyriinae, Coryciinae (except the two Disperis), Brownleeinae, Orchidinae, and Habenariinae taxa. The dashed line corresponds to an equal number of differences observed and substitutions inferred (i.e., no multiple changes). The high proportion of multiple changes is evident for comparisons involving diurid, cranichid, and Disperis relative to the others.
Highest-likelihood tree found after evaluation of a total of 280 most parsimonious trees and trees one and two steps longer than MP trees as computed by the DNAML program (Ti/Tv ratio = 1.38). The spiranthoid and diurid taxa were used as outgroups. The log-likelihood of this tree is −14099.2, and its length is 2905 steps (two steps longer than the MP tree). The log-likelihood values of these MP trees, one-step, and two-step longer trees range from the best one to −14118.3. Internode segments that are not significantly different from 0 (at the P < 0.01 level) are marked by a star. The quartet puzzling algorithm (PUZZLE 2.5.1 program) yielded a less resolved tree, and the reliability percentages (RP) are indicated for the nodes in common with the DNAML tree. Thick black segments lead to various Diseae taxa and thin segments to other non-Diseae orchidoid taxa.
Mapping of the morphological, anatomical, and palynological character states on the cladogram reconstructed from the molecular ITS-rDNA data. The topology shown is the strict consensus of the three most parsimonious cladograms derived from the analysis of the ITS-rDNA matrix with indels coded as missing data (length = 2903 CI = 0.37 RI = 0.53 CI excluding uninformative sites = 0.35). The relationships within each subtribe were not detailed, but the height of the triangles is proportional to the number of species sampled. Only exclusive synapomorphies are reported. The number of unambiguous molecular ITS-rDNA character-state changes are given within circles for each branch. Optimization of these changes on the three individual trees is the same as on the strict consensus tree for all illustrated branches.