Figure 1. Small subunit rRNA secondary structure model for the Toxarium undulatum GenBank accession number . Canonical base-pairs (G:C, A:U) are shown with tick marks, wobble (G:U) base-pairs are marked with small closed circles, A:G base-pairs are indicated with large open circles, and all other non-canonical base-pairs are shown with large closed circles.

Figure 2. Conservation secondary structure diagram for Bacillariophyta SSU rRNA, using the Toxarium undulatum SSU rRNA secondary structure model (Fig. 1) as the reference sequence. The conservation diagram summarizes the alignment of 181 diatom sequences. Symbols are present for positions that contain a nucleotide in at least 95% of the sequences in the alignment: red capital letters, the given nucleotide is conserved at 98-100% at the position; red lower-case letters, 90-98% conservation; black closed circles, 80-90%; black open circles, less than 80% conserved. Other positions (not containing a nucleotide in 95% of the sequences) are shown by arcs, which are labeled with the minimum and maximum numbers of nucleotides known to exist in the region. The blue tags indicate insertions relative to the reference sequence that are either 1-4 nt in length in at least 10% of the sequences or at least 5 nt in length in at least one sequence. The label format is (maximum length of insertion:percentage of sequences having any length insertion).

Figure 3. Phylogenetic tree from maximum likelihood analysis of the KEA alignment provided by W. Kooistra. Parameter values of the GTR+Γ+I model were fixed to those used by KEA. The tree search used 100 random addition sequence replicates and TBR branch swapping. Each diatom taxon name is followed by at least one generalized line drawing, based on figures and generic descriptions from Round et al. (1990). For taxa with multiple line drawings, the drawing immediately followng the scientific name represents the most common outline for that genus, based on Round et al. (1990).

Figure 4. One of two phylogenetic trees from maximum likelihood analysis of the KEA alignment provided by W. Kooistra. Parameter values of the GTR+Γ+I model were fixed to those set by KEA, except that empirical base frequencies were used. The tree search used "as-is" addition of taxa and TBR branch swapping. Each diatom taxon name is followed by at least one generalized line drawing based on figures and generic descriptions from Round et al. (1990). For taxa with multiple line drawings, the drawing immediately following the scientific name represents the most common outline for that genus, based on Round et al. (1990).

Figure 5. Consensus tree from Bayesian analysis of structurally aligned SSU rDNA sequences for 181 diatoms and eight outgroup taxa. A 50% majority-rule consensus tree was calculated from the pooled posterior distributions of two independent MCMCMC runs. Bayesian posterior probability values greater than 0.5 are shown below nodes. Terminal taxa are identified by GenBank accession number followed by scientific name. For simplicity, several clades were collapsed to triangles, with the number of taxa per clade noted to the right. Two clades ("A" and "B") were highlighted to facilitate discussion in the text.

Figure 6. Strict consensus of 106 most parsimonious trees based on 752 parsimony-informative characters; tree length=7151, consistency index (excluding uninformative characters) = 0.2646; retention index = 0.7040; rescaled consistency index = 0.1863. Nonparametric bootstrap values are shown below nodes. For simplicity, several clades were collapsed to triangles, with the number of taxa per clade noted to the right.