Last modified on 09 August 2004.

Literature Reference:

Doshi K.J., Cannone J.J., Cobaugh C.W., and Gutell R.R. (2004).
Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction.
BMC Bioinformatics, 5:105.

Manuscript/Supplemental Data References (PDF)


Manuscript Figures and Tables:

Figure 1. Direct Comparision of Mfold 2.3 and Mfold 3.1 Folding Accuracy PDF
A. PDF B.1 PDF B.2 PDF
C. PDF D.1 PDF
D.2 PDF
Figure 2. Accuracy of Comparatively Predicted Base-pairs from 496 16S rRNA Sequences and RNA Contact Distance PDF
A. PDF B. PDF
Figure 3. ΔΔG vs. Structural Variation for Pairwise Comparisions from the "Suboptimal Population" PDF
A. PDF B. PDF
Figure 4. Frequency of Base-pair predictions within the "Suboptimal Population" for Selected 16S rRNAs PDF
A. PDF B. PDF
Table 1. Distribution of Comparative Secondary Structure Models Analyzed for this Study TABLE
Table 2. Average Accuracy of the Optimal RNA Structure Predicted with MFold 3.1 TABLE
Table 3. Average Accuracy of the Optimal RNA Structure Predicted with MFold 3.1 grouped by Phylogeny TABLE
Table 4. Accuracy of Specific 16S and 23S rRNA Sequences using MFold 2.3 and MFold 3.1 TABLE
Table 5. Accuracy of Base-pairs Predicted with MFold 3.1 as a Function of RNA Contact Distance TABLE
Table 6. Average, Minimum, and Maximum ΔΔG Values for Pairwise Comparisons of Different Suboptimal Folds TABLE
Table 7. Distribution of 16S rRNA Base-pairs Predicted Correctly and Incorrectly TABLE
Table 8. Occurrence of Comparative Base-pairs Predicted with MFold 3.1 TABLE

Supplemental Information:

Dataset Characterization (Data Tables) Basic Sequence Statistics The summary page contains individual statistics for each of the 1,411 sequences studied, organized first by molecule type (tRNA, 5S rRNA, 16S rRNA, 23S rRNA), then by phylogenetic relationships. This page also contains links to detailed webpages for each molecule type (e.g. 16S rRNA) that provide the following information about each sequence analyzed:
  • Number of A,G,C,U nucleotides
  • Sequence length
  • %GC nucleotides
Sequence Identity Analysis (Data Tables) Sequence Similarities for 16S and 23S rRNA Sequences Sequence identity for all pairwise comparisions for 16S and 23S rRNA sequences within each phylogenetic group
Accuracy of Free Energy Minimization (Data Tables) Prediction Accuracy For Complete Dataset
With efn2
Without efn2
The summary page contains the averages and statistics for the prediction accuracies for tRNA, 5S, 16S and 23S rRNAs, and the primary phylogenetic groups. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide the following information about each sequence analyzed:
  • Number of canonical, comparative base-pairs
  • Number and percentage of base-pairs predicted correctly
  • Comparatively predicted secondary structures available as a textual listing of the predicted base-pairings and a diagram (in PDF format) with correctly predicted base-pairs indicated in red.
Uniquely Predicted Base-pairs for Different Size Populations of Suboptimal Structure Predictions
Without efn2
Multiple folding analyses for Archaeal 16S rRNA sequences, H.volcanii and M. Hungatei with 750 and 1000 total structure predictions respectively.
  • The number of unique correct and incorrectly predicted base-pairs is determined for both sets of analyses.
Counts of Suboptimal Predictions More or Less Accurate than the Optimal Structure Prediction for Different 16S rRNA Sequences
With efn2
Without efn2
The summary page contains counts for the major phylogenetic groups and a link to a webpage that provides counts for each 16S rRNA sequence.
  • For each sequence, the number of suboptimal structure predictions more accurate, less accurate or as accurate as the optimal structure prediction is determined.
Accuracy of Free Energy Minimization (Figures) Prediction Accuracy Histogram Plots
With efn2 (PDF)
Without efn2 (PDF)
Bar chart depicting the average prediction accuracy for tRNA, 5S, 16S and 23S rRNA, grouped by Phylogeny.
Prediction Accuracy Min,Max,Average Plots
With efn2 (PDF)
Without efn2 (PDF)
"Stock chart" used to depict the average prediction accuracy for different molecule types, with the most accurate and least accurate prediction for each molecule type indicated. Results from previous studies using Mfold 2.3 are also indicated.
Plot of Cumulative Correct and Incorrectly Predicted Unique Base-pairs for H. volcanii
With efn2 (PDF)
Without efn2 (PDF)
Running sum of the number of unique correct and incorrect base-pairs from the set of 750 structure predictions.
  • This analysis technique is similar to the "Any Suboptimal" analysis in Mathews et al. (ref #29, Table 1).
Plot of Individual Suboptimal Accuracy for:
H. volcanii 16S rRNA
With efn2 (PDF)
Without efn2 (PDF)
M. Hungatei 16S rRNA
With efn2 (PDF)
Without efn2 (PDF)
The accuracy for each suboptimal structure prediction is plotted.
Accuracy of Free Energy Minimization as a Function of RNA Contact Distance (Data Tables) Prediction Accuracy for Base-Pairs with Different RNA Contact Distances
All Distances
With efn2
Without efn2
Short-range
With efn2
Without efn2
The summary page contains statistics (such as counts and accuracy) for base-pairs of different RNA contact distances for 16S and 23S rRNA, organized by Phylogeny. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide the following information about each sequence:
  • Number of comparative, canonical base-pairs in different RNA contact distance categories.
  • Number of correctly predicted base-pairs in different RNA contact distance categories.
Prediction Accuracy for 191,994 16S rRNA Comparative Base-pairs Grouped by RNA Contact Distance
With efn2
Without efn2
All comparatively predicted base-pairs from 496 16S rRNA sequences are binned by RNA contact distance.
Accuracy of Free Energy Minimization as a Function of RNA Contact Distance (Figures) Prediction Accuracy as a Function of Contact Distance Histograms
All Distances
With efn2 (PDF)
Without efn2 (PDF)
Short Range
With efn2 (PDF)
Without efn2 (PDF)
Bar chart depicting the average prediction accuracy of base-pairs with different RNA contact distances for 16S and 23S rRNA.
Prediction Accuracy as a Function of Contact Distance and Phylogeny Histograms
16S rRNA - All Distances
With efn2 (PDF)
Without efn2 (PDF)
16S rRNA - Short-distances
With efn2 (PDF)
Without efn2 (PDF)
23S rRNA - All Distances
With efn2 (PDF)
Without efn2 (PDF)
23S rRNA - Short-distances
With efn2 (PDF)
Without efn2 (PDF)
Bar charts depicting the average prediction accuarcy of base-pairs of different RNA contact distances for different phylogenetic categories for 16S and 23S rRNA.
Scatter Plot of Prediction Accuracy of Comparative Base-pairs vs. RNA Contact Distance
With efn2 (PDF)
Without efn2 (PDF)
A plot of the contact distance for 191,994 base-pairs predicted with comparative analysis from 496 16S rRNA sequences vs. the accuracy of the Mfold prediction.
Log-Log Scatter Plot of the Distribution of Comparative Base-pairs as a Function of RNA Contact Distance for 16S rRNA
With efn2 (PDF)
Without efn2 (PDF)
A log-scale plot of the contact distance distribution of 191,994 base-pairs predicted with comparative analysis from 496 16S rRNA sequences.
Accuracy and Base-Pair Environment (Data Tables) Prediction Accuracy for Complete Dataset as a Function of Base-Pair Environment
Without efn2
The summary page contains statistics for base-pairs closing hairpin, internal and multistem loops for the 16S and 23S rRNA datasets, organized by Phylogeny. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide the following information about each sequence:
  • Number of comparative base-pairs closing each loop type.
  • Number of predicted base-pairs closing each loop type.
  • Number of comparative base-pairs closing each loop type predicted correctly.
  • Number of comparative base-pairs predicted correctly where the predicted base-pair also closes the correct loop type.
Accuracy and Base-Pair Environment (Figures) Base-Pair Loop Environment Definitions (PDF) Using a 16S rRNA structure diagram as a template, each base-pair is marked as belonging to one of three loop enviroments (hairpin, internal or multistem).
Accuracy and Base-Pair Type (Data Tables) Base-Pair Prediction Accuracy as a Function of Base-Pair Type
Without efn2
The summary page contains statistics for base-pairs of different types (GC/CG, AU/UA or GU/UG) for 16S and 23S rRNA, organized by Phylogeny. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide the following information about each sequence:
  • Number of comparative base-pairs of each type.
  • Number of predicted base-pairs of each type.
  • Number of correctly predicted base-pairs of each type
Sequence and/or Structure Biases in the Comparatively Predicted Structure Models (Data Tables) % of Non-Canonical (NC) Base-Pairs in Comparative Structure Models
Without efn2
The summary page contains average counts of non-canonical base-pairs for tRNA, 5S, 16S and 23S rRNA, organized by Phylogeny. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide counts of non-canonical base-pairs for each sequence analyzed.
% of Extra-Stable Tetraloops (ESTL) in Comparative Structure Models
Without efn2
The summary page contains average counts and prediction accuracies for extra-stable tetraloops (GVAA, GHGA, UWCG) for 16S and 23S rRNA, organized by Phylogeny. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide the following information about each sequence:
  • Count of comparative extra-stable tetraloops
  • Count of predicted extra-stable teteraloops
  • Count of extra-stable tetraloops predicted correctly
Paired to Unpaired Nucleotide Ratios for Comparative Structure Models
Without efn2
The summary page contains average counts of paired and unpaired nucleotides in each comparative model for tRNA, 5S, 16S and 23S rRNA, organized by Phylogeny. This page also contains links to detailed webpages for each molecule class (e.g. 16S rRNA) that provide counts of paired and unpaired nucleotides for each sequence analyzed.