Last modified on 26 January 2016.
Literature Reference:
Doshi K.J., et al.
Unpublished (2007).
Publicly-Available Materials:
Documentation Excerpt:
The Comparative Analysis Tookit (CAT) provides an engine for:
- Automatically aligning RNA sequences.
- Evaluating the quality of an RNA sequence alignment.
- Creating subalignments (using specific selection criteria).
- Sorting and annotating RNA sequence alignments via phylogenetic relationships.
- Creating secondary structure diagrams.
- Calculating base-pair frequencies and nucleotide frequencies for sequences in an RNA sequence alignment.
- Calculating consensus sequences for RNA sequence alignments.
- Calculating identity between aligned sequences in an RNA sequence alignment.
- Searching an RNA sequence alignment using FASTA.
- Searching an RNA sequence alignment using identity to an aligned sequence.
Commands
Section | Command | Short Description |
---|---|---|
Application Command | cat | Command-line options when launching the CAT application. |
Basic Commands | config | Load or change CAT configuration options. |
batch | Execute simple script files containing sets of CAT commands. | |
exit | Exit CAT. | |
history | View a history of commands that have been executed by a user within a given CAT session. | |
cmdTimer | Toggle execution timing for individual commands. | |
alias | View aliases currently registered with a given instance of the CAT application. | |
Alignment Manipulation Commands | renameRows | Rename sequences in an alignment using the format: NCBITaxID.CellLocation.Genus.Species.Ordinal. |
clearBuffer | Clear temporary alignment results from the autoalign command out of memory. | |
changeCurAln | Change the "current" alignment. The "current" alignment is the default alignment which many other commands operate on. After CAT is first launched, the first alignment loaded is set current by default (see the loadAlignment command). | |
swapRows | Swap a temporary result for a given row in the "current" alignment (e.g., an autoalign command result) with the actual contents of the same row in the "current" alignment. Note: This command is not full tested and should not be relied upon. | |
loadAlignment | Load an alignment into memory. | |
closeAlignments | Close alignment(s) already loaded in-memory. | |
saveAlignment | Save an entire alignment already loaded in-memory to a specified file or to create a subalignment from an alignment already loaded in-memory. | |
listAlignments | List the names and sizes of all alignments loaded in-memory. | |
listSequences | List the names and sizes of all sequences in the current alignment. | |
viewAlignment | View an entire alignment or selected rows from an alignment at the command-line. Note: this command works best when the CAT application is launched in a command terminal without automatic line wrapping. | |
Alignment Search and Selection Commands | selectRowsWithLim |
|
searchAlignments | Perform a FASTA search over multiple alignments loaded in memory with a given sequence. Searches can encompass the entire sequence, or fragments defined by specific selection criteria. | |
Alignment Generation Commands | autoalign | Align multiple sequences using an already aligned sequence as a template. |
fullAlignment | Have CAT automatically select a set potential aligned sequences (using FASTA) as templates and then autoalign a specified sequence against each potential template. | |
findQueries | Have CAT select a set of unaligned sequences (using FASTA) and then autoalign each unaligned sequence against a specified template sequence. | |
Analysis Commands | evaluate | Check the accuracy of the alignment of the given sequence or a set of sequences using sequence-based criteria and structure-based criteria. Percent complete and length can be calculated for the sequence(s) evaluated and the CRWDB updated if desired. |
evalRunner | A wrapper around the evaluate command. Its primary purposes are to:
Users should look to use the evalRunner command unless they need specific features from the evaluate command, which are not available in the evalRunner command. |
|
consensus | Calculate the consensus for either all or a selected subset of sequences from the "current" alignment. The consensus can be calculated across either all or a specified subset of columns from the "current" alignment. | |
identity | Calculate the pairwise identity and overlap between sequences in a given alignment that are already aligned. This command can also search an alignment for sequences that have a given identity and/or overlap to a specified reference sequence. | |
calcBPFreqs | Compute the base-pair frequency data in an alignment, across a given row set, given a reference set of secondary structure base-pairings. | |
calcNTFreqs | Compute the nucleotide frequencies for all columns in an alignment, across a given row set. | |
calcInDels | Count Insertion/Deletion events for a specified sequence with respect to a given reference sequence. | |
seqComps | Compute the nucleotide compositions of sequences in an alignment. | |
checkAln | Check the accuracy of the alignment of sequence computed using the autoalign command, assuming the given sequence is already correctly aligned in the alignment. Note: this command is useful for regression testing new parameters for the autoalign command. | |
Structure Manipulation/Generation Commands | loadStructData | Map secondary structure pairings for a specific row in an alignment. |
templateDiagram | Use this command to create a new secondary structure diagram (XRNA format) using an existing diagram as a template. | |
projectPairings | Use this command to create new secondary structure pairing sets by projecting an existing set of secondary structure base pairs across a set of sequences. The pairings sets are output in BPSEQ, CT, RNAml, Bracket and Alden formats. | |
Genbank Commands | checkAccGB | Use this command to check if a sequence in an alignment also exists in GenBank. The check is performed using the NCBI Accession Number stored in the CRWDB for the sequence. If the sequence exists in GenBank, the command can update the Available field in the CRWDB. Note: This command uses NCBI eUtils and Web Services to query Genbank remotely, in contrast to the rest of the commands in this section which require a local copy of Genbank. |
checkGBHits | Use this command to crosscheck against the CRWDB a set of sequences identified by searching a copy of Genbank that is co-located on the same server from which CAT is invoked. | |
getGBEntries | Use this command to retrieve Genbank entries by locus from a copy of Genbank that is co-located on the same server from which CAT is invoked. | |
manageGenbank | Use this command to manipulate a copy of Genbank that is co-located on the same server from which CAT is invoked. | |
searchGenbank | Use this command to search a copy of Genbank that is co-located on the same server from which CAT is invoked using FASTA. | |
alnGBSearch | Use this command to create a new AE2-formatted alignment from a set of Genbank entries obtained with the searchGenbank command. |