Represents regular expression parse rule plus some additional parameters. More...
#include <ms_taxonomyrules.hpp>
Public Types | |
typedef unsigned int | TAX_CHOP_SRC |
Data type used for the parameter specifying how to chop a source line. This will be zero or more of the ms_parserule_plus::TAX_CHOP_TYPES values OR'ed together. | |
enum | TAX_CHOP_TYPES { TAX_CHOP_PREFIX = 0x0001 , TAX_CHOP_SUFFIX = 0x0002 , TAX_CHOP_WORDS = 0x0004 } |
Constants used for combining TAX_CHOP_SRC values. More... | |
Public Member Functions | |
ms_parserule_plus () | |
Default constructor. | |
ms_parserule_plus (const ms_parserule_plus &src) | |
Copying constructor. | |
~ms_parserule_plus () | |
Destructor. | |
void | copyFrom (const ms_parserule_plus *right) |
Can be used to create a copy of another instance. | |
void | defaultValues () |
Initialises the instance. | |
TAX_CHOP_SRC | getChopSource () const |
Returns additional parameter specifying how to chop a source line. | |
TAX_SPECIES_FORMAT | getFileTypeToSearch () const |
Returns the file format. | |
std::string | getNameOfDB () const |
Returns the database name. | |
const ms_parserule * | getRule () const |
Returns the regular expression-based parse rule. | |
ms_parserule_plus & | operator= (const ms_parserule_plus &right) |
Assignment operator for C++ client applications. | |
void | setChopSource (const TAX_CHOP_SRC value) |
Change the parameter specifying how to a chop source line. | |
void | setFileTypeToSearch (const TAX_SPECIES_FORMAT value) |
Change the file format. | |
void | setNameOfDB (const char *name) |
Change the database name. | |
void | setRule (const ms_parserule *src) |
Set a new parse rule. | |
Represents regular expression parse rule plus some additional parameters.
enum TAX_CHOP_TYPES |
Constants used for combining TAX_CHOP_SRC values.
See Using enumerated values and static const ints in Perl, Java, Python and C#.
Enumerator | |
---|---|
TAX_CHOP_PREFIX | Remove all words at the start of the text specified in the PrefixRemoves section. See ms_taxonomyrules::getPrefixRemove(). |
TAX_CHOP_SUFFIX | Remove all words at the end of the text specified in the SuffixRemoves section. See ms_taxonomyrules::getSuffixRemove(). |
TAX_CHOP_WORDS | Remove one word at a time from the end of the text and try to get a taxonomy match again. |
ms_parserule_plus::TAX_CHOP_SRC getChopSource | ( | ) | const |
Returns additional parameter specifying how to chop a source line.
Parsing taxonomy information from a reference file or fasta file can not always be performed reliably using simple regular expressions. The 'CHOP' rules allow prefixes and suffixes to be removed and for 'words' to be extracted. For example:
DefaultRule NCBI, CHOP:WP "\(.*\)" # Everything! PrefixRemoves C;Species: and mitochondrion chloroplast plastid endogenous
species that any of the listed prefixes are removed, and if that fails, it should remove one word at a time from the end of the text and try to get a match.
void setChopSource | ( | const TAX_CHOP_SRC | value | ) |
Change the parameter specifying how to a chop source line.
Parsing taxonomy information from a reference file or fasta file can not always be performed reliably using simple regular expressions. The 'CHOP' rules allow prefixes and suffixes to be removed and for 'words' to be extracted. For example:
DefaultRule NCBI, CHOP:WP "\(.*\)" # Everything! PrefixRemoves C;Species: and mitochondrion chloroplast plastid endogenous
species that any of the listed prefixes are removed, and if that fails, it should remove one word at a time from the end of the text and try to get a match.
value | Should be zero or more of the ms_parserule_plus::TAX_CHOP_TYPES values OR'ed together. |