Matrix Science Mascot Parser toolkit
 
Loading...
Searching...
No Matches
ms_parserule_plus Class Reference

Represents regular expression parse rule plus some additional parameters. More...

#include <ms_taxonomyrules.hpp>

Public Types

typedef unsigned int TAX_CHOP_SRC
 Data type used for the parameter specifying how to chop a source line. This will be zero or more of the ms_parserule_plus::TAX_CHOP_TYPES values OR'ed together.
 
enum  TAX_CHOP_TYPES {
  TAX_CHOP_PREFIX = 0x0001 ,
  TAX_CHOP_SUFFIX = 0x0002 ,
  TAX_CHOP_WORDS = 0x0004
}
 Constants used for combining TAX_CHOP_SRC values. More...
 

Public Member Functions

 ms_parserule_plus ()
 Default constructor.
 
 ms_parserule_plus (const ms_parserule_plus &src)
 Copying constructor.
 
 ~ms_parserule_plus ()
 Destructor.
 
void copyFrom (const ms_parserule_plus *right)
 Can be used to create a copy of another instance.
 
void defaultValues ()
 Initialises the instance.
 
TAX_CHOP_SRC getChopSource () const
 Returns additional parameter specifying how to chop a source line.
 
TAX_SPECIES_FORMAT getFileTypeToSearch () const
 Returns the file format.
 
std::string getNameOfDB () const
 Returns the database name.
 
const ms_parserulegetRule () const
 Returns the regular expression-based parse rule.
 
ms_parserule_plusoperator= (const ms_parserule_plus &right)
 Assignment operator for C++ client applications.
 
void setChopSource (const TAX_CHOP_SRC value)
 Change the parameter specifying how to a chop source line.
 
void setFileTypeToSearch (const TAX_SPECIES_FORMAT value)
 Change the file format.
 
void setNameOfDB (const char *name)
 Change the database name.
 
void setRule (const ms_parserule *src)
 Set a new parse rule.
 

Detailed Description

Represents regular expression parse rule plus some additional parameters.

Member Enumeration Documentation

◆ TAX_CHOP_TYPES

Constants used for combining TAX_CHOP_SRC values.

See Using enumerated values and static const ints in Perl, Java, Python and C#.

Enumerator
TAX_CHOP_PREFIX 

Remove all words at the start of the text specified in the PrefixRemoves section. See ms_taxonomyrules::getPrefixRemove().

TAX_CHOP_SUFFIX 

Remove all words at the end of the text specified in the SuffixRemoves section. See ms_taxonomyrules::getSuffixRemove().

TAX_CHOP_WORDS 

Remove one word at a time from the end of the text and try to get a taxonomy match again.

Member Function Documentation

◆ getChopSource()

ms_parserule_plus::TAX_CHOP_SRC getChopSource ( ) const

Returns additional parameter specifying how to chop a source line.

Parsing taxonomy information from a reference file or fasta file can not always be performed reliably using simple regular expressions. The 'CHOP' rules allow prefixes and suffixes to be removed and for 'words' to be extracted. For example:

  DefaultRule         NCBI, CHOP:WP   "\‍(.*\‍)"    # Everything!
  PrefixRemoves       C;Species: and mitochondrion chloroplast plastid endogenous

species that any of the listed prefixes are removed, and if that fails, it should remove one word at a time from the end of the text and try to get a match.

Returns
Will be zero or more of the ms_parserule_plus::TAX_CHOP_TYPES OR'ed together.

◆ setChopSource()

void setChopSource ( const TAX_CHOP_SRC  value)

Change the parameter specifying how to a chop source line.

Parsing taxonomy information from a reference file or fasta file can not always be performed reliably using simple regular expressions. The 'CHOP' rules allow prefixes and suffixes to be removed and for 'words' to be extracted. For example:

  DefaultRule         NCBI, CHOP:WP   "\‍(.*\‍)"    # Everything!
  PrefixRemoves       C;Species: and mitochondrion chloroplast plastid endogenous

species that any of the listed prefixes are removed, and if that fails, it should remove one word at a time from the end of the text and try to get a match.

Parameters
valueShould be zero or more of the ms_parserule_plus::TAX_CHOP_TYPES values OR'ed together.

The documentation for this class was generated from the following files: