Matrix Science Mascot Parser toolkit
 
Loading...
Searching...
No Matches
ms_fileutilities Class Reference

This utility class mainly contains static members. More...

#include <ms_fileutilities.hpp>

Public Types

enum  CMD_ARGUMENT_PARSE_RESULT {
  CMD_ARGUMENT_PARSE_OK = 0 ,
  CMD_ARGUMENT_PARSE_UNBALANCED = -1 ,
  CMD_ARGUMENT_PARSE_COLLISION = -2 ,
  CMD_ARGUMENT_PARSE_INVALIDCHAR = -3
}
 Return status from parseCommandLineArgumentString(). More...
 
enum  compareXMLFiles_flags {
  CMPXML_NO_FLAG = 0x0000 ,
  CMPXML_IGN_DBL_IDENTICAL = 0x0001 ,
  CMPXML_IGN_DBL_DIFF_100_PPM = 0x0002 ,
  CMPXML_IGN_DBL_DIFF_10_PPM = 0x0004 ,
  CMPXML_IGN_DBL_DIFF_1_PPM = 0x0008 ,
  CMPXML_IGN_DBL_MASK = (CMPXML_IGN_DBL_IDENTICAL | CMPXML_IGN_DBL_DIFF_100_PPM | CMPXML_IGN_DBL_DIFF_10_PPM | CMPXML_IGN_DBL_DIFF_1_PPM) ,
  CMPXML_STRIP_TRAILING_ZEROS = 0x0010 ,
  CMPXML_IGN_COL_SIZE_EQ_ONE = 0x0020 ,
  CMPXML_IGN_TODAY_DATE = 0x0040 ,
  CMPXML_IGN_INF_999999 = 0x0080 ,
  CMPXML_IGN_FILE_URL = 0x0100 ,
  CMPXML_IGN_RAWFILE_NAME = 0x0200
}
 Flags for the compareXmlFiles() function. More...
 
enum  err_sar {
  SAR_SUCCESS ,
  SAR_FAIL_CHMOD ,
  SAR_FAIL_GET_NAMED_SECURITY_INFO ,
  SAR_FAIL_SET_ENTRIES_IN_ACL ,
  SAR_FAIL_SET_NAMED_SECURITY_INFO ,
  SAR_FAIL_CHOWN
}
 setAccessRights() return value. More...
 

Public Member Functions

 ms_fileutilities ()
 Default constructor.
 
void findClose ()
 Functions to search for all files that match a wildcard.
 
bool findNext (std::string &fileName)
 Functions to search for all files that match a wildcard.
 
bool findOpen (const std::string directory, const std::string wildcard)
 Functions to search for all files that match a wildcard.
 
unsigned long getSARExtendedErrorCode () const
 
err_sar setAccessRights (const char *filename, const bool bWrite, const bool bExecute, const bool isFile, const ms_mascotoptions &Options, const bool avoid_reapply=false)
 Sets access rights for a file according to settings in mascot.dat.
 

Static Public Member Functions

static bool compareXmlFiles (const std::string &filePath1, const std::string &filePath2, const std::string &schemaPath, const int flags, const std::vector< std::string > &ignoreElements, const std::vector< std::string > &ignoreAttributes, ms_errs &errs, std::string &differences, std::string &lhsOnly, std::string &rhsOnly)
 Compare 2 XML files.
 
static int createDirectory (const std::string &directory, std::string &errorDirectory)
 Create a directory or directory tree.
 
static int deleteDirectory (const std::string &directory, const bool bDeleteSubdirectories=true)
 Delete a directory or directory tree.
 
static bool doesFileExist (const char *filename)
 Returns true if the specified file exists, false otherwise.
 
static std::string findMascotDat (const char *szMascotDatFilename, ms_errs *err=NULL, const int timeoutSec=0)
 Returns a correct path to mascot.dat if it can be found in one of the default places.
 
static UINT64 getFileSize (const char *filename, ms_errs *err=NULL)
 Returns the size of the file.
 
static time_t getLastModificationTime (const char *filename, ms_errs *err=NULL)
 Returns the last file modification time.
 
static std::string getMD5Sum (const std::string &str, const bool b32bitEncoding=true)
 Return an MD5 sum of a string.
 
static bool isDirectory (const char *path)
 Determine if the passed path is a directory or a file.
 
static bool isFileWritable (const char *filePath)
 
static int parseCommandLineArgumentString (const std::string &command, std::vector< std::string > &components)
 Split a command-line argument string into components.
 
static void setLastModificationTime (const char *filename, time_t modificationTime, ms_errs *err=NULL)
 Sets the last file modification time.
 
static std::string stripLastFolder (const std::string pathname)
 Returns a pathname without last token, i.e. makes "C:\myfolder" from "C:\myfolder\XXX".
 

Detailed Description

This utility class mainly contains static members.

For static members, creating an instance of this class is not required.

See Static functions in Perl, Java, Python and C#

Member Enumeration Documentation

◆ CMD_ARGUMENT_PARSE_RESULT

Return status from parseCommandLineArgumentString().

Enumerator
CMD_ARGUMENT_PARSE_OK 

Command line parsed successfully.

CMD_ARGUMENT_PARSE_UNBALANCED 

Command line contains an unbalanced quote.

CMD_ARGUMENT_PARSE_COLLISION 

Command line contains a starting (ending) quote preceded (followed) by a non-whitespace character.

CMD_ARGUMENT_PARSE_INVALIDCHAR 

Command line contains invalid characters, such as linebreaks.

◆ compareXMLFiles_flags

Flags for the compareXmlFiles() function.

Enumerator
CMPXML_NO_FLAG 

No specific options.

CMPXML_IGN_DBL_IDENTICAL 

For example 0.2000 and 0.2 would be considered identical if the schema type for the value is double.

CMPXML_IGN_DBL_DIFF_100_PPM 

Ignore differences of 100 ppm (1/10,000) between floating point values if the schema type for the value is double.

CMPXML_IGN_DBL_DIFF_10_PPM 

Ignore differences of 10 ppm (1/100,000) between floating point values if the schema type for the value is double.

CMPXML_IGN_DBL_DIFF_1_PPM 

Ignore differences of 1 ppm (1/1,000,000) between floating point values if the schema type for the value is double.

CMPXML_STRIP_TRAILING_ZEROS 

For example 0.2000 and 0.2 would be considered identical even if the schema type is not double

CMPXML_IGN_COL_SIZE_EQ_ONE 

For distiller_quantitation_2 where Distiller 2.7 has zero rows (and hence zero columns) but records: <calcMoverZ col_size="1" row_size="0"/>

CMPXML_IGN_TODAY_DATE 

If the XML has todays date, it's likely that it won't compare with when the baseline was written. Date is of the form val="Wed Apr 18 16:06:24 2018".

CMPXML_IGN_INF_999999 

Double values of "INF" and "999999999999999970000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000" will compare.

CMPXML_IGN_FILE_URL 

Text strings starting with "file:///" will ignore directory differences and only compare the filename part. For example: "file:///c:/tmp/filename.txt" will compare OK with "file:///server/mytestdir/tmp/filename.txt" but not with "file:///c:/tmp/AnotherFile.txt".

CMPXML_IGN_RAWFILE_NAME 

Text strings for an attribute named "val" in element "info" which has parent element "rawfile" will ignore directory differences and only compare the filename part. For example: val="e:%5cbuildserver%5cfilename.wiff" will compare OK with val="C:%5cbackedup%5cgit%5cfilename.wiff".

◆ err_sar

enum err_sar

setAccessRights() return value.

Enumerator
SAR_SUCCESS 

Success!

SAR_FAIL_CHMOD 

Error returned by call to chmod().

SAR_FAIL_GET_NAMED_SECURITY_INFO 

Error returned by call to GetNamedSecurityInfo().

SAR_FAIL_SET_ENTRIES_IN_ACL 

Error returned by call to SetEntriesInAcl().

SAR_FAIL_SET_NAMED_SECURITY_INFO 

Error returned by call to SetNamedSecurityInfo().

SAR_FAIL_CHOWN 

Error returned by call to chown().

Constructor & Destructor Documentation

◆ ms_fileutilities()

Default constructor.

Creating an instance of this class is not required.

Member Function Documentation

◆ compareXmlFiles()

bool compareXmlFiles ( const std::string &  filePath1,
const std::string &  filePath2,
const std::string &  schemaPath,
const int  flags,
const std::vector< std::string > &  ignoreElements,
const std::vector< std::string > &  ignoreAttributes,
ms_errs errs,
std::string &  differences,
std::string &  lhsOnly,
std::string &  rhsOnly 
)
static

Compare 2 XML files.

See Multiple return values in Perl, Java, Python and C#.

Warning
if a relative path for the xsd is specified, then this will be relative to the documents and not relative to the current working directory. This means that if the files to be compared are in different directories, then absolute path(s) will be required in the schemaPath
Parameters
[in]filePath1is the path for the first (or left) file.
[in]filePath2is the path for the first (or right) file.
[in]schemaPathmust be valid and must have space separated pairs of "namespace" "path_to_xsd". See the example in ms_mascotresfilebase::setXMLschemaFilePath.
[in]flagsis one or more ms_fileutilities::compareXMLFiles_flags bitwise OR'd together.
[in]ignoreElementsis a vector of element names for which the element values will be ignored. Any attributes and child elements for the specified elements will still be compared. The full path to the element should be specified, e.g. "<quantitationResults><peptideMatch><partner><xic><intensity>" See Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#
[in]ignoreAttributesis a vector of string of attribute names to be ignored in the comparison. If a value contains an '=', then it will only be ignored if the attribute (in either file) is equal to the specified value. For example, 'minorVersion' would ignore all 'minorVersion' attributes, but 'minorVersion="2"' will only ignore if either side has a value of "2" See Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#
[out]errsis used to return a list of any XML parsing errors
[out]differencesis used to return a list of attribute and value differences for elements that appear in both input files.
[out]lhsOnlyis used to return a list of elements that only appear in file1
[out]rhsOnlyis used to return a list of elements that only appear in file2
Returns
true if the files could be read and validated against the schema file

◆ createDirectory()

int createDirectory ( const std::string &  directory,
std::string &  errorDirectory 
)
static

Create a directory or directory tree.

See Static functions in Perl, Java, Python and C#

See Multiple return values in Perl, Java, Python and C#.

Parameters
[in]directoryis the path of the directory to be created.
[out]errorDirectoryis the path of the first directory that it fails to create if the function fails.
Returns
0 for success, or errno if there is an error.

◆ deleteDirectory()

int deleteDirectory ( const std::string &  directory,
const bool  bDeleteSubdirectories = true 
)
static

Delete a directory or directory tree.

See Static functions in Perl, Java, Python and C#

Deletes the contents and subdirectories of a tree.

Parameters
[in]directoryis the path of the directory to be deleted.
[in]bDeleteSubdirectoriesis a flag to specify if subdirectories should be deleted
Returns
0 for success, or errno/Windows Error code if there is an error.

◆ doesFileExist()

bool doesFileExist ( const char *  filename)
static

Returns true if the specified file exists, false otherwise.

See Static functions in Perl, Java, Python and C#

Parameters
filenameis the file to check
Returns
true if file exists

◆ findClose()

void findClose ( )

Functions to search for all files that match a wildcard.

To use findOpen(), findNext() and findClose() :

These functions cannot currently be called from any language apart from C++

Do not call findOpen() for a second directory on any ms_fileutilities object until findClose() has been called for the first directory.

◆ findMascotDat()

std::string findMascotDat ( const char *  szMascotDatFilename,
ms_errs err = NULL,
const int  timeout = 0 
)
static

Returns a correct path to mascot.dat if it can be found in one of the default places.

See Static functions in Perl, Java, Python and C#

Parameters
szMascotDatFilenameis an optional filename. If an empty string or a null pointer is passed, default values are used depending on the platform.
erris a pointer to an error object which can be used to return detailed errors. See Passing objects to functions in Perl, Java, Python and C#
timeoutis the time in seconds to look for the file. If the file is being edited, it may not exist for a short period, so rather than erroring, this function will wait.
Returns
path to file mascot.dat

◆ findNext()

bool findNext ( std::string &  fileName)

Functions to search for all files that match a wildcard.

To use findOpen(), findNext() and findClose() :

These functions cannot currently be called from any language apart from C++

Do not call findOpen() for a second directory on any ms_fileutilities object until findClose() has been called for the first directory.

Parameters
fileNameis the name of the found file.
Returns
true if a file was found.

◆ findOpen()

bool findOpen ( const std::string  directory,
const std::string  wildcard 
)

Functions to search for all files that match a wildcard.

To use findOpen(), findNext() and findClose() :

These functions cannot currently be called from any language apart from C++

Do not call findOpen() for a second directory on any ms_fileutilities object until findClose() has been called for the first directory.

Parameters
directoryis the directory to find files.
wildcardis a wildcard pattern. For example "*.dat".
Returns
true if the function succeeds

◆ getFileSize()

UINT64 getFileSize ( const char *  filename,
ms_errs err = NULL 
)
static

Returns the size of the file.

See Static functions in Perl, Java, Python and C#

Parameters
filenameis the relative or absolute path to the file
erris a pointer to an optional error object which can be used to return detailed errors. See Passing objects to functions in Perl, Java, Python and C#
Returns
0 if the function fails

◆ getLastModificationTime()

time_t getLastModificationTime ( const char *  filename,
ms_errs err = NULL 
)
static

Returns the last file modification time.

See Static functions in Perl, Java, Python and C#

Parameters
filenameis the relative or absolute path to the file
erris a pointer to an optional error object which can be used to return detailed errors. See Passing objects to functions in Perl, Java, Python and C#
Returns
0 if the function fails

◆ getMD5Sum()

std::string getMD5Sum ( const std::string &  str,
const bool  b32bitEncoding = true 
)
static

Return an MD5 sum of a string.

See Static functions in Perl, Java, Python and C#

Useful for creating a unique filename from, for example, a long pathname or a set of options.

For 64 bit encoding the returned text is base 64 Encoded with a URL and Filename Safe Alphabet; see http://tools.ietf.org/html/rfc3548#section-4.

For 32 bit encoding, the returned text is base 32 bit encoded (except that lower case rather than upper case letters are used); see http://tools.ietf.org/html/rfc3548#section-5.

Parameters
stris the string (of any length) than needs to be converted. If a zero length string is passed, a zero length string will be returned.
b32bitEncodingshould be set to true to give 32 bit encoded string (essential for file systems where an upper case filename and a lower case filename are considered to be the same. If this value is false, then 64 bit encoding is used.
Returns
the MD5 sum of the string.

◆ getSARExtendedErrorCode()

unsigned long getSARExtendedErrorCode ( ) const

This can be called if setAccessRights() returns one of the following:

  • SAR_FAIL_GET_NAMED_SECURITY_INFO
  • SAR_FAIL_SET_ENTRIES_IN_ACL
  • SAR_FAIL_SET_NAMED_SECURITY_INFO

The value will be zero for any other errors and is used to return an error code for Windows only.

Returns
An optional extra errorCode if setAccessRights failed.

◆ isDirectory()

bool isDirectory ( const char *  path)
static

Determine if the passed path is a directory or a file.

See Static functions in Perl, Java, Python and C#

Parameters
pathPath to a directory or file
Returns
True if the path specifies a directory, otherwise returns false. If the path doesn't exist, the function will also return false.

◆ isFileWritable()

bool isFileWritable ( const char *  filePath)
static

See Static functions in Perl, Java, Python and C#

Parameters
filePathis the file to check
Returns
true if the file is writable

◆ parseCommandLineArgumentString()

int parseCommandLineArgumentString ( const std::string &  command,
std::vector< std::string > &  components 
)
static

Split a command-line argument string into components.

A command-line argument string consists of an executable name or filepath followed by a list of zero or more arguments. The executable and arguments are separated by one or more whitespace character. Here are some examples at increasing levels of complexity:

  dir
  cd "C:/Program Files"
  "C:/Program Files/Perl64/bin/perl.exe" C:/inetpub/mascot/bin/dbman_download.pl SwissProt

Any of the components can include spaces, including the executable name; in this case, the component with spaces must be delimited by quotes ("). If the command string does include quotes, the quotes must be balanced, in the sense that an opening quote must follow a closing quote. Further, an opening quote must be preceded by whitespace and a closing quote followed by whitespace. Otherwise the method indicates a syntax error (as such strings are ambiguous). This means the following are all syntax errors and cause the method to return a non-OK status:

   "dir
   cd"C:/Program Files"
   "C:/Program Files/Perl64/bin/perl.exe"C:/inetpub/mascot/bin/dbman_download.pl SwissProt

Note that any of the components, including the executable name, can be the empty string, e.g.

   dir ""
   "" SwissProt

You need to check each item in components before processing further, even if the command string is parsed without error.

If you pass in the empty string in command, you will get an OK status and an empty vector out.

This method does not interpret the components in any way. In particular, it does not check that the first component is an executable filepath or that it exists on disk.

Parameters
commandThe command-line style string.
componentsComponents parsed from command, with quotes stripped out.
Returns
Status indicating success or failure, one of ms_fileutilities::CMD_ARGUMENT_PARSE_RESULT.

◆ setAccessRights()

ms_fileutilities::err_sar setAccessRights ( const char *  filename,
const bool  bWrite,
const bool  bExecute,
const bool  bIsFile,
const ms_mascotoptions Options,
const bool  avoid_reapply = false 
)

Sets access rights for a file according to settings in mascot.dat.

See Using enumerated values and static const ints in Perl, Java, Python and C#.

Windows:

The bWrite flag is ignored.

The bExecute flag is ignored.

  • Always does a chmod(filename, _S_IREAD | _S_IWRITE)
  • If the filename is a directory, and this is running on NT4, nothing further is done
  • A security descriptor is constructed using the name specified by ms_mascotoptions::getNTIUserGroup() With NT4, Service Pack 4, it isn't possible to add a group to existing groups Q242510, so in this case only, the existing security is replaced with the NTIUserGroup name and the name specified by ms_mascotoptions::getNTMonitorGroup()
  • For a memory mapped object (where bIsFile is false), STANDARD_RIGHTS_ALL | SPECIFIC_RIGHTS_ALL are added to the current rights for the group specified above.
  • For a file or directory (where bIsFile is true), FILE_ALL_ACCESS rights are added to the current rights for the group specified above.

Unix:

The bIsFile parameter is ignored.

If the filename passed is for a file rather than a directory, then chmod is called with the following flags:

  • S_IRUSR (Read access for user)
  • S_IWUSR (Write access for user)
  • S_IXUSR (Execute access for user) - but only if the bExecute flag is set
  • S_IRGRP (Read access for group)
  • S_IWGRP (Write access for group) - but only if the bWrite flag is set
  • S_IXGRP (Execute access for group) - but only if the bExecute flag is set
  • S_IROTH (Read access for others) - but only if ms_mascotoptions::getUnixWebUserGroup() is -1
  • S_IWOTH (Write access for others) - but only if ms_mascotoptions::getUnixWebUserGroup() is -1 and if the bWrite flag is set
  • S_IXOTH (Execute access for others) - but only if ms_mascotoptions::getUnixWebUserGroup() is -1 and if the bExecute flag is set

If the filename passed is for a directory, then chmod is called with the flags above unless ms_mascotoptions::getUnixDirPerm() is defined, in which case the permission used are defined by that entry in mascot.dat.

If ms_mascotoptions::getUnixWebUserGroup() is not -1 and is not -2, then a chown will be performed using the group id specified by ms_mascotoptions::getUnixWebUserGroup().

Parameters
filenameis the name of the file, or memory mapped object that will be changed by this function.
bWriteshould be set to true if the file should be set to be writeable by the group and others.
bExecuteshould be set to true for executable files or directories.
bIsFileshould be set to true for files and directories, and false for for memory mapped file objects.
Optionsis the options section of mascot.dat.
avoid_reapplyshould be set to true if it should avoid applying the rights if the files already have the desired access configuration. Windows only, false by default.
Returns
Will return SAR_SUCCESS if it succeeds. See also getSARExtendedErrorCode().

◆ setLastModificationTime()

void setLastModificationTime ( const char *  filename,
time_t  modificationTime,
ms_errs err = NULL 
)
static

Sets the last file modification time.

See Static functions in Perl, Java, Python and C#

Parameters
filenameis the relative or absolute path to the file
modificationTimeis the time to be applied as the file's last write time
erris a pointer to an optional error object which can be used to return detailed errors. See Passing objects to functions in Perl, Java, Python and C#

◆ stripLastFolder()

std::string stripLastFolder ( const std::string  pathname)
static

Returns a pathname without last token, i.e. makes "C:\myfolder" from "C:\myfolder\XXX".

See Static functions in Perl, Java, Python and C#

Parameters
pathnameof the filepath to strip.
Returns
stripped folder name

The documentation for this class was generated from the following files: