Utility class for compressing and decompressing data in compress format, and compressing in gzip format. More...
#include <ms_zip.hpp>
Public Member Functions | |
ms_zip (const bool isZipped) | |
Streaming mode constructor. | |
ms_zip (const bool isZipped, const std::string &buffer) | |
Buffer mode constructor. | |
ms_zip (const bool isZipped, const unsigned char *buffer, const unsigned long len) | |
Buffer mode constructor (only for C++). | |
ms_zip (const ms_zip &src) | |
Copying constructor. | |
void | appendErrors (const ms_errors &src) |
Copies all errors from another instance and appends them at the end of own list. | |
void | clearAllErrors () |
Remove all errors from the current list of errors. | |
std::string | compressMore (const std::string &dataIn) |
Feed more data to the compressor and retrieve the next chunk of compressed data. | |
void | compressMore (const unsigned char *dataIn, const unsigned long inputLen, unsigned char *dataOut, unsigned long *outputLen) |
Feed more data to the compressor and retrieve the next chunk of compressed data. | |
void | copyFrom (const ms_errors *right) |
Use this member to make a copy of another instance. | |
void | copyFrom (const ms_zip *right) |
Call this member to copy all the information from another instance. | |
std::string | decompressMore (const std::string &dataIn) |
Retrieve the next chunk of decompressed data. | |
void | decompressMore (const unsigned char *dataIn, const unsigned long inputLen, unsigned char *dataOut, unsigned long *outputLen) |
Retrieve the next chunk of decompressed data. | |
const ms_errs * | getErrorHandler () const |
Retrive the error object using this function to get access to all errors and error parameters. | |
int | getLastError () const |
Return the error description of the last error that occurred. | |
std::string | getLastErrorString () const |
Return the error description of the last error that occurred. | |
std::string | getUnZipped () const |
Return the uncompressed buffer. | |
unsigned long | getUnZipped (unsigned char *buffer, const unsigned long len) const |
Copy the uncompressed buffer into a buffer. | |
unsigned long | getUnZippedLen () const |
Return the length of the uncompressed buffer. | |
std::string | getZipped () const |
Return the compressed buffer. | |
unsigned long | getZipped (unsigned char *buffer, const unsigned long len) const |
Copy the compressed buffer into a buffer. | |
unsigned long | getZippedLen () const |
Return the length of the compressed buffer. | |
bool | isValid () const |
Call this function to determine if there have been any errors. | |
ms_zip & | operator= (const ms_zip &right) |
C++ style assignment operator. | |
Utility class for compressing and decompressing data in compress format, and compressing in gzip format.
ms_zip is used internally in Mascot Parser, for example for decompressing unimod.xml
and other configuration files when they are downloaded from a remote web site.
There are two ways to use the library: buffer mode and streaming mode.
Usage in buffer mode:
In buffer mode, both the compressed and decompressed data are kept in an internal buffer. The uncompressed input must not exceed 10 MB. The format of the compressed data is identical to the Unix command-line utility compress, except that the first four bytes of the buffer contain the length of the compressed data. For this reason, ms_zip buffer mode should only be used to decompress data that has been compressed with ms_zip buffer mode.
Usage in streaming compression mode:
In streaming mode, only the most recent input and output chunk is stored in the internal buffer. The input to ms_zip::compressMore() must not exceed 10 MB. The compressed data stream follows the gzip format with a minimal file header. The data can be decompressed with any tool that supports gzipped files.
Usage in streaming decompression mode:
The input to ms_zip::decompressMore() must not exceed 10 MB. The compressed data stream is expected to be in gzip format with a minimal file header, for example a gzipped file.
ms_zip | ( | const bool | isZipped, |
const unsigned char * | buffer, | ||
const unsigned long | len | ||
) |
Buffer mode constructor (only for C++).
This constructor initialises ms_zip for buffer mode. ms_zip can be used to compress and uncompress a small amount of data (less than 10 MB) in a buffer. When decompressing in buffer mode, the only supported data format is data compressed by ms_zip in buffer mode.
The first four bytes of the compressed buffer are used to store the length of the uncompressed data as an unsigned 32 bit integer (little endian) format. No other headers are written. The object will hold a copy of both the compressed and uncompressed data, so it is not suitable for use with large amounts of data. Use the streaming mode in that case.
After creating the object, call isValid() to determine if there are any errors. The errors can be retrieved using getLastErrorString().
Possible errors are:
isZipped | should be true if the passed buffer contains compressed data (decompression mode), and false if it contains uncompressed data (compression mode). |
buffer | is a pointer to the compressed or uncompressed data. If it is uncompressed, there is no assumption that the data is a null terminated string. |
len | is the length of the passed buffer. |
ms_zip | ( | const bool | isZipped, |
const std::string & | buffer | ||
) |
Buffer mode constructor.
This constructor initialises ms_zip for buffer mode. ms_zip can be used to compress and uncompress a small amount of data (less than 10 MB) in a buffer. When decompressing in buffer mode, the only supported data format is data compressed by ms_zip in buffer mode.
The first four bytes of the compressed buffer are used to store the length of the uncompressed data as an unsigned 32 bit integer (little endian) format. No other headers are written. The object will hold a copy of both the compressed and uncompressed data, so it is not suitable for use with large amounts of data. Use the streaming mode in that case.
After creating the object, call isValid() to determine if there are any errors. The errors can be retrieved using getLastErrorString().
Possible errors are:
isZipped | should be true if the passed buffer contains compressed data (decompression mode), and false if it contains uncompressed data (compression mode). |
buffer | is a string that contains the compressed or uncompressed data. If it is uncompressed, there is no assumption that the data is a null terminated string. C++ programmers should be aware that a std::string constructor needs to be passed the length parameter when creating a std::string that contains binary data. Otherwise, a 'zero' in the data will be considered to be a null terminator for a string. |
|
explicit |
Streaming mode constructor.
This constructor initialises ms_zip in streaming mode. ms_zip can be used for compressing arbitrarily large data streams into gzip format. When decompressing in streaming mode, the only supported data format is gzip format.
In streaming compression mode, the compressed data stream will start with a minimal gzip header, followed by the compressed data. The constructor does not take any input data as a parameter, unlike in buffer mode. Instead, input data is fed to the object using ms_zip::compressMore(), which returns sequential chunks of compressed data. End of input data is indicated by passing the empty string to ms_zip::compressMore().
In streaming decompression mode, the compressed data stream is expected to start with a minimal gzip header, followed by the compressed data. Input data is fed to the object using ms_zip::decompressMore(), which returns sequential chunks of decompressed data. End of input data is indicated by passing the empty string to ms_zip::decompressMore().
If the underlying zlib library cannot be initialised, the following errors are possible:
isZipped | must always be false for compression mode. |
Copying constructor.
Generally only used from C++, but will be called indirectly from other languages.
src | is the ms_zip to make a copy of. |
|
inherited |
Copies all errors from another instance and appends them at the end of own list.
src | The object to copy the errors across from. See Maintaining object references: two rules of thumb. |
|
inherited |
Remove all errors from the current list of errors.
The list of 'errors' can include fatal errors, warning messages, information messages and different levels of debugging messages.
All messages are accumulated into a list in this object, until clearAllErrors() is called.
See Error Handling.
std::string compressMore | ( | const std::string & | dataIn | ) |
Feed more data to the compressor and retrieve the next chunk of compressed data.
You may need to call ms_zip::compressMore() a few times with more input data until the compressed stream starts. This is indicated by returning a non-empty string.
Both the input and output strings are bounded: the input string length must not exceed 10 MB and the output string length will never exceed 10 MB.
End of input data is indicated by calling ms_zip::compressMore() with the empty string (length 0). The compressor will flush the output and return the last compressed bytes. If there are more bytes to return than the internal buffer allows, you must call the method again with the empty string. When there is no more data to be flushed, the returned string will be empty (length 0).
If the compressor has more output than fits in the output buffer, ms_zip pushes it into an internal output queue. The next call to ms_zip::compressMore() returns the chunk from the head of the internal queue and may append more to the tail of the queue. The operation is not visible to the caller. However, it's best to keep the size of input smaller than the output buffer, as otherwise ms_zip may use an unexpected amount of memory for the output queue.
If this method is called in non-streaming mode, an empty string object will be returned and the error ms_errs::ERR_MSP_ZIP_NOTSTREAMING set.
Other possible error conditions:
dataIn | Next chunk of raw binary data to compress. The data need not be null terminated. C++ programmers should be aware that a std::string constructor needs to be passed the length parameter when creating a std::string that contains binary data. Otherwise, a 'zero' in the data will be considered to be a null terminator for a string. |
void compressMore | ( | const unsigned char * | dataIn, |
const unsigned long | inputLen, | ||
unsigned char * | dataOut, | ||
unsigned long * | outputLen | ||
) |
Feed more data to the compressor and retrieve the next chunk of compressed data.
You may need to call ms_zip::compressMore() a few times with more input data until the compressed stream starts. This is indicated by returning a non-empty string.
Both the input and output strings are bounded: inputLen and outputLen must not exceed 10 MB, and outputLen must always exceed 0. If any of these conditions fail, a corresponding error will be set.
End of input data is indicated by calling ms_zip::compressMore() with inputLen equal to 0 (dataIn can be nullptr). The compressor will flush the output and return the last compressed bytes. If there are more bytes to return than outputLen, you must call the method again with inputLen equal to 0. When there is no more data to be flushed, outputLen is set to 0.
If the compressor has more output than fits in the output buffer, ms_zip pushes it into an internal output queue. The next call to ms_zip::compressMore() returns the chunk from the head of the internal queue and may append more to the tail of the queue. The operation is not visible to the caller. However, it's best to keep the size of input smaller than the output buffer, as otherwise ms_zip may use an unexpected amount of memory for the output queue.
If this method is called in non-streaming mode, outputLen will be set to 0 and the error ms_errs::ERR_MSP_ZIP_NOTSTREAMING set.
Other possible error conditions:
dataIn | Next chunk of raw binary data compress. The data is not assumed to be null terminated. |
inputLen | Length of data in dataIn. |
dataOut | Pointer to the memory location where compressed data should be written. The caller must ensure at least outputLen bytes have been allocated. |
outputLen | Maximum length of data to output. outputLen will be set to the actual length of dataOut on successful return. |
|
inherited |
Use this member to make a copy of another instance.
right | is the source to initialise from |
void copyFrom | ( | const ms_zip * | right | ) |
Call this member to copy all the information from another instance.
Simply create an instance of the class using the default constructor and call this method.
right | is a pointer to another instance to copy from. |
std::string decompressMore | ( | const std::string & | dataIn | ) |
Retrieve the next chunk of decompressed data.
You may need to call ms_zip::decompressMore() a few times with more input data until the compressed stream starts. This is indicated by returning a non-empty string.
Both the input and output strings are bounded: the input string length must not exceed 10 MB and the output string length will never exceed 10 MB.
End of input data is indicated by calling ms_zip::decompressMore() with the empty string (length 0). The compressor will flush the output and return the last decompressed bytes. If there are more bytes to return than the internal buffer allows, you must call the method again with the empty string. When there is no more data to be flushed, the returned string will be empty (length 0).
If the compressor has more output than fits in the output buffer, ms_zip pushes it into an internal output queue. The next call to ms_zip::decompressMore() returns the chunk from the head of the internal queue and may append more to the tail of the queue. The operation is not visible to the caller. However, it's best to keep the size of input smaller than the output buffer, as otherwise ms_zip may use an unexpected amount of memory for the output queue.
If this method is called in non-streaming mode, an empty string object will be returned and the error ms_errs::ERR_MSP_ZIP_NOTSTREAMING set.
Other possible error conditions:
dataIn | Next chunk of raw binary data to decompress. The data need not be a null terminated string. C++ programmers should be aware that a std::string constructor needs to be passed the length parameter when creating a std::string that contains binary data. Otherwise, a 'zero' in the data will be considered to be a null terminator for a string. |
void decompressMore | ( | const unsigned char * | dataIn, |
const unsigned long | inputLen, | ||
unsigned char * | dataOut, | ||
unsigned long * | outputLen | ||
) |
Retrieve the next chunk of decompressed data.
You may need to call ms_zip::decompressMore() a few times with more input data until the decompressed stream starts. This is indicated by returning a non-empty string.
Both the input and output strings are bounded: inputLen and outputLen must not exceed 10 MB, and outputLen must always exceed 0. If any of these conditions fail, a corresponding error will be set.
End of input data is indicated by calling ms_zip::decompressMore() with inputLen equal to 0 (dataIn can be nullptr). The compressor will flush the output and return the last decompressed bytes. If there are more bytes to return than outputLen, you must call the method again with inputLen equal to 0. When there is no more data to be flushed, outputLen is set to 0.
If the compressor has more output than fits in the output buffer, ms_zip pushes it into an internal output queue. The next call to ms_zip::decompressMore() returns the chunk from the head of the internal queue and may append more to the tail of the queue. The operation is not visible to the caller. However, it's best to keep the size of input smaller than the output buffer, as otherwise ms_zip may use an unexpected amount of memory for the output queue.
If this method is called in non-streaming mode, outputLen will be set to 0 and the error ms_errs::ERR_MSP_ZIP_NOTSTREAMING set.
Other possible error conditions:
dataIn | Next chunk of raw binary data decompress. The data is not assumed to be a null terminated string. |
inputLen | Length of data in dataIn. |
dataOut | Pointer to the memory location where decompressed data should be written. The caller must ensure at least outputLen bytes have been allocated. |
outputLen | Maximum length of data to output. outputLen will be set to the actual length of dataOut on successful return. |
|
inherited |
Retrive the error object using this function to get access to all errors and error parameters.
See Error Handling.
|
inherited |
Return the error description of the last error that occurred.
All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.
See Error Handling.
|
inherited |
Return the error description of the last error that occurred.
All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.
See Error Handling.
std::string getUnZipped | ( | ) | const |
Return the uncompressed buffer.
In buffer mode, return the uncompressed data.
In streaming compression mode, return the most recent input value to ms_zip::compressMore().
In streaming decompression mode, return the most recent value from ms_zip::decompressMore().
unsigned long getUnZipped | ( | unsigned char * | buffer, |
const unsigned long | len | ||
) | const |
Copy the uncompressed buffer into a buffer.
In buffer mode, return the uncompressed data.
In streaming compression mode, return the most recent input value to ms_zip::compressMore().
In streaming decompression mode, return the most recent value from ms_zip::decompressMore().
buffer | is the location where the uncompressed data will be copied to. The calling application should make sure that the buffer is long enough. Use getUnZippedLen() to find the length of the uncompressed data. |
len | is the length of the passed buffer. |
unsigned long getUnZippedLen | ( | ) | const |
Return the length of the uncompressed buffer.
std::string getZipped | ( | ) | const |
Return the compressed buffer.
In buffer mode, return the compressed data.
In streaming compression mode, return the most recent value from ms_zip::compressMore().
In streaming decompression mode, return the most recent value given to ms_zip::decompressMore().
unsigned long getZipped | ( | unsigned char * | buffer, |
const unsigned long | len | ||
) | const |
Copy the compressed buffer into a buffer.
In buffer mode, return the compressed data.
In streaming compression mode, return the most recent value from ms_zip::compressMore().
In streaming decompression mode, return the most recent value given to ms_zip::decompressMore().
buffer | is the location where the compressed data will be copied to. The calling application should make sure that the buffer is long enough. Use getZippedLen() to find the length of the compressed data. |
len | is the length of the passed buffer. |
unsigned long getZippedLen | ( | ) | const |
Return the length of the compressed buffer.
|
inherited |
Call this function to determine if there have been any errors.
This will return true unless there have been any fatal errors.
See Error Handling.