The Distiller project file can be accessed using the Mascot Distiller SDK. This library includes files for opening and reading Distiller project files and raw spectrographic data files.
Note – This method of automatic extraction of data from a Distiller project (.rov) file requires the use of the Mascot Distiller SDK (MDRO). The Mascot Distiller SDK is a Windows-only product and is not included in the Mascot Parser Toolkit and is not free.
All the Distiller data is stored in a project file. By default, this has a '.rov' file extention.
The project is accessed by first creating a project manager. The file is then opened to get a project interface.
mdroapi::IMDROProjectManagerPtr iProjectManager; HRESULT hr = iProjectManager.CreateInstance( __uuidof(mdroapi::MDROProjectManager) ); mdroapi::IMDROProjectPtr iProject; hr = iProjectManager->raw_Open( _bstr_t(rovfilePathname.c_str()), mdroapi::mdroProjectFlagReadOnly, vtMissing, &iProject );
This can then be used to access the collection of search statuses: The collection may contain multiple search statuses. The data streams for the Mascot results and the peptide summary are accessed using the search status.
mdroapi::IMascotSearchStatusCollectionPtr iSearchStatusCollection; mdroapi::IMDROProjectIOStreamPtr ISearchStatusStm; hr = iProject->raw_OpenStream( mdroapi::mdroProjectStmFlagRead, mdroapi::mdroProjectStmSearchStatus, vtMissing, &ISearchStatusStm ); hr = iSearchStatusCollection.CreateInstance( __uuidof(mdroapi::MascotSearchStatusCollection) ); hr = iSearchStatusCollection->LoadFromStream( ISearchStatusStm ); long sCount = iSearchStatusCollection->Count; int searchIndex = 1; // select required search, usually only one mdroapi::IMascotSearchStatusPtr iSearchStatus = iSearchStatusCollection->GetItem(searchIndex);
The search status summary stream is used to extract the mascot results. It should be saved to a file to be re-loaded by the ms_mascotresfile.
mdroapi::IMDROProjectIOStreamPtr iSummaryStm; hr = iSearchStatus->raw_OpenSummaryStream(mdroapi::mdroProjectStmFlagRead, iProject, &iSummaryStm); bool ok = SaveIoStreamToFile(iSummaryStm, pathnameResfile); iSummaryStm->Close();
The cache files for the results and the peptide summary will be placed in a folder located in the temporary file folder specified to the ms_mascotresfile
constructor. The name of this sub-folder varies with the last update time attribute of the results file. In order to discourage unnecessary disk usage by an ever expanding number of files, you should set the last-write-time of the results file. Distiller sets it to 00:00:00 1/1/2010.
time_t time = matrix_science::ms_distiller_data::distillerResfileTimestamp; matrix_science::ms_fileutilities::setLastModificationTime(pathnameResfile.c_str(), time);
When creating the ms_mascotresfile, the cache should be enabled (to speed up access) and the timestamp within the cache file should be ignored (to avoid timezone issues, this does not affect generation of the folder used for cache files).
int resFileFlags = matrix_science::ms_mascotresfile::RESFILE_USE_CACHE | matrix_science::ms_mascotresfile::RESFILE_CACHE_IGNORE_DATE_CHANGE; matrix_science::ms_mascotresfile * resfile = new matrix_science::ms_mascotresfile( pathnameResfile.c_str(), 0, // keepAliveInterval "<!-- %d seconds -->\n", // keepAliveText resFileFlags, temporaryFileFolder.c_str(), mascotXmlSchemaFolder.c_str());
The project data shoud be loaded into an ms_distiller_data.
This contains the stream number offset used in the name of the quantitation results streams.
matrix_science::ms_distiller_data distillerData; std::string roverXml = GetIoStreamString(iProject, mdroapi::mdroProjectStmRover, 0); distillerData.loadXml(mascotXmlSchemaFolder.c_str(), roverXml); int quantStreamNumber = distillerData.getQuant(1).getStreamNumber();
The peptide summary loads a cache file in order to improve access times. The name of this cache file is dependent on the parameters passed to the constructor. A function is available from matrix_science::ms_peptidesummary to allow the name of the cache file to be determined in advance so that it can be created.
std::string pathnameCachefile = matrix_science::ms_peptidesummary::getCacheFilename( *resfile, distillerData, 1); // first search in the list mdroapi::IMDROProjectIOStreamPtr iCacheStm; hr = iProject->raw_OpenStream( mdroapi::mdroProjectStmFlagRead, mdroapi::mdroProjectStmRover, _variant_t(0xCAC0 + iSearchStatus->StatusId), &iCacheStm ); SaveIoStreamToFile(iCacheStm, pathnameCachefile); iCacheStm->Close(); matrix_science::ms_peptidesummary * peptideSummary = new matrix_science::ms_peptidesummary( *resfile, distillerData, 1); // first search in the list
The quantitation data is stored in two streams of the Distiller project file. These should be extracted and saved to temporary files. There is no special requirement on the naming of these files.
The stream number, within the project, is obtained from the dataset quantitation result information. There may be multiple quantitation results sets in the project, each with a different stream number. The stream number will not necessarily be the same as the index into the quantitation results sets.
const long QUANTRESULT_OFFSET = 3000; const long QUANTRESULT_CACHE_OFFSET = 8000; int quantStreamNumber = distillerData.getQuant(1).getStreamNumber(); mdroapi::IMDROProjectIOStreamPtr iStream; hr = iProject->raw_OpenStream( mdroapi::mdroProjectStmFlagRead, mdroapi::mdroProjectStmRover, _variant_t(QUANTRESULT_OFFSET + quantStreamNumber), &iStream ); SaveIoStreamToFile(iStream, cdbFilename); iStream->Close(); hr = iProject->raw_OpenStream( mdroapi::mdroProjectStmFlagRead, mdroapi::mdroProjectStmRover, _variant_t(QUANTRESULT_CACHE_OFFSET + quantStreamNumber), &iStream ); SaveIoStreamToFile(iStream, cacheFilename); iStream->Close();
The quantitation method is available from the mascot results.
These files can then be used to create a ms_ms1quantitation.
matrix_science::ms_quant_method quantMethod; resfile->getQuantitationMethod(&quantMethod); matrix_science::ms_ms1quantitation quant(peptideSummary, quantMethod); quant.loadCdbFile(cdbFilename, cacheFilename);
Reading the data of a Distiller stream is can be accomplished by repeatedly reading it in chunks into SAFEARRAY. The data can be concatentated in memory, if it is going to be small enough, or written to a file.
std::string GetIoStreamString(mdroapi::IMDROProjectIOStreamPtr IDistillerStm) { std::string data; stmobjapi::IByteStreamSignalPtr ISignal = IDistillerStm; ISignal->BeginStreamOp(stmobjapi::stmobjStreamOpFlagRead); SAFEARRAYBOUND safeArrayBounds[1] = { 1024 * 1024, 0 }; SAFEARRAY * safeArray = SafeArrayCreate(VT_UI1, 1, safeArrayBounds); stmobjapi::IByteStreamIOPtr IInStream = IDistillerStm; while (true) { long readCount = IInStream->ReadBytes(&safeArray); if (readCount == 0) { break; } const void * readBytes = safeArray->pvData; data.append((const char *)readBytes, ((const char *)readBytes) + readCount); } ISignal->EndStreamOp(VARIANT_TRUE); SafeArrayDestroy(safeArray); return data; }
void SaveIoStreamToFile(mdroapi::IMDROProjectIOStreamPtr iDistillerStm, const std::string & filename) { std::ofstream file; file.open(filename.c_str(), std::ios_base::out|std::ios_base::binary); stmobjapi::IByteStreamSignalPtr iSignal = iDistillerStream; iSignal->BeginStreamOp(stmobjapi::stmobjStreamOpFlagRead); SAFEARRAYBOUND safeArrayBounds[1] = { 1024 * 1024, 0 }; SAFEARRAY * safeArray = SafeArrayCreate(VT_UI1, 1, safeArrayBounds); stmobjapi::IByteStreamIOPtr iInStream = iDistillerStream; while (true) { long readCount = iInStream->ReadBytes(&safeArray); if (readCount == 0) { break; } const char * readBytes = (const char *)safeArray->pvData; file.write(readBytes, readCount); } iSignal->EndStreamOp(VARIANT_TRUE); SafeArrayDestroy(safeArray); file.close(); }