If your application is singlethreaded, you can safely skip this section.
Most Parser classes and methods are not guaranteed to be reentrant or thread-safe. However, with some care, it is possible to use Parser in a multithreaded application.
The easiest and safest solution is to only use Parser from a single thread in your application. For example, if your application has four threads, only use Parser classes and methods in thread 3 and never from threads 1, 2 or 4. If you can guarantee that no thread other than thread 3 uses Parser or has access to thread 3's variables or memory space, then Parser methods will work just fine. Note that if thread 3 should transfer data to or from the other threads, the data cannot be encapsulated in Parser objects; that is, don't pass ms_protein
or ms_peptide
objects between threads, or any other Parser objects.
A small subset of methods and classes are thread safe. These are used by Mascot Distiller, so they are tested extensively with each release of Parser. If your application uses a single thread for most Parser classes and restricts multithreading only to the below classes and methods, the application should be safe.
If you need to use arbitrary Parser classes from multiple threads, you have two choices: either create a synchronised data abstraction class (shared global data), or use thread-local storage for completely independent instances of Parser objects in each thread.
In the synchronised class scenario, you need to write a data abstraction class that uses Parser internally (storing copies of the ms_mascotresfile_msr
etc. objects), whose methods are guarded by a single mutual exclusion lock (mutex). The mutex must guard access to the entire data abstraction object, and each method must start by acquiring the mutex. If thread 3, for example, makes a method call and acquires the mutex, then any other thread making the same or different method call on the object will block until thread 3's method call has finished and released the mutex. This strict mutual exclusion prevents race conditions and ensures the Parser data shared between threads is only ever accessed by one thread at a time.
However, there are two caveats: 1) the data abstraction class must not return Parser objects or otherwise expose them, as this would nullify the protection of the mutex; and 2) the data abstraction class must be instantiated in a global memory space shared between all threads, so that it is not specific to a thread. The data abstraction class need not implement the Singleton pattern; you could have multiple instances of the class, each with its own internal, independent copies of Parser classes. If you do make it a Singleton, ensure you follow a thread-safe Singleton pattern.
In the other scenario, each thread could have its own copies of Parser classes that are independent of each other. For example, if thread 1 creates an ms_mascotresfile_msr
object resfile1
, and thread 2 creates resfile2
, then thread 1 can call methods of resfile1
concurrently with thread 2 calling methods of resfile2
. This is in effect the same situation as the singlethreaded mode, with the same restrictions: thread 2 must not call methods of resfile1
, or vice versa, and the threads must not pass Parser objects between themselves. If you do, the application is almost certain to crash sooner or later.
The above design restrictions apply to all programming languages, including C++. However, the C++ version of Parser has an additional restriction related to Apache Xerces; please refer to section Using Apache Xerces and Parser in the same application.
Multithreading does not change these issues in the dynamically linked case on either platform, but you must take extra care if you are linking statically against Parser and Xerces. Section Xerces and statically linked Parser shows an example sequence of Xerces and Parser calls that is safe in a singlethreaded application. The same sequence will only be safe in a multithreaded application if you can guarantee that no two threads call Parser and Xerces functions concurrently. Otherwise you are almost certain to corrupt the internal Xerces state.
The only easy fix is to do all Parser and Xerces processing in a single thread strictly following each other (not interleaved in any way). It may be best to use a different XML library in statically linked multithreaded applications to avoid the problem entirely.