Introduction to Mascot Daemon
Mascot Daemon is a client to Mascot Server that can automate the processing of raw data to peak lists and submit multiple searches as a batch. It is included with the Mascot Server licence and can be installed on as many computers in the lab as you like. If you have an in-house Mascot Server licence, go to your local Mascot home page and scroll down to the Daemon installation instructions.
This page introduces various concepts. Mascot Daemon ships with tutorials and full reference documentation; start Daemon and press F1, or open the menu About, Mascot Daemon Help.
User interface and the engine
Mascot Daemon the application is made up of two parts: the user interface through which you set up tasks and the daemon engine that runs in the background and actually carries out the work. You can see the background task’s cogwheel icon in the notification area of the taskbar. The two halves communicate through the task database (TaskDB). This is where all the information about the tasks, the file names and paths, etc., is stored.
Different ways of processing data
The simplest way to use Mascot Server is to search peak lists generated by the instrument software. The most common format is MGF (Mascot Generic Format), although other formats are also accepted. In Daemon, peak lists can be combined into a single search by checking the “Merge MS/MS files into a single search” box.
Most users quickly graduate to using raw MS data and a data import filter to convert it to a peak list prior to searching. Mascot Distiller can be used as a data import filter and can handle data from any of the major instrument vendors. Using a Distiller-compatible quantitation method allows Daemon to automatically trigger the quantitation once the search is complete. Additionally, a number of third-party peak picking applications are supported, with ProteWizard MSconvert being the most popular of these.
The HUPO PSI (Proteomics Standards Initiative) data format mzML is also supported. The mzML format can be used for both raw data and peak lists, so you need to check it’s the latter format before submitting to Mascot. Some post-search data analysis tools like the Trans-Proteomic Pipeline require that you search an mzML file, as other parts of the pipeline need to use the same peak lists for their analysis.
Search strategies
A Mascot Daemon task is a Mascot search with a single set of search parameters. It is possible to chain multiple tasks together with follow-up tasks. Follow-up tasks will take the data used in one task and research it with a different set of conditions. A follow-up task can search all the queries in a peak lists or just the ones that did not obtain a sufficiently good match. Multiple follow-up tasks can be chained together creating a sieve approach to the analysis. An example of this technique is analysing a Histone dataset.
Other uses for follow-up tasks include using it as a filter identifying and removing matches from a search against a contaminants database prior to searching the main database, or searching both spectral libraries and protein and/or DNA sequence databases.
Figure 1: Histone analysis: Starting from task 28 the data file progress upwards through the chain of Follow-up tasks.
The key to automation
Mascot Daemon allows you to easily automate a batch of searches, but you can go beyond just automating the peak picking and searching. The auto-export button takes you to the configuration options for exporting results in any of the supported file formats. With quantitation data, if you are using Mascot Distiller for the peak picking, you can configure Daemon to automatically calculate and export the quantitation results as XML files. Turn this feature on in the Daemon preferences, General tab. Daemon saves all of the exported files along with any peak list file generated by the data import filter into the “MGF directory”. The default location can be edited in the Daemon preferences, Data import filters dialog box.
The key to more advanced automation is the External Processes Dialog. From here you can call programs or scripts before or after a task or search. Daemon uses tokens to pass values such as file names and paths to the external application. This way you can trigger a program that performs a task after every search is complete, for example sending an email or passing the results to another program for further analysis. We have helped customers build scripts for preprocessing peak lists prior to searching or copying and renaming results files post search ready for importing into a lab database.
Figure 2: An external task has been set up to run before each search that calls a script to clean up the header information on a peak list file. The Auto-export feature has also been configured so the button title is in bold too.
Tips for core labs and other multi user environments
Daemon has a number of features that make it easier to use within a core laboratory. For example, it can be configured to allow running searches in the name of another user. Enable Mascot Server security and make the Daemon user a member of the PowerUsers group or a group with the security task “CLIENT: For Mascot Daemon, allow spoofing of another user”. The “Owner” field in the task editor tab becomes a drop down field. This allows a core lab member to run searches on behalf of their customers or collaborators so that they can view their results but not rerun the search. Spoofing users and sharing results with collaborators is described in Mascot Security: search results.
Figure 3: List of usernames that can be selected from to “own” the search.
Within the core lab, there may be multiple computers running Mascot Daemon clients. These can be configured to share a common TaskDB making it easy to track which tasks are running and sending searches to Mascot Server as well as centralizing links to results. The Mascot Server search parameters used with a Daemon task are normally saved as a text file. You could save these files to a network share so that they are accessible from all the Daemon computers. Alternatively, Daemon also allows you to save the search parameters directly in the TaskDB. Activate this feature in Daemon preferences, General tab.
Although Daemon was not originally designed to be used as the main interface to the search results, it is often used as such. If you are using a shared task database or storing the search parameters in it, it makes sense to back up the database regularly. You can do this through the ODBC connection dialog in Daemon preferences.
Anything else I should know?
If the Mascot Server is running on Linux, there is no restriction to the size of the peak list. On Windows, the IIS webserver restricts uploaded files to 2GB. When both Daemon and Server are installed on the same Windows computer, Daemon can submit searches on the command line bypassing the file size limit. This feature is activated by default and can be changed in the Daemon preferences, General tab.
One final feature that can save you a bit of time is the “Clone” button. Cloning a task is a quick way to set up a new task that has the same or very similar settings to an existing task. It is particularly useful if you frequently use a combination of Data import filters, auto exports and external processes.