This is the root element for the Proteomics Standards Initiative (PSI) mzML schema, which is intended to capture the use of a mass spectrometer, the data generated, and the initial processing of that data (to the level of the peak list).
An optional accession number for the mzML document used for storage, e.g. in PRIDE.
The version of this mzML document.
An optional id for the mzML document used for referencing from external files. It is recommended to use LSIDs when possible.
Information pertaining to the entire mzML file (i.e. not specific to any part of the data set) is stored here.
This summarizes the different types of spectra that can be expected in the file. This is expected to aid processing software in skipping files that do not contain appropriate spectrum types for it. It should also describe the nativeID format used in the file by referring to an appropriate CV term.
List and descriptions of the source files this mzML document was generated or derived from
Number of source files used in generating the instance document.
Description of the source file, including location and type.
An identifier for this file.
Name of the source file, without reference to location (either URI or local path).
URI-formatted location where the file was retrieved.
Information about an ontology or CV source and a short 'lookup' tag to refer to.
The short label to be used as a reference tag with which to refer to this particular Controlled Vocabulary source description (e.g., from the cvLabel attribute, in CVParamType elements).
The usual name for the resource (e.g. The PSI-MS Controlled Vocabulary).
The version of the CV from which the referred-to terms are drawn.
The URI for the resource.
Container for one or more controlled vocabulary definitions.
The number of CV definitionsin this mzML file.
Container for a list of referenceableParamGroups
The number of ParamGroups defined in this mzML file.
Structure allowing the use of a controlled (cvParam) or uncontrolled vocabulary (userParam), or a reference to a predefined set of these in this mzML file (paramGroupRef).
A collection of CVParam and UserParam elements that can be referenced from elsewhere in this mzML document by using the 'paramGroupRef' element in that location to reference the 'id' attribute value of this element.
The identifier with which to reference this ReferenceableParamGroup.
This element holds additional data or annotation. Only controlled values are allowed here.
A reference to the CV 'id' attribute as defined in the cvList in this mzML file.
The accession number of the referred-to term in the named resource (e.g.: MS:000012).
The value for the parameter; may be absent if not appropriate, or a numeric or symbolic value, or may itself be CV (legal values for a parameter should be enumerated and defined in the ontology).
The actual name for the parameter, from the referred-to controlled vocabulary. This should be the preferred name associated with the specified accession number.
An optional CV accession number for the unit term associated with the value, if any (e.g., 'UO:0000266' for 'electron volt').
An optional CV name for the unit accession number, if any (e.g., 'electron volt' for 'UO:0000266' ).
If a unit term is referenced, this attribute must refer to the CV 'id' attribute defined in the cvList in this mzML file.
Uncontrolled user parameters (essentially allowing free text). Before using these, one should verify whether there is an appropriate CV term available, and if so, use the CV term instead
The name for the parameter.
The datatype of the parameter, where appropriate (e.g.: xsd:float).
The value for the parameter, where appropriate.
An optional CV accession number for the unit term associated with the value, if any (e.g., 'UO:0000266' for 'electron volt').
An optional CV name for the unit accession number, if any (e.g., 'electron volt' for 'UO:0000266' ).
If a unit term is referenced, this attribute must refer to the CV 'id' attribute defined in the cvList in this mzML file.
A reference to a previously defined ParamGroup, which is a reusable container of one or more cvParams.
Reference to the id attribute in a referenceableParamGroup.
List and descriptions of samples.
The number of Samples defined in this mzML file.
Expansible description of the sample used to generate the dataset, named in sampleName.
A unique identifier across the samples with which to reference this sample description.
An optional name for the sample description, mostly intended as a quick mnemonic.
List and descriptions of instrument configurations. At least one instrument configuration must be specified, even if it is only to specify that the instrument is unknown. In that case, the "instrument model" term is used to indicate the unknown instrument in the instrumentConfiguration.
The number of instrument configurations present in this list.
This attribute must be used to indicate the order in which the components are encountered from source to detector (e.g., in a Q-TOF, the quadrupole would have the lower order number, and the TOF the higher number of the two).
This element must be used to describe a Source Component Type. This is a PRIDE3-specific
modification of the core MzML schema that does not have any impact on the base schema validation.
This element must be used to describe a Detector Component Type. This is a PRIDE3-specific
modification of the core MzML schema that does not have any impact on the base schema validation.
This element must be used to describe an Analyzer Component Type. This is a
PRIDE3-specific
modification of the core MzML schema that does not have any impact on the base schema validation.
List with the different components used in the mass spectrometer. At least one source, one mass analyzer and one detector need to be specified.
A source component.
A mass analyzer (or mass filter) component.
A detector component.
The number of components in this list.
Description of a particular hardware configuration of a mass spectrometer. Each configuration must have one (and only one) of the three different components used for an analysis. For hybrid instruments, such as an LTQ-FT, there must be one configuration for each permutation of the components that is used in the document. For software configuration, use a ReferenceableParamGroup element.
An identifier for this instrument configuration.
Reference to a previously defined software element
This attribute must be used to reference the 'id' attribute of a software element.
List and descriptions of software used to acquire and/or process the data in this mzML file.
A piece of software.
The number of softwares defined in this mzML file.
Software information.
An identifier for this software that is unique across all SoftwareTypes.
The software version.
List and descriptions of data processing applied to this data.
The number of DataProcessingTypes in this mzML file.
Description of the way in which a particular software was used.
Description of the default peak processing method. This element describes the base method used in the generation of a particular mzML file. Variable methods should be described in the appropriate acquisition section - if no acquisition-specific details are found, then this information serves as the default.
A unique identifier for this data processing that is unique across all DataProcessingTypes.
This attributes allows a series of consecutive steps to be placed in the correct order.
This attribute must reference the 'id' of the appropriate SoftwareType.
List with the descriptions of the acquisition settings applied prior to the start of data acquisition.
The number of AcquisitionType elements in this list.
Description of the acquisition settings of the instrument prior to the start of the run.
List with the source files containing the acquisition settings.
Target list (or 'inclusion list') configured prior to the run.
A unique identifier for this acquisition setting.
Target list (or 'inclusion list') configured prior to the run.
The number of TargetType elements in this list.
A run in mzML should correspond to a single, consecutive and coherent set of scans on an instrument.
All mass spectra and the acquisitions underlying them are described and attached here. Subsidiary data arrays are also both described and attached here.
All chromatograms for this run.
A unique identifier for this run.
This attribute must reference the 'id' of the default instrument configuration. If a scan does not reference an instrument configuration, it implicitly refers to this configuration.
This attribute can optionally reference the 'id' of the default source file. If a spectrum or scan does not reference a source file and this attribute is set, then it implicitly refers to this source file.
This attribute must reference the 'id' of the appropriate sample.
The optional start timestamp of the run, in UT.
This attribute must reference the 'id' of the appropriate sourceFile.
Reference to a previously defined sourceFile.
This number of source files referenced in this list.
List and descriptions of spectra.
The number of spectra defined in this mzML file.
This attribute MUST reference the 'id' of the default data processing for the spectrum list. If an acquisition does not reference any data processing, it implicitly refers to this data processing. This attribute is required because the minimum amount of data processing that any format will undergo is "conversion to mzML".
A range of m/z values over which the instrument scans and acquires a spectrum.
The number of scan windows defined in this list.
List and descriptions of scans.
the number of scans defined in this list.
Scan or acquisition from original raw file used to create this peak list, as specified in sourceFile.
Container for a list of scan windows.
For scans that are local to this document, this attribute can be used to reference the 'id' attribute of the spectrum corresponding to the scan.
If this attribute is set, it must reference the 'id' attribute of a sourceFile representing the external document containing the spectrum referred to by 'externalSpectrumID'.
For scans that are external to this document, this string must correspond to the 'id' attribute of a spectrum in the external document indicated by 'sourceFileRef'.
This attribute can optionally reference the 'id' attribute of the appropriate instrument configuration.
List and descriptions of precursor isolations to the spectrum currently being described, ordered.
The number of precursor isolations in this list.
The method of precursor ion selection and activation
This element captures the isolation (or 'selection') window configured to isolate one or more ionss.
A list of ions that were selected.
The type and energy level used for activation.
For precursor spectra that are local to this document, this attribute must be used to reference the 'id' attribute of the spectrum corresponding to the precursor spectrum.
For precursor spectra that are external to this document, this attribute must reference the 'id' attribute of a sourceFile representing that external document.
For precursor spectra that are external to this document, this string must correspond to the 'id' attribute of a spectrum in the external document indicated by 'sourceFileRef'.
The list of selected precursor ions.
The number of selected precursor ions defined in this list.
List and descriptions of product isolations to the spectrum currently being described, ordered.
The number of product isolations in this list.
The method of product ion selection and activation in a precursor ion scan
This element captures the isolation (or 'selection') window configured to isolate one or more ions.
List of binary data arrays.
Data point arrays for default data arrays (m/z, intensity, time) and meta data arrays. Default data arrays must not have the attributes 'arrayLength' and 'dataProcessingRef'.
The number of binary data arrays defined in this list.
The structure into which encoded binary data goes. Byte ordering is always little endian (Intel style). Computers using a different endian style must convert to/from little endian when writing/reading mzML
The actual base64 encoded binary data. The byte order is always 'little endian'.
This optional attribute may override the 'defaultArrayLength' defined in SpectrumType. The two default arrays (m/z and intensity) should NEVER use this override option, and should therefore adhere to the 'defaultArrayLength' defined in SpectrumType. Parsing software can thus safely choose to ignore arrays of lengths different from the one defined in the 'defaultArrayLength' SpectrumType element.
This optional attribute may reference the 'id' attribute of the appropriate dataProcessing.
The encoded length of the binary data array.
The structure that captures the generation of a peak list (including the underlying acquisitions). Also describes some of the parameters for the mass spectrometer for a given acquisition (or list of acquisitions).
The native identifier for a spectrum. For unmerged native spectra or spectra from older open file formats, the format of the identifier is defined in the PSI-MS CV and referred to in the mzML header. External documents may use this identifier together with the mzML filename or accession to reference a particular spectrum.
The identifier for the spot from which this spectrum was derived, if a MALDI or similar run.
The zero-based, consecutive index of the spectrum in the SpectrumList.
Default length of binary data arrays contained in this element.
This attribute can optionally reference the 'id' of the appropriate dataProcessing.
This attribute can optionally reference the 'id' of the appropriate sourceFile.
List of chromatograms.
The number of chromatograms defined in this mzML file.
This attribute MUST reference the 'id' of the default data processing for the chromatogram list. If an acquisition does not reference any data processing, it implicitly refers to this data processing. This attribute is required because the minimum amount of data processing that any format will undergo is "conversion to mzML".
A single chromatogram.
A unique identifier for this chromatogram.
The zero-based index for this chromatogram in the chromatogram list.
Default length of binary data arrays contained in this element.
This attribute can optionally reference the 'id' of the appropriate dataProcessing.
This is the root element for the Proteomics Standards Initiative (PSI) mzML schema, which is intended to capture the use of a mass spectrometer, the data generated, and the initial processing of that data (to the level of the peak list).
This is a unique key constraint on spectrum identifiers stored in the id attribute. It ensures that an id is present and unique among all spectra. Note that this constrains schematic validation only (full semantic validation restricts spectrum IDs to a specified nativeID format).
This is a unique key constraint on chromatogram identifiers stored in the id attribute. It ensures that an id is present and unique among all chromatograms.
This is a unique key constraint on source file identifiers stored in the id attribute. It ensures that an id is present and unique among all source files.
This is a unique key constraint on CV identifiers stored in the id attribute. It ensures that an id is present and unique among all CVs.
This is a unique key constraint on referenceable param group identifiers stored in the id attribute. It ensures that an id is present and unique among all referenceable param groups.
This is a unique key constraint on sample identifiers stored in the id attribute. It ensures that an id is present and unique among all samples.
This is a unique key constraint on instrument configuration identifiers stored in the id attribute. It ensures that an id is present and unique among all instrument configurations.
This is a unique key constraint on data processing identifiers stored in the id attribute. It ensures that an id is present and unique among all data processing elements.
This is a unique key constraint on software identifiers stored in the id attribute. It ensures that an id is present and unique among all software elements.
This is a unique key constraint on scan settings identifiers stored in the id attribute. It ensures that an id is present and unique among all scan settings elements.
This is a reference to spectrum contained within this file. It ensures that an id is present in the file and is one of the values defined in KEY_SPECTRUM_ID.
This is a reference to spectrum contained within this file. It ensures that an id is present in the file and is one of the values defined in KEY_SPECTRUM_ID.
This is a reference to a source file in sourceFileList. It ensures that an id is present in the file and is one of the values defined in KEY_SOURCEFILE_ID.
This is a reference to a sample in sampleList. It ensures that an id is present in the file and is one of the values defined in KEY_SAMPLE_ID.
This is a reference to an instrument configuration in instrumentConfigurationList. It ensures that an id is present in the file and is one of the values defined in KEY_SOFTWARE_ID.
This is a reference to a data processing element in dataProcessingList. It ensures that an id is present in the file and is one of the values defined in KEY_DP_ID.
This is a reference to a data processing element in dataProcessingList. It ensures that an id is present in the file and is one of the values defined in KEY_DP_ID.
This is a reference to a CV in cvList. It ensures that an id is present in the file and is one of the values defined in KEY_CV_ID.
This is a reference to the CV in cvList used for unit terms. It ensures that an id is present in the file and is one of the values defined in KEY_CV_ID.
This is a reference to a referenceable param group in referenceableParamGroupList. It ensures that an id is present in the file and is one of the values defined in KEY_RPG_ID.
This is a reference to an instrument configuration in instrumentConfigurationList. It ensures that an id is present in the file and is one of the values defined in KEY_IC_ID.
This is a reference to an instrument configuration in instrumentConfigurationList. It ensures that an id is present in the file and is one of the values defined in KEY_IC_ID.