Corrects retention time distortions between maps, using information from peptides identified in different maps.
potential predecessor tools | ![]() ![]() | potential successor tools |
XTandemAdapter (or another search engine adapter) | IDMerger | |
IDFileConverter | FeatureLinkerUnlabeled or FeatureLinkerUnlabeledQT | |
IDMapper |
Reference:
Weisser et al.: An automated pipeline for high-throughput label-free quantitative proteomics (J. Proteome Res., 2013, PMID: 23391308).
This tool provides an algorithm to align the retention time scales of multiple input files, correcting shifts and distortions between them. Retention time adjustment may be necessary to correct for chromatography differences e.g. before data from multiple LC-MS runs can be combined (feature grouping), or when one run should be annotated with peptide identifications obtained in a different run.
All map alignment tools (MapAligner...) collect retention time data from the input files and - by fitting a model to this data - compute transformations that map all runs to a common retention time scale. They can apply the transformations right away and return output files with aligned time scales (parameter out
), and/or return descriptions of the transformations in trafoXML format (parameter trafo_out
). Transformations stored as trafoXML can be applied to arbitrary files with the MapRTTransformer tool.
The map alignment tools differ in how they obtain retention time data for the modeling of transformations, and consequently what types of data they can be applied to. The alignment algorithm implemented here is based on peptide identifications, and thus applicable to files containing peptide IDs (idXML, annotated featureXML/consensusXML). It finds peptide sequences that different input files have in common and uses them as points of correspondence between the inputs. For more details and algorithm-specific parameters (set in the INI file) see "Detailed Description" in the algorithm documentation.
Note that alignment is based on the sequence including modifications, thus an exact match is required. I.e., a peptide with oxidised methionine will not be matched to its unmodified version. This behavior is generally desired since (some) modifications can cause retention time shifts.
Since OpenMS 1.8, the extraction of data for the alignment has been separate from the modeling of RT transformations based on that data. It is now possible to use different models independently of the chosen algorithm. This algorithm has been tested mostly with the "b_spline" model. The different available models are:
The following parameters control the modeling of RT transformations (they can be set in the "model" section of the INI file):
Name | Type | Default | Restrictions | Description |
---|---|---|---|---|
type | string | interpolated | linear, b_spline, interpolated | Type of model |
linear:symmetric_regression | string | false | true, false | Perform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'. |
b_spline:wavelength | float | 0 | min: 0 | Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points. |
b_spline:num_nodes | int | 5 | min: 0 | Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing. |
b_spline:extrapolate | string | linear | linear, b_spline, constant, global_linear | Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range). |
b_spline:boundary_condition | int | 2 | min: 0 max: 2 | Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero) |
interpolated:interpolation_type | string | cspline | linear, cspline, akima | Type of interpolation to apply. |
The command line parameters of this tool are:
MapAlignerIdentification -- Corrects retention time distortions between maps based on common peptide identifi cations. Version: 2.0.0 Mar 30 2016, 12:52:33, Revision: GIT-NOTFOUND Usage: MapAlignerIdentification <options> This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript ion or use the --helphelp option. Options (mandatory options marked with '*'): -in <files>* Input files separated by blanks (all must have the same file type) (valid format s: 'featureXML', 'consensusXML', 'idXML') -out <files> Output files separated by blanks. Either 'out' or 'trafo_out' has to be provided . They can be used together. (valid formats: 'featureXML', 'consensusXML', 'idXM L') -trafo_out <files> Transformation output files separated by blanks. Either 'out' or 'trafo_out' has to be provided. They can be used together. (valid formats: 'trafoXML') Options to define a reference file (use either 'file' or 'index', not both; if neither is given 'index' is used).: -reference:file <file> File to use as reference (same file format as input files required) (valid forma ts: 'featureXML', 'consensusXML', 'idXML') -reference:index <number> Use one of the input files as reference ('1' for the first file, etc.). If '0', no explicit reference is set - the algorithm will select a reference. ( default: '0' min: '0') Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced) The following configuration subsections are valid: - algorithm Algorithm parameters section - model Options to control the modeling of retention time transformations from data You can write an example INI file using the '-write_ini' option. Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor. Have a look at the OpenMS documentation for more information.
INI file documentation of this tool:
OpenMS / TOPP release 2.0.0 | Documentation generated on Wed Mar 30 2016 16:18:43 using doxygen 1.8.5 |