This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
userguide:retentiontimealignment [2010/08/02 10:39] 132.168.74.230 |
userguide:retentiontimealignment [2010/08/02 16:21] 132.168.74.230 |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Retention Time Alignment ====== | ====== Retention Time Alignment ====== | ||
- | Here are the steps followed by the Retention Time algorithm: | ||
- | Collect species from the reference User context and predict their retention time using an external utility | ||
- | Collect species to align within identifications that are gathered under the reference User context | ||
- | Constitute pair groups between collected reference species & species to align: a pair group is created from species having same sequence & calculated mass | ||
+ | ===== Major steps followed by the Retention Time algorithm ===== | ||
- | - Species Retention Time of the reference UserContext are predicted with **NETPrediction v2.2.3378** utility using Kangas method (click [[http://omics.pnl.gov/software/NETPredictionUtility.php|here]] for more details). NETPrediction utility only uses species **sequences** to predict a Normalized Elution Time (NET) value. First, a list of '//reference//' species is built with the below criteria, then the corresponding sequences are exported in order to be used by the NETPrediction utility: | + | {{ :userguide:rtalignmentconcept2.png|}} |
- | * The reference species list doesn't contain any species with PTMs | + | - Collect species from the reference User context and predict their retention time using an external utility and store them in reference species properties |
- | * If several species exist with redundant sequences, the best score species is retained | + | - For each identification gathered under the reference User context do: |
- | - All the final child species, i.e. species in identifications, are collected. ''a | + | - Collect species to align from the identification |
- | c | + | - Constitute several pair groups between collected reference species & species to align. A pair group contains 2 groups of species having same sequence & calculated mass (group1 has species from reference User context and group2 has species to align from identifications). |
- | f'' | + | - Compute one (or several) representative value(s) for group1 & group2 for each pair group |
- | - tata | + | - Compute linear regression between representative values |
- | + | - Store linear regression coefficients & reference context name in identification properties | |
- | {{:userguide:specif_alignement_tr_1.png|}} | + | |
- | {{:userguide:specif_alignement_tr_2.png|}} | + | ===== In more details... ===== |
+ | |||
+ | {{ RtAlignmentConcept1.PNG}} | ||
+ | - Species Retention Time of the reference UserContext are predicted with **NETPrediction v2.2.3378** utility using Kangas method (click [[http://omics.pnl.gov/software/NETPredictionUtility.php|here]] for more details). NETPrediction utility only uses species **sequences** to predict a Normalized Elution Time (NET) value. | ||
+ | - First, a list of '//reference//' species is built | ||
+ | * The reference species list doesn't contain any species with PTMs | ||
+ | * If several species exist with redundant sequences, the best score species is retained | ||
+ | - Then, the corresponding sequences are exported in order to be used by the NETPrediction utility | ||
+ | - The predicted NET are converted to retention time using user-defined parameters. | ||
+ | - The user can decide to exclude predicted values too far from the others: | ||
+ | * The average absolute deviation (between RT & predicted RT) is computed and all predicted RT far from this average value about a given user-defined threshold are excluded. | ||
+ | - The predicted RT values are stored as properties in the reference species. | ||
+ | - For each identification existing under the reference User context (not necessary directly under the reference user context): | ||
+ | - All the final child species to align are collected | ||
+ | - Several pair groups are created using reference species & species to align. A pair group contains 2 groups of species having same sequence & calculated mass (group1 has species from reference User context and group2 has species to align from identifications). | ||
+ | - For each group, according the statistical method choosen by the user, one or several representative RT value(s) is(are) calculated. | ||
+ | * Each 'Group1' always contains one species because [[userguide:proteingroups|species/protein grouping]] has been executed on the reference context and suppressed sequence/calculated mass redundancy. | ||
+ | * 'Group2' may contain one or more species. Following statistical methods are available: | ||
+ | * Average {{ :userguide:rtalignmentconcept3.png|}} | ||
+ | * Median | ||
+ | * Cross-product | ||
+ | * Combined | ||
+ | * Best score | ||
+ | - Compute linear regression between pair groups | ||
+ | - Store linear regression coefficients (slope/intercept) + reference context name in the identification properties | ||