Differences

This shows you the differences between two versions of the page.

--- userguide:retentiontimealignment [2010/08/02 10:39]
132.168.74.230
+++ userguide:retentiontimealignment [2010/08/02 16:21]
132.168.74.230
@@ Line 1: / Line 1: @@
 ====== Retention Time Alignment ======
-Here are the steps followed by the Retention Time algorithm:
-Collect species from the reference User context and predict their retention time using an external utility
-Collect species to align within identifications that are gathered under the reference User context
-Constitute pair groups between collected reference species & species to align: a pair group is created from species having same sequence & calculated mass
+===== Major steps followed by the Retention Time algorithm =====
-  - Species Retention Time of the reference UserContext are predicted with **NETPrediction v2.2.3378** utility using Kangas method (click [[http://omics.pnl.gov/software/NETPredictionUtility.php|here]] for more details). NETPrediction utility only uses species **sequences** to predict a Normalized Elution Time (NET) value. First, a list of '//reference//' species is built with the below criteria, then the corresponding sequences are exported in order to be used by the NETPrediction utility:
+{{ :userguide:rtalignmentconcept2.png|}}
-    * The reference species list doesn't contain any species with PTMs
+  - Collect species from the reference User context and predict their retention time using an external utility and store them in reference species properties
-    * If several species exist with redundant sequences, the best score species is retained
+  - For each identification gathered under the reference User context do:
-  - All the final child species, i.e. species in identifications, are collected. ''a
+    - Collect species to align from the identification
-    c
+    - Constitute several pair groups between collected reference species & species to align. A pair group contains 2 groups of species having same sequence & calculated mass (group1 has species from reference User context and group2 has species to align from identifications).
-    f''
+    - Compute one (or several) representative value(s) for group1 & group2 for each pair group
-  - tata
+    - Compute linear regression between representative values
+    - Store linear regression coefficients & reference context name in identification properties
-{{:userguide:specif_alignement_tr_1.png|}}
-{{:userguide:specif_alignement_tr_2.png|}}
+===== In more details... =====
+{{ RtAlignmentConcept1.PNG}}
+  - Species Retention Time of the reference UserContext are predicted with **NETPrediction v2.2.3378** utility using Kangas method (click [[http://omics.pnl.gov/software/NETPredictionUtility.php|here]] for more details). NETPrediction utility only uses species **sequences** to predict a Normalized Elution Time (NET) value.
+    - First, a list of '//reference//' species is built
+         * The reference species list doesn't contain any species with PTMs
+         * If several species exist with redundant sequences, the best score species is retained
+    - Then, the corresponding sequences are exported in order to be used by the NETPrediction utility
+  - The predicted NET are converted to retention time using user-defined parameters.
+  - The user can decide to exclude predicted values too far from the others:
+    * The average absolute deviation (between RT & predicted RT) is computed and all predicted RT far from this average value about a given user-defined threshold are excluded.
+  - The predicted RT values are stored as properties in the reference species.
+  - For each identification existing under the reference User context (not necessary directly under the reference user context):
+    -  All the final child species to align are collected
+    -  Several pair groups are created using reference species & species to align. A pair group contains 2 groups of species having same sequence & calculated mass (group1 has species from reference User context and group2 has species to align from identifications).
+    - For each group, according the statistical method choosen by the user, one or several representative RT value(s) is(are) calculated.
+      * Each 'Group1' always contains one species because [[userguide:proteingroups|species/protein grouping]] has been executed on the reference context and suppressed sequence/calculated mass redundancy.
+      * 'Group2' may contain one or more species. Following statistical methods are available:
+        * Average {{ :userguide:rtalignmentconcept3.png|}}
+        * Median
+        * Cross-product
+        * Combined
+        * Best score
+    - Compute linear regression between pair groups
+    - Store linear regression coefficients (slope/intercept) + reference context name in the identification properties

hEIDI

User Tools

Site Tools

Differences

Page Tools