User Tools

Site Tools


userguide:peptidesandproteingrouping

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
userguide:peptidesandproteingrouping [2010/01/08 12:38]
132.168.74.230
userguide:peptidesandproteingrouping [2012/03/29 09:02]
132.168.72.131
Line 1: Line 1:
 ====== Protein grouping ====== ====== Protein grouping ======
 +
 +Protein grouping is done from a parent context and consist of 
 +  * creating new peptides (grouped peptide) ans matches from the union of all peptides referenced in child context (direct child if exist, otherwise more deeper childs). ​
 +  * defining protein group using new set of peptides and matches associated to parent context.
  
 ===== Algorithm ===== ===== Algorithm =====
  
-==== Step - Peptide ​grouping ​====+**Since hEIDI 1.11.0**, a new peptide/​protein ​grouping ​algorithm has been implemented. This is part of a global idea which is to improve global performance in hEIDI.\\ 
 +Indeed, datasets become bigger and bigger (ex. VELOS), and loading all data in memory in one hEIDI session is no more possible (as we did before hEIDI 1.11.0). \\ 
 +The purpose is now to load the minimum information in hEIDI session, and to have algorithms that save results directly to MSIdb. \\ 
 +We have started to optimize the peptide/​Protein grouping algorithm as it requires to load a complex object tree and so is very memory consuming. Other algorithms will be optimized progressively in further hEIDI versions.\\
  
-Peptides from different identifications - attached directly or indirectly to a context - are grouped.\\ +**What are the changes for the user:** 
-Peptides must have same sequence and same experimental ​mass to be grouped.\\+  * Paradoxically,​ the grouping is little bit slower compared to previous hEIDI versions. But, initial loading and save operation will be faster 
 +  * The MSIdb must be saved before launching grouping 
 +  * The peptide/​protein grouping result is automatically saved to MSIdb 
 + 
 +Protein grouping mechanism is detailled beneath the following image.\\ 
 +{{:​proteingroupingmechanism.png| Protein grouping mechanism}} 
 +==== Step 1 - Peptide grouping ==== 
 +=== grouping === 
 +{{:​GroupingStep1.png |}} Peptides from different ​child context or identifications - attached directly or indirectly to a context - are grouped.\\ 
 +Peptides must have same sequence and same calculated ​mass to be grouped. \\
 Peptide grouping results in **new** peptides attached to the parent context and having child peptides Peptide grouping results in **new** peptides attached to the parent context and having child peptides
  
-{{:GroupingStep1.png}}+{{ :peptidegrouping.png|}} A new peptide is construct as follow (since heidi 1.13.0) : 
 +  * ''​peptide reference (sequence, ptm), missed cleavage and calculated mass''​ are copied from the first child peptide found : 
 +  * ''​experimental mass, charge, delta mass, score, retention time and fragmentation count''​ are copied from the best child : 
 +  * ''​child list''​ is set as peptides with same sequence, same mass are found 
 +  * to define the ''​matches list''​ associated to new peptide, matches from all child peptides are grouped using matched protein. Created match ''​score''​ is set to the max of all child matches scores and ''​start and end''​ value are equal to child matches start and end. 
 + 
 +=== filtering === 
 +Different filters could be applied during grouping.  
 +  - First protein filter is applied while creating the list of proteins to which match each peptide. (filter reverse protein for instance). This is done before ''​new matches''​ are created.  
 +  - A second protein filter could be applied depending on its list of matches. This filter is applied after ''​new matches''​ are created. Typically this correspond to filter protein with less than x peptides...  
 +  - The last filter allow user to filter new grouped peptides.  
 + 
 +The filtered protein or species are not taken into account in the final grouping result, they are removed from result (unlike during [[userguide:​proteinfiltering|protein filter]] ). An other difference with ''​protein filter''​ operation is that filtering is done on each proteins. 
 + 
 ==== Step 2 - Protein grouping ==== ==== Step 2 - Protein grouping ====
  
-Peptides from a same protein ​or a same protein set are grouped.\\+{{:​GroupingStep2.png |}} Once new peptides have been created and associated to parent context, ​same grouping as done by Mascot® and IRMa is done. 
 + 
 +But before executing the protein ​grouping the list of proteins to be considered is filtered using optional ​protein ​filter. This means that proteins are filtered individually and filter is not applied to protein group level. See protein group filters page.  
 + 
 +Protein grouping consist in :  
 +  * All proteins identified by the same set of peptides ​are grouped ​together as a **protein group**. Proteins sharing only a sub-set of peptides are distinguished in each group. A **typical** protein is one of the same-set proteins. The rules used to [[how_to:​changetypicalprotein|select this typical]] can be specified by user.\\ 
 Protein grouping results in **new** groups of proteins and peptides, attached to the parent context. Protein grouping results in **new** groups of proteins and peptides, attached to the parent context.
 +The protein group and proteins matching properties are set as follow :
  
-{{:​GroupingStep2.png}}+  * Create a //protein match// for each protein of the group where the list and count of matching species is set. 
 +  * Calculate score and coverage value using all matching species
  
 ===== Beware of protein grouping order ===== ===== Beware of protein grouping order =====
Line 50: Line 88:
  
 But in case 1, when grouping proteins at the ''​Rootnode''​ level, ProtA will '​gain'​ one peptide more (ProtA will be identified by 3 peptides instead of 2). So, ProtA will not be filtered and will appear in the final result. ​   ​ But in case 1, when grouping proteins at the ''​Rootnode''​ level, ProtA will '​gain'​ one peptide more (ProtA will be identified by 3 peptides instead of 2). So, ProtA will not be filtered and will appear in the final result. ​   ​
- 
- 
userguide/peptidesandproteingrouping.txt · Last modified: 2012/03/29 09:02 by 132.168.72.131