This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
prolineconcepts:protscoring [2013/11/22 13:52] 193.48.0.3 [Mascot Modified MudPIT Scoring] |
prolineconcepts:protscoring [2015/07/10 15:21] (current) 132.168.72.225 [Proteins and Proteins sets scoring] |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Proteins and Proteins sets scoring ====== | ====== Proteins and Proteins sets scoring ====== | ||
- | There are multiple algorithms than could be use to calculate the Proteins and Protein Sets score. | + | There are multiple algorithms than could be used to calculate the Proteins and Protein Sets score. |
- | Proteins score are computed during the importation phase while Protein Sets score are comptued during the validation phase. | + | Proteins score are computed during the importation phase while Protein Sets score are computed during the validation phase. |
===== Protein ===== | ===== Protein ===== | ||
Line 10: | Line 10: | ||
* importing Mascot result file : the Mascot standard scoring is used (sum of peptide matches scores) | * importing Mascot result file : the Mascot standard scoring is used (sum of peptide matches scores) | ||
* importing OMSSA result file : FIXME | * importing OMSSA result file : FIXME | ||
+ | * importing X! tandem result file : the X! Tandem standard hyperscore is used | ||
===== Protein Set ===== | ===== Protein Set ===== | ||
- | Each individual protein set is scored according to the validated peptide matches belonging to this protein set (see inference). | + | Each individual protein set is scored according to the validated peptide matches belonging to this protein set (see [[prolineconcepts:proteininferer|inference]]). |
+ | |||
===== Scoring schemes ===== | ===== Scoring schemes ===== | ||
Line 23: | Line 26: | ||
==== Mascot MudPIT Scoring ==== | ==== Mascot MudPIT Scoring ==== | ||
- | This is scoring scheme is also based on the sum of all non-duplicate peptide matches score. However the score for each peptide match is not its absolute value, but the amount that it is above the threshold: the score offset. Therefore, peptide matches with a score below the threshold do not contribute to the protein score. Finally, the average of the thresholds used is added to the score. For each peptide match, the "threshold" is the homology threshold if it exists, otherwise it is the identity threshold. | + | This scoring scheme is also based on the sum of all non-duplicate peptide matches score. However the score for each peptide match is not its absolute value, but the amount that it is above the threshold: the score offset. Therefore, peptide matches with a score below the threshold do not contribute to the protein score. Finally, the average of the thresholds used is added to the score. For each peptide match, the "threshold" is the homology threshold if it exists, otherwise it is the identity threshold. |
The algorithm below illustrates the MudPIT score computation procedure: | The algorithm below illustrates the MudPIT score computation procedure: | ||
Line 48: | Line 51: | ||
==== Mascot Modified MudPIT Scoring ==== | ==== Mascot Modified MudPIT Scoring ==== | ||
- | This scoring scheme, introduced by Proline, is a modified of the Mascot MudPIT one. | + | This scoring scheme, introduced by Proline, is a modified version of the Mascot MudPIT one. |
- | The difference with the latter is that we do not the average of the substracted thresholds. | + | The difference with the latter is that it does not take into account the average of the substracted thresholds. |
This leads to the following scoring procedure: | This leads to the following scoring procedure: | ||
Line 64: | Line 67: | ||
</code> | </code> | ||
- | This score has the same benefits than the MudPIT one. The main difference is that the minimum value of this modified version will be always close to zero while the genuine MudPIT score define a minimum value which is not constant between the datasets and the proteins (i.e. the average of all the subtracted thresholds). | + | This score has the same benefits than the MudPIT one. The main difference is that the minimum value of this modified version will be always close to zero while the genuine MudPIT score defines a minimum value which is not constant between the datasets and the proteins (i.e. the average of all the subtracted thresholds). |