This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
prolineconcepts:rsvalidation [2014/02/04 09:29] 132.168.72.206 |
prolineconcepts:rsvalidation [2015/07/10 11:05] (current) 132.168.72.225 |
||
---|---|---|---|
Line 3: | Line 3: | ||
Once a result file have been imported and a search result created, the validation is performed in 4 mains steps : | Once a result file have been imported and a search result created, the validation is performed in 4 mains steps : | ||
- | - [[.:rsvalidation#Peptide Matches Filtering|Peptide Matches filtering]] and [[.:rsvalidation#Peptide Matches Validation|validation]] | ||
- [[prolineconcepts:PeptideMatchesFilteringAndValidation|Peptide Matches filtering and Validation]] | - [[prolineconcepts:PeptideMatchesFilteringAndValidation|Peptide Matches filtering and Validation]] | ||
- [[.:ProteinInferer|Protein Inference]] (peptides and proteins grouping) | - [[.:ProteinInferer|Protein Inference]] (peptides and proteins grouping) | ||
Line 9: | Line 8: | ||
- [[prolineconcepts:ProteinSetsFilteringAndValidation|Protein Sets Filtering and Validation]] | - [[prolineconcepts:ProteinSetsFilteringAndValidation|Protein Sets Filtering and Validation]] | ||
- | Finally, the [[.:rs_rsm|Identification Result]] issued from these steps is stored in the identification database. Different validation of a Search Result can be performed and a new Identification Summary of this Search Result is created for each validation. | + | Finally, the [[.:rsm|Identification Result]] issued from these steps is stored in the identification database. Different validation of a Search Result can be performed and a new Identification Summary of this Search Result is created for each validation. |
- | ===== Peptide Matches Filtering ===== | ||
- | Peptide Matches identified in search result can be filtered using one or multiple predefined filters (describes here after). Only validated peptide matches will be considered for further steps.\\ | ||
- | |||
- | |||
- | ==== Basic Score Filter ==== | ||
- | |||
- | All PSMs which score is lower than a given threshold are invalidated. | ||
- | |||
- | ==== Pretty Rank Filter ==== | ||
- | |||
- | This filtering is performed after having temporarily joined target and decoy PSMs corresponding to the same query (only really needed for separated forward/reverse database searches). Then for each query, PSMs from target and decoy are sorted by their score. A rank (Mascot pretty rank) is computed for each PSM depending on their score position: PSM with almost equal score (difference < 0.1) are assigned the same rank. | ||
- | All PSMs with rank greater than specified one are invalidated. | ||
- | |||
- | |||
- | ==== Minimum Sequence length Filter ==== | ||
- | |||
- | PSMs corresponding to short peptide sequences (length lower than the provided one) can be invalidated using this parameter. | ||
- | |||
- | ==== Mascot eValue Filter ==== | ||
- | |||
- | Allows to filter PSMs by using the Mascot expectation value (e-value) which reflects the difference between the PSM score and the Mascot identity threshold (p=0.05). | ||
- | PSMs having an e-value greater than the specified one are invalidated. | ||
- | |||
- | ==== Mascot adjusted eValue Filter ==== | ||
- | |||
- | Proline is able to compute an adjusted e-value. It first selects the lowest threshold between the identity and homology ones (p=0.05). Then it computes the e-value using this selected threshold. | ||
- | PSMs having an adjusted e-value greater than the specified one are invalidated. | ||
- | |||
- | ==== Mascot p-value on Identity Filter ==== | ||
- | |||
- | Given a specific p-value, the Mascot identity threshold is calculated for each query and all peptide matches associated to the query with a score lower than calculated identity threshold are invalidated.\\ | ||
- | When parsing Mascot result file, the number of PSM candidate for a spectra is saved and could be used to recalculate identity threshold for any p-value. | ||
- | |||
- | ==== Mascot p-value on homology Filter ==== | ||
- | |||
- | Given a specific p-value, the Mascot homology threshold is inferred for each query and all peptide matches associated to the query with a score lower than calculated homology threshold are invalidated. | ||
- | |||
- | ===== Peptide Matches Validation ===== | ||
- | |||
- | Specify an expected FDR and tune a specified filter in order to obtain this FDR. | ||
- | See how[[prolineconcepts:fdrestimator| FDR is calculated]] | ||
- | |||
- | Once previously described pre-filters have been applied, a validation algorithm can be run to control the FDR: given a criteria, the system will estimate the better threshold value in order to reach a specific FDR. |