Table of Contents

Protein Sets Filtering

Filtering applied during validation is the same as protsetfiltering

Protein Sets Validation

Once pre-filters (see above) have been applied, a validation algorithm can be run to control the FDR. See how FDR is calculated

At the moment, it is only possible to control the FDR by changing the Protein Set Score threshold. Three different protein set scoring functions are available.

Given an expected FDR, the system will try to estimate the best score threshold to reach this FDR. Two validation rules (R1 and R2) corresponding to two different groups of protein sets (see below the detailed procedure) are optimized by the algorithm. Each rule defines the optimum score threshold allowing to obtain the closest FDR to the expected one for the corresponding group of protein sets.

Here is the procedure used for FDR optimization:

The separation of proteins sets in two groups allows to increase the power of discrimination between target and decoy hits. Indeed, the score threshold of the G1 group is often much higher than the G2 one. If we were using a single average threshold, this will reduce the number of G2 validated proteins, leading to a decrease in sensitivity for a same value of FDR. In the future, we will try to implement such a strategy in order to allow the user to make its own comparison.