====== Identification Tree and Dataset creation ======

Once your Result Files have been imported, you can use them to create a new Identification Dataset.
To create it from result files, you can:
  * right-click the //Identification Trees// node in the project tree, then select //New Dataset// or
  * double-click the //Identification Trees// node to show the identifications grid. It is meant to list all your datasets, and the results of their validation if you've run them. For now, click on the //New Dataset// button from the toolbar.

{{ :how_to:web:dse_new_dataset.png?nolink&1000 |}}

Note that you can also create an empty dataset to further assemble complex structures using drag and drop in the project tree. However, this way is not favoured.

{{:how_to:web:dse_new_dataset_based_on_sresults.png?nolink&200 |}}
\\
\\
You should now see a window asking to choose a source of data for your new Dataset.\\
The only option available yet is //Result Set//: it allows you to build a new dataset from the Result Files you have imported.
Click //OK// to continue.

{{ :how_to:web:dse_new_dataset_creation_panel_empty.png?nolink&1000 |}}

\\

===== Create a simple dataset (single group) =====

==== Dataset properties ====

In the right panel, two text fields allow you to enter a name and description (optional) for your dataset.

By default, the result sets will be named in the tree using the result file name. You can change that with the //Name child results using// box. The possibilities are: peaklist file name, raw file identifier, result file name, sample, search title.


=== Files selection ===

To add one or many files to your selection, select them in the grid (you can use the Ctrl and the Shift keys to make a multiple selection), then click on //Add to dataset// (top-right of the panel). You can also double click on one file to quickly add it to the selection. 

To remove any file from the selection, just select them and click on //Remove selected Items//.

=== Created dataset ===

The creation of your identification dataset happens as follows:
  * An “Aggregation” Dataset Node is created. It takes the name that you provided during the creation.
  * One “Identification” Dataset Node is created for each one of the Result Files you have selected. They take the name of the Result File.

{{:how_to:web:dse_newly_created_dataset.png?nolink&250 |}}

Once your Identification Dataset has been created, you can see it on the tree, in the left side of the window. You may need to collapse and expand again the //Identifications// node to see it appear.\\
The white icon let you known that it is not yet validated (becomes green when validated).


Double click on the aggregation node to open the identification summary. This panel shows a list of your Identification fractions (corresponding to each imported file) and, after the validation process, it will display the Merged Result Summary infos.

\\
\\
\\

===== Create a complex dataset (automatic grouping) =====

==== Dataset properties and files selection ====

Refer to the [[#Create a simple dataset (single group)|previous paragraph]] to know how to do these steps.

==== Add annotations to files ====

Once you have selected your result files, click on //Add annotations//. A window shows up, with as many empty lines as selected files.

{{ :how_to:web:dse_edit_annotations.png?nolink&700 |}}


=== Example 1: multiple biological groups ===

Let's say you compare 2 conditions, and you have the following result table (Excel file for instance). The idea is to copy/paste this table to the annotation editor.

| Result file | Peaklist File | Condition |
| F078594.dat | OVEMB150205_21.raw.mgf | UPS1 50fmol |
| F078596.dat | OVEMB150205_23.raw.mgf | UPS1 50fmol |
| F078592.dat | OVEMB150205_25.raw.mgf | UPS1 50fmol |
| F078590.dat | OVEMB150205_27.raw.mgf | UPS1 50fmol |
| F078591.dat | OVEMB150205_12.raw.mgf | UPS1 5fmol |
| F078595.dat | OVEMB150205_14.raw.mgf | UPS1 5fmol |
| F078597.dat | OVEMB150205_16.raw.mgf | UPS1 5fmol |
| F078593.dat | OVEMB150205_18.raw.mgf | UPS1 5fmol |


  * Select a property to identify the result set with the //Map annotations using...// box. It doesn't need to be the same property as for dataset children name (see [[#Dataset properties|dataset properties]]), so long as it refers to the selected files. In this example, it is convenient to select "Peaklist file name".
  * In the toolbar at the top of the grid, the annotation //Biological group// is selected by default. Click //Add// to add this column to the grid.
  * Copy the "Peaklist File" and "Condition" columns of your table, and paste it (using Ctrl + V) in the empty grid of the annotation editor. Note that the files order doesn't matter, so long as there is the right count.
  * The window should now look like this:

{{ :how_to:web:dse_edit_annotations_bio_groups.png?nolink&550 |}}

Click //OK// to register these annotations. The window closes and the files selection has been annotated as shown below (on the left). Click //Create Dataset//. The generated dataset (below, on the right) will have 2 levels of aggregation: biological group (with the copy/pasted names) and top-level (chosen name for dataset).

{{ :how_to:web:dse_edit_annotations_bio_groups_created.png?nolink&700 |}}

\\
\\

=== Example 2: multiple biological groups and biological samples ===

Let's take a slightly more complex design, introducing biological replicates for each condition:

| Result file | Peaklist File | Biological replicate | Condition |
| F078594.dat | OVEMB150205_21.raw.mgf | BRep 1 | UPS1 50fmol |
| F078596.dat | OVEMB150205_23.raw.mgf | BRep 1 | UPS1 50fmol |
| F078592.dat | OVEMB150205_25.raw.mgf | BRep 2 | UPS1 50fmol |
| F078590.dat | OVEMB150205_27.raw.mgf | BRep 2 | UPS1 50fmol |
| F078591.dat | OVEMB150205_12.raw.mgf | BRep 1 | UPS1 5fmol |
| F078595.dat | OVEMB150205_14.raw.mgf | BRep 1 | UPS1 5fmol |
| F078597.dat | OVEMB150205_16.raw.mgf | BRep 2 | UPS1 5fmol |
| F078593.dat | OVEMB150205_18.raw.mgf | BRep 2 | UPS1 5fmol |

**Note**
In most of the cases, the definitions go as following:\\
- biological group = biological condition\\
- biological sample = biological replicate\\
- sample analysis = technical replicate\\
However, these definitions are not strict and you can use and adapt the hierarchy to fit your needs.

  * As shown previously (see [[#Example 1: multiple biological groups | example 1]]), choose the result set mapping property and add the //Biological group// column.
  * In the toolbar box, now select the //Biological sample// annotation, and add it by clicking on //Add//.
  * You can re-arrange the columns order to fit your data table: drag and drop the column headers to move them.
  * Copy/paste the relevant columns of your table in the annotation editor.

{{ :how_to:web:dse_edit_annotations_bio_groups_bio_rep.png?nolink&800 |}}

\\

{{ :how_to:web:dse_edit_annotations_bio_groups_bio_rep_created.png?nolink&300|}}

Click //OK// to register the annotations and see them in the creation panel.\\
Then click //Create Dataset//.\\
The generated dataset will have 3 levels of aggregation:
  * top-level (name: chosen in creation panel)
  * biological groups (names: from annotation editor column)
  * biological samples (names: from annotation editor column)