Differences

This shows you the differences between two versions of the page.

--- how_to:web:createaggregate [2014/09/30 10:39]
193.48.0.3
+++ how_to:web:createaggregate [2016/07/05 15:54] (current)
193.48.0.3
@@ Line 1: / Line 1: @@
-====== Import Result Files  ======
+====== Identification Tree and Dataset creation ======
-The first thing to do in your brand new project is to import Result Files.
+Once your Result Files have been imported, you can use them to create a new Identification Dataset.
-To do so, Click on the “Import Result File” Button, in the toolbar of the project overview window or via a "right-click" on the Project node of the tree
+To create it from result files, you can:
+  * right-click the //Identification Trees// node in the project tree, then select //New Dataset// or
+  * double-click the //Identification Trees// node to show the identifications grid. It is meant to list all your datasets, and the results of their validation if you've run them. For now, click on the //New Dataset// button from the toolbar.
-{{ :how_to:web:dse_import_result_file_btn.png?600 |}}
+{{ :how_to:web:dse_new_dataset.png?nolink&1000 |}}
-You should then see this panel appear :
+Note that you can also create an empty dataset to further assemble complex structures using drag and drop in the project tree. However, this way is not favoured.
-{{ :how_to:web:dse_import_result_file_form.png?700 |}}
+{{:how_to:web:dse_new_dataset_based_on_sresults.png?nolink&200 |}}
+\\
+\\
+You should now see a window asking to choose a source of data for your new Dataset.\\
+The only option available yet is //Result Set//: it allows you to build a new dataset from the Result Files you have imported.
+Click //OK// to continue.
-It allows you to select Result Files and set up following parameters for the import process :
+{{ :how_to:web:dse_new_dataset_creation_panel_empty.png?nolink&1000 |}}
-  * Parameters
+\\
-    * Software Engine : the software which generated your interrogation file
-    * Instrument : mass-spectrometer used for sample analysis
-    * Peaklist Software : the software used for the peaklist creation
-  * Decoy Parameters
+===== Create a simple dataset (single group) =====
-    * Decoy Strategy : TODO
-    * Protein Match Decoy Rule : TODO
-  * Parser Parameters : according to your Software Engine, this will display some extra-parameters.
+==== Dataset properties ====
-    * Mascot :
-      * Ion Score Cutoff : TODO
-      * Subset Threshold : TODO
-      * Mascot Server URL : TODO
-    * Omssa :
-      * User mod file : TODO
-      * PTM Composition File : TODO
-  * Check Files before Import : let this checked to ensure that your files contain no errors. The server will perform a check operation before launching the import.
+In the right panel, two text fields allow you to enter a name and description (optional) for your dataset.
+By default, the result sets will be named in the tree using the result file name. You can change that with the //Name child results using// box. The possibilities are: peaklist file name, raw file identifier, result file name, sample, search title.
+=== Files selection ===
-In order to add files to your import selection, click on “Select Result File” to open the File Browser that will let you choose one or many result files to import :
+To add one or many files to your selection, select them in the grid (you can use the Ctrl and the Shift keys to make a multiple selection), then click on //Add to dataset// (top-right of the panel). You can also double click on one file to quickly add it to the selection.
-{{ :how_to:web:dse_import_result_file_select_file.png?direct |}}
+To remove any file from the selection, just select them and click on //Remove selected Items//.
-The left side let you browse the directories, and when you click on one of them, its content is shown on the main panel. Choose one or multiple files then click “Ok”.
+=== Created dataset ===
-Back to the “Import Result File” window, you should now see your selection appear in the grid.
+The creation of your identification dataset happens as follows:
-Choose the instrument and the peaklist software corresponding to your files, and then select a “Decoy Strategy”. You can now click on “Start Import” to launch the check and the import tasks.
+  * An “Aggregation” Dataset Node is created. It takes the name that you provided during the creation.
+  * One “Identification” Dataset Node is created for each one of the Result Files you have selected. They take the name of the Result File.
-The server will check your files first, then the import itself will be launched automatically.
+{{:how_to:web:dse_newly_created_dataset.png?nolink&250 |}}
-You can follow the current state of your tasks by clicking on the small cake in the bottom right corner of the desktop screen. It opens a small grid where you can see all your tasks.
+Once your Identification Dataset has been created, you can see it on the tree, in the left side of the window. You may need to collapse and expand again the //Identifications// node to see it appear.\\
+The white icon let you known that it is not yet validated (becomes green when validated).
-{{ :how_to:web:dse_import_result_file_tasks.png?direct |}}
-When a task is done, you are notified by a small message in the top of the screen, and you can see its status in the tasks window :
+Double click on the aggregation node to open the identification summary. This panel shows a list of your Identification fractions (corresponding to each imported file) and, after the validation process, it will display the Merged Result Summary infos.
-{{ :how_to:web:dse_import_finished.png?direct |}}
+\\
+\\
+\\
-All the Result Files you have imported are listed in the "Search Results" panel you can access by double-clicking the "Search Results" node in your project's tree.
+===== Create a complex dataset (automatic grouping) =====
-====== Create a new Dataset  ======
+==== Dataset properties and files selection ====
-Once your Result Files have been imported, you can use them to create a new Identification Dataset. Double-Click on the “Identification Trees” element in the tree, on the left side of the window. The grid which just appeared is meant to list all your datasets. For now, click on the "New Dataset" button from the bar, or by right-clicking on the "Identification Trees" node.
+Refer to the [[#Create a simple dataset (single group)|previous paragraph]] to know how to do these steps.
-{{ :how_to:web:dse_new_identification.png?direct |}}
+==== Add annotations to files ====
-You should now see a window asking to choose a source of data for your new Dataset.
+Once you have selected your result files, click on //Add annotations//. A window shows up, with as many empty lines as selected files.
-  * Choosing “Result Set” allows you to build a new dataset from both Result Files you have imported and existing identification datasets, whether they have been validated or not.
-  * Choosing “Result Summary” will let you build a new dataset from one or more existing validated identification dataset. Il will duplicate them into a new Dataset without their validation data.
-For now, assuming you are creating your first identification dataset, you should choose the “Result Set” option. On the two-tabbed panel, go to the “Result Sets” tabs in order to see the list of the files you have imported.
+{{ :how_to:web:dse_edit_annotations.png?nolink&700 |}}
-{{ :how_to:web:dse_new_dataset_form.png?direct |}}
-To add one or many files to your selection, select them in the grid (you can use the Ctrl and the Shift keys to make a multiple selection), then click on “Add to Selection” on the bottom of the window. You can also double click on one file to quickly add it to the selection.
+=== Example 1: multiple biological groups ===
-To remove any file from the selection, just select them and click on “Remove selected Items”.
+Let's say you compare 2 conditions, and you have the following result table (Excel file for instance). The idea is to copy/paste this table to the annotation editor.
-Type a name for your dataset then click on “Create”.
-The creation of your identification dataset happens as follows :
+| Result file | Peaklist File | Condition |
-  * An “Aggreagation” Dataset Node is created. It takes the name that you provided during the creation.
+| F078594.dat | OVEMB150205_21.raw.mgf | UPS1 50fmol |
-  * One “Identification” Dataset Node is created for each one of the Result Files you have selected. They take the name of the Result File.
+| F078596.dat | OVEMB150205_23.raw.mgf | UPS1 50fmol |
+| F078592.dat | OVEMB150205_25.raw.mgf | UPS1 50fmol |
+| F078590.dat | OVEMB150205_27.raw.mgf | UPS1 50fmol |
+| F078591.dat | OVEMB150205_12.raw.mgf | UPS1 5fmol |
+| F078595.dat | OVEMB150205_14.raw.mgf | UPS1 5fmol |
+| F078597.dat | OVEMB150205_16.raw.mgf | UPS1 5fmol |
+| F078593.dat | OVEMB150205_18.raw.mgf | UPS1 5fmol |
+  * Select a property to identify the result set with the //Map annotations using...// box. It doesn't need to be the same property as for dataset children name (see [[#Dataset properties|dataset properties]]), so long as it refers to the selected files. In this example, it is convenient to select "Peaklist file name".
+  * In the toolbar at the top of the grid, the annotation //Biological group// is selected by default. Click //Add// to add this column to the grid.
+  * Copy the "Peaklist File" and "Condition" columns of your table, and paste it (using Ctrl + V) in the empty grid of the annotation editor. Note that the files order doesn't matter, so long as there is the right count.
+  * The window should now look like this:
+{{ :how_to:web:dse_edit_annotations_bio_groups.png?nolink&550 |}}
+Click //OK// to register these annotations. The window closes and the files selection has been annotated as shown below (on the left). Click //Create Dataset//. The generated dataset (below, on the right) will have 2 levels of aggregation: biological group (with the copy/pasted names) and top-level (chosen name for dataset).
-Once your Identification Dataset has been created, you can see it on the tree, in the left side of the window.
+{{ :how_to:web:dse_edit_annotations_bio_groups_created.png?nolink&700 |}}
-{{ :how_to:web:dse_new_dataset_tree.png?direct |}}
+\\
+\\
-The panel of the Aggregation Node shows a list of your Identification fractions (corresponding to each imported file) and, after the validation process, it will display the Merged Result Summary infos.
+=== Example 2: multiple biological groups and biological samples ===
+Let's take a slightly more complex design, introducing biological replicates for each condition:
+| Result file | Peaklist File | Biological replicate | Condition |
+| F078594.dat | OVEMB150205_21.raw.mgf | BRep 1 | UPS1 50fmol |
+| F078596.dat | OVEMB150205_23.raw.mgf | BRep 1 | UPS1 50fmol |
+| F078592.dat | OVEMB150205_25.raw.mgf | BRep 2 | UPS1 50fmol |
+| F078590.dat | OVEMB150205_27.raw.mgf | BRep 2 | UPS1 50fmol |
+| F078591.dat | OVEMB150205_12.raw.mgf | BRep 1 | UPS1 5fmol |
+| F078595.dat | OVEMB150205_14.raw.mgf | BRep 1 | UPS1 5fmol |
+| F078597.dat | OVEMB150205_16.raw.mgf | BRep 2 | UPS1 5fmol |
+| F078593.dat | OVEMB150205_18.raw.mgf | BRep 2 | UPS1 5fmol |
+**Note**
+In most of the cases, the definitions go as following:\\
+- biological group = biological condition\\
+- biological sample = biological replicate\\
+- sample analysis = technical replicate\\
+However, these definitions are not strict and you can use and adapt the hierarchy to fit your needs.
+  * As shown previously (see [[#Example 1: multiple biological groups | example 1]]), choose the result set mapping property and add the //Biological group// column.
+  * In the toolbar box, now select the //Biological sample// annotation, and add it by clicking on //Add//.
+  * You can re-arrange the columns order to fit your data table: drag and drop the column headers to move them.
+  * Copy/paste the relevant columns of your table in the annotation editor.
+{{ :how_to:web:dse_edit_annotations_bio_groups_bio_rep.png?nolink&800 |}}
+\\
+{{ :how_to:web:dse_edit_annotations_bio_groups_bio_rep_created.png?nolink&300|}}
+Click //OK// to register the annotations and see them in the creation panel.\\
+Then click //Create Dataset//.\\
+The generated dataset will have 3 levels of aggregation:
+  * top-level (name: chosen in creation panel)
+  * biological groups (names: from annotation editor column)
+  * biological samples (names: from annotation editor column)

Proline

User Tools

Site Tools

Differences

Page Tools