New Implementation Study will improve reusability of proteomics datasets

ELIXIR Belgium is involved in a new Implementation Study “Crowd-sourcing the annotation of public proteomics datasets to improve data reusability” that aims to improve the reusability of public proteomics datasets. Ten ELIXIR nodes are involved in this study; Lennart Martens will represent ELIXIR Belgium”. The study will start on 1 June 2018 and run for a one-year period.


Scope of the study

This Implementation Study aims to improve the reusability of public proteomics datasets by substantially increasing the amount and quality of technical and biological annotations for datasets stored in the PRIDE database.

The PRIDE database is the world-leading repository for mass spectrometry proteomics data and is one of ELIXIR’s core data resources. PRIDE received 2.443 dataset submissions in 2017. The large amounts of data downloaded from PRIDE in 2017, 295Tbs, illustrates the growing reuse and reanalysis of this data. However, proteomics data reuse is currently still constrained by the limited technical and biological annotations that are available for PRIDE datasets. A user-friendly mechanism is required for users to improve the quality of the annotation of the datasets.

The current Implementation Study has four objectives. First, the ELIXIR nodes will develop an a-posteriori annotation system for PRIDE, for technical and biological metadata, which will leverage the unique synergies of already existing tools and pipelines developed by different ELIXIR nodes. Second, they will create data structures that can capture the most-frequently used experimental designs in proteomics studies. Third, an appropriate API will be built to allow annotation tools to be developed easily. Fourth, they will reach out to actively involve the whole proteomics community in the annotation process. Taken together, this should dramatically improve the reusability of public proteomics datasets.


More information on ELIXIR Implementation Studies

Implementation Studies are short technical projects to inform future service development, drive standards adoption, and connect ELIXIR’s 21 Nodes. The projects are funded through the budget of the ELIXIR Hub and form part of ELIXIR’s ongoing activities in a particular Platform or Community. They are proposed by Platforms, agreed with the ELIXIR Heads of Nodes committee, and approved by the ELIXIR Board. Read more.

About proteomics and ELIXIR
Proteomics activities are well-represented within Europe and ELIXIR nodes. ELIXIR recently started a Proteomics Community.