Research data life cycle

The research data lifecycle is a scheme describing the steps that research data go through in different parts of the research process, i.e. before, during and after the end of an experiment or project. Not all research data processing necessarily involves all stages of the lifecycle. If nothing prevents it, data should be available for ‘reuse’ so that ideally anyone can use it. Opening up research data involves following certain principles of good governance, legal standards, recommendations and criteria to ensure different levels of future reuse for specific data. In general, the data life cycle consists of several phases (see Figures). For each of them, there are different technical tools that support the life cycle phases.

A useful source of information on the tools that support each phase of the research data lifecycle is the Research Data Management Kit (RDMkit), which was developed by ELIXIR to support life scientists in their efforts to improve research data management and fulfil the FAIR principles.

A similar resource, with slight differences in emphasis on which part of the research cycle phase, is the Data Management Expert Guide (DMEG), which is produced by the CESSDA ERIC, a consortium of European social science data archives, and is thus focused on the social sciences.

  1. Planning: Design of the experiment based on existing knowledge and data, planning of data management, design of conditions to enable subsequent sharing of the resulting data, design of the collection of the data produced in the experiment.
  2. Data creation: Creation of scientific data within a specific application of the scientific method, acquisition of data from third parties for the purpose of the experiment.
  3. Data processing: Primary processing of the raw data representing its cleaning from noise, creation of data documentation (creation of metadata describing the individual scientific data), conversion into the necessary formats.
  4. Data analysis: Analysis and interpretation of the processed data in order to evaluate the conclusions of the experiment, test the hypotheses.
  5. Data preservation: Proper long-term preservation of the necessary scientific data generated by a specific application of the scientific method, including in particular the technical processing of the scientific data, the preparation of the data for long-term preservation and the selection of a suitable repository.
  6. Data sharing: Making research data available for collaboration within a collaborative research project or with the wider research community and society at large. Sharing research data does not necessarily mean that the data is accessible to everyone and for any purpose; data may be shared with limited access.
  7. Data reuse: The use of data for further research or other purposes, the degree of data reuse is determined by the degree of data disclosure in step 6.

 


Based on input from RDMkit, DMEG and outputs from WG Education – EOSC CZ and Marek, Jiří. 2020. Masaryk University, Faculty of Law.