Data Stewards

A Data Steward in academia is responsible for the management and quality of data within a research team. This covers a wide range of tasks and responsibilities, from creating a Data Management Plan, to ensuring that data is saved and stored correctly and securely. In addition, there are Data Stewards at the headquarters of academic institutions, specifically in libraries, who act as support to researchers and other Data Stewards, piloting the quality of training materials within the Data Steward community and providing consultancy services.

Data Stewards, as they figure in academia, are the communication hub between individual researchers at institutions and other collaborators such as stakeholders, policy makers, infrastructure providers (local and outside institutions) and software developers (IT departments).

Data Steward is still a fairly new profession in the Czech Republic. Therefore, we at the Academy of Sciences Library provide support to those who need advice on, for example, creating and managing Data Management Plans – specifically through the FAIR Wizard tool, selecting appropriate repositories, or consulting on the use of the institutional repository we operate – ASEP Repository of the CAS, meeting Open Science requirements, or setting internal data policies.

Why Data Steward is needed on your team?

  • Better (meta)data quality;
  • Better data documentation;
  • Clear policies and processes for handling (meta)data;
  • Guarantee that data handling will comply with regulations;
  • Ensuring data security;
  • Ensuring that collected data can be further exploited.

Data Steward skills

  • Communicate effectively;
  • Have knowledge of databases;
  • Ability to gain insight into the issues that their team, or the department they work with, is addressing;
  • A basic understanding of legislation and other regulations governing the handling of data;
  • The ability to solve problems and challenges.

A Data Steward should have a certain amount of technical knowledge – for example, keeping up to date with different types of storage, knowing how to store data correctly and securely is useful. He or she should understand data formats. Knowledge of databases is also useful. If needed, the Data Steward should understand database structures and be able to identify corrupted data.

A Data Steward should also understand the needs of the research teams he works with and communicate effectively with them, for example how their data will be handled. He should also be able to explain new processes to them and implement them effectively.

The ability to keep abreast of what the legislative requirements are regarding data handling, to be aware of what data handling requirements the funders have, to also be compliant with the institution’s regulations and to maintain best practice within his team should be indispensable. It is optimal if the data is handled according to FAIR principles.

The knowledge that a Data Steward should need may vary slightly depending on the industry, in which the Data Steward is working. Data Steward may secure and handle with qualitative data in one way and perhaps with big data in a different way.

A Data Steward is not a data analyst

Data Steward and Data Analyst are not the same. Although they both work with data, an Analyst organizes and analyzes data to gain insights from it and make conclusions or predictions that help business decisions. A Data Steward does not analyze the data, but ensures, that it is stored securely and correctly and is easily accessible to everyone it is meant to be accessible to.

Data Steward’s job description

Creating data processes

Creates standardization of data collection processes and uniform rules for data use and manipulation.

Data protection

Data Stewards are responsible for data maintenance and protection. They help to remove any duplications, imperfections, or detect abnormalities in data and prevent their loss or damage. They can also provide information about potential risks to data security.

Data line management

Data lineage is the process of tracking the origin of data and recording every instance of its use. By managing the data lineage, Data Stewards can detect inaccuracies or problems in the timing of when the data was entered (in what format or program) and thus correct them more efficiently and quickly.

Maintaining data quality

Data Stewards use customer feedback and queries to create systems for maintaining high data quality. They report on internal metrics to the appropriate members of the organization or scientific team and regularly identify, track, and evaluate issues.

Community of Data Stewards in ČR

In the Czech Republic, a community of Data Stewards has already been established, meeting several times a year for online and in-person gatherings and communicate regularly via the Discord platform (https://discord.gg/eMpzXFRaPn) to exchange updates and best practices. At the same time, some of its members are involved in testing and evaluating the outputs of the National Repository Platform (NRP), thus contributing to its improvement and more effective use.

The Data Steward position and its important role in the management of research data has also been featured on the official EOSC CZ website, which can be found here.


Elaborated by:

Data Stewards: Overview – The Turing Way, online: 2. 9. 2024;

Data Steward, UK, online 2. 9. 2024;

Data Steward course, DocEnhance NTK;

Salome Scholtens (2019) „Final report: Towards FAIR data steward as profession for the lifesciences. Report of a ZonMw funded collaborative approach built on existing expertise”.