Research Data Management
Research Data Management (RDM) is one of the most important issues facing researchers today. In the era of open science, researchers are increasingly expected to make the results of their research available to the scholarly community and to the public; furthermore, publishers and funders are increasingly insisting upon it. This presents our research community with both opportunities and challenges. Through a process of consultation, planning, and implementation, St. Thomas University is moving toward developing an institutional RDM strategy that will help our faculty researchers achieve their data management goals.
(image courtesy of science.gc.ca)
Research data management involves the careful consideration of how data is organized throughout the entire research process:
Plan: Create a Data Management Plan (DMP) that considers what types of data will be collected, in what formats, with what documentation, with ethical, legal, and financial considerations.
Create: Produce data in accordance with your methodology; also produce associated metadata in accordance with accepted standards.
Process: Data are checked, validated, anonymized, and digitized.
Analyze: Data are interpreted to generate research findings.
Preserve: Data are saved to the best possible formats, metadata and digital identifiers are added, and due consideration is given to data security and intellectual property matters.
Share: Data (along with documentation and metadata) are deposited in a controlled repository under appropriate access conditions.
Reuse: Data may be discovered, obtained (if appropriate), and reanalyzed, with correct data citation.
Why would I want to do this?
There are a number of reasons to implement a research data management plan:
- Preserving and protecting your data
- Complying with funder requirements
- Benefit to the scholarly community and to society
Preserving and protecting your data
Most of us don’t like to think about data loss, but we can easily imagine the professional disaster that would ensue. There are heartbreaking documented cases. In 2005 at the University of Southampton, a fire destroyed one of its most important research facilities. A doctoral student in biology lost all of his research data, 400GB worth, to a burglary, and as a result failed to get his PhD. In a rather dramatic case, an American student’s laptop received three rounds from an Israeli border security guard’s pistol when she was crossing from Egypt. In 2021, 77TB of research data were lost at Kyoto University due to a supercomputer backup error – a total of 34 million files from 14 different research groups. If you value your research data, you should consider what steps you should take to mitigate an unlikely, but theoretically possible, disaster.
Complying with funder requirements
The Tri-Agencies have released their Research Data Management Policy, which consists of three main components. First, all institutions eligible to administer Tri-Agency funding must develop an institutional strategy by March 1st, 2023. This document is a roadmap that outlines how STU will support best practices in RDM. Second, Tri-Agency funding opportunities will require data management plans (DMPs) at the time of application, in which applicants will describe how research data will be collected, documented, and preserved. To date, Tri-Agency programs requiring DMPs have been very limited, but we can probably expect some of our bread-and-butter programs (like SSHRC IDG) to require DMPs by 2024. Third, grantees will be required to deposit all research data, metadata, and code that support conclusions in journal publications into a digital repository. Researchers will not be required to share their data, but are expected to provide access to research data wherever it is ethically, culturally, and legally possible.
Benefit to the scholarly community and to society
The global movement toward open science encourages researchers to make their data “as open as possible, and as closed as necessary”. There are obvious societal benefits for research data to be shared as widely as possible; for example, the full genome for the SARS-CoV-2 coronavirus was shared with the scientific community less than a month after the first COVID patient was admitted to a Wuhan hospital. Open science has the potential to increase the public’s trust in the scientific community and to make access to research data far more democratic.
One framework that underpins research data management is a set of principles known by the acronym FAIR – Findable, Accessible, Interoperable, and Reusable. The FAIR principles are intended to ensure that research data are managed in a way that maximizes benefit to the scholarly community and to society more broadly. Even in the case of sensitive data that are common in qualitative research, it is possible to abide by FAIR principles. A great guide to the management of sensitive research data has been developed by the University of Ottawa library.
Another important framework for some researchers to consider is CARE, the principles for Indigenous Data Governance (Collective Benefit, Authority to Control, Responsibility, and Ethics). The CARE principles focus on Indigenous rights and self-determination in the emerging context of open science. A specifically Canadian iteration are the First Nations principles of OCAP (Ownership, Control, Access, and Possession) that establish how First Nations’ data should be collected, protected, used, or shared.
For more information on research data management, a comprehensive set of resources have been created by the Digital Alliance of Canada.
St. Thomas University has developed our first draft Research Data Management Institutional Strategy, which can be found HERE. This document is intended to raise awareness of RDM issues at STU, to highlight RDM best practices, to describe the current state of RDM culture and resources on campus, to identify gaps and challenges, and to envision an ideal future state in which all research data management needs of STU’s research community are recognized and addressed. The strategy is a “living document” and will continue to be evaluated and updated based on feedback from the STU community and the latest developments in the field.