Managing your Research Data
What is Research Data Management?
Research Data Management concerns the handling of research data throughout the research lifecycle, from its collection through to storage and potentially reuse. Research data covers a wide range of information types, including:
- Spreadsheets
- Laboratory notebooks or field notes
- Questionnaires, transcripts or codebooks
- Audio or visual files
- Protein or genetic sequences
- Slides, artifacts, specimens or samples
- Models, algorithms or scripts
Elsevier - Hierarchy of research data needs
Research Data Management Plans
Creating a Research Data Management Plan is the first step to ensuring that:
- Your research data is stored and backed up appropriately and safely
- Your research data is useful to others (should you choose to share it)
- Your research data is findable, understandable and useable to collaborators and other researchers
- Your research is compliant with ethical codes, funder policies and journal requirements
- Your research methods will stand up to scrutiny over time
Examples of Data Management Tools:
DCC Checklist for a Data Management Plan
Collecting and Analysing your Data
It's important to think carefully about your data in the early stages of your project. You need to consider:
- What file types your data will be produced in, and what implications this might have for software
- What volume of data you can expect, and how this will be stored
- How you'll manage the data in order to facilitate its analysis
- How your data will be checked and verified
- What you'll need to do to ensure you're complying with the requirements of the University and any other stakeholders such as funders or collaborating organisations
Describing your Data
Your data needs to be described and organised in such a way that someone else could understand and re-use it, or at least interpret your results for themselves. In addition to explaining your methods with regard to data collection, preparation and analysis, it's also important to provide descriptive information in your data.
This could include:
- File names and versions
- Variable descriptions, data types and values
- Location of header columns
- Explanations of codes or classification systems
- Explanations of missing values
This information could be embedded in the documents themselves, within a "readme" file or data dictionary. Any such information should be accessible alongside your data.
See also:
Organising data (UK Data Service)
Version control and authenticity (UK Data Service)
Storing your Data
The way you store your data can have implications for accessibility and security. There are also logistic considerations - it's not unusual for a single data set to be over a terabyte.
The University provides several options for the storage of data. When deciding which to use for your project consider:
- Who needs to access the data
- How much data you expect to have
- Where you will need to access the data from
- How long the data will need to be kept
- How the data will be backed up
In the first instance, talk to your department's ICT Relationship Manager about what option is best for you. For more information, see the University's Storage Guidelines for Staff.
Sharing your Data
Increasingly, funding agencies require some degree of public access to research data. Publishers may also require authors to submit data alongside their manuscript.
There are many good reasons to share your data even without a mandate to do so. Data can and should be cited just like any other resource, resulting in more visibility and greater recognition for the original researcher. The reuse of data represents considerable value to the academic community because research is not being duplicated needlessly, saving everyone time and money. It also showcases the researcher's contribution as an academic citizen and can foster new collaborations.
There are many ways to share your data. You could also consider publishing your data on a personal webpage or online research profile, or in a repository. The Registry of Research Data Repositories and the Repository Finder act as tools for discovering data repositories.
It's important to ensure it carries a DOI so that it can be cited by researchers who use it.
Some examples of data repositories:
File Obsolescence
File obsolescence is when a digital resource is no longer readable because the hardware or software required to read it is no longer available. The ANDS File format guide provides guidance for those considering what file formats might be appropriate for their data.
Data Sovereignty
Data Sovereignty is concerned with the governance of data. Currently, data is subject to the laws of the nation in which it is stored. This can be problematic when data is stored remotely from its place of origin, particularly through cloud-based storage solutions, whose servers are overseas. Indigenous Data Sovereignty argues that data should be subject to the laws of the nation it originates from. Māori data sovereignty argues that data collected by or about Māori people, or Māori resources should be subject to Māori governance. If your research involves indigenous populations or resources, it is important to think carefully about the management of the resulting data, and consult fully with all parties in making data management decisions.
For more information, please see:
Te Mana Raraunga (Maori Data Sovereignty Network)
Indigenous data sovereignty: Toward an agenda
Principles of Māori Data Sovereignty (Te Mana Raraunga)
Māori data audit tool (Te Mana Raraunga)
See also:
Research Data (University of York)
Need help? Contact the Open Research Team at [email protected].