Plans for sharing and reusing data are an integral part of the Research Data Management Planning process.
Many funders as well as journal publishers have policies which encourage, expect, or require researchers to prepare and provide their data for sharing. This is particularly true of data produced through public funding.
The OECD declaration on access to research results, to which Canada is a signatory, sets out reasons for sharing research.
Rationale for sharing data:
Preparation of data for sharing begins with the creation of a data management plan during the initial stages of the research project. Researchers should familiarize themselves with the policies of their funders as part of the planning process.
Factors to consider include:
Researchers should consult the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS2) and the Office of Research for information on contractual and ethical obligations.
The Tri-Council Policy Statement (TCPS2) stipulates that informed consent from project participants is necessary for the sharing and reuse of data containing identifiable information (TCPS2 Article 3.2 and Article 5.2). To ensure that consent has been received, consideration should be paid to wording in the consent form regarding preservation, reuse and/or sharing data containing identifiable information, how this information would be protected and under what conditions the data would be shared or reused. Consent from participants is not required for secondary use or reuse of anonymous or aggregated data, however, informing participants about preservation, reuse and sharing of this data is considered ethical.
Canadian copyright legislation does not cover raw research data although it does cover descriptions of data such as tables, graphs and databases. Sharing of data files can be controlled and protected by use of licences. Researchers, in many cases, can decide on the level of access and conditions of use related to data they are sharing or depositing in a repository. Individual repositories may have embedded licence choices within the repository platform.
Several online licensing options can be adopted for personal use:
Conditions of use should reflect the nature of the data and level of confidentiality involved.
Setting conditions of use can include:
Personal identifying information should never be disclosed through research findings unless explicit informed consent from participants has been provided in writing.
Researchers must ensure that a person’s identity cannot be disclosed through:
Direct identifiers collected during the research process usually are not essential for data analysis and can be easily removed from the data. Consideration should be paid to length of time these identifiers are kept separately and securely and to the manner in which they may be destroyed. In many cases, collection of direct identifiers can be avoided during the collection process.
Anonymising quantitative data may involve removing or aggregating variables. Techniques such as cell suppression, rounding, inference control and perturbation can be employed to anonymise data. Coding information using standard classifications at higher levels than the data has been collected is an example of a low risk technique which can be employed in the anonymising process.
Relational data requires particular attention where connections between variables may inadvertently cause identities to be revealed. Transcription of interviews may require the employment of different techniques such as the use of consistent pseudonyms, or more generalized terms to reduce risk of identification without rendering the data unusable. Retain unedited versions of your data for use within the team or in the event of errors during anonymisation. Remember to log all techniques used and instances of replacements or aggregation of variables.
Please refer to the UK Anonymisation Network’s UKAN Resources for additional information and documentation on data anonymisation, including comprehensive guides to performing anonymisation.
For more information on this topic please contact us.
We would like to thank the UK Data Service for use of their training materials in the creation of this guide.
We would also like to thank the EDINA and Data Library, University of Edinburgh for use of materials from the Research Data MANTRA [online course] in the creation of this guide.