Documenting Your Work

Creating documentation throughout your research project is an important component of the research process. At the very least, you will require the information to describe your research outputs in any future presentations or publications. In order to preserve your data in a repository or share your data with others, you will be expected to provide supplemental information such as citation information, an explanation of survey methodology, sampling information, question context and coding, how and why derived variables were created, and more.

Metadata

This term refers to ’data about the data’, or in other words, information which describes all aspects of your data.

Documenting Your Data

The level of structure used to document your data will depend on the complexity of the project or data collected and the number of people involved in the project. Consider documenting the following information:

Study Level

  • Document creators, collaborators, funders, rights
  • Outline the research question and rationale
  • Document the date the data was gathered or analysed
  • Describe the survey methodology
  • Describe the sampling frame
  • Describe instruments, instrument setting or measures used

File or Database Level

  • Describe the relationship between files
  • Document information contained within the files
  • Identify the format files are stored in
  • List and document tests or analysis performed on the file(s)
  • Use a readme.txt file to document information at the file or folder level

    • Include information on file naming, abbreviations or acronyms used as well as contents of the file(s)

Variable Level

  • Document not only variable name but also the variable label explaining the variable meaning, unit of measure, sample weighting, etc.
  • Information could be contained in a codebook

Rationale

  • To enable others to reuse your data
  • To facilitate preservation
  • To allow replication at a later date
  • To make the data understandable to others

Sources of Metadata Information

  • Standard information submitted in Research Ethics Board (REB) request
  • Laboratory notebooks & experimental protocols
  • Questionnaires, codebooks, data dictionaries
  • Software syntax and output files
  • Information about equipment settings & instrument calibration
  • Database schema
  • Methodology reports
  • Provenance information about sources of derived data

Using Standards, Taxonomies, Classification Systems

When preserving or sharing data, standards, taxonomies or classification systems can be utilized to categorize or document data or other information in a widely understood method. Data repositories usually request that you use an international metadata standard.

Standards

A wide variety of standards and schemas are available for use in documenting research data. Most are discipline specific but some can be adapted for use in other fields. All have a core set of tags collecting vital information related to your project including title, author, funding sources, abstract, keywords, terms of use, and copyright information. Examples include:

Ontologies, Taxonomies, and Classifications

It is important to use discipline-specific ontologies, vocabularies or taxonomies and classification systems when creating and documenting your data. This is a method of standardizing information into relational schemas ensuring wide-spread understanding of concepts and descriptions.

Ontologies

Taxonomies

  • Used primarily in the sciences to depict hierarchical relationships
  • Example:

    • Biology – phylum, family, genus...

Classification Systems

Example of a Metadata Standard: Dublin Core Metadata Element Set

This fifteen term vocabulary set is considered to be the core elements which should be used to describe an item. It is part of a more complex set of vocabularies known as the DCMI Metadata Terms which is an ISO Standard [ISO15836] and an ANSI/NISO Standard [NISOZ3985].

Term Name Definition
Contributor An entity responsible for making contributions to the resource
Coverage The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant
Creator An entity primarily responsible for making the resource
Date A point or period of time associated with an event in the lifecycle of the resource
Description An account of the resource
Format The file format, physical medium, or dimensions of the resource
Identifier An unambiguous reference to the resource within a given context
Language A language of the resource
Publisher An entity responsible for making the resource available
Relation A related resource
Rights Information about the rights held in and over the resource
Source A related resource from which the described resource is derived
Subject The topic of the resource
Title A name given to the resource
Type The nature or genre of the resource

This table has been compiled from the Dublin Core Metadata Element Set, Version 1.1 document and used under Creative Commons Attribution 3.0 Unported Licence.

Acknowledgements

We would like to thank the UK Data Service for use of their training materials in the creation of these modules.

We would also like to thank the EDINA and Data Library, University of Edinburgh for use of materials from the Research Data MANTRA [online course] in the creation of these modules.