Organising Information

Creating a strategy for how you will manage your project files throughout the research process is a fundamental element of your overall data management plan. A research project may include multiple files in a variety of formats, multiple versions of files, spreadsheets, images, lab notes, interview tapes, etc. that are essential to the project. Establishing good file management practices at the outset is vastly easier than trying to organise the work mid-way through the project.

Managing your project files will render benefits later on by:

  • Increasing efficiency
  • Reducing risk of loss or file redundancy
  • Increasing research impact by making it easier to share files
  • Complying with legal/ethical requirements or policies
  • Providing clear record of research process
  • Facilitating preservation at conclusion of project

Elements central to managing project files include:

  • Adopting and documenting folder and file naming conventions
  • Creating a clear hierarchy of folders
  • Documenting file contents
  • Tracking file versions
  • Understanding file formats used in long-term preservation

Directory Structure

When organising your files consider including elements such as the project title, a unique identifier, and the date (year) in the folder directory name. The substructure should include a clear, documented naming convention; for instance, each component or run of an experiment, each version of a dataset, and/or each person in the group. The structure should follow a consistent pattern which can be clearly recognizable to the entire research group.

Elements of file name conventions (below) apply to directory folder names as well.

File Naming Conventions

Project files should be named and organised in a consistent and descriptive manner in a way that is logical and predicable to yourself and others. Clear distinctions between files will facilitate effective and efficient file browsing and retrieval.

There are 3 things to keep in mind when labeling data:

  • Organisation - important for future access and retrieval
  • Context - this could include content specific or descriptive information independent of where the data is stored
  • Consistency - choose a naming convention and ensure that the rules are followed systematically by always including the same information (such as date and time) in the same order and following the same format (e.g. YYYYMMDD)

Consider using several of the following elements in file names:

  • Project name, number or acronym
  • Creator surname and initials
  • Name of research team/department associated with the data
  • File version number
  • Date of creation
  • Date experiment undertaken
  • Description of content
  • Publication date

Other considerations:

  • Keep file names to a manageable length – preferably 25 characters or less
  • Do not name files the same name as the folder in which they reside
  • Avoid using unusual characters such as: ! - @ # $ % ^ & * ( ) [] {}+ ? > <
  • Avoid using spaces. In place of spaces between words use one of the following methods:

    • Use a capital for the first letter of each word:

      • ProjectAcronymLastNameFirstNameTopic.txt
      • ProjectAcronymTopicOfDocumentDate.pdf
    • Use an underscore in between each word:

      • ProjectAcronym_last_name_first_name_topic.txt
      • ProjectAcronym_topic_of_document_date.pdf

    Consider using version control systems for bulk renaming of files where necessary.

    If, part-way through a project there is a need to rename a large number of files to conform to a systematic file-naming convention you have adopted, there are a number of tools available to make this process easier.

    Examples of file renaming tools:

    Versioning

    Versioning or version control refers to managing file revisions. Versioning assists researchers in managing data during a project where experimentation, revisions and re-examinations are undertaken. Text files as well as data files may undergo numerous changes before the final version is set.

    Versioning mechanisms such as directory structure and file naming conventions assist users in differentiating between different versions of a dataset and accompanying files.

    Researchers should also consider discarding obsolete versions of files. Care should be taken in making decisions about future use of files before discarding. In some instances, keeping backup copies of versions may be advisable.        

    A number of tools are available for file versioning including:

    Backing up files

    Backing up files refers to the creation of file copies. These copies should reside in a separate physical location from the working or stored files. Arranging a regular back-up schedule mitigates the possibility of data loss and backup copies can be used to restore damaged or lost original files. 

    On campus, CCS (Computing and Communications Services) provides file security, encryption, storage and backup support and services. The UK Data Archive website provides excellent informal risk assessment  information that may be helpful in determining your particular data back-up needs. 

    Acknowledgements

    We would like to thank the UK Data Service for use of their training materials in the creation of these modules.

    We would also like to thank the EDINA and Data Library, University of Edinburgh for use of materials from the Research Data MANTRA [online course] in the creation of these modules.