Skip to Main Content

Data Management

help with the NIH Data Management Sharing Plan policy that takes effect in January 2023.

NIH Policy

Effective Jan. 25, 2023, the Final NIH Policy for Data Management and Sharing (DMS Policy) requires all NIH-supported research that generates scientific data to include a Data Management and Sharing Plan, or “Plan” for short.

The plan is required as part of the Budget Justification in your application for funding.

This guide can help you write your Plan and find an appropriate repository for your data.

Things to consider in your Data Management Plan

What is the Data?

  • Title
  • Description
  • Number, format, and size of files
  • Rate of file growth
  • Versions of files

Who created, accesses, and owns the data?

  • PI/Study lead/Contact
  • University, Department, Research Core, Research Team, Consortium or Group
  • Funder
  • Authentication levels/Restrictions
  • Outside requesters

Where is the data stored?

  • Institutional servers or services (like Box)
  • Filesharing services
  • Personal account drives
  • Individuals' computers
  • Backups

How is data being created and manipulated?

  • Collection techniques and instruments
  • File naming
  • Workflows
  • Metadata
  • Analysis tools and software

When is data being collected, and what are the plans for its future?

  • Date created/modified
  • Attribution: who did what, when?
  • Data retention
  • File format migration
  • Long-term storage

Elements to Include in a Data Management and Sharing Plan - NIH

As outlined in NIH Guide Notice Supplemental Policy Information: Elements of an NIH Data Management and Sharing Plan, DMS plans should address the following recommended elements and should be two pages or less in length.

Data Type: Briefly describe the scientific data to be managed and shared:

  • Summarize the types (for example, 256-channel EEG data and fMRI images) and amount (for example, from 50 research participants) of scientific data to be generated and/or used in the research. Descriptions may include the data modality (e.g., imaging, genomic, mobile, survey), level of aggregation (e.g., individual, aggregated, summarized), and/or the degree of data processing.

  • Describe which scientific data from the project will be preserved and shared. NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions.

  • A brief listing of the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data.

Related Tools, Software and/or Code: Indicate whether specialized tools are needed to access or manipulate shared scientific data to support replication or reuse, and name(s) of the needed tool(s) and software. If applicable, specify how needed tools can be accessed.

Standards: Describe what standards, if any, will be applied to the scientific data and associated metadata (i.e., data formats, data dictionaries, data identifiers, definitions, unique identifiers, and other data documentation).

Data Preservation, Access, and Associated Timelines: Give plans and timelines for data preservation and access, including:

  • The name of the repository(ies) where scientific data and metadata arising from the project will be archived. See Selecting a Data Repository for information on selecting an appropriate repository. 

  • How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.

  • When the scientific data will be made available to other users and for how long. Identify any differences in timelines for different subsets of scientific data to be shared.

    • Note that NIH encourages scientific data to be shared as soon as possible, and no later than the time of an associated publication or end of the performance period, whichever comes first. NIH also encourages researchers to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.

Access, Distribution, or Reuse Considerations: Describe any applicable factors affecting subsequent access, distribution, or reuse of scientific data related to:

  • Informed consent

  • Privacy and confidentiality protections consistent with applicable federal, Tribal, state, and local laws, regulations, and policies

  • Whether access to scientific data derived from humans will be controlled 

  • Any restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements

  • Any other considerations that may limit the extent of data sharing. Any potential limitations on subsequent data use should be communicated to the individuals or entities (for example, data repository managers) that will preserve and share the scientific data. The NIH IC will assess whether an applicant’s DMS plan appropriately considers and describes these factors.

Oversight of Data Management and Sharing: Indicate how compliance with the DMS plan will be monitored and managed.

 

Creating a data management plan

Michener WK. Ten Simple Rules for Creating a Good Data Management PlanPLoS Comput Biol. 2015;11(10):e1004525.  doi:10.1371/journal.pcbi.1004525

File Naming Practices

Best practices

Harvard Medical School file naming conventions

UI Research Data Services - see file naming prompt sheet link

Data Documentation

Northwestern University. Data Documentation - Guide for improving the organization, documentation, and long-term preservation of digital research data.