CROSS-CUTTING

Data management plan

Formal document describing how a project's data will be created, organized, documented, stored, preserved, and shared across the research lifecycle. It operationalizes the FAIR principles and is required by funders.

Extended definition

A data management plan (DMP) is the formal document that describes how a project’s data will be created, organized, documented, stored, preserved, and shared across the entire research lifecycle. It is not a bureaucratic attachment: it is the instrument that turns the abstract intention to care for data well into concrete, verifiable decisions, made before collection begins. Michener (2015) summarizes what a good DMP covers: the types and formats of data, the metadata and documentation standards, the storage and backup policies, the long-term preservation strategy, the access and licensing conditions, and the team’s roles and responsibilities. The DMP is where the FAIR principles, defined by Wilkinson and colleagues (2016), stop being an ideal and become an operational plan, specifying how the data will be findable, accessible, interoperable, and reusable. Miksa and colleagues (2019) further propose machine-actionable DMPs, able to connect to systems and update themselves over the course of the project.

When it applies

The DMP applies at the start of a project, ideally at the proposal stage, and is today a requirement of many funding agencies as a condition for the award. It applies to infrastructure planning: defining where the data will live, how it will be versioned, and for how long it will be preserved avoids improvised decisions in the middle of the research. It applies to operationalizing the FAIR principles and open science, fixing the repository, the persistent identifier, and the license before the first observation is collected. It applies to the governance of sensitive data, declaring how privacy and consent will be respected. And it applies as a living document: a good DMP is revised as the project evolves, not filed away after the proposal is submitted.

When it does not apply

The DMP does not apply as a form filled in once and forgotten: treated as a submission obligation, it loses its function of guiding the real practice of data management. It does not apply as a guarantee that the data will actually be shared; a plan that promises deposit does not substitute for the deposit itself, and the data availability statement in the article is what materializes the commitment. It does not apply identically to every project: the level of detail and the restrictions vary with the type of data, the field, and the sensitivity, and copying a generic template without adapting it hollows it out. It does not apply as a substitute for infrastructure: planning preservation without a real repository to sustain it is a promise without backing. And it does not apply in isolation; without budget, training, and defined responsibilities, the DMP stays on paper.

Applications by field

  • Funded research: an agency requirement, where the DMP is part of the proposal and conditions the award.
  • Life and environmental sciences: heterogeneous, long-term data, with strong demand for metadata standards.
  • Social sciences and health: sensitive data, where the DMP declares consent, anonymization, and controlled access.
  • Computational research: versioning of data and code, linking the plan to reproducibility.

Common pitfalls

The first pitfall is treating the DMP as a submission form, filled in and forgotten rather than revised over the project. The second is promising sharing without securing the infrastructure to sustain it, leaving the plan without backing. The third is copying a generic template without adapting it to the data type and the field’s restrictions. The fourth is confusing the plan with the execution: a DMP does not make the data FAIR by itself, it only describes how to. The fifth is omitting budget, roles, and responsibilities, the practical conditions without which none of the plan’s promises materialize.

Last updated —