Please enable JavaScript.
Coggle requires JavaScript to display documents.
Deposit your Data in a Data Repository for long-term preservation (Why to…
Deposit your Data in a Data Repository for long-term preservation
Why to deposit
Transparency of your research and Increase Use
Ideally, research is reproducible (
https://blogs.plos.org/absolutely-maybe/2016/12/05/reproducibility-crisis-timeline-milestones-in-tackling-research-reliability/
): the study can be repeated and the findings can be replicated by others. For this, the methods and the documented data need to be available.
Efficiency
Cutting down on academic fraud
"If data is to be available for scrutiny, then proper archiving is needed (
https://www.knaw.nl/nl/actueel/publicaties/responsible-research-data-management-and-the-prevention-of-scientific-misconduct
)." (Schuyt Committee, 2013)
Citation advantage.
See for instance Sharing data increases citations (
https://www.liberquarterly.eu/articles/10.18352/lq.10149/
) and the OpenAIRE Guide on How identifiers can improve the dissemination of your output (
https://www.openaire.eu/how-can-identifiers-improve-the-dissemination-of-your-research-outputs
)
Negative data and negative results are also worth archiving
(
https://link.springer.com/article/10.1186/s12952-015-0033-9
)
More scientific breakthroughs
when more data are FAIR in a repository
Increased use and economic benefit
Reuse of data is only possible if they can be found in public places
(
https://www.thelancet.com/pdfs/journals/laninf/PIIS1473-3099(20)30119-5.pdf
)
When to deposit
When you start a project
: How will you make your data FAIR? (
https://www.openaire.eu/how-to-make-your-data-fair
)
Address preservation
already in the DMP (
https://www.openaire.eu/how-to-create-a-data-management-plan
)
During the research
:
Handling, processing, transferring of data
Dealing with sensitive data (
https://www.openaire.eu/sensitive-data-guide
)
Note that already now repositories can help you with your choices
When significant data changes happen
: Archive essential versions.
Coming to the end of research
: Have you used common standards? Deposit your outputs. Have you assigned licences and persistent identifiers? Have you met your funder's expectations?
Last version
, when you finish the research. Read more (
https://www.tue.nl/en/our-university/library/education-research-support/scientific-publishing/data-coach/sharing-and-reusing-research-data/after-your-project/#top
)
How to deposit
Take care of sensitive data
: Check out the OpenAIRE guide on How to deal with sensitive data (
https://www.openaire.eu/sensitive-data-guide
)
Check which licence is appropriate
: see the OpenAIRE guide on How do I license my research data? (
https://www.openaire.eu/how-do-i-license-my-research-data
) And EUDAT provides a wizard (
https://ufal.github.io/public-license-selector/
) to help you choose an appropriate licence (also for software).
Distinguish between raw and processed data, and don't forget back-ups.
See some tips in the versioning and references parts of the OpenAIRE guide on Raw data, backup and versioning (
https://www.openaire.eu/raw-data-backup-and-versioning
).
Use data formats that are suitable for reuse and preservation
. Check out the Data formats for preservation (
https://www.openaire.eu/data-formats-preservation-guide
) OpenAIRE guide and preferred file formats
here
.
Include necessary documentation and machine-readable metadata.
The metadata (
https://www.openaire.eu/what-is-metadata
) describing your data supports findability, citation and reuse. Follow standard metadata schemes, general ones such as Dublin Core (
https://www.dublincore.org/specifications/dublin-core/dces/
), or discipline-specific: the DCC metadata directory (
http://www.dcc.ac.uk/resources/metadata-standards
), the RDA Metadata Directory (
http://rd-alliance.github.io/metadata-directory/
) and a portal of data standards at FAIRsharing (
https://fairsharing.org/
). The repository of your choice will also provide some guidance on this.
Use domain-specific controlled vocabularies.
Schema.org (
https://schema.org/
) is widely used to build controlled vocabularies, a more specific example is bioschemas.org (
https://bioschemas.org/specifications/Dataset/
): a collection of specifications that provide guidelines to facilitate a more consistent adoption of schema.org within the life sciences. Note: building a vocabulary is an advanced activity and no part of the regular research lifecycle.
Where to deposit
In an
institutional repository
In a subject repository
, e.g. certified as a Trustworthy Digital Repository (
https://www.openaire.eu/find-trustworthy-data-repository
In an
external, generic repository
, e.g. Zenodo (
https://zenodo.org/
)
If the options above are not enough,
search or browse the registry of data repositories
(
https://www.re3data.org/
)
What to deposit
What data and contextual documentation should be deposited?
Recorded factual material commonly retained by and accepted in the scientific community as necessary to validate research findings
(
https://www2.le.ac.uk/services/research-data/rdm/what-is-rdm/research-data
).
Horizon2020 demands
the data underlying publications - with documentation, tools and/or software needed to understand the data.
Decide if and what data should be protected
(
https://www.openaire.eu/how-do-i-know-if-my-research-data-is-protected
)
Criteria to decide what data to keep
(
https://www.data.cam.ac.uk/data-management-guide/looking-after-and-sharing-your-data#Preservation
)
Is the data needed to reproduce your work?
Could this data be re-used?
Is the data unique?
Must it be kept as evidence or for legal reasons?
Should it be kept for its potential value?
Consider costs – do benefits outweigh cost?
Evaluate criteria to decide what to keep, because criteriia may differ among projects