Hardy Schwamm, Research Data and Repository Manager, delivered an interesting and informative session on the emerging area of data citation.
Why data citation is important. Data underpins the arguments in an article, therefore it is just as important. Citing data, and making it easily accessible means that research is reproducible, and can more efficiently be built on by others. or can be verified, and can be built on more efficiently by others. It is a Common Principle of Research Councils UK (RCUK) that research data are a public good and more and more funding bodies expect you to share data as openly as possible.
How data citation works. This is evolving, and there are not yet clear standards. Data citations are similar to bibliographic citations, but would ideally include a “persistent identifier” such as a digital object identifier (DOI) which links directly to the dataset. You may have seen DOIs when looking at journal articles online e.g. 10.1145/1515693.1515696 or sometimes presented as a link e.g. http://dx.doi.org/10.5255/UKDA-SN-6899-1
DataCite recommends that data citations contain the following details:
- Creator (Publication Year): Title. Version. Publisher. Resource Type. Identifier
- Geofon operator (2009): GEFON event gfz2009kciu (NW Balkan Region). GeoForschungsZentrum Potsdam (GFZ).http://dx.doi.org/10.1594/GFZ.GEOFON.gfz2009kciu
What is a digital object identifier (DOI)? It is a unique and persistent link to a digital object. The organisation DataCite are concerned with assigning DOIs to research data and other outputs. Institutions, such as Lancaster University, can also be granted the ability to ‘mint’ their own DOIs. We can now provide a DOI for datasets that are deposited into Pure.
Tools to help with data citation:
- DOI Citation formatter (Beta)
- Endnote includes a reference type ‘Dataset’ which you can use to cite data.
Thomson Reuters who produce Web of Science now publish a Data Citation Index, though the University does not currently subscribe.
The consent forms I have been advised to use state that the data I collect from participants should not be kept for more than 10 years. How can I reconcile that with a persistent DOI, that will potentially last ‘forever’? The DOI is persistent, but the contents linked to from the DOI can change. After 10 years there could be a message to say something like ‘this data is no longer available as participants consented to make the data available for 10 years’. The responsibility for this would lie with the institution who issued the DOI.
Should researchers add an additional clause to their consent forms about data sharing? You may have good reasons why data cannot be made available, though anonymised data is usually acceptable to participants and ethics committees. Some funders (e.g. ESRC) already stipulate consent forms to allow sharing of anonymised data. It is best to check. If you are self-funded, it is up to you to follow good research practice, and seek advice.