OASPA is pleased to publish this guest post on the subject of open data and data sharing, co-authored by Fiona Murphy and Bob Samors. The post provides helpful practical advice drawn from a wealth of resources, to enable publishers and editors to play a key role in the important movement to make data accessible.
Authors: Fiona Murphy, Murphy Mitchell Consulting Ltd and Bob Samors, Coordination Officer – Belmont Forum e-Infrastructures and Data Management Project
As the global Open/FAIR Data movement continues to gather momentum, we are seeing increasing signs of convergence among the various segments of the data ecosystem (researchers, funders, publishers, data training and service providers), and across various disciplines. Examples include:
- the Belmont Forum consortium of global change research funders has co-developed data accessibility guidance with a group of scholarly publishers
- the American Geophysical Union has rolled out the Enabling FAIR Data Commitment Statement in extraordinarily close collaboration with researchers, funders, publishers, data archives and others
- the Open Research Funders Group (ORFG) comprising some of the world’s largest grant-making foundations is partnering with the US National Academy of Sciences to implement the NAS “Open Science by Design” agenda
- the Research Data Alliance Interest Group on Data Policy Standardisation and Implementation has been conducting a cross-community effort to harmonize data policies
All of these initiatives illustrate how the role of journal editors and publishers continues to grow in importance in enabling FAIR and accelerating open data and open science. But what exactly can these stewards of the end stage of the research process actually do to help achieve these worthy goals? The answer is, plenty, and it does not have to be either onerous or overwhelming.
A number of publishers and journals have begun to develop and implement data accessibility/availability protocols. Some publishers – typically the born-digital, born-Open Access players such as PLOS – take a single policy approach. However, others have a stepwise continuum of requirements (eg., Springer Nature, Wiley, Elsevier, IOPP, Sage, Taylor & Francis) for journals and authors. These come from a range of starting points on the Open Science continuum to move incrementally toward making research data Open and FAIR. These various approaches are often readily accessible and can help guide individual journals and smaller publishers in developing their own strategies to data accessibility, but it is also the case that there is little, if any, overarching guidance on how to prioritize, take first steps, or otherwise navigate this increasingly complex landscape.
Open Access publishers, including OASPA members, will hopefully be particularly well placed to think through these recommendations and use their existing Open materials, titles and messaging to build awareness amongst their authors, editors, reviewers and suppliers.
New resources are constantly emerging: for example, the RDA Data Policy Interest Group released its work product earlier this month in Figshare. And readers of the Open Access Scholarly Publishers Association’s (OASPA) blog posts will likely have seen that OASPA hosted a webinar on June 6th, which focused on several of these initiatives. Further, the FAIR Data movement is already splitting into a number of more detailed projects, such as FAIRsFAIR, GOFAIR, the DANS FAIR Data Assessment tool, and FAIRSharing.org.
To bring some order to this wealth of information, we suggest several rules of thumb:
- Encourage the use of persistent identifiers or PIDs (for example, DOIs for datasets, ORCIDs for authors, RRIDs for reagents – more information here)
- Engage with journal editors, learned societies and other domain leaders to benchmark where a specific subject or community is comfortable in terms of encouraging, expecting or mandating open data practices. You could use the RDA policy framework as the outline for the conversation.
- It is preferable to upload data to a repository, and include a link within a research article, rather than hosting via a supplementary material facility.
- Sometimes data do need to be kept closed, but this doesn’t need to be the default situation. Ask the researcher/author why should it be closed rather than why should it be open.
- Have some information (metadata) in front of any paywall to point to where underlying data can be found. See the following examples:
Taken from Scientific Data, a Gold Open Access journal.
This example points towards a dataset being held in a repository, although it doesn’t give the specific DOI.
The inclusion of the data licence is particularly useful for potential sharing.
Taken from The Journal of Social Psychology, which is a subscription journal.
Of particular interest and merit is the fact that although the article was behind a paywall, the information about the data is freely accessible.
Taken from Mathematical Problems in Engineering a wholly Gold Open Access journal.
Although it does not provide any information about additional data,
it does enable the reader to be certain that they have access to all the relevant data underlying the article.
6. Think about the workflow – pre-submission as well as the peer review and production of research outputs – from the researcher’s viewpoint.
- Is there anything that can be simplified through using Crossref’s or DataCite’s automated services?
- Are there any confusing or even contradictory requirements that can be clarified, thereby improving the experience and eventual quality of outputs?
- Is it possible that your colleagues and vendors (such as copy-editors and typesetters) should be included in this reflection for the sake of consistency throughout the workflow?
An invaluable resource for publishers here is the Open Access article published in Scientific Data: A data citation roadmap for scientific publishers, which provides a readable, actionable how-to guide for developing and implementing some sound data policies. (There’s a companion data citation roadmap article designed to support repositories to be found here.)
Two points to finish with. Firstly, this isn’t a swift, on-off sort of process. Instead, change is going to be effected through incremental steps over time towards general principles and goals. Every little helps and it doesn’t need to be overwhelming. Don’t let the perfect be the enemy of the good.
And finally, remember that the vast majority of stakeholders involved in this area are active because they want to improve the transparency, reproducibility and experience of publishing research. As a result, if you have a question or comment about this post or any of the issues it raises, please feel free to get in touch and start a conversation – we’d be delighted to hear from you!
Fiona Murphy formerly an Earth Sciences Publisher, now a scholarly communications consultant, is Secretary of the Dryad Data Repository Board, Editorial Board Member of Data Science Journal and Steering Committee Member for the FORCE11 Scholarly Communications Institute (FSCI). She regularly presents, writes and tweets about Open Science and Open Data and organizes workshops and conference sessions around topics such as PIDs, data sharing, and data policy.
Bob Samors is currently an independent consultant and most recently served as the Coordination Officer for the Belmont Forum e-Infrastructures & Data Management Initiative, overseeing and participating in the development of the project’s data policy, infrastructure and capacity building outputs and outcomes. He previously served as the Senior External Relations Manager for the Group on Earth Observations assessing and communicating the societal impacts of the GEO community’s work products.