The open research data space is picking up pace on all fronts at the moment. There has been an upswell in the focus on data from traditional academic publishers as well as recent policy pushes from the NIH draft policy and CODATA’s Beijing Declaration.
The most obvious progress has come from the Data policy standardisation and implementation RDA interest group and more recently, the paper “FAIRsharing Collaboration with DataCite and Publishers: Data Repository Selection, Criteria That Matter”. The RDA group came about due to shared thinking that “The research data policy landscape of funders, institutions and publishers is…too complex”.
Many researchers will naturally see publishers as the dissemination point of their research, so naturally, they have questions for the publishers. The problem here is that publishers in the most part cater to a global audience, whereas policy may have nuances at the funder, or national level. Cameron Neylon echoes this sentiment – that policies on RDM and RDS are being developed by a number of agencies, primarily in the Global North. These policies are broadly consistent in aspiration and outlines but differ significantly in details of implementation.
One thread that transposes all of the stakeholders policies looks something like this:
- Make your data as open as possible, as closed (read: restricted access) as necessary
- The latest point at which you should be publishing the data, is when the paper describing your results is published – the minimum amount of data should be that which backs up or helps reproduce the findings of the paper
- The data should go into a subject specific repository, if there isn’t one, put it in a generalist repository
There is also a general consensus across policies for the need for Findable, Accessible, Interoperable and Reusable (FAIR) data. At Figshare, we have previously noted that in order to move towards FAIR data, stage one is the files living on an accessible platform with persistent storage, accompanied by metadata. Stage two is improving said metadata. Data from SpringerNature’s Scientific Data journal tells us that 70% of researchers did not have a subject specific repository. However, the scope of generalist repositories varies greatly, as does the level of curation. So perhaps as we head into 2020, the major focus for publishers is to align on stage one – at point of paper publication – ensuring that the data needed to reproduce the paper is made available in a repository, with a DOI. There is an advantage for publishers here too. There is a 25% citation advantage for publications that link to research data.
For publishers interested in improving their research data workflows, the team at Figshare is always happy to answer any questions and provide the infrastructure to move very far, very fast in this space. Alternatively, the RDA working groups provide fantastic forums for discussion and best practice.