Open research, The State of Open Data / August 7, 2020
The current climate has put a spotlight onto the value and importance of data sharing and curation and good data management for boosting the reproducibility and reliability of research. Its value has never been pulled more sharply into focus as you can see the real life impact of data sharing as we navigate this pandemic. After five years of collaboration on an annual survey of researchers, we can see increasing positive attitudes and behaviours when it comes to data sharing, and yet many researchers and those within the research community still face roadblocks – be this because of challenges in working practices, the lack of tools or services supporting them, or the wider misconception around the role, use and appropriate re-use of data – and this is a problem.
Since 2016 Figshare, Springer Nature and Digital Science have partnered on the State of Open Data report, based on a survey tracking researcher attitudes and behaviours towards open data sharing and research data management. The most recent survey launched in May this year, and with the global pandemic we took the opportunity to ask researchers how Covid-19 was impacting their ability to carry out research, and their views on reuse of data and collaboration. We wanted to get a better understanding of how researcher behaviour was being affected. When the survey was conducted much of the world was under lockdown which has since eased, however, fears of a second wave are growing. We are aware of the time sensitivity of these insights so rather than wait until October we wanted to release a snapshot of the data to the community as soon as we could, to allow stakeholders the time to analyse the data to help inform policy and actions going forward as we enter a new phase of the pandemic. The data published this week was from surveys dating from 24th May to 18th June, n=3,436.
As a snapshot of the full report, key takeaways at this stage indicate that:
COVID-19 has demonstrated that the research community has the ability to react to a crisis, and quickly. We have seen an increase in the publication of preprints, expedited peer review and clinical trials, an increase in collaboration and data sharing, as well as funders allowing the diversion of funds to COVID-19 research. All of this together has demonstrated the incredibly responsive nature of our sector, under immense pressure, at a time when the use, re-use, access to and engagement with research has, and continues to be critical. In turn the practices and outreach conducted during this time have led us to a greater understanding of the disease which will hopefully result in better therapeutics and a successful vaccine.
Lockdown has, also notably resulted in greater intended re-use of data with over 60% of respondents likely to reuse their own data during lockdown (64%), and a similar percentage over the next 12-18 months (65%). This compares to 58% who report previously reusing their own data. We see similar levels of increase in expected reuse in others’ data – 50% during a state of lockdown and 51% over the next 12-18 months. Forty-four percent (44%) of respondents report that they have previously reused others data. The inability of many researchers to gain access to their labs or carry out new research has fuelled a planned increase in the reuse of their own and others data.
Early responses to our findings have highlighted concerns that the re-use of data (the same data for new publications) could fuel academic misconduct, while others believe academics have made the most of this time to analyse data they had not got around to yet. These concerns underscores the importance, and value, of releasing all data related to a publication so that its providence is clear and the data can be scrutinised by reviewers and readers. If data underpinning a publication is published alongside the article there is less chance of researchers’ salami slicing or writing contradictory papers from the same dataset. Published, citable datasets, are, as they should, being recognised as an equal research output to the paper
What we have also seen emerge from the survey results, is a much greater focus on collaboration – collaboration across researchers, better collaboration to support the sustainable use of data, and a greater awareness from funders, research organisations and publishers in how to enable sustainable re-use of data and the structures needed in which to do so. Open data, although faced with challenges, is an integral part of being able to advance the conversations and collaboration around research engagement. Appropriate re-use of data where resources are limited enables the vital research that is needed, pandemic or not, to continue to develop and maximise the return on investment in the original study. With the survey results indicating a heightened awareness into the role and needs of open data, we can take this as a really positive sign that attitudes, and practices, are changing and collaboration, across stakeholders, is taking place which in turn will enable real effective and sustainable change for the use of open data and the benefits of open research.
What we are seeing from this snapshot of data, part way through a pandemic is the unwavering value of having access to data and the importance of rapid data sharing. Good and appropriate data management has, and will continue to enable researchers to reuse their own data where they are not able to conduct new experiments – which in an environment where many are still unable to be back in active research settings, is vital in enabling research, and collaboration, to continue to take place.
Whilst there is arguably still hesitation around ways in which data can be reused and shared appropriately and sustainably, throwing the spotlight on open data, its practices and its value in such a pandemic is an important conversation and one that is needed in order to continue to effect positive change through collaboration, awareness and innovation. We all have a role to play in this – supporting uptake through policy and credit, the better management of open research and data and the development of tools and services to enable good high quality research to be conducted, collaborated on and shared both through times of crisis and as we move back to the ‘new normal’ – whatever that may look like for the wider research community.
This snapshot survey dataset is available on Figshare as well as an infographic with the key findings. The survey closed in July 2020. further analysis of this data will be available in the State of Open Data report, due to be released in October 2020 along with the complete dataset.
Recently, we’ve been looking at the reusability of data in line with the FAIR principles. The ‘R’ of FAIR stands for Reusable and refers to data whereby the: “R1. meta(data) have a plurality of accurate and…