Recently, we’ve been looking at the reusability of data in line with the FAIR principles. The ‘R’ of FAIR stands for Reusable and refers to data whereby the:
“R1. meta(data) have a plurality of accurate and relevant attributes.
R1.1. (meta)data are released with a clear and accessible data usage license.
R1.2. (meta)data are associated with their provenance.
R1.3. (meta)data meet domain-relevant community standards.”
To showcase the reusable possibilities with Figshare, we enlisted Jan Tulp, data visualization expert and founder of Tulp Interactive, to reuse three datasets published in Figshare as part of a publication. The task was to take these three datasets and create visualizations using only the data, the metadata found in the item page, and the original publication. The following is a breakdown of the Reusable principle in line with Jan’s experience.
R1. meta(data) have a plurality of accurate and relevant attributes.
Jan was easily able to locate the metadata for the data, including the ability to download and cite. He could also imagine that using the API (Application Programming Interface) would be an option for pulling the data from Figshare and reusing it.
The thoroughness of the documentation around the data was hugely important for Jan to be able to reuse and understand the data.
“Especially when creating a visualization, the meaning of the data is just as important as the data types available, so that you can make well informed design choices.”
R1.1. (meta)data are released with a clear and accessible data usage license.
The license – which in the case of these datasets, was Creative Commons – was clearly stated within the metadata and included a link to the license website for more information on reuse policies.
R1.2. (meta)data are associated with their provenance.
All three items included a link to the publication in which the data can be found, as well as any references for relevant links to the data.
R1.3. (meta)data meet domain-relevant community standards.
This is much more difficult to measure, especially as the data was not reused by a domain expert. However, Jan found all the relevant information he needed to reuse the data within the data itself, the metadata, and the publication.
Each of the visualizations can be found below.
Springer Nature
Heenan, Adel; Williams, Ivor; Acoba, Tomoko; DesRochers, Annette; Kanemura, Troy; Kosaki, Randall; et al. (2017): NOAA Pacific RAMP fish SPC 2010-2017 dataset. figshare. Dataset.
Link to visualization
Link to source data
Springer Nature is a leading academic and educational publisher serving the needs of researchers, students, teachers and professionals around the world.
Public Library of Science (PLoS)
Arbyn, Marc; Fabri, Valérie; Temmerman, Marleen; Simoens, Cindy (2014): Consumption of Pap smears, three-year screening coverage and overuse for women between 25 and 64 year old, by Region (Belgium, 1996–2006).. PLOS ONE. Dataset.
Public Library of Science (PLoS) was founded as a nonprofit Open Access publisher, innovator and advocacy organization with a mission to advance progress in science and medicine by leading a transformation in research communication.
ChemRxiv
Wójcikowski, Maciej; Kukiełka, Michał; Stepniewska-Dziubinska, Marta; Siedlecki, Pawel (2018): Development of a Protein-Ligand Extended Connectivity (PLEC) Fingerprint and Its Application for Binding Affinity Predictions.. ChemRxiv. Preprint
ChemRxiv is an open preprint server for the global chemistry community. You can put your research immediately out on the web and share it with other scientists and colleagues, prior to formal peer review. ChemRxiv is openly accessible, with no subscription fees for readers and no submission charges for authors.