--- title: "Bibentry for FAIR datasets" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Bibentry for FAIR datasets} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(dataset) iris_dataset_2 <- iris_dataset ``` An important aim of the **dataset** package is to create native R objects where bibliographic metadata cannot be detached, thus ensuring Findability, Accessibility, Interoperability and Reusability in the long run. We provide an interface and methods to add metadata required by open data repositories according to the more general Dublin Core library metadata standard, or the more specific DataCite metadata standard. ```{r bibentryprint} print(get_bibentry(iris_dataset_2), "Bibtex") ``` ## Titles ```{r, get-title} dataset_title(iris_dataset) ``` ```{r, change-title} dataset_title(iris_dataset_2, overwrite=TRUE) <- "The Famous Iris Dataset" get_bibentry(iris_dataset_2) ``` ## Creators The \code{Creator} corresponds to [dct:creator](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/creator/) in Dublin Core and Creator in DataCite, the two most important metadata definitions for publishing datasets in repositories. They refer to the The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the dataset. This property will be used to formulate the citation. ```{r creator} creator(iris_dataset) ``` ```{r creator-modify} iris_dataset_2 <- iris_dataset # Add a new creator, with overwriting existing authorship information: creator(iris_dataset_2, overwrite=TRUE) <- person("Jane", "Doe", role = "aut") # Add a new creator, without overwriting existing authorship information: creator(iris_dataset_2, overwrite=FALSE) <- person("John", "Doe", role = "ctb") # The two new creation contributors: creator(iris_dataset_2) ``` ## Further descriptive metadata about the whole dataset ### Publication year The publication year is usually one of the most important descriptive metadata in repositories and libraries: ```{r irispublicationyear} publication_year(iris_dataset_2) ``` The default value is `:unas` for unassigned values: ```{r publicationyeardefault} # Revert to default (unassigned): publication_year(iris_dataset_2) <- NULL # Get the default value: publication_year(iris_dataset_2) ``` ### Language ```{r, language} # Get the language: language(iris_dataset) # Reset the language: language(iris_dataset_2) <- "French" language(iris_dataset_2) ``` ### Rights statement ```{r rights} # Add rights statement to the dataset rights(iris_dataset_2, overwrite = TRUE) <- "GNU-2" ``` Some metadata functions prevent accidental overwriting, except for the default `:unas` unassigned and `:tba` to-be-announced values. ```{r prevent-overwrite} rights(iris_dataset_2) <- "CC0" rights(iris_dataset_2) ``` Overwriting the rights statement needs an explicit approval: ```{r overwrite} rights(iris_dataset_2, overwrite = TRUE) <- "GNU-2" ``` DataCite currently allows the use of subproperties. For example, the _Creative Commons Attribution 4.0 International_ would be described as: ```{r rightsubproperties} list ( schemeURI="https://spdx.org/licenses/", rightsIdentifierScheme="SPDX", rightsIdentifier="CC-BY-4.0", rightsURI="https://creativecommons.org/licenses/by/4.0/") ``` The use of subproperties will be later implemented. ### Description The description is currently implemented as a character string. However, DataCite 4.6 states that if Description is used, `descriptionType` is mandatory. This will be implemented later. ``` Example abstract ``` ```{r printdescription} description(iris_dataset) ``` ### Subject ```{r} subject(iris_dataset) ``` ``` Climate change mitigation Climate change processes ``` ```{r} subject_create( term = "data sets", subjectScheme = "Library of Congress Subject Headings (LCSH)", schemeURI = "https://id.loc.gov/authorities/subjects.html", valueURI = "http://id.loc.gov/authorities/subjects/sh2018002256" ) ``` ### Identifiers ```{r identifier} # Add rights statement to the dataset identifier(iris_dataset_2) ``` ## All bibliographic information Get the metadata according to the DataCite definition: ```{r datacite} print(as_datacite(iris_dataset), "Bibtex") ``` And according to DCTERMS (Dublin Core): ```{r dc} print(as_dublincore(iris_dataset), "Bibtex") ```