The goal of this vignette is to provide an explicit map between the metadata fields used by cffr and each one of the valid keys of the Citation File Format schema version 1.2.0.
Summary
We summarize here the fields that cffr can coerce and the original source of information for each one of them. The details on each key are presented in the next section of the document. The assessment of fields is based on the Guide to Citation File Format schema version 1.2.0(Druskat et al. 2021).
This key is extracted from the "Description" field of the DESCRIPTION file.
Example
library(cffr)# Create cffr for yamlcff_obj <-cff_create("rmarkdown")# Get DESCRIPTION of rmarkdown to checkpkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))cat(cff_obj$abstract)#> Convert R Markdown documents into a variety of formats.cat(pkg$get("Description"))#> Convert R Markdown documents into a variety of formats.
This key is coerced from the "Authors" or "Authors@R" field of the DESCRIPTION file. By default persons with the role "aut" or "cre" are considered, however this can be modified via the authors_roles argument.
Example
# An example DESCRIPTIONpath <-system.file("examples/DESCRIPTION_many_persons", package ="cffr")pkg <- desc::desc(path)# See persons listedpkg$get_authors()#> [1] "Diego Hernangómez <fake@gmail.com> [aut, cre] (ORCID: <https://orcid.org/0000-0001-8457-4658>, email: error, not-valid: error)"#> [2] "Joe Doe <I am a wrong email> [aut] (affiliation: This One, country: ES, date-end: error)" #> [3] "Pepe Doe <fake@gmail.com> [aut] (error)" #> [4] "I am an entity [cre] (date-end: 2020-01-01, affiliation: error)" #> [5] "ERROR entity [cph] (for the administrative boundaries.)" #> [6] "ERROR person [cph] (ORCID: <https://orcid.org/0000-0003-2042-7063>, for the gisco_countrycode dataset.)"# Default behaviour, use authors and creators (maintainers)cff_obj <-cff_create(path)cff_obj$authors#> - family-names: Hernangómez#> given-names: Diego#> email: fake@gmail.com#> orcid: https://orcid.org/0000-0001-8457-4658#> - family-names: Doe#> given-names: Joe#> affiliation: This One#> country: ES#> - family-names: Doe#> given-names: Pepe#> email: fake@gmail.com#> - name: I am an entity#> date-end: '2020-01-01'# Use now Copyright holders and maintainerscff_obj_alt <-cff_create(path, authors_roles =c("cre", "cph"))cff_obj_alt$authors#> - family-names: Hernangómez#> given-names: Diego#> email: fake@gmail.com#> orcid: https://orcid.org/0000-0001-8457-4658#> - name: I am an entity#> date-end: '2020-01-01'#> - name: ERROR entity#> - family-names: person#> given-names: ERROR#> orcid: https://orcid.org/0000-0003-2042-7063
This key is extracted from the "RemoteSha" field of the DESCRIPTION file. This is the case of packages installed using the r-universe or packages such as remotes or pak.
Example
# An example DESCRIPTIONpath <-system.file("examples/DESCRIPTION_r_universe", package ="cffr")pkg <- desc::desc(path)# See RemoteShapkg$get("RemoteSha")#> RemoteSha #> "bdd9a29d7eabcc43c3195fe461f884932eba763c"cff_read(path)#> cff-version: 1.2.0#> message: 'To cite package "codemetar" in publications use:'#> type: software#> title: 'codemetar: Generate ''CodeMeta'' Metadata for R Packages'#> version: 0.3.5#> authors:#> - family-names: Boettiger#> given-names: Carl#> email: cboettig@gmail.com#> orcid: https://orcid.org/0000-0002-1642-628X#> - family-names: Salmon#> given-names: Maëlle#> orcid: https://orcid.org/0000-0002-2815-0399#> abstract: The 'Codemeta' Project defines a 'JSON-LD' format for describing software#> metadata, as detailed at <https://codemeta.github.io>. This package provides utilities#> to generate, parse, and modify 'codemeta.json' files automatically for R packages,#> as well as tools and examples for working with 'codemeta.json' 'JSON-LD' more generally.#> repository: https://ropensci.r-universe.dev#> repository-code: https://github.com/ropensci/codemetar#> url: https://docs.ropensci.org/codemetar/#> date-released: '2024-02-09'#> contact:#> - family-names: Boettiger#> given-names: Carl#> email: cboettig@gmail.com#> orcid: https://orcid.org/0000-0002-1642-628X#> keywords:#> - metadata#> - codemeta#> - ropensci#> - citation#> - credit#> - linked-data#> - json-ld#> - peer-reviewed#> - r#> - r-package#> - rstats#> license: GPL-3.0-only#> commit: bdd9a29d7eabcc43c3195fe461f884932eba763c#> doi: 10.32614/CRAN.package.codemetar
This key is coerced from the "Authors" or "Authors@R" field of the DESCRIPTION file. Only persons with the role "cre" (i.e, the maintainer(s)) are considered.
This key includes all the possible identifiers of the package:
From the DESCRIPTION field, it includes all the URLs not included in url or repository-code.
From the CITATION file, it includes all the DOIs not included in doi and the identifiers (if any) not included in the "identifiers" key of preferred-citation.
If the package is on CRAN and it has a CITATION file providing a doi, the doi provided by CRAN would be added as well.
# A DESCRIPTION file without keywordsnokeywords <-system.file("examples/DESCRIPTION_basic", package ="cffr")tmp2 <-tempfile("DESCRIPTION")# Create a temporary filefile.copy(nokeywords, tmp2)#> [1] TRUEpkgnokeywords <- desc::desc(tmp2)cffnokeywords <-cff_create(tmp2)# Won't appearcat(cffnokeywords$keywords)pkgnokeywords#> Type: Package#> Package: basicdesc#> Title: A Basic Description#> Version: 0.1.6#> Authors@R (parsed):#> * Marc Basic <marcbasic@gmail.com> [aut, cre, cph]#> Description: A very basic description. Should parse without problems.#> License: GPL-3#> URL: https://github.com/basic/package, https://basic.github.io/package#> BugReports: https://github.com/basic/package/issues#> Encoding: UTF-8#> LazyData: true#> RoxygenNote: 6.0.1.9000# Adding Keywordsdesc::desc_set("X-schema.org-keywords","keyword1, keyword2, keyword3",file = tmp2)#> Type: Package#> Package: basicdesc#> Title: A Basic Description#> Version: 0.1.6#> Authors@R (parsed):#> * Marc Basic <marcbasic@gmail.com> [aut, cre, cph]#> Description: A very basic description. Should parse without problems.#> License: GPL-3#> URL: https://github.com/basic/package, https://basic.github.io/package#> BugReports: https://github.com/basic/package/issues#> Encoding: UTF-8#> LazyData: true#> RoxygenNote: 6.0.1.9000#> X-schema.org-keywords: keyword1, keyword2, keyword3cat(cff_create(tmp2)$keywords)#> keyword1 keyword2 keyword3
Additionally, if the source code of the package is hosted on GitHub, cffr can retrieve the topics of the repository via the GitHub API and include those topics as keywords. This option is controlled via the gh_keywords argument:
This key is not extracted from the metadata of the package. See the description on the Guide to CFF schema v1.2.0.
description: The URL of the license text under which the software or dataset is licensed (only for non-standard licenses not included in the SPDX License List).
This key is extracted from the CITATION file. If several references are provided, it would select the first citation as the "preferred-citation" and the rest of them as references.
Example
cffobj <-cff_create("rmarkdown")cffobj$`preferred-citation`#> type: manual#> title: 'rmarkdown: Dynamic Documents for R'#> authors:#> - family-names: Allaire#> given-names: JJ#> email: jj@posit.co#> - family-names: Xie#> given-names: Yihui#> email: xie@yihui.name#> orcid: https://orcid.org/0000-0003-0645-5666#> - family-names: Dervieux#> given-names: Christophe#> email: cderv@posit.co#> orcid: https://orcid.org/0000-0003-4474-2498#> - family-names: McPherson#> given-names: Jonathan#> email: jonathan@posit.co#> - family-names: Luraschi#> given-names: Javier#> - family-names: Ushey#> given-names: Kevin#> email: kevin@posit.co#> - family-names: Atkins#> given-names: Aron#> email: aron@posit.co#> - family-names: Wickham#> given-names: Hadley#> email: hadley@posit.co#> - family-names: Cheng#> given-names: Joe#> email: joe@posit.co#> - family-names: Chang#> given-names: Winston#> email: winston@posit.co#> - family-names: Iannone#> given-names: Richard#> email: rich@posit.co#> orcid: https://orcid.org/0000-0003-3925-190X#> year: '2025'#> notes: R package version 2.30#> url: https://github.com/rstudio/rmarkdowncitation("rmarkdown")[1]#> Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins#> A, Wickham H, Cheng J, Chang W, Iannone R (2025). _rmarkdown: Dynamic#> Documents for R_. R package version 2.30,#> <https://github.com/rstudio/rmarkdown>.#> #> A BibTeX entry for LaTeX users is#> #> @Manual{,#> title = {rmarkdown: Dynamic Documents for R},#> author = {JJ Allaire and Yihui Xie and Christophe Dervieux and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone},#> year = {2025},#> note = {R package version 2.30},#> url = {https://github.com/rstudio/rmarkdown},#> }
This key is extracted from the CITATION file if several references are provided. The first citation is considered as the preferred-citation and the rest of them as "references". It also extracts the package dependencies and adds those to this fields using citation(auto = TRUE) on each dependency.
This key is extracted from the "Repository" field of the DESCRIPTION file. Usually, this field is auto-populated when a package is hosted on a repository (like CRAN or the r-universe). For packages without this field on the DESCRIPTION (which is the typical case for an in-development package), cffr would try to search the package on any of the default repositories specified on options("repos").
In the case of Bioconductor packages, those are identified if a “biocViews” is present on the DESCRIPTION file.
This key is extracted from the "BugReports" or "URL" fields on the DESCRIPTION file. cffr tries to identify the url of the source on the following repositories:
This key is extracted from the "BugReports" or "URL" fields on the DESCRIPTION file. It corresponds to the first url that is different to repository-code.
Example
# Many urlsmanyurls <-system.file("examples/DESCRIPTION_many_urls", package ="cffr")cat(cff_create(manyurls)$url)#> https://test.github.io/package/# Checkdesc::desc(manyurls)#> Type: Package#> Package: manyurls#> Title: A lot of urls#> Version: 0.1.6#> Authors@R (parsed):#> * Marc Basic <marcbasic@gmail.com> [aut, cre, cph]#> Description: This package has many urls. Specifically, 1 Bug Reports and 6#> URLs. Expected is to have 1 repository-code, 1 url and 3 URLs, since#> there is 1 duplicate and 1 invalid url.#> License: GPL-3#> URL: https://github.com/test/package, https://test.github.io/package/,#> https://r-forge.r-project.org/projects/test/, http://google.ru,#> https://gitlab.com/r-packages/behaviorchange, this.is.not.an.url#> BugReports: https://github.com/test/package/issues#> Encoding: UTF-8
Druskat, Stephan, Jurriaan H. Spaaks, Neil Chue Hong, Robert Haines, James Baker, Spencer Bliven, Egon Willighagen, David Pérez-Suárez, and Alexander Konovalov. 2021. “Citation FileFormat.”https://doi.org/10.5281/zenodo.5171937.