Manipulating Citations with cffr

cffr is a tool whose target audience are R package developers. The main goal of cffr is to create a CITATION.cff file using the metadata information of the following files:

What is a CITATION.cff file?

Citation File Format (CFF) (Druskat et al. 2021) (v1.2.0) are plain text files with human- and machine-readable citation information for software (and data sets). Code developers can include them in their repositories to let others know how to correctly cite their software.

This format is becoming popular within the software citation ecosystem. Recently GitHub, Zenodo and Zotero have included full support of this citation format (Druskat 2021).

GitHub support is of special interest:

GitHub-link

— Nat Friedman (@natfriedman) July 27, 2021

See Customize your repository/About CITATION files for more info.

Creating a CITATION.cff file for my R package

With cffr creating a CITATION.cff file is quite straightforward. You just need to run cff_write():

library(cffr)

cff_write()

# You are done!

Under the hood, cff_write() performs the following tasks:

Congratulations! Now you have a full CITATION.cff file for your R package.

Modifying your CITATION.cff file

You can easily customize the cff object (a custom class of cffr) using the coercion system provided in the package, as well as making use of the keys parameter.

We would create a cff object using cff() (for example purposes only) and we would add or modify contents of it.

Adding new fields

newobject <- cff()

newobject
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: My Research Software
#> authors:
#> - family-names: Doe
#>   given-names: John

The valid keys of the Citation File Format schema version 1.2.0 can be displayed with cff_schema_keys():

cff_schema_keys()
#>  [1] "cff-version"         "message"             "type"               
#>  [4] "license"             "title"               "version"            
#>  [7] "doi"                 "abstract"            "authors"            
#> [10] "preferred-citation"  "repository"          "repository-artifact"
#> [13] "repository-code"     "url"                 "date-released"      
#> [16] "contact"             "keywords"            "references"         
#> [19] "commit"              "identifiers"         "license-url"

In this case, we are going to add url, version and repository. We would also overwrite the title key. We just need to add those parameters to cff_modify():

modobject <- cff_modify(newobject,
  url = "https://ropensci.org/",
  version = "0.0.1",
  repository = "https://github.com/ropensci/cffr",
  # If the field is already present, it would be overridden
  title = "Modifying a 'cff' object"
)

modobject
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: Modifying a 'cff' object
#> authors:
#> - family-names: Doe
#>   given-names: John
#> url: https://ropensci.org/
#> version: 0.0.1
#> repository: https://github.com/ropensci/cffr

# Validate against the schema

cff_validate(modobject)
#> ══ Validating cff ══════════════════════════════════════════════════════════════
#> ✔ Congratulations! This <cff> is valid

Persons and references

cffr provides two functions that convert person and bibentry objects (see ?person and ?bibentry) according to the Citation File Format schema.

Following the previous example, we are going to add a new author first. For doing that, we need first to extract the current author of the package and append the coerced person:

# Valid person keys

cff_schema_definitions_person()
#>  [1] "address"       "affiliation"   "alias"         "city"         
#>  [5] "country"       "email"         "family-names"  "fax"          
#>  [9] "given-names"   "name-particle" "name-suffix"   "orcid"        
#> [13] "post-code"     "region"        "tel"           "website"

# Create the person

chiquito <- person("Gregorio",
  "Sánchez Fernández",
  email = "fake@email2.com",
  comment = c(
    alias = "Chiquito de la Calzada",
    city = "Malaga",
    country = "ES",
    ORCID = "0000-0000-0000-0001"
  )
)

chiquito
#> [1] "Gregorio Sánchez Fernández <fake@email2.com> (Chiquito de la Calzada, Malaga, ES, <https://orcid.org/0000-0000-0000-0001>)"

# To cff
chiquito_cff <- as_cff_person(chiquito)
chiquito_cff
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001


# Append to previous authors

newauthors <- c(modobject$authors, chiquito_cff)
newauthors
#> - family-names: Doe
#>   given-names: John
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001

newauthorobject <- cff_modify(modobject, authors = newauthors)

newauthorobject
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: Modifying a 'cff' object
#> authors:
#> - family-names: Doe
#>   given-names: John
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001
#> url: https://ropensci.org/
#> version: 0.0.1
#> repository: https://github.com/ropensci/cffr

cff_validate(newauthorobject)
#> ══ Validating cff ══════════════════════════════════════════════════════════════
#> ✔ Congratulations! This <cff> is valid

Now, we may want to add references to our data. On the following example, we would add two references, one created with bibentry() and another with citation():

# Valid reference keys

cff_schema_definitions_refs()
#>  [1] "abbreviation"        "abstract"            "authors"            
#>  [4] "collection-doi"      "collection-title"    "collection-type"    
#>  [7] "commit"              "conference"          "contact"            
#> [10] "copyright"           "data-type"           "database-provider"  
#> [13] "database"            "date-accessed"       "date-downloaded"    
#> [16] "date-published"      "date-released"       "department"         
#> [19] "doi"                 "edition"             "editors"            
#> [22] "editors-series"      "end"                 "entry"              
#> [25] "filename"            "format"              "identifiers"        
#> [28] "institution"         "isbn"                "issn"               
#> [31] "issue"               "issue-date"          "issue-title"        
#> [34] "journal"             "keywords"            "languages"          
#> [37] "license"             "license-url"         "loc-end"            
#> [40] "loc-start"           "location"            "medium"             
#> [43] "month"               "nihmsid"             "notes"              
#> [46] "number"              "number-volumes"      "pages"              
#> [49] "patent-states"       "pmcid"               "publisher"          
#> [52] "recipients"          "repository"          "repository-artifact"
#> [55] "repository-code"     "scope"               "section"            
#> [58] "senders"             "start"               "status"             
#> [61] "term"                "thesis-type"         "title"              
#> [64] "translators"         "type"                "url"                
#> [67] "version"             "volume"              "volume-title"       
#> [70] "year"                "year-original"

# Auto coercion from another R package
base_r <- citation("base")

bib <- bibentry("Book",
  title = "This is a book",
  author = "Lisa Lee",
  year = 1980,
  publisher = "McGraw Hill",
  volume = 2
)

refs <- c(base_r, bib)

refs
#> R Core Team (2024). _R: A Language and Environment for Statistical
#> Computing_. R Foundation for Statistical Computing, Vienna, Austria.
#> <https://www.R-project.org/>.
#> 
#> Lee L (1980). _This is a book_, volume 2. McGraw Hill.

# Now to cff

refs_cff <- as_cff(refs)

refs_cff
#> - type: manual
#>   title: 'R: A Language and Environment for Statistical Computing'
#>   authors:
#>   - name: R Core Team
#>   institution:
#>     name: R Foundation for Statistical Computing
#>     address: Vienna, Austria
#>   year: '2024'
#>   url: https://www.R-project.org/
#> - type: book
#>   title: This is a book
#>   authors:
#>   - family-names: Lee
#>     given-names: Lisa
#>   year: '1980'
#>   publisher:
#>     name: McGraw Hill
#>   volume: '2'

Now the process is similar to the example with person: we just modify our cff object:

finalobject <- cff_modify(newauthorobject, references = refs_cff)

finalobject
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: Modifying a 'cff' object
#> authors:
#> - family-names: Doe
#>   given-names: John
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001
#> url: https://ropensci.org/
#> version: 0.0.1
#> repository: https://github.com/ropensci/cffr
#> references:
#> - type: manual
#>   title: 'R: A Language and Environment for Statistical Computing'
#>   authors:
#>   - name: R Core Team
#>   institution:
#>     name: R Foundation for Statistical Computing
#>     address: Vienna, Austria
#>   year: '2024'
#>   url: https://www.R-project.org/
#> - type: book
#>   title: This is a book
#>   authors:
#>   - family-names: Lee
#>     given-names: Lisa
#>   year: '1980'
#>   publisher:
#>     name: McGraw Hill
#>   volume: '2'

cff_validate(finalobject)
#> ══ Validating cff ══════════════════════════════════════════════════════════════
#> ✔ Congratulations! This <cff> is valid

Create your modified CITATION.cff file

The results can be written with cff_write():

# For example
tmp <- tempfile(fileext = ".cff")

see_res <- cff_write(finalobject, outfile = tmp)
#> ✔ 'C:\Users\diego\AppData\Local\Temp\RtmpYzIFLf\file710057903392.cff' generated
#> ══ Validating cff ══════════════════════════════════════════════════════════════
#> ✔ Congratulations! 'C:\Users\diego\AppData\Local\Temp\RtmpYzIFLf\file710057903392.cff' is valid

cat(readLines(tmp), sep = "\n")
#> # -----------------------------------------------------------
#> # CITATION file created with {cffr} R package, v1.0.0
#> # See also: https://docs.ropensci.org/cffr/
#> # -----------------------------------------------------------
#>  
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: Modifying a 'cff' object
#> version: 0.0.1
#> authors:
#> - family-names: Doe
#>   given-names: John
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001
#> repository: https://github.com/ropensci/cffr
#> url: https://ropensci.org/
#> references:
#> - type: manual
#>   title: 'R: A Language and Environment for Statistical Computing'
#>   authors:
#>   - name: R Core Team
#>   institution:
#>     name: R Foundation for Statistical Computing
#>     address: Vienna, Austria
#>   year: '2024'
#>   url: https://www.R-project.org/
#> - type: book
#>   title: This is a book
#>   authors:
#>   - family-names: Lee
#>     given-names: Lisa
#>   year: '1980'
#>   publisher:
#>     name: McGraw Hill
#>   volume: '2'

And finally we can read our created CITATION.cff file using cff_read():

reading <- cff_read(tmp)

reading
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: Modifying a 'cff' object
#> version: 0.0.1
#> authors:
#> - family-names: Doe
#>   given-names: John
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001
#> repository: https://github.com/ropensci/cffr
#> url: https://ropensci.org/
#> references:
#> - type: manual
#>   title: 'R: A Language and Environment for Statistical Computing'
#>   authors:
#>   - name: R Core Team
#>   institution:
#>     name: R Foundation for Statistical Computing
#>     address: Vienna, Austria
#>   year: '2024'
#>   url: https://www.R-project.org/
#> - type: book
#>   title: This is a book
#>   authors:
#>   - family-names: Lee
#>     given-names: Lisa
#>   year: '1980'
#>   publisher:
#>     name: McGraw Hill
#>   volume: '2'

Note that cff_write() also has the keys param, so the workflow can be simplified as:

allkeys <- list(
  "url" = "https://ropensci.org/",
  "version" = "0.0.1",
  "repository" = "https://github.com/ropensci/cffr",
  # If the field is already present, it would be overridden
  title = "Modifying a 'cff' object",
  authors = newauthors,
  references = refs_cff
)

tmp2 <- tempfile(fileext = ".cff")

res <- cff_write(cff(), outfile = tmp2, keys = allkeys)
#> ✔ 'C:\Users\diego\AppData\Local\Temp\RtmpYzIFLf\file710052c53a0.cff' generated
#> ══ Validating cff ══════════════════════════════════════════════════════════════
#> ✔ Congratulations! 'C:\Users\diego\AppData\Local\Temp\RtmpYzIFLf\file710052c53a0.cff' is valid

res
#> cff-version: 1.2.0
#> message: If you use this software, please cite it using these metadata.
#> title: Modifying a 'cff' object
#> version: 0.0.1
#> authors:
#> - family-names: Doe
#>   given-names: John
#> - family-names: Sánchez Fernández
#>   given-names: Gregorio
#>   email: fake@email2.com
#>   alias: Chiquito de la Calzada
#>   city: Malaga
#>   country: ES
#>   orcid: https://orcid.org/0000-0000-0000-0001
#> repository: https://github.com/ropensci/cffr
#> url: https://ropensci.org/
#> references:
#> - type: manual
#>   title: 'R: A Language and Environment for Statistical Computing'
#>   authors:
#>   - name: R Core Team
#>   institution:
#>     name: R Foundation for Statistical Computing
#>     address: Vienna, Austria
#>   year: '2024'
#>   url: https://www.R-project.org/
#> - type: book
#>   title: This is a book
#>   authors:
#>   - family-names: Lee
#>     given-names: Lisa
#>   year: '1980'
#>   publisher:
#>     name: McGraw Hill
#>   volume: '2'

References

Druskat, Stephan. 2021. “Making Software Citation Easi(er) - The Citation File Format and Its Integrations.” https://doi.org/10.5281/zenodo.5529914.
Druskat, Stephan, Jurriaan H. Spaaks, Neil Chue Hong, Robert Haines, James Baker, Spencer Bliven, Egon Willighagen, David Pérez-Suárez, and Alexander Konovalov. 2021. “Citation File Format.” https://doi.org/10.5281/zenodo.5171937.