Test data (SDTM) for the pharmaverse family of packages
To provide a one-stop-shop for SDTM test data in the pharmaverse family of packages. This includes datasets that are therapeutic area (TA)-agnostic (DM
, VS
, EG
, etc.) as well TA-specific ones (RS
, TR
, OE
, etc.).
The package is available from CRAN and can be installed by running install.packages("pharmaversesdtm")
. To install the latest development version of the package directly from GitHub use the following code:
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("pharmaverse/pharmaversesdtm", ref = "devel")
Some of the test datasets has been sourced from the CDISC pilot project, while other datasets have been constructed ad-hoc by the admiral team. Please check the Reference page for detailed information regarding the source of specific datasets.
dm
, rs
).oe_ophtha
, rs_onco
, rs_onco_irecist
).Note: If an SDTM domain is used by multiple TAs, {pharmaversesdtm}
may provide multiple versions of the corresponding test dataset. For instance, the package contains ex
and ex_ophtha
as the latter contains ophthalmology-specific variables such as EXLAT
and EXLOC
, and EXROUTE
is exchanged for a plausible ophthalmology value.
Firstly, make a GitHub issue in {pharmaversesdtm}
with the planned updates and tag @pharmaverse/admiral
so that one of the development core team can sanity check the request. Then there are two main ways to extend the test data: either by adding new datasets or extending existing datasets with new records/variables. Whichever method you choose, it is worth noting the following:
data-raw/
folder.library()
at the start of the program (but please do not call library(pharmaversesdtm)
).renv.lock
file, so they will already be installed if you have been keeping in sync–you can check this by entering renv::status()
in the Console. However, you may also wish to install {metatools}
, which is currently not specified in the renv.lock
file. If you feel that you need to install any other packages in addition to those just mentioned, then please tag @pharmaverse/admiral
to discuss with the development core team.data-raw/
folder, you need to run it as a standalone R script, in order to generate a test dataset that will become part of the {pharmaversesdtm}
package, but you do not need to build the package..rda
file whose name is consistent with the name of the dataset, e.g., dataset xx
is stored as xx.rda
. The easiest way to achieve this is to use usethis::use_data(xx)
data-raw/
are stored within the {pharmaversesdtm}
GitHub repository, but they are not part of the {pharmaversesdtm}
package–the data-raw/
folder is specified in .Rbuildignore
.data-raw/
folder, you generate a dataset that is written to the data/
folder, which will become part of the {pharmaversesdtm}
package.R/data.R
, for the purpose of generating documentation in the man/
folder.data-raw/
folder, named <name>.R
, where <name>
should follow the naming convention, to generate the test data and output <name>.rda
to the data/
folder.
dm
as input in this program in order to create realistic synthetic data that remains consistent with other domains (not mandatory).<name>
in R/data.R
.devtools::document()
in order to update NAMESPACE
and update the .Rd
files in man/
..github/CODEOWNERS
.NEWS.md
.<name>.R
in the data-raw/
folder, update it accordingly.<name>.rda
to the data/
folder.devtools::document()
in order to update NAMESPACE
and update the .Rd
files in man/
..github/CODEOWNERS
.NEWS.md
.