EDCimport

Package-License Lifecycle: stable CRAN status Last Commit minimal R version

EDCimport is a package designed to easily import data from EDC software TrialMaster and Macro.

Installation

# Install last version available on CRAN (once published)
install.packages("EDCimport")

# Install development version on Github
devtools::install_github("DanChaltiel/EDCimport")

You will also need 7-zip installed, and preferably added to the PATH.

Windows-only

This package was developed to work on Windows and is unlikely to work on any other OS. Feel free to submit a PR if you manage to get it to work on another OS.

TrialMaster

Load the data

First, you need to request an export of type SAS Xport, with the checkbox “Include Codelists” ticked. This export should generate a .zip archive.

Then, simply use read_trialmaster() with the archive password (if any) to retrieve the data from the archive:

library(EDCimport)
tm = read_trialmaster("path/to/my/archive.zip", pw="foobar")

The resulting object tm is a list containing all the datasets, plus the date of extraction (datetime_extraction) and a dataset summary (.lookup).

You can now use load_list() to import the list in the global environment and use your tables:

load_list(tm) #this also removes `tm` to save memory
mean(dataset1$column5)

There are other options available, e.g. colnames cleaning & table splitting), see ?read_trialmaster for more details.

Utils

EDCimport include a set of useful tools that help with using the imported database.

Search the whole database

.lookup is a dataframe containing for each dataset all its column names and labels.

Its main use is to work with find_keyword(). For instance, say you do not remember in which dataset and column is located the “date of ECG”. find_keyword() will search every column name and label and will give you the answer:

find_keyword("date")
#> # A tibble: 10 x 3
#>    dataset names   labels                      
#>    <chr>   <chr>   <chr>                       
#>  1 pat     PTRNDT  Randomization Date          
#>  2 pat     RGSTDT  Registration Date           
#>  3 site    INVDAT  Deactivation date           
#>  4 site    TRGTDT  Target Enroll Date          
#>  5 trial   TRSPDT  End Date                    
#>  6 trial   TRSTDT  Start Date                  
#>  7 visit   VISIT2  Visit Date                  
#>  8 visit   EEXPVDT Earliest Expected Visit Date
#>  9 vs      ECGDAT  Date of ECG                 
#> 10 vs      VISITDT Visit Date

Note that find_keyword() uses the edc_lookup option as its second argument, automatically set by read_trialmaster().

Swimmer Plot

The edc_swimmerplot() function will create a swimmer plot of all date variables in the whole database.

There are 2 arguments of interest:

edc_swimmerplot()
edc_swimmerplot(group="enrolres$arm")
edc_swimmerplot(origin="enrolres$enroldt")

This outputs a plotly interactive graph where you can select the dates of interest and zoom in with your mouse.

Note that any modification made after running read_trialmaster() is taken into account. For instance, mutating a column with as.Date() in one of the tables will add a new group in the plot.