Visit and Period Variables

Introduction

The derivation of visit variables like AVISIT, AVISITN, AWLO, AWHI, … or period, subperiod, or phase variables like APERIOD, TRT01A, TRT02A, ASPER, PHSDTM, PHEDTM, … is highly study-specific. Therefore admiral cannot provide functions which derive these variables. However, for common scenarios like visit assignments based on time windows or deriving BDS period variables from ADSL period variables, functions are provided which support those derivations.

Required Packages

The examples of this vignette require the following packages.

library(admiral)
library(tibble)
library(dplyr, warn.conflicts = FALSE)
library(lubridate)

Visit variables (AVISIT, AVISITN, AWLO, AWHI, …)

The most common ways of deriving AVISIT and AVISITN are:

The former can be achieved simply by calling mutate(), like in the vignettes and the template scripts.

For the latter a (study-specific) reference dataset needs to be created which provides for each visit the start and end day (AWLO and AWHI) and the values of other visit related variables (AVISIT, AVISITN, AWTARGET, …).

windows <- tribble(
  ~AVISIT,    ~AWLO, ~AWHI, ~AVISITN, ~AWTARGET,
  "BASELINE",   -30,     1,        0,         1,
  "WEEK 1",       2,     7,        1,         5,
  "WEEK 2",       8,    15,        2,        11,
  "WEEK 3",      16,    22,        3,        19,
  "WEEK 4",      23,    30,        4,        26
)

Then the visits can be assigned based on the analysis day (ADY) by calling derive_vars_joined():

adbds <- tribble(
  ~USUBJID, ~ADY,
  "1",       -33,
  "1",        -2,
  "1",         3,
  "1",        24,
  "2",        NA,
)

derive_vars_joined(
  adbds,
  dataset_add = windows,
  filter_join = AWLO <= ADY & ADY <= AWHI,
  join_type = "all",
)
#> # A tibble: 5 × 7
#>   USUBJID   ADY AVISIT    AWLO  AWHI AVISITN AWTARGET
#>   <chr>   <dbl> <chr>    <dbl> <dbl>   <dbl>    <dbl>
#> 1 1         -33 <NA>        NA    NA      NA       NA
#> 2 1          -2 BASELINE   -30     1       0        1
#> 3 1           3 WEEK 1       2     7       1        5
#> 4 1          24 WEEK 4      23    30       4       26
#> 5 2          NA <NA>        NA    NA      NA       NA

Period, Subperiod, and Phase Variables

If periods, subperiods, or phases are used, the corresponding variables have to be consistent across all datasets. This can be achieved by defining the periods, subperiods, or phases once and then use this definition for all datasets. The definition can be stored in ADSL or in a separate dataset. In the following examples, this separate dataset is called period reference dataset.

Period Reference Dataset

The period reference dataset contains one observation per subject and period, subperiod, or phase. For example:

#> # A tibble: 3 × 6
#>   STUDYID USUBJID APHASEN PHSDT      PHEDT      APHASE   
#>   <chr>   <chr>     <int> <date>     <date>     <chr>    
#> 1 xyz     1             1 2021-01-04 2021-02-06 TREATMENT
#> 2 xyz     1             2 2021-02-07 2021-03-07 FUP      
#> 3 xyz     2             1 2021-02-02 2021-03-02 TREATMENT

The admiral functions expect separate datasets for periods, subperiods, and phases. For periods the numeric variable APERIOD is expected, for subperiods the numeric variables APERIOD and ASPER, and for phases the numeric variable APHASEN.

Creating ADSL Period, Subperiod, or Phase Variables

If a period reference dataset is available, the ADSL variables for periods, subperiods, or phases can be created from this dataset by calling derive_vars_period().

For example the period reference dataset from the previous section can be used to add the phase variables (PHwSDT, PHwEDT, and APHASEw) to ADSL:

adsl <- tibble(STUDYID = "xyz", USUBJID = c("1", "2"))

adsl <- derive_vars_period(
  adsl,
  dataset_ref = phase_ref,
  new_vars = exprs(PHwSDT = PHSDT, PHwEDT = PHEDT, APHASEw = APHASE)
) %>%
  select(STUDYID, USUBJID, PH1SDT, PH1EDT, PH2SDT, PH2EDT, APHASE1, APHASE2)

adsl
#> # A tibble: 2 × 8
#>   STUDYID USUBJID PH1SDT     PH1EDT     PH2SDT     PH2EDT     APHASE1   APHASE2
#>   <chr>   <chr>   <date>     <date>     <date>     <date>     <chr>     <chr>  
#> 1 xyz     1       2021-01-04 2021-02-06 2021-02-07 2021-03-07 TREATMENT FUP    
#> 2 xyz     2       2021-02-02 2021-03-02 NA         NA         TREATMENT <NA>

Creating BDS and OCCDS Period, Subperiod, or Phase Variables

If a period reference dataset is available, BDS and OCCDS variables for periods, subperiods, or phases can be created by calling derive_vars_joined().

For example the variables APHASEN, PHSDT, PHEDT, APHASE can be derived from the period reference dataset defined above.

adae <- tribble(
  ~USUBJID, ~ASTDT,
  "1",      "2021-01-01",
  "1",      "2021-01-05",
  "1",      "2021-02-05",
  "1",      "2021-03-05",
  "1",      "2021-04-05",
  "2",      "2021-02-15",
) %>%
  mutate(
    STUDYID = "xyz",
    .before = USUBJID
  ) %>%
  mutate(ASTDT = ymd(ASTDT))

derive_vars_joined(
  adae,
  dataset_add = phase_ref,
  by_vars = exprs(STUDYID, USUBJID),
  filter_join = PHSDT <= ASTDT & ASTDT <= PHEDT,
  join_type = "all"
)
#> # A tibble: 6 × 7
#>   STUDYID USUBJID ASTDT      APHASEN PHSDT      PHEDT      APHASE   
#>   <chr>   <chr>   <date>       <int> <date>     <date>     <chr>    
#> 1 xyz     1       2021-01-01      NA NA         NA         <NA>     
#> 2 xyz     1       2021-01-05       1 2021-01-04 2021-02-06 TREATMENT
#> 3 xyz     1       2021-02-05       1 2021-01-04 2021-02-06 TREATMENT
#> 4 xyz     1       2021-03-05       2 2021-02-07 2021-03-07 FUP      
#> 5 xyz     1       2021-04-05      NA NA         NA         <NA>     
#> 6 xyz     2       2021-02-15       1 2021-02-02 2021-03-02 TREATMENT

If no period reference dataset is available but period variables are in ADSL, the period reference dataset can be created from ADSL by calling create_period_dataset().

For example, a period reference dataset for phases can be created from the ADSL dataset created above:

create_period_dataset(
  adsl,
  new_vars = exprs(PHSDT = PHwSDT, PHEDT = PHwEDT, APHASE = APHASEw)
)
#> # A tibble: 3 × 6
#>   STUDYID USUBJID APHASEN PHSDT      PHEDT      APHASE   
#>   <chr>   <chr>     <int> <date>     <date>     <chr>    
#> 1 xyz     1             1 2021-01-04 2021-02-06 TREATMENT
#> 2 xyz     1             2 2021-02-07 2021-03-07 FUP      
#> 3 xyz     2             1 2021-02-02 2021-03-02 TREATMENT

Treatment Variables (TRTxxP, TRTxxA, TRTP, TRTA, …)

In studies with multiple periods the treatment can differ by period, e.g. for a crossover trial. CDISC defines variables for planned and actual treatments in ADSL (TRTxxP, TRTxxA, TRxxPGy, TRxxAGy, …) and corresponding variables in BDS and OCCDS datasets (TRTP, TRTA, TRTPGy, TRTAGy, …). They can be derived in the same way (and same step) as the period, subperiod, and phase variables.

Creating ADSL Treatment Variables

If the treatment information is included in the period reference dataset, the treatment ADSL variables can be created by calling derive_vars_period():

# Add period variables to ADSL
period_ref <- tribble(
  ~USUBJID, ~APERIOD, ~APERSDT,     ~APEREDT,     ~TRTA,
  "1",             1, "2021-01-04", "2021-02-06", "DRUG A",
  "1",             2, "2021-02-07", "2021-03-07", "DRUG B",
  "2",             1, "2021-02-02", "2021-03-02", "DRUG B",
  "2",             2, "2021-03-03", "2021-04-01", "DRUG B"
) %>%
  mutate(
    STUDYID = "xyz",
    APERIOD = as.integer(APERIOD),
    across(ends_with("DT"), ymd)
  )

adsl <- derive_vars_period(
  adsl,
  dataset_ref = period_ref,
  new_vars = exprs(
    APxxSDT = APERSDT,
    APxxEDT = APEREDT,
    TRTxxA = TRTA
  )
) %>%
  select(
    STUDYID, USUBJID,
    TRT01A, TRT02A,
    AP01SDT, AP01EDT, AP02SDT, AP02EDT
  )

adsl
#> # A tibble: 2 × 8
#>   STUDYID USUBJID TRT01A TRT02A AP01SDT    AP01EDT    AP02SDT    AP02EDT   
#>   <chr>   <chr>   <chr>  <chr>  <date>     <date>     <date>     <date>    
#> 1 xyz     1       DRUG A DRUG B 2021-01-04 2021-02-06 2021-02-07 2021-03-07
#> 2 xyz     2       DRUG B DRUG B 2021-02-02 2021-03-02 2021-03-03 2021-04-01

Creating BDS and OCCDS Treatment Variables

If a period reference dataset is available, BDS and OCCDS variables for treatment can be created by calling derive_vars_joined().

For example the variables APERIOD and TRTA can be derived from the period reference dataset defined above.

adae <- tribble(
  ~USUBJID, ~ASTDT,
  "1",      "2021-01-05",
  "1",      "2021-02-05",
  "1",      "2021-03-05",
  "1",      "2021-04-05",
  "2",      "2021-02-15",
  "2",      "2021-03-10",
) %>%
  mutate(
    ASTDT = ymd(ASTDT),
    STUDYID = "xyz"
  )

derive_vars_joined(
  adae,
  dataset_add = period_ref,
  by_vars = exprs(STUDYID, USUBJID),
  new_vars = exprs(APERIOD, TRTA),
  join_vars = exprs(APERSDT, APEREDT),
  join_type = "all",
  filter_join = APERSDT <= ASTDT & ASTDT <= APEREDT
)
#> # A tibble: 6 × 5
#>   USUBJID ASTDT      STUDYID APERIOD TRTA  
#>   <chr>   <date>     <chr>     <int> <chr> 
#> 1 1       2021-01-05 xyz           1 DRUG A
#> 2 1       2021-02-05 xyz           1 DRUG A
#> 3 1       2021-03-05 xyz           2 DRUG B
#> 4 1       2021-04-05 xyz          NA <NA>  
#> 5 2       2021-02-15 xyz           1 DRUG B
#> 6 2       2021-03-10 xyz           2 DRUG B

If no period reference dataset is available but period variables are in ADSL, the period reference dataset can be created from ADSL by calling create_period_dataset().

For example, a period reference dataset for periods and treatments can be created from the ADSL dataset created above:

create_period_dataset(
  adsl,
  new_vars = exprs(APERSDT = APxxSDT, APEREDT = APxxEDT, TRTA = TRTxxA)
)
#> # A tibble: 4 × 6
#>   STUDYID USUBJID APERIOD APERSDT    APEREDT    TRTA  
#>   <chr>   <chr>     <int> <date>     <date>     <chr> 
#> 1 xyz     1             1 2021-01-04 2021-02-06 DRUG A
#> 2 xyz     1             2 2021-02-07 2021-03-07 DRUG B
#> 3 xyz     2             1 2021-02-02 2021-03-02 DRUG B
#> 4 xyz     2             2 2021-03-03 2021-04-01 DRUG B

Study Specific Code

At some point study specific code is required to derive period/subperiod variables. There are two options:

It depends on the specific definition of the periods/subperiods which option works best. If the definition is based on other ADSL variables, the first option would work best. If the definition is based on vertically structured data like exposure data (EX dataset), the second option should be used.