Creating ADCE

Introduction

This article describes creating an ADCE ADaM for the analysis of Vaccine Reactogenicity Data collected in SDTM CE domain. The current presented example is tested using CE SDTM domains and ADSL ADaM domain. However, other domains could be used if needed (eg temperature data collected in VS).

Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.

Programming Flow

Read in Data

Assumption: The CE domain has already been merged with the SUPPCE dataset. If this is not yet the case, join SUPPCE onto parent CE domain using metatools::combine_supp(CE, SUPPCE).

library(admiraldev)
library(admiral)
library(dplyr)
library(lubridate)
library(admiralvaccine)
library(pharmaversesdtm)

data("ce_vaccine")
data("admiralvaccine_adsl")

adsl <- admiralvaccine_adsl
ce <- ce_vaccine

ce <- convert_blanks_to_na(ce)
adsl <- convert_blanks_to_na(adsl)
USUBJID TRTSDT TRTEDT TRT01A AP01SDT AP01EDT AP02SDT AP02EDT
ABC-1001 2021-11-03 2021-12-30 VACCINE A 2021-11-03 2021-12-29 2021-12-30 2022-04-27
ABC-1002 2021-10-07 2021-12-16 VACCINE A 2021-10-07 2021-12-15 2021-12-16 2022-06-14

Pre-processing of Input Dataset

This step involves company-specific pre-processing of required input dataset for further analysis. In this step, we will filter records that has only reactogenicity events.

adce <- ce %>%
  filter(CECAT == "REACTOGENICITY")

Create Reference Dataset for Periods

Create period dataset - for joining period information onto CE records. Need to remove datetime variables as otherwise causes duplicate issues.

adsl2 <- adsl %>%
  select(-c(starts_with("AP") & ends_with("DTM")))

adperiods <- create_period_dataset(
  adsl2,
  new_vars = exprs(APERSDT = APxxSDT, APEREDT = APxxEDT)
)
USUBJID APERIOD APERSDT APEREDT
ABC-1001 1 2021-11-03 2021-12-29
ABC-1001 2 2021-12-30 2022-04-27
ABC-1002 1 2021-10-07 2021-12-15
ABC-1002 2 2021-12-16 2022-06-14

Derivation of Analysis Dates

At this step, it may be useful to join ADSL to your CE domain. Only the ADSL variables used for derivations are selected at this step. The rest of the relevant ADSL variables would be added later.

adsl_vars <- exprs(TRTSDT, TRTEDT)

adce <- adce %>%
  # join ADSL to CE
  derive_vars_merged(
    dataset_add = adsl,
    new_vars = adsl_vars,
    by = exprs(STUDYID, USUBJID)
  ) %>%
  derive_vars_dt(
    dtc = CESTDTC,
    new_vars_prefix = "AST",
    highest_imputation = "n"
  ) %>%
  derive_vars_dt(
    dtc = CEENDTC,
    new_vars_prefix = "AEN",
    highest_imputation = "n"
  ) %>%
  derive_vars_dy(
    reference_date = TRTSDT,
    source_vars = exprs(ASTDT, AENDT)
  )
USUBJID TRTSDT CESTDTC CEENDTC ASTDT AENDT ASTDY AENDY
ABC-1001 2021-11-03 NA NA NA NA NA NA
ABC-1001 2021-11-03 2021-11-04 2021-11-07 2021-11-04 2021-11-07 2 5
ABC-1001 2021-11-03 2021-11-04 2021-11-04 2021-11-04 2021-11-04 2 2
ABC-1001 2021-11-03 2021-11-03 2021-11-09 2021-11-03 2021-11-09 1 7
ABC-1001 2021-11-03 NA NA NA NA NA NA
ABC-1001 2021-11-03 2021-11-03 2021-11-04 2021-11-03 2021-11-04 1 2
ABC-1001 2021-11-03 NA NA NA NA NA NA
ABC-1001 2021-11-03 NA NA NA NA NA NA
ABC-1001 2021-11-03 2021-11-04 2021-11-04 2021-11-04 2021-11-04 2 2
ABC-1001 2021-11-03 2021-11-04 2021-11-04 2021-11-04 2021-11-04 2 2

Join with the Periods Reference Dataset and Derive Relative Day in Period

Also add analysis version of CEREL(AREL).

adce <-
  derive_vars_joined(
    adce,
    dataset_add = adperiods,
    by_vars = exprs(STUDYID, USUBJID),
    filter_join = ASTDT >= APERSDT & ASTDT <= APEREDT,
    join_type = "all"
  ) %>%
  mutate(
    APERSTDY = as.integer(ASTDT - APERSDT) + 1,
    AREL = CEREL
  )
USUBJID TRTSDT ASTDT AENDT ASTDY AENDY APERIOD APERSDT APERSTDY
ABC-1001 2021-11-03 NA NA NA NA NA NA NA
ABC-1001 2021-11-03 2021-11-04 2021-11-07 2 5 1 2021-11-03 2
ABC-1001 2021-11-03 2021-11-04 2021-11-04 2 2 1 2021-11-03 2
ABC-1001 2021-11-03 2021-11-03 2021-11-09 1 7 1 2021-11-03 1
ABC-1001 2021-11-03 NA NA NA NA NA NA NA
ABC-1001 2021-11-03 2021-11-03 2021-11-04 1 2 1 2021-11-03 1
ABC-1001 2021-11-03 NA NA NA NA NA NA NA
ABC-1001 2021-11-03 NA NA NA NA NA NA NA
ABC-1001 2021-11-03 2021-11-04 2021-11-04 2 2 1 2021-11-03 2
ABC-1001 2021-11-03 2021-11-04 2021-11-04 2 2 1 2021-11-03 2

Creation of Analysis Version for GRADING Variable (Either TOXGR or SEV)

Depending on which variable is collected for the Grading (TOXGR or SEV) in CE domain, derive the associated analysis version. In current example, SEV is collected, so the code is using this as an example. In addition, derivation of Extreme Flags: in current example: flag the first occurrence of the most severe grade within a Period (AOCC01FL).

adce <- adce %>%
  mutate(
    ASEV = CESEV,
    ASEVN = as.integer(factor(ASEV,
      levels = c("MILD", "MODERATE", "SEVERE", "DEATH THREATENING")
    ))
  ) %>%
  restrict_derivation(
    derivation = derive_var_extreme_flag,
    args = params(
      by_vars = exprs(USUBJID, APERIOD),
      order = exprs(desc(ASEVN), ASTDY, CEDECOD),
      new_var = AOCC01FL,
      mode = "first"
    ),
    filter = !is.na(APERIOD) & !is.na(ASEV)
  )
USUBJID TRTSDT ASTDT APERIOD APERSDT APERSTDY CEDECOD ASEVN AOCC01FL CESEQ
ABC-1001 2021-11-03 2021-11-03 1 2021-11-03 1 Swelling 2 Y 4
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Erythema 2 NA 3
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Injection site pain 2 NA 2
ABC-1001 2021-11-03 2021-11-03 1 2021-11-03 1 Fatigue 1 NA 6
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Arthralgia 1 NA 9
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Myalgia 1 NA 10
ABC-1002 2021-10-07 2021-10-11 1 2021-10-07 5 Headache 2 Y 8
ABC-1002 2021-10-07 2021-10-09 1 2021-10-07 3 Erythema 1 NA 3
ABC-1002 2021-10-07 2021-12-16 2 2021-12-16 1 Injection site pain 1 Y 13
ABC-1002 2021-10-07 2021-12-17 2 2021-12-16 2 Erythema 1 NA 14

Creation of Analysis Sequence Number

adce <- adce %>%
  derive_var_obs_number(
    new_var = ASEQ,
    by_vars = exprs(STUDYID, USUBJID),
    order = exprs(CEDECOD, CELAT, CETPTREF, APERIOD),
    check_type = "error"
  ) %>%
  derive_vars_duration(
    new_var = ADURN,
    new_var_unit = ADURU,
    start_date = ASTDT,
    end_date = AENDT,
    in_unit = "days",
    out_unit = "days",
    add_one = TRUE,
    trunc_out = FALSE
  )
USUBJID TRTSDT ASTDT APERIOD APERSDT APERSTDY CEDECOD ASEVN AOCC01FL CESEQ ASEQ
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Arthralgia 1 NA 9 1
ABC-1001 2021-11-03 NA NA NA NA Arthralgia NA NA 20 2
ABC-1001 2021-11-03 NA NA NA NA Chills NA NA 1 3
ABC-1001 2021-11-03 NA NA NA NA Chills NA NA 12 4
ABC-1001 2021-11-03 NA NA NA NA Diarrhoea NA NA 5 5
ABC-1001 2021-11-03 NA NA NA NA Diarrhoea NA NA 16 6
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Erythema 2 NA 3 7
ABC-1001 2021-11-03 NA NA NA NA Erythema NA NA 14 8
ABC-1001 2021-11-03 2021-11-03 1 2021-11-03 1 Fatigue 1 NA 6 9
ABC-1001 2021-11-03 NA NA NA NA Fatigue NA NA 17 10

Final Step : Get All the Remaining Variables from ADSL

Get list of ADSL vars as per trial specific which needs to be adjusted when using the template

adsl_list <- adsl %>%
  select(STUDYID, USUBJID, TRT01A, TRT01P, AGE, AGEU, SEX, RACE, COUNTRY, ETHNIC, SITEID, SUBJID)

adce <- adce %>%
  derive_vars_merged(
    dataset_add = adsl_list,
    by_vars = exprs(STUDYID, USUBJID)
  )
USUBJID TRTSDT ASTDT APERIOD APERSDT APERSTDY CEDECOD ASEVN AOCC01FL CESEQ ASEQ AGE SEX
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Arthralgia 1 NA 9 1 74 F
ABC-1001 2021-11-03 NA NA NA NA Arthralgia NA NA 20 2 74 F
ABC-1001 2021-11-03 NA NA NA NA Chills NA NA 1 3 74 F
ABC-1001 2021-11-03 NA NA NA NA Chills NA NA 12 4 74 F
ABC-1001 2021-11-03 NA NA NA NA Diarrhoea NA NA 5 5 74 F
ABC-1001 2021-11-03 NA NA NA NA Diarrhoea NA NA 16 6 74 F
ABC-1001 2021-11-03 2021-11-04 1 2021-11-03 2 Erythema 2 NA 3 7 74 F
ABC-1001 2021-11-03 NA NA NA NA Erythema NA NA 14 8 74 F
ABC-1001 2021-11-03 2021-11-03 1 2021-11-03 1 Fatigue 1 NA 6 9 74 F
ABC-1001 2021-11-03 NA NA NA NA Fatigue NA NA 17 10 74 F

Example Script

ADaM Sample Code
ADCE ad_adce.R