| Type: | Package |
| Title: | While-Alive Regression for Composite Endpoints with Cluster-Robust Inference |
| Version: | 0.1.0 |
| Description: | Provides estimation and inference for while-alive regression models targeting the while-alive loss rate for composite endpoints that include recurrent events and a terminal event. The implementation supports flexible time-varying covariate effects through user-selected time bases, including B-splines, natural splines, M-splines, step functions, truncated linear bases, interval-local bases, and piecewise polynomials. Inference can be performed using cluster-robust variance estimators for cluster-randomized trials, with subject-level (IID) variance as a special case. The package includes prediction and plotting utilities and K-fold cross-validation for selecting basis and tuning parameters. Methodology is based on Fang et al. (2025) <doi:10.1093/biostatistics/kxaf047>. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| Depends: | R (≥ 4.1) |
| Imports: | dplyr, tidyr, tibble, ggplot2, survival, nleqslv, splines, MASS, magrittr, rlang |
| Suggests: | splines2, testthat (≥ 3.0.0), knitr, rmarkdown |
| Config/testthat/edition: | 3 |
| URL: | https://github.com/fancy575/WAreg |
| BugReports: | https://github.com/fancy575/WAreg/issues |
| RoxygenNote: | 7.3.3 |
| LazyData: | true |
| NeedsCompilation: | no |
| Packaged: | 2026-03-02 21:19:15 UTC; xf97 |
| Author: | Xi Fang [aut, cre], Hajime Uno [aut], Fan Li [aut] |
| Maintainer: | Xi Fang <x.fang@yale.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-03-06 13:50:07 UTC |
Pipe operator
Description
See magrittr::%>%.
Arguments
lhs |
A value or the left-hand side of the pipe. |
rhs |
A function call using the placeholder |
Value
The result of applying rhs to lhs.
K-fold cross-validation for WA configuration selection
Description
Runs K-fold CV over a grid of basis types, degrees, interior-knot counts,
and link functions. For each configuration, fits the model on K-1 folds and
accumulates the prediction error (PE) on the held-out fold using
WA_PE() (IPCW computed on the training subjects).
Usage
WA_cv(
formula,
data,
id,
cluster = NULL,
basis_set = c("il", "pl", "bz"),
degree_vec = 1:2,
n_int_vec = c(0, 2, 4),
knot_scheme = c("equidist", "quantile"),
link_set = c("log"),
time_range = NULL,
tau_grid = NULL,
w_recur,
w_term,
ipcw = c("cox", "km"),
ipcw_formula = ~1,
K = 5,
seed = 1L,
verbose = TRUE
)
Arguments
formula |
A |
data |
Long-format data frame; see |
id |
Character scalar; subject ID column name; see |
cluster |
Optional character scalar; cluster column name; see |
basis_set |
Character vector of candidate bases. |
degree_vec |
Integer vector of candidate degrees. |
n_int_vec |
Integer vector of interior-knot counts; 0 means boundaries only. |
knot_scheme |
|
link_set |
Character vector of candidate links (subset of |
time_range |
Optional numeric length-2 vector |
tau_grid |
Optional numeric vector; if |
w_recur |
recurrent-event weights |
w_term |
Numeric scalar; terminal-event weight; see |
ipcw |
IPCW method ( |
ipcw_formula |
One-sided RHS formula for IPCW Cox model (if |
K |
Number of folds. |
seed |
RNG seed for fold assignment. |
verbose |
Logical; show a text progress bar and per-fold messages. |
Value
A data frame with columns: basis, degree, n_int,
link, and aggregated PE. Lower PE is better.
While-Alive Regression (WA) for Composite Endpoints
Description
Fits the while-alive regression model targeting the while-alive loss rate
for composite endpoints with recurrent and terminal events. Time-varying
covariate effects are represented via user-chosen time bases (e.g., B-spline,
piecewise polynomial, interval-local). Robust inference supports
cluster-randomized trials (CRTs) via cluster-robust variance; if
cluster = NULL, IID (subject-as-cluster) variance is used.
Usage
WA_fit(
formula,
data,
id,
cluster = NULL,
knots,
tau_grid,
basis = c("il", "pl", "bz", "ns", "ms", "st", "tl", "tf"),
degree = 1,
link = c("log", "identity"),
w_recur,
w_term,
ipcw = c("km", "cox"),
ipcw_formula = ~1
)
Arguments
formula |
A |
data |
Long-format data frame with one row per event/checkpoint
per subject, containing |
id |
Character scalar; subject ID column name. |
cluster |
Optional character scalar; cluster column name for CRT-robust
inference. If |
knots |
Numeric vector (length |
tau_grid |
Numeric vector of evaluation times used to stack the
estimating equations. Independent of |
basis |
One of |
degree |
Integer degree for bases that use it (e.g., |
link |
Link function: |
w_recur |
Numeric vector of weights for each recurrent event type. Its
length must match the number of recurrent |
w_term |
Numeric scalar; weight for the terminal event. |
ipcw |
IPCW method: |
ipcw_formula |
A one-sided formula specifying RHS covariates for the IPCW Cox model
when |
Details
The estimating equations solve E[Z(t)\{L(t) - \mu_\beta(t)X_{\min}(t)\}V/G]=0
over tau_grid, where L(t) is the weighted composite loss
(recurrent+terminal), \mu_\beta(t) the while-alive loss rate under the chosen
link, X_{\min}(t) = \min(T, t), V the at-risk/terminal indicator, and
G the censoring survival modeled via ipcw.
Value
An object of class "WA" with elements:
-
est: named coefficient vector. -
vcov: cluster-robust variance matrix. -
se: standard errors. -
converged: logical. -
basis,degree,link,Z_cols,knots,tau_grid,id_var,cluster_var,w_recur,w_term,status_codes,formula.
Examples
ex_dt <- crt_dt[crt_dt$cluster %in% c(1,2,3,4,7,10), ]
fit <- WA_fit(
survival::Surv(time, status) ~ trt + Z1 + Z2,
data = ex_dt,
id = "id",
cluster = "cluster",
knots = seq(0, max(ex_dt$time, na.rm = TRUE), length.out = 6),
tau_grid = seq(0, max(ex_dt$time, na.rm = TRUE), length.out = 6),
basis = "bz", degree = 1, link = "log",
w_recur = c(1, 1), w_term = 2,
ipcw = "km"
)
s <- summary(fit)
nd <- unique(ex_dt[, c("trt","Z1","Z2")])
plot(fit, newdata = nd,
t_seq = seq(0, max(fit$tau_grid), length.out = 200),
id = 1, mode = "wa", smooth = TRUE)
Clustered Recurrent-Time Dataset: crt_dt
Description
A simulated dataset of clustered recurrent events with terminal/censoring outcomes and covariates, suitable for examples and tests.
Usage
data(crt_dt)
Format
A data frame with the following columns:
- id
Integer subject ID (within the whole sample).
- cluster
Integer cluster ID.
- time
Numeric event/censoring time.
- status
Integer event type indicator:
0= censored,1= recurrent type 1,2= recurrent type 2,3= death (terminal).- trt
Cluster-level treatment indicator carried to subjects (e.g., 0/1).
- Z1
Numeric covariate.
- Z2
Numeric covariate.
Details
Rows represent observed events (including censoring and death) for each subject.
Multiple rows per id indicate multiple recurrent events; terminal/censoring
rows mark the end of observation for that subject.
Source
Generated by the package's simulation utilities.
Examples
data(crt_dt)
head(crt_dt)
Individual Recurrent-Time Dataset: irt_dt
Description
A simulated dataset of recurrent events with terminal/censoring outcomes and covariates, organized in long format.
Usage
data(irt_dt)
Format
A data frame with the following columns:
- id
Integer subject ID (within the whole sample).
- time
Numeric event/censoring time.
- status
Integer event type indicator:
0= censored,1= recurrent type 1,2= recurrent type 2,3= death (terminal).- trt
Cluster-level treatment indicator carried to subjects (e.g., 0/1).
- Z1
Numeric covariate.
- Z2
Numeric covariate.
Details
Long-format events: each row is an event (or censoring/death) for a subject.
Source
Generated by the package's simulation utilities.
Examples
data(irt_dt)
head(irt_dt)
Plot while-alive trajectory or a covariate's time-varying effect
Description
Plot while-alive trajectory or a covariate's time-varying effect
Usage
## S3 method for class 'WA'
plot(
x,
newdata,
t_seq,
id = 1,
mode = c("wa", "cov"),
covariate = NULL,
ylab_wa = "While-alive loss rate",
ylab_cov = NULL,
xlab = "Time",
level = 0.95,
smooth = FALSE,
span = 0.3,
...
)
Arguments
x |
A |
newdata |
Data used to rebuild the RHS design (same columns as in the model). |
t_seq |
Times to plot over (numeric vector). |
id |
Row index of |
mode |
|
covariate |
Character; covariate name (must appear on RHS) when |
ylab_wa |
Y-axis label for while-alive plot. |
ylab_cov |
Y-axis label for covariate-effect plot; default
|
xlab |
X-axis label. |
level |
Confidence level for ribbons (default 0.95). |
smooth |
Logical; if |
span |
LOESS span used when |
... |
Unused. |
Value
A ggplot2 object.
Examples
ex_dt <- crt_dt[crt_dt$cluster %in% c(1,2,3,4,7,10), ]
fit <- WA_fit(survival::Surv(time, status) ~ trt + Z1 + Z2,
data = ex_dt, id="id", cluster="cluster",
knots=seq(0, max(ex_dt$time), length.out=6),
tau_grid=seq(0, max(ex_dt$time), length.out=6),
basis="bz", degree=1, link="log",
w_recur=c(1,1), w_term=2, ipcw="km")
nd <- unique(ex_dt[, c("trt","Z1","Z2")])
plot(fit, newdata = nd,
t_seq = seq(0, max(fit$tau_grid), length.out = 200),
id = 1, mode = "wa", smooth = TRUE)
Predict while-alive loss rates
Description
Predict while-alive loss rates
Usage
## S3 method for class 'WA'
predict(object, newdata, t_seq, level = 0.95, ...)
Arguments
object |
A |
newdata |
Data frame with columns matching the RHS of the fitted model.
Predictions are computed for the rows of |
t_seq |
Numeric vector of times at which to evaluate predictions. |
level |
Confidence level for pointwise intervals (default 0.95). |
... |
Unused. |
Value
A data frame with columns id (row index in newdata),
t, mu (predicted while-alive rate), and CI columns lb, ub.
Examples
ex_dt <- crt_dt[crt_dt$cluster %in% c(1,2,3,4,7,10), ]
fit <- WA_fit(survival::Surv(time, status) ~ trt + Z1 + Z2,
data = ex_dt, id="id", cluster="cluster",
knots=seq(0, max(ex_dt$time), length.out=6),
tau_grid=seq(0, max(ex_dt$time), length.out=6),
basis="bz", degree=1, link="log",
w_recur=c(1,1), w_term=2, ipcw="km")
nd <- unique(ex_dt[, c("trt","Z1","Z2")])
pred <- predict(fit, newdata = nd, t_seq = seq(0, max(fit$tau_grid), by = 0.2))
head(pred)
Summarize a WA object
Description
Summarize a WA object
Usage
## S3 method for class 'WA'
summary(object, ...)
Arguments
object |
A |
... |
Unused. |
Value
An object of class "summary.WA" containing configuration and a
coefficient table with estimates, standard errors, and z-scores.
Examples
ex_dt <- crt_dt[crt_dt$cluster %in% c(1,2,3,4,7,10), ]
fit <- WA_fit(survival::Surv(time, status) ~ trt + Z1 + Z2,
data = ex_dt, id="id", cluster="cluster",
knots=seq(0, max(ex_dt$time), length.out=6),
tau_grid=seq(0, max(ex_dt$time), length.out=6),
basis="bz", degree=1, link="log",
w_recur=c(1,1), w_term=2, ipcw="km")
summary(fit)