Type: Package
Title: Model-Robust Standardization in Cluster-Randomized Trials
Version: 0.1.1
Description: Implements model-robust standardization for cluster-randomized trials (CRTs). Provides functions that standardize user-specified regression models to estimate marginal treatment effects. The targets include the cluster-average and individual-average treatment effects, with utilities for variance estimation and example simulation datasets. Methods are described in Li, Tong, Fang, Cheng, Kahan, and Wang (2025) <doi:10.1002/sim.70270>.
License: GPL-3
Encoding: UTF-8
LazyData: true
Depends: R (≥ 4.1)
Imports: dplyr (≥ 1.0.0), geepack (≥ 1.3-2), lme4 (≥ 1.1-25), nlme (≥ 3.1-150), magrittr (≥ 2.0.0), rlang (≥ 1.0.0), stats
VignetteBuilder: knitr
URL: https://github.com/deckardt98/MRStdCRT
BugReports: https://github.com/deckardt98/MRStdCRT/issues
Suggests: knitr, rmarkdown
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-11-07 00:42:23 UTC; HP
Author: Jiaqi Tong [aut], Changjun Li [aut, cre], Xi Fang [aut], Chao Cheng [aut], Bingkai Wang [aut], Fan Li [aut]
Maintainer: Changjun Li <changjun.li@yale.edu>
Repository: CRAN
Date/Publication: 2025-11-11 21:40:18 UTC

Model-robust Standardization Estimators for the Cluster Randomized Trials

Description

This function performs cluster randomized trials (CRT) analysis using model-robust standardization estimators to estimate the cluster-average and individual-average treatment effect. It handles different outcome mean models (GLM, LMM, GEE, GLMM) and supports both continuous, binary, and count outcomes with options for different correlation structures and scales (risk difference, risk ratio and odds ratio).

Usage

MRStdCRT_fit(
  formula,
  data,
  cluster,
  trt,
  trtprob = rep(0.5, nrow(data)),
  method,
  family = gaussian(link = "identity"),
  corstr,
  scale,
  jack = 1,
  alpha = 0.05
)

Arguments

formula

A formula for the outcome mean model, including covariates.

data

A data frame where categorical variables should already be converted to dummy variables.

cluster

A string representing the column name of the cluster ID in the data frame.

trt

A string representing the column name of the treatment assignment per cluster (0=control, 1=treatment).

trtprob

A vector of treatment probabilities per cluster (for each individual), conditional on covariates. Default is rep(0.5,nrow(data))

method

A string specifying the outcome mean model. Possible values are: - 'GLM': generalized linear model on cluster-level means (binary/continuous outcome). - 'LMM': linear mixed model on individual-level observations (continuous outcome). - 'GEE': marginal models fitted by generalized estimating equations. - 'GLMM': generalized linear mixed model.

family

The link function for the outcome. Can be one of the following: - 'gaussian(link = "identity")': for continuous outcomes. Default is gaussian("identity"). - 'binomial(link = "logit")': for binary outcomes. - 'poisson(link = "log")': for count outcomes. - 'gaussian(link = "logit")': for binary outcomes with logit link to model the genealized linear model.

corstr

A string specifying the correlation structure for GEE models (e.g., "exchangeable", "independence").

scale

A string specifying the risk measure of interest. Can be 'RD' (risk difference), 'RR' (relative risk), or 'OR' (odds ratio).

jack

A numeric value (1, 2, or 3) specifying the type of jackknife standard error estimate. Type 1 is the standard jackknife, and type 3 is recommended for small numbers of clusters. Default is 1.

alpha

A numeric value for the type-I error rate. Default is 0.05.

Value

A list with the following components: - 'estimate': A summary table of estimates. - 'm': Number of clusters. - 'N': Total number of observations per cluster. - 'family': The family used for the model. - 'model': The method used for the outcome mean model.

Examples


utils::data("ppact", package = "MRStdCRT")

fit <- MRStdCRT_fit(
  formula = PEGS ~ AGE + FEMALE + comorbid + Dep_OR_Anx + pain_count + PEGS_bl +
    BL_benzo_flag + BL_avg_daily + satisfied_primary + n,
  data     = ppact,
  cluster  = "CLUST",
  trt      = "INTERVENTION",
  trtprob  = NULL,
  method   = "GEE",
  corstr   = "independence",
  scale    = "RR"
)
summary(fit)


Model-robust standardization in CRT Point Estimate

Description

This function calculates a model-robust point estimate for a clustered randomized trial (CRT).

Usage

MRStdCRT_point(
  formula,
  data,
  cluster,
  trt,
  trtprob,
  family = gaussian(link = "identity"),
  corstr,
  method = "GLM",
  scale
)

Arguments

formula

A formula for the outcome mean model, including covariates.

data

A data frame where categorical variables should already be converted to dummy variables.

cluster

A string representing the column name of the cluster ID in the data frame.

trt

A string representing the column name of the treatment assignment per cluster.

trtprob

A vector of treatment probabilities per cluster (for each individual), conditional on covariates. Default is rep(0.5,nrow(data))

family

The link function for the outcome. Can be one of the following: - 'gaussian(link = "identity")': for continuous outcomes. Default is gaussian("identity") - 'binomial(link = "logit")': for binary outcomes. - 'poisson(link = "log")': for count outcomes. - 'gaussian(link = "logit")': for binary outcomes with logit link to model the genealized linear model.

corstr

A string specifying the correlation structure for GEE models (e.g., "exchangeable", "independence").

method

A string specifying the outcome mean model. Possible values are: - 'GLM': Generalized linear model on cluster-level means (continous/binary outcome). - 'LMM': linear mixed model on individual-level observations (continuous outcome). - 'GEE': marginal models fitted by generalized estimating equations. - 'GLMM': generalized linear mixed model.

scale

A string specifying the risk measure of interest. Can be 'RD' (risk difference), 'RR' (relative risk), or 'OR' (odds ratio).

Value

A list with the following components: - 'data1': A data frame containing all individual-level observations. - 'data_clus': A data frame contaning all cluster-level summaries. - 'c(cate,iate,test_NICS)': A vector containing: (i) cate: point estimate for cluster-average treatment effect; (ii) iate: point estimate for individual-average treatment effect; (iii) test_NICS: value of test statistics for non-informative cluster sizes.


Example Dataset: Simulated CRT (binary outcome)

Description

A simulated dataset for demonstrating MRStdCRT with a binary outcome. Treatment is assigned at the cluster level and is constant within cluster.

Usage

data(data_sim_binary)

Format

A data frame with the following variables (10 columns):

A

Cluster-level treatment assignment (0/1), constant within cluster.

H1

Cluster-level covariate 1.

H2

Cluster-level covariate 2.

N

Cluster size recorded on each row (repeats within cluster).

X1

Individual-level covariate 1 (numeric).

X2

Individual-level covariate 2 (numeric or binary coded 0/1).

Y

Observed binary outcome (0/1).

Y0

Potential outcome under control (0/1).

Y1

Potential outcome under treatment (0/1).

cluster_id

Cluster identifier (integer or factor), constant within cluster.

Source

Simulated data included with the package for examples.

Examples

data(data_sim_binary)
head(data_sim_binary)
with(data_sim_binary, table(A, Y))

Example Dataset: Simulated CRT (continuous outcome)

Description

A simulated dataset for demonstrating MRStdCRT with a continuous outcome. Treatment is assigned at the cluster level and is constant within cluster.

Usage

data(data_sim_continuous)

Format

A data frame with the following variables (10 columns):

A

Cluster-level treatment assignment (0/1), constant within cluster.

H1

Cluster-level covariate 1.

H2

Cluster-level covariate 2.

N

Cluster size recorded on each row (repeats within cluster).

X1

Individual-level covariate 1 (numeric).

X2

Individual-level covariate 2 (numeric or binary coded 0/1).

Y

Observed continuous outcome.

Y0

Potential outcome under control (continuous).

Y1

Potential outcome under treatment (continuous).

cluster_id

Cluster identifier (integer or factor), constant within cluster.

Source

Simulated data included with the package for examples.

Examples

data(data_sim_continuous)
head(data_sim_continuous)
table(data_sim_continuous$cluster_id)

Example Dataset: PPACT

Description

The Pain Program of Active Coping and Training(PPACT) is a large-scale, mixed methods, cluster-randomized trial (CRT) to compare the effectiveness of an integrated, interdisciplinary program versus usual care in treating patients with chronic pain on long-term opioid treatment (CP-LOT). The primary outcome is the impact of pain (assessed using the PEGS)

Usage

ppact

Format

A data frame with primary outcome, cluster-level, individual level covariates:

SID

Study ID

CLUST

Cluster

INTERVENTION

Study arm

AGE

Patient age at randomization

FEMALE

Participant gender

comorbid

Diagnosis of 2 or more of the chronic medical conditions in 6 month prior to randomization

Dep_OR_Anx

Anxiety and/or depression diagnosis in 6 months prior to randomization

pain_count

Number of different pain types from which participants have diagnoses in 12 months prior to randomization

BL_benzo_flag

Benzodiazepine dispensed in 6 months prior to randomization

BL_avg_daily

Average morphine miligram equivalents dose per day in 6 month prior to randomization

PEGS_bl

PEGS score at baseline

satisfied_primary

Satisfaction with primary care services in prior 3 months

PEGS

PEGS score

n

cluster size

Source

ClinicalTrials.gov: NCT02113592, The manuscript of the study's main outcomes is published in the Annals of Internal Medicine (https://doi.org/10.7326/M21-1436).


Summarize a MRS_obj Fit

Description

Print a concise summary of a model-robust standardization CRT fit, including the c-ATE and i-ATE estimates with SEs and CIs.

Usage

## S3 method for class 'MRS_obj'
summary(object, ...)

Arguments

object

An object of class MRS_obj, as returned by MRStdCRT_fit().

...

Additional arguments (currently ignored).

Value

Invisibly returns the original MRS_obj object, after printing: