All estimation in seine rests on the Conditional
Average Representativeness (CAR) assumption: that individual outcomes
are mean-independent of predictor group membership, conditional on the
observed covariates. ei_test_car() provides a formal test
of this assumption. However, the test has important limitations that
users should understand before interpreting its results.
The CAR assumption implies that the conditional expectation function
(CEF) of the aggregate outcome takes a specific partially linear form.
ei_test_car() tests this implication by comparing a fully
nonparametric estimate of the CEF to one constrained to that form, and
evaluating the goodness-of-fit difference via a Wald statistic. A
significant result indicates that the data are inconsistent with the
partially linear structure implied by CAR.
By default, the p-value is computed via a permutation test (Kennedy-Cade 1996) on the Wald statistic. For large samples (2000 or more observations), the asymptotic chi-squared distribution, which is faster, is used by default instead.
library(seine)
data(elec_1968)
spec = ei_spec(
elec_1968,
predictors = vap_white:vap_other,
outcome = pres_dem_hum:pres_abs,
total = pres_total,
covariates = c(state, pop_city:pop_rural, farm:educ_coll, inc_00_03k:inc_25_99k),
preproc = function(x) {
x = model.matrix(~ 0 + ., x) # convert factors to dummies
bases::b_bart(x, trees = 200)
}
)
ei_test_car(spec, iter = 200) # use iter = 1000 or more in practice
#> # A tibble: 4 × 4
#> outcome W df p.value
#> <chr> <dbl> <int> <dbl>
#> 1 pres_dem_hum 388. 157 0.005
#> 2 pres_rep_nix 253. 157 0.005
#> 3 pres_ind_wal 443. 157 0.005
#> 4 pres_abs 142. 157 0.665The output is a data frame with one row per outcome variable. The
W column contains the Wald statistic, df its
degrees of freedom, and p.value the p-value for each
outcome. P-values are not adjusted for multiple testing by default; pass
them to p.adjust() if a correction is desired.
ei_test_car() is a useful diagnostic, but its
limitations are substantial and should be kept in mind when interpreting
the results.
The test only checks a necessary implication of CAR, not CAR itself. CAR is a condition on individual-level data, but only aggregate-level data are observed. The test asks whether the aggregate CEF is inconsistent with CAR; a failure to reject does not mean CAR holds, only that the data are not in conflict with one of its implications. There may be many forms of individual-level confounding that leave the aggregate CEF approximately in the partially linear form, and which the test will not detect.
The test requires a rich basis expansion to have
power. If the preproc argument to
ei_spec() does not include a flexible basis expansion of
the covariates and predictors, the test will have little power to detect
violations of CAR. An interaction between the predictors and covariates
that is not captured by the basis will not be flagged. A warning is
issued if preproc is absent. In general, the richer the
basis expansion, the better the test can detect violations, but also the
more data are needed for the test statistic to be well-calibrated.
The test may be anti-conservative in small samples.
The Wald statistic is only asymptotically chi-squared, and the
permutation approximation of the null distribution may also be imperfect
when the dimensionality of the basis expansion is large relative to the
sample size. In practice, this means the test may reject too often in
small samples. The undersmooth argument controls how
aggressively the partially linear component is estimated, and increasing
it can improve Type I error control at the cost of power.
A significant result does not prevent estimation.
Rejecting the null means the data suggest CAR does not hold exactly. It
does not mean that estimation with ei_est() is impossible
or useless, only that the estimates may be biased. In that case, the
sensitivity analysis tools in vignette("sensitivity") are
important for assessing how much the conclusions depend on the
assumption. Conversely, a non-significant result is weak evidence that
the assumption holds and does not substitute for careful subject-matter
reasoning about what confounders might be present.
Helwig, N. E. (2022). Robust permutation tests for penalized splines. Stats, 5(3), 916-933.
Kennedy, P. E., & Cade, B. S. (1996). Randomization tests for multiple regression. Communications in Statistics-Simulation and Computation, 25(4), 923-936.
McCartan, C., & Kuriwaki, S. (2025+). Identification and semiparametric estimation of conditional means from aggregate data. Working paper arXiv:2509.20194.
This vignette was originally produced by a large language model, and then reviewed and edited by the package authors.