In intensive longitudinal data, predictors often vary both
between persons (e.g. some people are higher on
average) and within person (e.g. momentary
fluctuations). Using the raw variable in a multilevel model conflates
these two sources. tidyILD’s ild_center() makes the
decomposition explicit.
Use the WP component at level 1 (within-person effect) and the BP component at level 2 or in cross-level interactions to avoid ecological fallacy and conflation bias.
library(tidyILD)
d <- ild_simulate(n_id = 5, n_obs_per = 8, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
x <- ild_center(x, y)
# New columns: y_bp (person mean), y_wp (deviation from person mean)
head(x[, c("id", "y", "y_bp", "y_wp")])
#> # A tibble: 6 × 4
#> id y y_bp y_wp
#> <int> <dbl> <dbl> <dbl>
#> 1 1 -1.04 -0.154 -0.882
#> 2 1 -0.283 -0.154 -0.129
#> 3 1 -0.0573 -0.154 0.0969
#> 4 1 -0.0386 -0.154 0.116
#> 5 1 -0.379 -0.154 -0.225
#> 6 1 0.629 -0.154 0.784ILD often has irregular time spacing (e.g. EMA prompts at random
times). Using dplyr::lag() assumes equal spacing and can
produce misaligned lags. Use ild_lag() with
mode = "gap_aware" so that lags are set to NA when the time
distance to the previous observation exceeds a threshold.
d <- ild_simulate(n_id = 3, n_obs_per = 6, irregular = TRUE, seed = 2)
x <- ild_prepare(d, id = "id", time = "time")
x <- ild_lag(x, y, mode = "gap_aware", max_gap = 4000)
# Compare: .ild_dt (interval) and y_lag1 (NA after large gaps)
x[, c(".ild_id", ".ild_dt", "y", "y_lag1")]
#> # A tibble: 18 × 4
#> .ild_id .ild_dt y y_lag1
#> <int> <dbl> <dbl> <dbl>
#> 1 1 NA -0.335 NA
#> 2 1 3843. -0.552 -0.335
#> 3 1 3445. 0.955 -0.552
#> 4 1 3114. -1.01 0.955
#> 5 1 4531. 0.715 NA
#> 6 1 3600. 0.394 0.715
#> 7 2 NA 0.924 NA
#> 8 2 4000. 0.745 NA
#> 9 2 3161. 1.66 0.745
#> 10 2 3698. 0.119 1.66
#> 11 2 3603. 1.61 0.119
#> 12 2 3223. 1.89 1.61
#> 13 3 NA 0.993 NA
#> 14 3 2904. -0.155 0.993
#> 15 3 3869. 1.42 -0.155
#> 16 3 4138. 0.995 NA
#> 17 3 3747. 1.79 0.995
#> 18 3 2699. 1.62 1.79ild_spacing_class() returns “regular-ish” or
“irregular-ish” based on the variability of intervals and the proportion
of large gaps. The rule is overridable. This classification can inform
the choice of correlation structure in ild_lme() (AR1 for
regular-ish, CAR1 for irregular-ish).
ild_summary(x)$spacing
#> $median_dt
#> [1] 3603.228
#>
#> $iqr_dt
#> [1] 663.6274
#>
#> $n_intervals
#> [1] 15
#>
#> $pct_gap
#> [1] NA
#>
#> $by_id
#> # A tibble: 3 × 5
#> id median_dt iqr_dt n_intervals pct_gap
#> <int> <dbl> <dbl> <int> <dbl>
#> 1 1 3600. 398. 5 NA
#> 2 2 3603. 475. 5 NA
#> 3 3 3747. 965. 5 NA
ild_spacing_class(x)
#> [1] "regular-ish"