| Title: | Tidy Intensive Longitudinal Data Analysis |
| Version: | 0.0.1 |
| Description: | An opinionated, tidyverse-native toolkit for intensive longitudinal data (ILD). Encodes time structure, enforces within-between decomposition, provides spacing-aware lags, and integrates diagnostics and visualization. Use ild_prepare(), ild_center(), ild_lag(), and related functions for a unified pipeline from raw EMA/diary data to interpretable models. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Depends: | R (≥ 4.0.0) |
| Imports: | tibble, dplyr, lubridate, rlang, lme4, nlme, ggplot2 |
| Suggests: | testthat (≥ 3.0.0), roxygen2, knitr, broom.mixed |
| VignetteBuilder: | knitr |
| Collate: | 'ild-class.R' 'utils.R' 'ild_prepare.R' 'ild_summary.R' 'ild_center.R' 'ild_lag.R' 'ild_spacing_class.R' 'ild_missing_pattern.R' 'ild_check_lags.R' 'ild_lme.R' 'ild_diagnostics.R' 'ild_manifest.R' 'ild_plot.R' 'ild_simulate.R' 'data.R' 'broom.R' |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-02-11 19:46:04 UTC; alexanderlitovchenko |
| Author: | Alex Litovchenko [aut, cre] |
| Maintainer: | Alex Litovchenko <al4877@columbia.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-13 16:40:02 UTC |
Coerce to ILD object
Description
If the object already has the required '.ild_*' columns and attributes, validates and returns it (with ild_tbl class if missing). Otherwise errors.
Usage
as_ild(x)
Arguments
x |
A data frame or tibble that may already be ILD-shaped. |
Value
An ILD tibble (invisibly validated).
Tidy and augment ild_lme fits with broom.mixed
Description
These S3 methods delegate to [broom.mixed::tidy()] and [broom.mixed::augment()]
on the underlying model object so that ild_lme fits work in tidy workflows.
Package broom.mixed must be attached (e.g. library(broom.mixed)).
Usage
tidy.ild_lme(x, ...)
augment.ild_lme(x, ...)
Arguments
x |
A fitted model from [ild_lme()]. |
... |
Passed to |
Value
Same as the corresponding broom.mixed method.
Example EMA-style intensive longitudinal dataset
Description
A small simulated dataset with 10 persons and 14 observations per person, irregular timing, and two variables (mood, stress). For use in examples and vignettes. Use [ild_prepare()] to convert to an ILD object.
Format
A data frame with 140 rows and 4 columns:
- id
Person identifier (1–10).
- time
POSIXct timestamp (irregular within person).
- mood
Simulated mood score.
- stress
Simulated stress score.
Source
Simulated with a fixed seed (12345) for reproducibility.
Bundle a result with a reproducibility manifest
Description
Combines a result (e.g. a fit from [ild_lme()] or output from
[ild_diagnostics()]) with a manifest and optional label for one-shot
saving. Typical use: saveRDS(ild_bundle(fit, label = "model_ar1"), "run.rds").
You can build a manifest with [ild_manifest()] and pass scenario
(e.g. from [ild_summary()]) and seed before bundling.
Usage
ild_bundle(result, manifest = NULL, label = NULL)
Arguments
result |
Any object (e.g. fitted model, diagnostics list). |
manifest |
List. Reproducibility manifest from [ild_manifest()].
If |
label |
Optional character. Short label for the run (e.g.
|
Value
A list with elements result, manifest, label,
suitable for [saveRDS()].
Examples
dat <- ild_prepare(ild_simulate(seed = 1), "id", "time")
fit <- ild_lme(y ~ 1 + (1 | id), dat, ar1 = FALSE, warn_no_ar1 = FALSE)
b <- ild_bundle(fit, label = "ar1")
names(b)
b <- ild_bundle(fit, manifest = ild_manifest(seed = 1, scenario = list(n_obs = 50)), label = "run1")
Within-person and between-person decomposition (centering)
Description
For each selected variable, computes the person mean (between-person component) and the within-person deviation (variable minus person mean). Use '*_wp' at level-1 and '*_bp' at level-2 or in cross-level interactions to avoid ecological fallacy and conflation bias.
Usage
ild_center(x, ..., type = c("person_mean", "grand_mean"))
Arguments
x |
An ILD object (see [is_ild()]). |
... |
Variables to center (tidy-select). Unquoted names or a single character vector of column names. |
type |
Character. '"person_mean"' (default) for person-mean centering (x_bp, x_wp); '"grand_mean"' for grand-mean centering (x_gm, x_wp_gm). |
Value
The same ILD tibble with additional columns: for each variable 'v', 'v_bp' (person mean), 'v_wp' (v - v_bp). If 'type = "grand_mean"', also 'v_gm' and optionally 'v_wp_gm'. ILD attributes are preserved.
Check lag variable validity (gap-aware)
Description
Given an ILD object and lag variable names, reports how many lagged
values are valid vs invalid (NA because the time distance to the
lagged row exceeded a threshold). Useful to audit lag columns before
modeling without re-specifying max_gap.
Usage
ild_check_lags(x, lag_vars = NULL, max_gap = NULL)
Arguments
x |
An ILD object (see [is_ild()]) that contains lag columns
(e.g. from [ild_lag()] with |
lag_vars |
Character vector of lag column names (e.g. |
max_gap |
Numeric. Threshold used to define invalid (same units as
|
Value
A data frame with one row per lag variable: var, n_valid,
n_invalid, n_first (rows that are first per person, so no lag),
n_total, pct_valid (among rows that could have a lag, i.e. excluding first).
Residual diagnostics for an ILD model
Description
Computes residual ACF (by person and/or pooled), residual vs fitted, residual vs time, and optional Q-Q. For 'ild_lme' models with 'ar1 = TRUE', reports the estimated AR/CAR parameter.
Usage
ild_diagnostics(object, data = NULL, by_id = TRUE, ...)
Arguments
object |
A fitted model from [ild_lme()] (or an object with 'residuals()', and optional 'fitted()'; if not 'ild_lme', pass 'data' with '.ild_id' and '.ild_time_num' or '.ild_seq'). |
data |
Optional. ILD data (required if 'object' is not from [ild_lme()]). |
by_id |
Logical. If 'TRUE', compute ACF within each person (default 'TRUE'). |
... |
Unused. |
Value
A list with: 'acf' (ACF values or list by id), 'residuals', 'fitted', 'id', 'time' (or 'seq'), 'ar1_param' (if applicable), and 'plot' (a ggplot or list of plots).
Spacing-aware lag within person
Description
Computes lagged values within each person. Use this instead of [dplyr::lag()], which assumes equal spacing and no gaps and is unsafe for irregular ILD.
Usage
ild_lag(
x,
...,
n = 1L,
mode = c("index", "gap_aware", "time_window"),
max_gap = Inf,
window = NULL,
resolution = c("closest_prior", "last_in_window", "mean_in_window")
)
Arguments
x |
An ILD object (see [is_ild()]). |
... |
Variables to lag (tidy-select). Unquoted names or selection. |
n |
Integer. Lag order (default 1 = previous observation). |
mode |
Character. |
max_gap |
Numeric. For |
window |
Numeric. For |
resolution |
Character. For |
Value
The same ILD tibble with new lag columns. ILD attributes preserved.
Fit a linear mixed-effects model to ILD
Description
When 'ar1 = FALSE', fits with [lme4::lmer()] (no residual correlation). When 'ar1 = TRUE', fits with [nlme::lme()] using a residual correlation structure: CAR1 (continuous-time) by default for irregular spacing, or AR1 when spacing is regular-ish. Use [ild_spacing_class()] to inform the choice; override with 'correlation_class'.
Usage
ild_lme(
formula,
data,
ar1 = FALSE,
correlation_class = c("auto", "AR1", "CAR1"),
random = ~1 | .ild_id,
warn_no_ar1 = TRUE,
...
)
Arguments
formula |
Fixed-effects formula. For 'ar1 = TRUE', must be fixed-only (e.g. 'y ~ x'); random structure is set to '~ 1 | .ild_id' internally. For 'ar1 = FALSE', formula may include random effects (e.g. 'y ~ x + (1|id)'). |
data |
An ILD object (see [is_ild()]). |
ar1 |
Logical. If 'TRUE', fit with nlme and residual AR1/CAR1 correlation; if 'FALSE', fit with lme4 (no residual correlation). |
correlation_class |
Character. '"auto"' (default) uses [ild_spacing_class()] to choose CAR1 (irregular-ish) or AR1 (regular-ish). Use '"CAR1"' or '"AR1"' to override. |
random |
For 'ar1 = TRUE', the random effects formula (default '~ 1 | .ild_id'). Must use '.ild_id' as grouping for correlation to match. |
warn_no_ar1 |
If 'TRUE' (default), warn when 'ar1 = FALSE' that temporal autocorrelation is not modeled. |
... |
Passed to [lme4::lmer()] or [nlme::lme()]. |
Value
A fitted model object (class 'lmerMod' or 'lme') with attribute 'ild_data' (the ILD data) and 'ild_ar1' (logical). When 'ar1 = TRUE', the returned object has class 'ild_lme' prepended for [ild_diagnostics()] and [ild_plot()].
Create a reproducibility manifest
Description
Captures timestamp, optional seed, optional scenario fingerprint, session info, and optional git SHA for use when saving or serializing results (e.g. after [ild_lme()] or [ild_diagnostics()]). The return value is a serializable list suitable for [saveRDS()] or [ild_bundle()].
Usage
ild_manifest(
seed = NULL,
scenario = NULL,
include_session = TRUE,
include_git = FALSE,
git_path = "."
)
Arguments
seed |
Optional integer. Seed used for the run (e.g. from [ild_simulate()] or set before fitting). Not captured automatically; pass explicitly if you want it in the manifest. |
scenario |
Optional. Named list or character string describing the run (e.g. formula, n_obs, n_id, ar1). Build from [ild_summary()] or a short list when calling after [ild_lme()] / [ild_diagnostics()]. |
include_session |
Logical. If 'TRUE' (default), include [utils::sessionInfo()] in the manifest. Set to 'FALSE' to reduce size. |
include_git |
Logical. If 'TRUE', attempt to record the current
git commit SHA from |
git_path |
Character. Path to the repository root (default
|
Value
A list with elements timestamp (POSIXct), seed
(integer or NULL), scenario (as provided or NULL),
session_info (list from sessionInfo() or NULL),
git_sha (length-1 character or NA). All elements are
serializable.
Examples
m <- ild_manifest()
names(m)
m <- ild_manifest(seed = 42, scenario = list(n_obs = 100, formula = "y ~ x"))
m$seed
m$scenario
Get ILD metadata attributes
Description
Returns the metadata attributes set by [ild_prepare()]: user-facing id/time column names, gap threshold, n_units, n_obs, and spacing (descriptive stats only).
Usage
ild_meta(x)
Arguments
x |
An ILD object (see [is_ild()]). |
Value
A named list of metadata (ild_id, ild_time, ild_gap_threshold,
ild_n_units, ild_n_obs, ild_spacing). ild_spacing includes overall
stats and may contain by_id, a tibble of per-person spacing stats.
Summarize missingness pattern in ILD
Description
Returns a tabular summary of missingness by person and/or by variable. Complements [ild_summary()] and supports checking data before modeling.
Usage
ild_missing_pattern(x, vars = NULL)
Arguments
x |
An ILD object (see [is_ild()]). |
vars |
Optional character vector of variable names to summarize. If missing, all non-.ild_* columns (except id/time) are used. |
Value
A list with: - 'by_id': data frame with one row per person, columns id and for each var the count (or proportion) of non-missing and missing. - 'overall': named vector or list of overall missing counts/proportions per variable. - 'n_complete': number of rows with no missing in selected vars.
ILD-specific plots
Description
Produces trajectory (spaghetti), heatmap, gaps, and (if a fitted model is provided) fitted vs observed and residual ACF.
Usage
ild_plot(
x,
type = c("trajectory", "heatmap", "gaps", "missingness", "fitted", "residual_acf"),
var = NULL,
id_var = ".ild_id",
time_var = c(".ild_time_num", ".ild_seq"),
max_ids = 20L,
seed = 42L,
...
)
Arguments
x |
An ILD tibble or a fitted [ild_lme()] model. |
type |
Character. One of '"trajectory"', '"heatmap"', '"gaps"', '"missingness"' (person x time missingness), '"fitted"' (requires fitted model), '"residual_acf"' (requires fitted model). |
var |
For 'trajectory' or 'heatmap', the variable to plot (optional; if missing and only one non-.ild_* column exists, it is used). |
id_var |
For trajectory, variable used for grouping (default '.ild_id'). |
time_var |
For trajectory/gaps, x-axis: '.ild_time_num' or '.ild_seq'. |
max_ids |
For trajectory, max number of persons to plot (sampled if larger; default 20). Set to 'Inf' to plot all. |
seed |
Integer. Seed for sampling ids when 'max_ids' is set (default 42). |
... |
Unused. |
Value
A ggplot object, or a list of plots for diagnostics.
Prepare a data frame as an ILD (intensive longitudinal data) object
Description
Validates and encodes longitudinal structure: parses time, sorts by id and time, handles duplicate timestamps, and adds internal columns ('.ild_*') and metadata. All downstream functions assume the result of 'ild_prepare()'.
Usage
ild_prepare(
data,
id,
time,
gap_threshold = Inf,
duplicate_handling = c("first", "last", "error", "collapse"),
collapse_fn = NULL
)
Arguments
data |
A data frame or tibble with at least an id and a time column. |
id |
Character. Name of the subject/unit identifier column. |
time |
Character. Name of the time column (Date, POSIXct, or numeric). |
gap_threshold |
Numeric. Time distance above which an interval is flagged as a gap ('.ild_gap' TRUE). Same units as the numeric time (e.g. seconds if time is POSIXct). Use 'Inf' to disable gap flagging. |
duplicate_handling |
Character. How to handle duplicate timestamps
within the same id: '"first"' (keep first), '"last"' (keep last),
'"error"' (stop with an error), '"collapse"' (aggregate with |
collapse_fn |
Named list of functions, one per variable to collapse.
Used only when |
Value
An ILD tibble with '.ild_*' columns and metadata attributes.
Spacing metadata (see [ild_meta()]) includes overall stats and a
by_id tibble of per-person spacing stats (median_dt, iqr_dt,
n_intervals, pct_gap). Use [ild_summary()] to inspect and check gap
flags before modeling.
Simulate simple ILD for examples and tests
Description
Generates a small tibble with id, time, and one or more variables, optionally with irregular spacing. Use [ild_prepare()] after to get a proper ILD object.
Usage
ild_simulate(n_id = 5L, n_obs_per = 10L, irregular = FALSE, seed = 42L)
Arguments
n_id |
Integer. Number of persons (default 5). |
n_obs_per |
Integer. Observations per person (default 10). |
irregular |
Logical. If 'TRUE', add random jitter to time (default 'FALSE'). |
seed |
Integer. Random seed for reproducibility (default 42). |
Value
A data frame with columns 'id', 'time' (POSIXct or numeric), and 'y'.
Examples
d <- ild_simulate(n_id = 3, n_obs_per = 5, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
Classify spacing as regular-ish vs irregular-ish
Description
Returns a simple classification for use in documentation or when choosing correlation structure (e.g. AR1 vs CAR1 in [ild_lme()]). The rule is documented and overridable via arguments. Does not change core ILD behavior.
Usage
ild_spacing_class(x, cv_threshold = 0.2, pct_gap_threshold = 10)
Arguments
x |
An ILD object (see [is_ild()]). |
cv_threshold |
Numeric. Coefficient of variation of within-person intervals above which spacing is "irregular-ish" (default 0.2). |
pct_gap_threshold |
Numeric. Percent of intervals flagged as gaps above which spacing is "irregular-ish" (default 10). |
Value
Character: '"regular-ish"' or '"irregular-ish"'.
One-shot summary of an ILD object
Description
Reports number of persons, number of observations, time range, descriptive spacing (median/IQR of intervals, percent gaps), and duplicate info. Uses [ild_meta()] and '.ild_*' columns only. No hard "regular"/"irregular" label; use [ild_spacing_class()] for that.
Usage
ild_summary(x)
Arguments
x |
An ILD object (see [is_ild()]). |
Value
A list with elements: 'n_units', 'n_obs', 'time_range' (min/max of '.ild_time_num'), 'spacing' (from metadata), 'n_gaps' (sum of '.ild_gap' TRUE), 'pct_gap' (from spacing if available).
Check if an object is a valid ILD tibble
Description
Returns TRUE if the object has all required '.ild_*' columns and 'ild_*' metadata attributes (as set by [ild_prepare()]).
Usage
is_ild(x)
Arguments
x |
Any object. |
Value
Logical.
Validate an ILD object and error if invalid
Description
Checks presence and types of '.ild_*' columns and 'ild_*' attributes. Errors with a clear message if anything is missing or invalid.
Usage
validate_ild(x)
Arguments
x |
Object to validate (expected to be an ILD tibble). |
Value
Invisibly returns 'x' if valid.