Using FLORAL for survival models with longitudinal microbiome data

library(FLORAL)
library(dplyr)
library(patchwork)
library(survival)
set.seed(8192024)

In this vignette, we illustrate how to apply FLORAL to fit a Cox model with longitudinal microbiome data. Due to limited availability of public data sets with survival information, we use simulated data for illustrative purposes.

Data simulation

We will use the built-in simulation function simu() to generate longitudinal compositional features and the corresponding time-to-event. The underlying methodology used for the simulation is based on a piece-wise exponential distribution as described by Hendry 2014.

By default, the first 10 features out of the 500 features simulated below are associated with the time-to-event.


simdat <- simu(n=200, # sample size
               p=500, # number of features
               model="timedep",
               pct.sparsity = 0.8, # proportion of zeros
               rho=0, # feature-wise correlation
               longitudinal_stability = TRUE # choose to simulate longitudinal features with stable trajectories
)

With the simulated data, the log-ratio lasso Cox model with time-dependent features can be fitted by running the following function. Here we provide a detailed description on each arguments:


fit <- FLORAL(x=simdat$xcount,
              y=Surv(simdat$data_unique$t,simdat$data_unique$d),
              family="cox",
              longitudinal = TRUE,
              id = simdat$data$id,
              tobs = simdat$data$t0,
              progress=FALSE,
              plot=TRUE)

fit$selected
#> $min
#> [1] "taxa1"   "taxa2"   "taxa27"  "taxa494" "taxa5"   "taxa6"   "taxa8"  
#> [8] "taxa9"  
#> 
#> $`1se`
#> [1] "taxa1" "taxa5" "taxa6" "taxa8" "taxa9"
#> 
#> $min.2stage
#> [1] "taxa2"   "taxa27"  "taxa494" "taxa5"   "taxa6"   "taxa8"   "taxa9"  
#> 
#> $`1se.2stage`
#> [1] "taxa5" "taxa6" "taxa8" "taxa9"

The list of selected features is saved in fit$selected as shown above.

To appropriately prepare the data in practice, we have the following recommendations: