Data formats

David Garcia-Callejas and cxr team

Introduction

The cxr package provides diverse functions to handle empirical datasets, and these need to be in a common format for their processing. Here we review the structure of the dataset included in the package, which conforms to the formats accepted by the different functions of cxr.

The Caracoles dataset

We include a dataset of an annual plant system subjected to spatial variability in a Mediterranean-type ecosystem of Southern Europe. Details of the ecosystem and sampling design can be consulted in Lanuza et al. (2018). The main data file contains, for each focal individual sampled, its reproductive success and the number of neighbors per plant species in a 7.5 cm buffer. Note that this format of data is not limited to plant species. In fact, the package is not taxonomically biased, meaning that observational data passed to cxr can contain any information of individual performance as a function of the interacting species’ relative frequency and density.

library(cxr)
data("neigh_list", package = "cxr")

You can check the structure of the data in the help file

?neigh_list
names(neigh_list)

##  [1] "BEMA" "CETE" "CHFU" "CHMI" "HOMA" "LEMA" "MEEL" "MESU" "PAIN" "PLCO"
## [11] "POMA" "POMO" "PUPA" "SASO" "SCLA" "SOAS" "SPRU"

head(neigh_list[[1]])

## # A tibble: 6 × 19
##   obs_ID fitness  BEMA  CETE  CHFU  CHMI  HOMA  LEMA  MEEL  MESU  PAIN  PLCO
##    <int>   <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1      1     116     1     0     0     0     0     0     0     0     0     0
## 2      5      68     0     0     0     0     0     1     0     0     0     0
## 3      9      36     0     0     0     0     0     0     0     0     0     0
## 4     14      64     0     0     0     0     0     5     0     0     0     0
## 5     19     144     2     0     0     0     0     0     0     0     0     0
## 6     27      56     1     0     0     0     0     4     0     0     0     0
## # ℹ 7 more variables: POMA <dbl>, POMO <dbl>, PUPA <dbl>, SASO <dbl>,
## #   SCLA <dbl>, SOAS <dbl>, SPRU <dbl>

This structure is the one accepted by different cxr functions, save for the ‘obs_ID’ column. In particular, cxr accepts a dataframe with a first numeric column named ‘fitness’ (constrained to positive values), and a variable number of numeric columns with the densities of neighbour taxa. Each row is taken to be an observation of a focal individual.

Additionally to individual fitness and neighbours, we also recorded the species abundance:

data("abundance", package = "cxr")
head(abundance)

##   plot subplot species individuals
## 1    1      A1    BEMA           1
## 2    1      A1    CETE           0
## 3    1      A1    CHFU           8
## 4    1      A1    CHMI           0
## 5    1      A1    HOMA          35
## 6    1      A1    LEMA           4

Abundances are stored per plot and subplot, in our spatially explicit design (see Lanuza et al. 2018 for details). In the neigh_list dataset, the obs_ID column relates each observation to the spatial coordinates of the system. The spatial_sampling dataset is a species list, in which each element contains the obs_ID of each observation and its spatial arrangement, i.e. the plot and subplot where it was taken.

data("spatial_sampling")
names(spatial_sampling)

##  [1] "BEMA" "CETE" "CHFU" "CHMI" "HOMA" "LEMA" "MEEL" "MESU" "PAIN" "PLCO"
## [11] "POMA" "POMO" "PUPA" "SASO" "SCLA" "SOAS" "SPRU"

head(spatial_sampling[["BEMA"]])

## # A tibble: 6 × 3
##   obs_ID  plot subplot
##    <int> <int> <chr>  
## 1      1     1 A1     
## 2      5     1 A2     
## 3      9     1 A3     
## 4     14     1 A4     
## 5     19     1 A5     
## 6     27     1 B1

We also provide seed soil survival and germination rates for each species. These species vital rates have been obtained independently, and are critical to parameterize a model describing the population dynamics of interacting annual plant species. For more information of how to estimate seed soil survival and germination rates see Godoy and Levine (2014). This file also includes the complete scientific name and abbreviation of each species. Such abbreviations are used as species identifier in all analyses and vignettes.

data("species_rates", package = "cxr")
species_rates

##    code                 species germination.rate seed.survival
## 1  BEMA         beta_macrocarpa             0.38          0.43
## 2  CETE  centaurium_tenuiflorum             0.53          0.10
## 3  CHFU    chamaemelum_fuscatum             0.80          0.38
## 4  CHMI      chamaemelum_mixtum             0.76          0.32
## 5  HOMA         hordeum_marinum             0.94          0.25
## 6  LEMA    Leontodon_maroccanus             0.89          0.33
## 7  MEEL       melilotus_elegans             0.59          0.60
## 8  MESU      melilotus_sulcatus             0.77          0.63
## 9  PAIN      parapholis_incurva             0.33          0.60
## 10 PLCO      plantago_coronopus             0.84          0.57
## 11 POMA     polypogon_maritimus             0.85          0.46
## 12 POMO polypogon_monspeliensis             0.91          0.49
## 13 PUPA      pulicaria_paludosa             0.84          0.55
## 14 SASO            salsola_soda             0.52          0.30
## 15 SCLA    scorzonera_laciniata             0.69          0.41
## 16 SOAS           sonchus_asper             0.60          0.29
## 17 SPRU       spergularia_rubra             0.44          0.41

The environmental covariate provided for this analysis is soil salinity, measured with a portable Time Domain Reflectometer (TDR). This technology measures the amount of salt dissolved in the soil water that is accessible to plant species. This environmental covariate has been estimated for each sub-plot. There are 36 subplots for each plot, and there are 9 plots in total structured along a micro-topographic gradient:

data("salinity_list", package = "cxr")
names(salinity_list)

##  [1] "BEMA" "CETE" "CHFU" "CHMI" "HOMA" "LEMA" "MEEL" "MESU" "PAIN" "PLCO"
## [11] "POMA" "POMO" "PUPA" "SASO" "SCLA" "SOAS" "SPRU"

head(salinity_list[[1]])

## # A tibble: 6 × 2
##   obs_ID salinity
##    <int>    <dbl>
## 1      1    0.986
## 2      5    1.03 
## 3      9    1    
## 4     14    0.883
## 5     19    0.842
## 6     27    0.912

References

Godoy, O., & Levine, J. M. (2014). Phenology effects on invasion success: insights from coupling field experiments to coexistence theory. Ecology, 95(3), 726-736.

Lanuza, J. B., Bartomeus, I., & Godoy, O. (2018). Opposing effects of floral visitors and soil conditions on the determinants of competitive outcomes maintain species diversity in heterogeneous landscapes. Ecology letters, 21(6), 865-874.