canaper

DOI R-CMD-check Project Status: active. Codecov test coverage ropensci review runiverse CRAN status

The goal of canaper is to enable categorical analysis of neo- and paleo-endemism (CANAPE) in R.

Installation

The stable version can be installed from CRAN:

install.packages("canaper")

The development version can be installed from r-universe or github:

# r-universe
options(repos = c(
  ropensci = "https://ropensci.r-universe.dev/", 
  CRAN = "https://cran.rstudio.com/"
))
install.packages("canaper", dep = TRUE)

# OR

# github (requires `remotes` or `devtools`)
remotes::install_github("ropensci/canaper")

Example usage

These examples use the dataset from Phylocom. The dataset includes a community (site x species) matrix and a phylogenetic tree.

library(canaper)

data(phylocom)

# Example community matrix including 4 "clumped" communities,
# one "even" community, and one "random" community
phylocom$comm
#>         sp1 sp10 sp11 sp12 sp13 sp14 sp15 sp17 sp18 sp19 sp2 sp20 sp21 sp22
#> clump1    1    0    0    0    0    0    0    0    0    0   1    0    0    0
#> clump2a   1    2    2    2    0    0    0    0    0    0   1    0    0    0
#> clump2b   1    0    0    0    0    0    0    2    2    2   1    2    0    0
#> clump4    1    1    0    0    0    0    0    2    2    0   1    0    0    0
#> even      1    0    0    0    1    0    0    1    0    0   0    0    1    0
#> random    0    0    0    1    0    4    2    3    0    0   1    0    0    1
#>         sp24 sp25 sp26 sp29 sp3 sp4 sp5 sp6 sp7 sp8 sp9
#> clump1     0    0    0    0   1   1   1   1   1   1   0
#> clump2a    0    0    0    0   1   1   0   0   0   0   2
#> clump2b    0    0    0    0   1   1   0   0   0   0   0
#> clump4     0    2    2    0   0   0   0   0   0   0   1
#> even       0    1    0    1   0   0   1   0   0   0   1
#> random     2    0    0    0   0   0   2   0   0   0   0

# Example phylogeny
phylocom$phy
#> 
#> Phylogenetic tree with 32 tips and 31 internal nodes.
#> 
#> Tip labels:
#>   sp1, sp2, sp3, sp4, sp5, sp6, ...
#> Node labels:
#>   A, B, C, D, E, F, ...
#> 
#> Rooted; includes branch lengths.

The main “workhorse” function of canaper is cpr_rand_test(), which conducts a randomization test to determine if observed values of phylogenetic diversity (PD) and phylogenetic endemism (PE) are significantly different from random. It also calculates the same values on an alternative phylogeny where all branch lengths have been set equal (alternative PD, alternative PE) as well as the ratio of the original value to the alternative value (relative PD, relative PE).

set.seed(071421)
rand_test_results <- cpr_rand_test(
  phylocom$comm, phylocom$phy,
  null_model = "swap"
)
#> Warning: Abundance data detected. Results will be the same as if using
#> presence/absence data (no abundance weighting is used).
#> Warning: Dropping tips from the tree because they are not present in the community data: 
#>  sp16, sp23, sp27, sp28, sp30, sp31, sp32

cpr_rand_test produces a lot of columns (nine per metric), so let’s just look at a subset of them:

rand_test_results[, 1:9]
#>            pd_obs pd_rand_mean pd_rand_sd  pd_obs_z pd_obs_c_upper
#> clump1  0.3018868    0.4692453 0.03214267 -5.206739              0
#> clump2a 0.3207547    0.4762264 0.03263836 -4.763465              0
#> clump2b 0.3396226    0.4681132 0.03462444 -3.710978              0
#> clump4  0.4150943    0.4667925 0.03180131 -1.625660              3
#> even    0.5660377    0.4660377 0.03501739  2.855724            100
#> random  0.5094340    0.4733962 0.03070539  1.173662             79
#>         pd_obs_c_lower pd_obs_q pd_obs_p_upper pd_obs_p_lower
#> clump1             100      100           0.00           1.00
#> clump2a            100      100           0.00           1.00
#> clump2b            100      100           0.00           1.00
#> clump4              91      100           0.03           0.91
#> even                 0      100           1.00           0.00
#> random               6      100           0.79           0.06

This is a summary of the columns:

The next step in CANAPE is to classify endemism types according to the significance of PE, alternative PE, and relative PE. This adds a column called endem_type.

canape_results <- cpr_classify_endem(rand_test_results)

canape_results[, "endem_type", drop = FALSE]
#>              endem_type
#> clump1  not significant
#> clump2a not significant
#> clump2b not significant
#> clump4  not significant
#> even              mixed
#> random            mixed

This data set is very small, so it doesn’t include all possible endemism types. In total, they include:

For a more complete example, please see the vignette

Comparsion with other software

Several other R packages are available to calculate diversity metrics for ecological communities. The non-exhaustive summary below focuses on alpha diversity metrics in comparison with canaper, and is not a comprehensive description of each package.

Other information

Poster at Botany 2021

Citing this package

If you use this package, please cite it! Here is an example:

The example DOI above is for the overall package.

Here is the latest DOI, which you should use if you are using the latest version of the package:

DOI

You can find DOIs for older versions by viewing the “Releases” menu on the right.

Papers citing canaper

Contributing and code of conduct

Contributions to canaper are welcome! For more information, please see CONTRIBUTING.md

Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Note to developers

roxyglobals is used to maintain R/globals.R, but is not available on CRAN. You will need to install this package from github and use the @autoglobal or @global roxygen tags to develop functions with globals.

Licenses

References

Mishler, B., Knerr, N., González-Orozco, C. et al. Phylogenetic measures of biodiversity and neo- and paleo-endemism in Australian Acacia. Nat Commun 5, 4473 (2014). https://doi.org/10.1038/ncomms5473