Orangutan

DOI CRAN status CRAN downloads CRAN Downloads install from r-universe

Orangutan is an R package for analyzing and visualizing measurements (morphometrics) from groups such as species or populations. It runs a full analysis pipeline that summarizes data, finds variables that differentiate groups, performs multivariate and univariate statistics, and produces publication-ready plots.

Table of Contents

What Orangutan does

Orangutan workflow

Installation

Stable version (CRAN)

Install the latest stable release from CRAN (v2.0.0):

install.packages("Orangutan")

Development version (GitHub)

Install the development version directly from GitHub (v2.1.0):

install.packages("pak")
pak::pak("metalofis/Orangutan-R")

Implementation

Quick example: run_orangutan called with default parameters (writes results next to the input file by default):

library(Orangutan)

run_orangutan("data/my_dataset.csv")

Full example: run_orangutan called with all available arguments

library(Orangutan)  # Load the Orangutan package

run_orangutan(
  # ---------- Input / output ----------
  data_path = "data/my_dataset.csv",             # Path to your input CSV dataset
  output_dir = "address/to/orangutan_outputs",   # Folder where all outputs (plots, tables) will be saved
  
  # ---------- Allometry ----------
  apply_allometry = TRUE,             # Whether to adjust measurements for allometry
  allometry_var = "SVL",              # Column used as the reference variable for allometry correction
  
  # ---------- Outlier handling ----------
  remove_outliers = TRUE,             # Whether to remove extreme values (outliers)
  outlier_vars = c("SVL"),            # Which variables to check for outliers
  outlier_tail_pct = 0.05,            # Proportion of extreme values to remove from each tail (5% here)
  
  # ---------- PCA / DAPC highlighting ----------
  species_to_encircle = c("carolinensis", "torresfundorai"), # Species to highlight on PCA/DAPC plots
  
  # ---------- Color palette ----------
  palette_name = "Paired",            # Name of the color palette for plots ("Paired", "Set3", "Dark2")
  custom_colors = c(SpeciesA = "#FF0000", SpeciesB = "#00FF00"), # Optional: custom hex codes for specific species
  
  # ---------- Point aesthetics ----------
  point_aes = list(
    point_size    = 3.5,              # Size of each individual point
    jitter_width  = 0.1,              # Horizontal jitter to prevent overplotting
    jitter_alpha  = 0.8,              # Transparency of points
    jitter_shape  = 21,               # Shape of the points (21 = filled circle with border)
    jitter_color  = "black",          # Border color of points
    jitter_stroke = 0.35              # Thickness of the point border
  ),
  
  # ---------- Mean point aesthetics ----------
  mean_aes = list(
    size   = 1.8,                      # Size of the mean point
    shape  = 21,                       # Shape of the mean point
    fill   = "white",                  # Fill color of the mean point
    color  = "black",                  # Border color of the mean point
    stroke = 0.6                       # Thickness of the mean point border
  ),
  
  # ---------- Violin aesthetics ----------
  violin_aes = list(
    alpha = 0.4                         # Transparency of violin plots
  ),
  
  # ---------- Boxplot aesthetics ----------
  box_aes = list(
    alpha = 0.4,                        # Transparency of boxplots
    width = 0.15                        # Width of boxplots
  ),
  
  # ---------- Label / text control ----------
  label_aes = list(
    text_size      = 6,                 # Size of text labels on plots
    axis_text_size = 10,                # Size of axis tick labels
    title_size     = 12,                # Size of plot titles
    label_offset   = 0.05               # Distance of labels from points
  ),
  
  # ---------- Optional label templates ----------
  label_templates = list(
    nonoverlap_title = "Non-Overlapping Pair: %s vs %s for %s", # Title template for non-overlapping variable plots
    pca_x = "PC1 (%s%% variance)",       # Label for PCA X-axis with explained variance
    pca_y = "PC2 (%s%% variance)",       # Label for PCA Y-axis with explained variance
    dapc_x = "LD1 (%s%%)",               # Label for DAPC X-axis with explained variance
    dapc_y = "LD2 (%s%%)",               # Label for DAPC Y-axis with explained variance
    dapc_title_1d = "DAPC – Single Discriminant Axis" # Title for one-dimensional DAPC plots
  ),
  
  # ---------- Multivariate test seeds ----------
  seeds = list(betadisper = 123, permanova = 456),   # Seed for reproducible dispersion/randomization calculations and permutation tests
  
  # ---------- Messaging ----------
  verbose = FALSE                                    # Whether to print progress messages in console
)

Description of run_orangutan() arguments

Input data format

species main_length Head_length Supralabials Color
allisoni 86.5 25.2 9 Blue
allisoni 73.6 24.8 8 Blue
carolinensis 63.0 18.3 8 Green
carolinensis 59.0 19.17 8 Green
torresfundorai 66.9 18.7 7 Green
torresfundorai 70.9 23.6 7 Green

HTML Report

Every run automatically produces orangutan_report.html inside output_dir. Open it in any web browser to get a plain-language summary of all analysis sections, with embedded thumbnail images of the key plots. No extra arguments are needed — the report is generated by default.

Contributing / Support

Citation

Torres, J. (2026). Orangutan: An R Package for Analyzing and Visualizing Phenotypic Data in the Context of Species Descriptions and Population Comparisons. Ecology and Evolution, 16(2), e73111. https://doi.org/10.1002/ece3.73111