Type: Package
Title: Simulating Climate Data for Research and Modelling
Version: 0.1.2
Date: 2026-04-19
Maintainer: Isaac Osei <ikemillar65@gmail.com>
Description: Generate synthetic station-based monthly climate time-series including temperature and rainfall, export to Network Common Data Form (NetCDF), and provide visualization helpers for climate workflows. The approach is inspired by statistical weather generator concepts described in Wilks (1999) <doi:10.1016/S0168-1923(99)00037-4> and Richardson (1981) <doi:10.1029/WR017i001p00182>.
License: MIT + file LICENSE
Encoding: UTF-8
Imports: trend, truncnorm, ncdf4, lubridate, readr, dplyr, ggplot2, rlang, tidyr, vroom, tibble, stats
Suggests: testthat (≥ 3.0.0), knitr, rmarkdown
VignetteBuilder: knitr
RoxygenNote: 7.3.3
Config/testthat/edition: 3
URL: https://github.com/ikemillar/CDSim
BugReports: https://github.com/ikemillar/CDSim/issues
NeedsCompilation: no
Packaged: 2026-04-22 13:38:18 UTC; isaacosei
Author: Isaac Osei [aut, cre], Acheampong Baafi-Adomako [aut]
Repository: CRAN
Date/Publication: 2026-04-22 14:10:02 UTC

CDSim: Climate Data Simulation Toolkit

Description

Tools for generating and exporting synthetic climate observation datasets.

Author(s)

Isaac Osei and Acheampong Baafi-Adomako and Sivaparvathi Dusari

See Also

Useful links:


Create or load station metadata

Description

Create a station metadata table (Station, LON, LAT) either by:

Usage

create_stations(
  source = NULL,
  n = 10,
  bbox = c(-3.5, 1.5, 4.5, 11.5),
  seed = NULL
)

Arguments

source

Path to CSV file OR a data.frame with Station/LON/LAT OR NULL (to generate synthetic).

n

Integer number of stations to generate when source = NULL. Default 10.

bbox

numeric vector c(min_lon, max_lon, min_lat, max_lat). Default ~ Ghana bounding box.

seed

Optional numeric to make generation reproducible.

Value

A data.frame with columns Station, LON, LAT.

Examples

create_stations(n = 5, seed = 42)
create_stations(data.frame(Station="A", LON=0, LAT=5))

Plot Station Time Series with Seasonal Detection

Description

Creates a time-series plot for climate variables with automatic hemisphere-based season detection.

Usage

plot_station_timeseries(
  df,
  station,
  var = "Avg.Tn",
  smooth = TRUE,
  theme_dark = FALSE
)

Arguments

df

A tidy dataset containing columns: Station, Date, LAT, and variables.

station

Station name.

var

Climate variable to plot.

smooth

Add LOESS smoothing line.

theme_dark

Use dark theme.

Value

A ggplot object.

Examples

stations <- create_stations(n = 3)
sim <- simulate_climate_series(stations)
plot_station_timeseries(sim, station = "Station_1", var = "Avg.Tn")


Make a safe filename

Description

Ensures file names contain only safe ASCII characters.

Usage

safe_name(x)

safe_name(x)

Arguments

x

A character string to clean.

Value

A cleaned filename string.


Simulate monthly climate time series for stations

Description

Simulate monthly Tmin, Tmax, monthly total rainfall (Sum.Rf) and mean daily rainfall (Avg.Rf) for each station across a year range.

Usage

simulate_climate_series(
  stations,
  start_year = 1996,
  end_year = 2025,
  seed = NULL,
  temp_trend_per_year = 0.02,
  rain_trend_per_year = -0.003,
  phi_temp = 0.85,
  sd = 0.4,
  Tmin_min = 18,
  Tmin_max = 30,
  Tmax_min = 24,
  Tmax_max = 42
)

Arguments

stations

data.frame from create_stations() (Station, LON, LAT)

start_year

integer (e.g., 1996)

end_year

integer (e.g., 2025)

seed

optional numeric seed

temp_trend_per_year

temperature trend per year (°C/year warming)

rain_trend_per_year

rain trend per year (slight drying trend)

phi_temp

AR(1) persistence

sd

standard deviation of the AR(1) innovation process controlling temperature variability

Tmin_min

minimum value for minimum temperature

Tmin_max

maximum value for minimum temperature

Tmax_min

minimum value for maximum temperature

Tmax_max

maximum value for maximum temperature

Details

This function generates synthetic monthly climate time series using a stochastic, physically-informed modelling framework. Temperature is modeled as a combination of deterministic seasonality, long-term trend, and stochastic variability. The seasonal component is represented using a sinusoidal function, while temporal persistence is introduced via an autoregressive AR(1) process applied to the innovation term.

Minimum temperature (Avg.Tn) is simulated using a truncated normal distribution to enforce physically realistic lower and upper bounds. Maximum temperature (Avg.Tx) is generated using a gamma-distributed perturbation applied to the mean temperature, producing an asymmetric distribution consistent with observed climatological behavior.

Rainfall occurrence is modeled using a first-order Markov chain, allowing for realistic wet–dry persistence. Conditional on occurrence, rainfall intensity is drawn from a gamma distribution with seasonally varying mean. A temporal trend term can be applied to represent long-term climatic changes such as gradual drying or wetting.

To ensure physical consistency between variables, a coupling mechanism is introduced whereby increased rainfall (proxy for cloud cover) reduces maximum temperature through a linear cooling adjustment. This enforces a negative dependence between precipitation and temperature consistent with atmospheric energy balance principles.

Finally, a minimum diurnal temperature difference constraint is enforced after rounding to guarantee that Avg.Tx > Avg.Tn at all time steps, while preserving the statistical distribution of the simulated variables.

The default parameterization reflects typical tropical conditions for Ghana, but all parameters are user-configurable, allowing adaptation to other climatic regions. The modelling approach follows established stochastic weather generation principles while extending them with distributional asymmetry and cross-variable coupling for improved physical realism.

Value

A tidy data.frame with one row per station × month containing: Station, LON, LAT, Year, Month, Date, Avg.Tn, Avg.Tx, Sum.Rf, Avg.Rf

See Also

write_station_csv(), write_station_netcdf()

Examples

st <- create_stations(n = 3, seed = 1)
sim <- simulate_climate_series(st, 1996, 2025, seed = 42)
head(sim)

Validate simulated climate data against observations

Description

Performs statistical and physical validation of simulated climate data against observed datasets, including distributional tests, mean comparison, dependence structure, and temporal persistence.

Usage

validate_climate(sim, obs)

Arguments

sim

Simulated climate data.frame

obs

Observed climate data.frame

Value

A list containing validation metrics and test results


Internal validation of simulated climate data

Description

Evaluates physical plausibility and statistical properties of simulated climate data in the absence of observational datasets. The function assesses distributional characteristics, temporal persistence, inter-variable relationships, and physical constraints.

Usage

validate_climate_internal(sim)

Arguments

sim

Simulated climate data.frame

Value

A list of validation diagnostics


Visualization Functions for Climate Data

Description

Visualization Functions for Climate Data


Write station CSV Exports a simulated climate station dataset to a CSV file.

Description

Write station CSV Exports a simulated climate station dataset to a CSV file.

Usage

write_station_csv(df, file = "simulated_station_climate.csv")

Arguments

df

A dataframe returned by simulate_climate_series().

file

The output CSV filename.

Value

Returns the file path invisibly.

Examples

stations <- create_stations(n = 3)
sim <- simulate_climate_series(stations)
tmp <- tempfile(fileext = ".csv")
write_station_csv(sim, tmp)


Write station NetCDF (station x time) Exports a simulated climate station dataset to a NetCDF file.

Description

Write station NetCDF (station x time) Exports a simulated climate station dataset to a NetCDF file.

Usage

write_station_netcdf(
  df,
  out_nc = "simulated_station_climate.nc",
  fillvalue = -9999
)

Arguments

df

station x time long dataframe returned by simulate_climate_series()

out_nc

Output NetCDF filename

fillvalue

Value used for missing entries

Value

Returns the file path invisibly.

Examples

stations <- create_stations(n = 3)
sim <- simulate_climate_series(stations)
tmp <- tempfile(fileext = ".nc")
write_station_netcdf(sim, tmp)