The CDSim package provides an easy workflow for simulating climate data such as temperature and rainfall across multiple synthetic weather stations. It is useful for testing models, teaching climate analysis, generating demo data, and creating datasets with controlled variability.
This vignette demonstrates:
how to create synthetic weather stations
how to generate multi-year climate time series
how to export results to CSV and NetCDF
how to perform a quick visualization
The stations can be created either by:
loading from a CSV file,
accepting an existing data frame,
or auto-generating synthetic stations in a bounding box.
We begin by generating a set of synthetic weather stations. The seed ensures reproducibility.
stations <- create_stations(n = 3, seed = 123)
#> Generating synthetic station network...
#> Generated 3 synthetic stations within bounding box.
stations
#> Station LON LAT
#> 1 Station_1 -2.0621124 10.681122
#> 2 Station_2 0.4415257 11.083271
#> 3 Station_3 -1.4551154 4.818895Each station typically contains:
station name
longitude
latitude
Once the stations are created, we can generate daily or monthly climate time series using built-in stochastic models.
sim <- simulate_climate_series(stations, start_year = 2019, end_year = 2024)
head(sim)
#> Station LON LAT Year Month Date Avg.Tn Avg.Tx Sum.Rf
#> 1 Station_1 -2.062112 10.68112 2019 1 2019-01-15 19.5 31.8 0.0
#> 2 Station_1 -2.062112 10.68112 2019 2 2019-02-15 21.7 34.7 0.0
#> 3 Station_1 -2.062112 10.68112 2019 3 2019-03-15 23.9 36.0 0.0
#> 4 Station_1 -2.062112 10.68112 2019 4 2019-04-15 24.6 32.6 51.4
#> 5 Station_1 -2.062112 10.68112 2019 5 2019-05-15 22.6 33.3 0.0
#> 6 Station_1 -2.062112 10.68112 2019 6 2019-06-15 20.7 29.2 49.5
#> Avg.Rf
#> 1 0.00
#> 2 0.00
#> 3 0.00
#> 4 1.71
#> 5 0.00
#> 6 1.65A typical simulated record includes:
date
min and max temperature
rainfall
station metadata
CDSim includes convenient exporters such as CSV and NetCDF for storing climate data. This makes it easier for packages such as ncdf4, terra, or stars to read the outputs.
To demonstrate a quick plot, here’s the maximum temperature series of the first station.
To assess the physical realism and statistical consistency of the simulated climate data, CDSim provides a validation framework.
The function validate_climate_internal() performs a
series of diagnostic checks on the simulated dataset, including:
validation <- validate_climate_internal(sim)
validation
#> $summary
#> Station LON LAT Year
#> Length:216 Min. :-2.0621 Min. : 4.819 Min. :2019
#> Class :character 1st Qu.:-2.0621 1st Qu.: 4.819 1st Qu.:2020
#> Mode :character Median :-1.4551 Median :10.681 Median :2022
#> Mean :-1.0252 Mean : 8.861 Mean :2022
#> 3rd Qu.: 0.4415 3rd Qu.:11.083 3rd Qu.:2023
#> Max. : 0.4415 Max. :11.083 Max. :2024
#> Month Date Avg.Tn Avg.Tx
#> Min. : 1.00 Min. :2019-01-15 Min. :18.00 Min. :24.00
#> 1st Qu.: 3.75 1st Qu.:2020-07-07 1st Qu.:18.60 1st Qu.:26.60
#> Median : 6.50 Median :2021-12-30 Median :19.80 Median :30.30
#> Mean : 6.50 Mean :2021-12-29 Mean :20.57 Mean :30.04
#> 3rd Qu.: 9.25 3rd Qu.:2023-06-22 3rd Qu.:22.23 3rd Qu.:32.90
#> Max. :12.00 Max. :2024-12-15 Max. :25.90 Max. :40.10
#> Sum.Rf Avg.Rf
#> Min. : 0.00 Min. :0.000
#> 1st Qu.: 0.00 1st Qu.:0.000
#> Median : 26.45 Median :0.850
#> Mean : 39.63 Mean :1.299
#> 3rd Qu.: 62.33 3rd Qu.:2.120
#> Max. :201.30 Max. :6.500
#>
#> $checks
#> $checks$Tmin_min
#> [1] 18
#>
#> $checks$Tmax_max
#> [1] 40.1
#>
#> $checks$Rain_min
#> [1] 0
#>
#> $checks$Tmax_gt_Tmin
#> [1] TRUE
#>
#> $checks$Tmin_plausible
#> [1] TRUE
#>
#>
#> $distribution
#> $distribution$rain_skewness
#> [1] 1.281271
#>
#> $distribution$Tmax_skewness
#> [1] 0.1560093
#>
#> $distribution$Tmin_skewness
#> [1] 0.6707342
#>
#> $distribution$sd_Tmax
#> [1] 4.049788
#>
#> $distribution$sd_Tmin
#> [1] 2.260367
#>
#> $distribution$sd_Rain
#> [1] 47.04732
#>
#>
#> $correlation
#> Sum.Rf Avg.Tx Avg.Tn
#> Sum.Rf 1.0000000 -0.4488468 -0.2087886
#> Avg.Tx -0.4488468 1.0000000 0.7185227
#> Avg.Tn -0.2087886 0.7185227 1.0000000
#>
#> $rain_temp_coupling
#> [1] TRUE
#>
#> $autocorrelation
#> $autocorrelation$Tmax
#> [1] 0.513696
#>
#> $autocorrelation$Tmin
#> [1] 0.7104567
#>
#> $autocorrelation$Rain
#> [1] 0.109139
#>
#>
#> $trend
#> $trend$Tmax_slope
#> time_index
#> -0.003346432
#>
#> $trend$Rain_slope
#> time_index
#> 0.08788007
#>
#>
#> $seasonality
#> $seasonality$peak_month
#> 8
#> 8
#>
#> $seasonality$trough_month
#> 5
#> 5
#>
#>
#> $valid
#> [1] TRUEIn addition, CDSim supports external validation against observed
datasets using the function validate_climate(). This allows
users to compare simulated data with real-world observations for further
evaluation of model performance.
The validation framework is particularly useful in scenarios where observed data are unavailable, providing diagnostic assurance that the simulated outputs adhere to known climatological principles.