| Type: | Package | 
| Title: | Automatic Sequence Prediction by Expansion of the Distance Matrix | 
| Version: | 1.3.0 | 
| Author: | Giancarlo Vercellino | 
| Maintainer: | Giancarlo Vercellino <giancarlo.vercellino@gmail.com> | 
| Description: | Each sequence is predicted by expanding the distance matrix. The compact set of hyper-parameters is tuned through random search. | 
| License: | GPL-3 | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.1.1 | 
| Depends: | R (≥ 4.1) | 
| Imports: | purrr (≥ 0.3.4), abind (≥ 1.4-5), ggplot2 (≥ 3.3.5), readr (≥ 2.0.1), stringr (≥ 1.4.0), lubridate (≥ 1.7.10), narray (≥ 0.4.1.1), imputeTS (≥ 3.2), scales (≥ 1.1.1), tictoc (≥ 1.0.1), modeest (≥ 2.4.0), moments (≥ 0.14), greybox (≥ 1.0.1), dqrng (≥ 0.3.0), entropy (≥ 1.3.1), Rfast (≥ 2.0.6), philentropy (≥ 0.5.0), fastDummies (≥ 1.6.3), fANCOVA (≥ 0.6-1) | 
| URL: | https://rpubs.com/giancarlo_vercellino/tetragon | 
| NeedsCompilation: | no | 
| Packaged: | 2022-08-13 16:47:17 UTC; gvercellino | 
| Repository: | CRAN | 
| Date/Publication: | 2022-08-13 17:30:02 UTC | 
tetragon
Description
Each sequence is predicted by expanding the distance matrix. The compact set of hyper-parameters is tuned via grid or random search.
Usage
tetragon(
  df,
  seq_len = NULL,
  smoother = F,
  ci = 0.8,
  method = NULL,
  distr = NULL,
  n_windows = 3,
  n_sample = 30,
  dates = NULL,
  error_scale = "naive",
  error_benchmark = "naive",
  seed = 42
)
Arguments
| df | A data frame with time features as columns. They could be continuous variables or not. | 
| seq_len | Positive integer. Time-step number of the projected sequence. Default: NULL (random selection between maximum boundaries). | 
| smoother | Logical. Perform optimal smoothing using standard loess. Default: FALSE | 
| ci | Confidence interval. Default: 0.8. | 
| method | String. Distance method for calculating distance matrix among sequences. Options are: "euclidean", "manhattan", "maximum", "minkowski". Default: NULL (random selection among all possible options). | 
| distr | String. Distribution used to expand the distance matrix. Options are: "norm", "logis", "t", "exp", "chisq". Default: NULL (random selection among all possible options). | 
| n_windows | Positive integer. Number of validation tests to measure/sample error. Default: 3 (but a larger value is strongly suggested to really understand your accuracy). | 
| n_sample | Positive integer. Number of samples for random search. Default: 30. | 
| dates | Date. Vector with dates for time features. | 
| error_scale | String. Scale for the scaled error metrics (only for continuous variables). Two options: "naive" (average of naive one-step absolute error for the historical series) or "deviation" (standard error of the historical series). Default: "naive". | 
| error_benchmark | String. Benchmark for the relative error metrics (only for continuous variables). Two options: "naive" (sequential extension of last value) or "average" (mean value of true sequence). Default: "naive". | 
| seed | Positive integer. Random seed. Default: 42. | 
Value
This function returns a list including:
- exploration: list of all explored models, complete with predictions, testing metrics and plots 
- history: a table with the sampled models, hyper-parameters, validation errors 
- best: results for the best model including: - predictions: min, max, q25, q50, q75, quantiles at selected ci, and a bunch of specific measures for each point fo predicted sequences 
- testing_errors: testing errors for one-step and sequence for each ts feature 
- plots: confidence interval plot for each time feature 
 
- time_log 
Author(s)
Giancarlo Vercellino giancarlo.vercellino@gmail.com
See Also
Useful links:
Examples
tetragon(covid_in_europe[, c(2, 4)], seq_len = 40, n_sample = 2)
covid_in_europe data set
Description
A data frame with with daily and cumulative cases of Covid infections and deaths in Europe since March 2021.
Usage
covid_in_europe
Format
A data frame with 5 columns and 163 rows.
Source
www.ecdc.europa.eu