In this vignette, we will use the Cort algorithm to fit a copula on a given simulated dataset based on clayton copula simulations. We will show how the algorithm can be used to produce a bona-fide copula, and describe some of the parameters.
First, let’s create and plot the dataset we will work with. For that, we’ll use the gamma frailty model for the Clayton copula (but it’ll work for any other completely monotonous archimedean generator), as it is done in the copula
package, see there. The following code is directly taken from the previous link, from the copula
package :
psi <- function(t,alpha) (1 + sign(alpha)*t) ^ (-1/alpha) # generator
rClayton <- function(n,dim,alpha){
val <- matrix(runif(n * dim), nrow = n)
gam <- rgamma(n, shape = 1/alpha, rate = 1)
gam <- matrix(gam, nrow = n, ncol = dim)
psi(- log(val) / gam,alpha)
}
For reproducibility reasons, we set the random number generator. This vignette has been compiled on a version of R > 3.6, so the fix of the random number generator added in this version is used. To reproduce results from an earlier version, just use sample.kind = "Rounding"
. The following code simulates a dataset and then visualise it :
if(as.numeric(version$minor)<6){
# the way of specifying the random number generation changed.
set.seed(12,kind = "Mersenne-Twister",normal.kind = "Inversion")
} else {
set.seed(12,kind = "Mersenne-Twister",normal.kind = "Inversion",sample.kind = "Rejection")
}
n = 200 # taken small to reduce runtime of the vignette.
d = 4
n_trees = 5 # taken small to reduce runtime of the vignette.
number_max_dim_forest = 2 # taken small to reduce runtime of the vignette.
data <- matrix(nrow=n,ncol=d)
data[,c(1,4,3)] = rClayton(n=n,dim=d-1,alpha=7)
data[,2] = runif(n)
data[,3] <- 1 - data[,3]
pairs(data,cex=0.6)
We can clearly see that the second marginal is independent form the rest. In the following we will use this package to fit this dependence structure.
Now that we have a dataset, we can run the Cort algorithm on it. In the implementation proposed here, this is done via the cort::Cort()
function, passing first the dataset, and then various parameters. See ?Cort
for a detailed list of parameters. Note that the verbosity level is quite progressive: We will here put it on 4 to see the splitting decisions that the algorithm is making.
(model = Cort(data,verbose_lvl=4,p_value_for_dim_red = 0.75))
#> Splitting...
#>
#> 1 leaves to split...
#> Leaf with 200 points.
#> min max bp p_value action reason
#> 1 0 1 0.793066313 0.00000000 Splitted
#> 2 0 1 0.073362799 0.74574575 Splitted
#> 3 0 1 0.208957488 0.00000000 Splitted
#> 4 0 1 0.789018711 0.00000000 Splitted
#>
#>
#> 10 leaves to split...
#> Leaf with 10 points.
#> min max bp p_value action reason
#> 1 0.00000000 0.793066313 0.029266960 0.23323323 Removed Close to boundary
#> 2 0.00000000 0.073362799 0.024694101 1.00000000 Removed Independence test
#> 3 0.20895749 1.000000000 0.976639186 1.00000000 Removed Independence test
#> 4 0.00000000 0.789018711 0.024931546 1.00000000 Removed Independence test
#>
#> Leaf with 5 points.
#> min max bp p_value action reason
#> 1 0.000000000 0.79306631 0.74647612 0.025025025 Removed Close to boundary
#> 2 0.073362799 1.00000000 0.46782571 0.815815816 Removed Independence test
#> 3 0.000000000 0.20895749 0.11341065 1.000000000 Removed Independence test
#> 4 0.000000000 0.78901871 0.73903510 0.001001001 Removed Close to boundary
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.000000000 0.79306631 0.74622987 0.059059059 Removed Close to boundary
#> 2 0.073362799 1.00000000 0.74698467 1.000000000 Removed Independence test
#> 3 0.000000000 0.20895749 0.11288653 1.000000000 Removed Independence test
#> 4 0.789018711 1.00000000 0.95782035 1.000000000 Removed Independence test
#>
#> Leaf with 132 points.
#> min max bp p_value action reason
#> 1 0.000000000 0.79306631 0.042821551 0.00000000 Splitted
#> 2 0.073362799 1.00000000 0.537607355 0.93793794 Removed Independence test
#> 3 0.208957488 1.00000000 0.956730821 0.00000000 Splitted
#> 4 0.000000000 0.78901871 0.042245292 0.00000000 Splitted
#>
#> Leaf with 10 points.
#> min max bp p_value action reason
#> 1 0.000000000 0.79306631 0.72963801 0.001001001 Splitted
#> 2 0.073362799 1.00000000 0.79100657 0.699699700 Splitted
#> 3 0.208957488 1.00000000 0.29191967 0.000000000 Splitted
#> 4 0.789018711 1.00000000 0.81149607 0.708708709 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.79306631 1.000000000 0.831543045 1.000000000 Removed Independence test
#> 2 0.00000000 0.073362799 0.054725149 1.000000000 Removed Independence test
#> 3 0.00000000 0.208957488 0.077939286 1.000000000 Removed Independence test
#> 4 0.00000000 0.789018711 0.779741296 0.038038038 Removed Close to boundary
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.79306631 1.000000000 0.845773688 1 Removed Independence test
#> 2 0.00000000 0.073362799 0.064447416 1 Removed Independence test
#> 3 0.00000000 0.208957488 0.025492203 1 Removed Independence test
#> 4 0.78901871 1.000000000 0.979599081 1 Removed Independence test
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 0.84166992 1.000000000 Removed Independence test
#> 2 0.073362799 1.00000000 0.32339274 0.342342342 Splitted
#> 3 0.000000000 0.20895749 0.15757214 0.326326326 Splitted
#> 4 0.000000000 0.78901871 0.74348553 0.001001001 Removed Close to boundary
#>
#> Leaf with 27 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 0.910329535 0.897897898 Removed Independence test
#> 2 0.073362799 1.00000000 0.338809561 0.349349349 Splitted
#> 3 0.000000000 0.20895749 0.062301862 0.074074074 Splitted
#> 4 0.789018711 1.00000000 0.945263364 0.055055055 Splitted
#>
#> Leaf with 5 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 0.91944990 1.000000000 Removed Independence test
#> 2 0.073362799 1.00000000 0.29163589 1.000000000 Removed Independence test
#> 3 0.208957488 1.00000000 0.23173294 0.051051051 Removed Close to boundary
#> 4 0.000000000 0.78901871 0.76680943 0.007007007 Removed Close to boundary
#>
#>
#> 10 leaves to split...
#> Leaf with 6 points.
#> min max bp p_value action reason
#> 1 0.000000000 0.042821551 0.0065360942 1 Removed Independence test
#> 2 0.073362799 1.000000000 NaN NaN
#> 3 0.956730821 1.000000000 0.9931915688 1 Removed Independence test
#> 4 0.000000000 0.042245292 0.0062637572 1 Removed Independence test
#>
#> Leaf with 124 points.
#> min max bp p_value action reason
#> 1 0.042821551 0.79306631 0.065828676 0 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.95673082 0.935908260 0 Splitted
#> 4 0.042245292 0.78901871 0.062581190 0 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.000000000 0.72963801 0.56858981 0.17917918 Splitted
#> 2 0.073362799 0.79100657 0.27064799 0.30030030 Splitted
#> 3 0.291919675 1.00000000 0.43283826 0.42042042 Removed Close to boundary
#> 4 0.811496071 1.00000000 0.82108441 1.00000000 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.79306631 1.00000000 NaN NaN
#> 2 0.32339274 1.00000000 0.63368566 1 Removed Independence test
#> 3 0.00000000 0.15757214 0.14924937 1 Removed Independence test
#> 4 0.00000000 0.78901871 NaN NaN
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 NaN NaN
#> 2 0.073362799 0.33880956 0.29480487 0.56456456 Removed Close to boundary
#> 3 0.062301862 0.20895749 0.10448923 1.00000000 Removed Independence test
#> 4 0.789018711 0.94526336 0.88403653 1.00000000 Removed Independence test
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 NaN NaN
#> 2 0.073362799 0.33880956 0.29234682 0.42442442 Splitted
#> 3 0.062301862 0.20895749 0.12394396 0.09009009 Splitted
#> 4 0.945263364 1.00000000 0.96517297 1.00000000 Removed Independence test
#>
#> Leaf with 6 points.
#> min max bp p_value action reason
#> 1 0.79306631 1.000000000 NaN NaN
#> 2 0.33880956 1.000000000 0.832772505 1.00000000 Removed Independence test
#> 3 0.00000000 0.062301862 0.052701408 1.00000000 Removed Independence test
#> 4 0.78901871 0.945263364 0.920371108 0.11511512 Dissmissed No one-dim split
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.79306631 1.000000000 NaN NaN
#> 2 0.33880956 1.000000000 0.8706379426 1 Removed Independence test
#> 3 0.00000000 0.062301862 0.0099673589 1 Removed Independence test
#> 4 0.94526336 1.000000000 0.9615632425 1 Removed Independence test
#>
#> Leaf with 9 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 NaN NaN
#> 2 0.338809561 1.00000000 0.33887113 0.33933934 Removed Close to boundary
#> 3 0.062301862 0.20895749 0.20894104 0.85785786 Removed Independence test
#> 4 0.789018711 0.94526336 0.94524514 1.00000000 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 NaN NaN
#> 2 0.338809561 1.00000000 0.40298836 0.64264264 Removed Close to boundary
#> 3 0.062301862 0.20895749 0.18667424 0.36536537 Removed Close to boundary
#> 4 0.945263364 1.00000000 0.95041568 1.00000000 Removed Independence test
#>
#>
#> 3 leaves to split...
#> Leaf with 3 points.
#> min max bp p_value action reason
#> 1 0.042821551 0.065828676 0.053067993 1 Removed Independence test
#> 2 0.073362799 1.000000000 NaN NaN
#> 3 0.935908260 0.956730821 0.950248756 1 Removed Independence test
#> 4 0.042245292 0.062581190 0.054726368 1 Removed Independence test
#>
#> Leaf with 120 points.
#> min max bp p_value action reason
#> 1 0.065828676 0.79306631 0.11964882 0 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.93590826 0.88115172 0 Splitted
#> 4 0.062581190 0.78901871 0.11599548 0 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.793066313 1.00000000 NaN NaN
#> 2 0.292346820 0.33880956 0.296951575 0.11411411 Removed Close to boundary
#> 3 0.062301862 0.12394396 0.069654825 1.00000000 Removed Independence test
#> 4 0.945263364 1.00000000 NaN NaN
#>
#>
#> 3 leaves to split...
#> Leaf with 7 points.
#> min max bp p_value action reason
#> 1 0.065828676 0.11964882 0.079603606 0.42242242 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.881151716 0.93590826 0.924981262 0.43143143 Splitted
#> 4 0.062581190 0.11599548 0.069823682 0.38838839 Splitted
#>
#> Leaf with 107 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.79306631 0.27188506 0 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.88115172 0.72636003 0 Splitted
#> 4 0.115995484 0.78901871 0.26648409 0 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.065828676 0.11964882 0.101680877 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.88115172 0.879194269 0.01001001 Removed Close to boundary
#> 4 0.062581190 0.11599548 0.084207853 1.00000000 Removed Independence test
#>
#>
#> 7 leaves to split...
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.065828676 0.079603606 0.078923524 1 Removed Independence test
#> 2 0.073362799 1.000000000 NaN NaN
#> 3 0.924981262 0.935908260 0.925377175 1 Removed Independence test
#> 4 0.062581190 0.069823682 0.066008939 1 Removed Independence test
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.079603606 0.11964882 0.097829469 1 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.881151716 0.92498126 0.906854689 1 Removed Independence test
#> 4 0.069823682 0.11599548 0.099455676 1 Removed Independence test
#>
#> Leaf with 21 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.27188506 0.19685645 0.012012012 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.726360027 0.88115172 0.80101288 0.012012012 Splitted
#> 4 0.115995484 0.26648409 0.19373922 0.018018018 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.27188506 0.25352940 0.641641642 Removed Close to boundary
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.726360027 0.88115172 0.76618348 1.000000000 Removed Independence test
#> 4 0.266484092 0.78901871 0.29054102 0.076076076 Removed Close to boundary
#>
#> Leaf with 78 points.
#> min max bp p_value action reason
#> 1 0.271885061 0.79306631 0.38434960 0 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.72636003 0.61193415 0 Splitted
#> 4 0.266484092 0.78901871 0.37693310 0 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.271885061 0.79306631 0.33562755 0.105105105 Removed Close to boundary
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.726360027 0.88115172 0.73246193 0.596596597 Removed Close to boundary
#> 4 0.266484092 0.78901871 0.30366987 0.078078078 Removed Close to boundary
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.27188506 0.19894481 1.000000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.72636003 0.72113597 0.079079079 Removed Close to boundary
#> 4 0.115995484 0.26648409 0.17420944 1.000000000 Removed Independence test
#>
#>
#> 12 leaves to split...
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.19685645 0.18407815 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.726360027 0.80101288 0.75733968 1.00000000 Removed Independence test
#> 4 0.193739216 0.26648409 0.21153553 0.34334334 Dissmissed No one-dim split
#>
#> Leaf with 7 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.19685645 0.14108896 0.77077077 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.801012875 0.88115172 0.85572048 0.29429429 Splitted
#> 4 0.115995484 0.19373922 0.13455778 0.27827828 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.19685645 0.18879593 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.801012875 0.88115172 0.84545595 1.00000000 Removed Independence test
#> 4 0.193739216 0.26648409 0.25373082 0.63263263 Removed Close to boundary
#>
#> Leaf with 8 points.
#> min max bp p_value action reason
#> 1 0.196856447 0.27188506 0.22662820 0.74974975 Dissmissed No one-dim split
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.726360027 0.80101288 0.75125930 0.82582583 Removed Independence test
#> 4 0.193739216 0.26648409 0.21393072 0.76976977 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.196856447 0.27188506 0.22554118 1 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.801012875 0.88115172 0.82168855 1 Removed Independence test
#> 4 0.115995484 0.19373922 0.16915233 1 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.271885061 0.38434960 0.32303566 0.55655656 Dissmissed No one-dim split
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.61193415 0.56218594 0.16616617 Removed Close to boundary
#> 4 0.376933100 0.78901871 0.42165697 0.18318318 Removed Close to boundary
#>
#> Leaf with 10 points.
#> min max bp p_value action reason
#> 1 0.271885061 0.38434960 0.36667239 0.55055055 Dissmissed No one-dim split
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.611934152 0.72636003 0.61195748 0.82582583 Removed Independence test
#> 4 0.266484092 0.37693310 0.32525261 0.82582583 Removed Independence test
#>
#> Leaf with 3 points.
#> min max bp p_value action reason
#> 1 0.271885061 0.38434960 0.35570733 0.310310310 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.611934152 0.72636003 0.65107066 0.312312312 Splitted
#> 4 0.376933100 0.78901871 0.42288955 0.095095095 Removed Close to boundary
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.79306631 0.43274466 0.036036036 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.61193415 0.43790324 0.633633634 Splitted
#> 4 0.266484092 0.37693310 0.36074103 0.428428428 Splitted
#>
#> Leaf with 52 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.79306631 0.57453765 0.005005005 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.61193415 0.41031179 0.030030030 Splitted
#> 4 0.376933100 0.78901871 0.57220761 0.001001001 Splitted
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.79306631 0.55222598 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.611934152 0.72636003 0.66067260 0.40840841 Splitted
#> 4 0.376933100 0.78901871 0.45888203 0.16116116 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.271885061 0.38434960 0.35256476 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.61193415 0.53233104 0.22722723 Removed Close to boundary
#> 4 0.266484092 0.37693310 0.28582048 1.00000000 Removed Independence test
#>
#>
#> 12 leaves to split...
#> Leaf with 3 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.19685645 NaN NaN
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.801012875 0.85572048 0.82124427 0.46346346 Splitted
#> 4 0.134557776 0.19373922 0.16419882 0.40940941 Splitted
#>
#> Leaf with 3 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.19685645 NaN NaN
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.855720475 0.88115172 0.85660434 1 Removed Independence test
#> 4 0.115995484 0.13455778 0.12935286 1 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.43274466 0.38452287 0.38238238 Removed Close to boundary
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.43790324 0.43777821 1.00000000 Removed Independence test
#> 4 0.360741033 0.37693310 0.37690501 0.37537538 Removed Close to boundary
#>
#> Leaf with 3 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.57453765 0.53247665 0.58858859 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.41031179 0.36827940 0.53753754 Splitted
#> 4 0.572207608 0.78901871 0.62189266 0.52252252 Splitted
#>
#> Leaf with 12 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.57453765 0.45392145 0.52152152 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.410311786 0.61193415 0.54725983 0.17317317 Splitted
#> 4 0.376933100 0.57220761 0.50213797 0.21421421 Splitted
#>
#> Leaf with 5 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.57453765 0.41791128 0.63563564 Dissmissed No one-dim split
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.410311786 0.61193415 0.52717035 0.79779780 Removed Independence test
#> 4 0.572207608 0.78901871 0.58706794 1.00000000 Removed Independence test
#>
#> Leaf with 5 points.
#> min max bp p_value action reason
#> 1 0.574537650 0.79306631 0.78561324 0.45745746 Removed Close to boundary
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.41031179 0.40633893 0.58458458 Removed Close to boundary
#> 4 0.376933100 0.57220761 0.38812318 0.58658659 Removed Close to boundary
#>
#> Leaf with 14 points.
#> min max bp p_value action reason
#> 1 0.574537650 0.79306631 0.66905645 0.62262262 Dissmissed No one-dim split
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.41031179 0.32337244 0.95495495 Removed Independence test
#> 4 0.572207608 0.78901871 0.67361034 1.00000000 Removed Independence test
#>
#> Leaf with 5 points.
#> min max bp p_value action reason
#> 1 0.574537650 0.79306631 0.59204309 1.0000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.410311786 0.61193415 0.52113685 0.4974975 Dissmissed No one-dim split
#> 4 0.376933100 0.57220761 0.55508101 0.7977978 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.574537650 0.79306631 0.73371970 1.0000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.410311786 0.61193415 0.44958932 0.2042042 Removed Close to boundary
#> 4 0.572207608 0.78901871 0.63185248 1.0000000 Removed Independence test
#>
#> Leaf with 6 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.57453765 0.55283435 0.028028028 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.41031179 0.26190578 0.294294294 Splitted
#> 4 0.376933100 0.57220761 0.54726179 0.049049049 Splitted
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.79306631 NaN NaN
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.611934152 0.66067260 0.62269454 0.60160160 Splitted
#> 4 0.376933100 0.45888203 0.39801108 0.65365365 Splitted
#>
#>
#> 6 leaves to split...
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.119648815 0.19685645 NaN NaN
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.801012875 0.82124427 0.80180241 0.20420420 Removed Close to boundary
#> 4 0.134557776 0.16419882 0.16412616 0.16716717 Removed Close to boundary
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.53247665 0.51740520 1 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.208957488 0.36827940 0.30759802 1 Removed Independence test
#> 4 0.621892657 0.78901871 0.65174185 1 Removed Independence test
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.45392145 0.42785285 0.86686687 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.547259827 0.61193415 0.59646282 0.19019019 Dissmissed No one-dim split
#> 4 0.376933100 0.50213797 0.47070495 0.85985986 Removed Independence test
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.45392145 0.39801501 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.547259827 0.61193415 0.55232574 1.00000000 Removed Independence test
#> 4 0.502137972 0.57220761 0.56711339 0.19219219 Removed Close to boundary
#>
#> Leaf with 3 points.
#> min max bp p_value action reason
#> 1 0.453921453 0.57453765 0.50248891 1.00000000 Removed Independence test
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.410311786 0.54725983 0.46270142 0.44444444 Dissmissed No one-dim split
#> 4 0.502137972 0.57220761 0.53924121 0.84484484 Removed Independence test
#>
#> Leaf with 4 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.55283435 0.50746186 0.422422422 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.261905783 0.41031179 0.34766423 1.000000000 Removed Independence test
#> 4 0.376933100 0.54726179 0.50579039 0.057057057 Splitted
#>
#>
#> 2 leaves to split...
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.507461859 0.55283435 0.50748350 0.74674675 Removed Close to boundary
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.261905783 0.41031179 NaN NaN
#> 4 0.505790392 0.54726179 0.54724087 0.72672673 Removed Close to boundary
#>
#> Leaf with 2 points.
#> min max bp p_value action reason
#> 1 0.384349600 0.50746186 0.48002519 0.51851852 Splitted
#> 2 0.073362799 1.00000000 NaN NaN
#> 3 0.261905783 0.41031179 NaN NaN
#> 4 0.376933100 0.50579039 0.47760360 0.54054054 Splitted
#>
#>
#> 0 leaves to split...
#> Enforcing constraints...
#> -----------------------------------------------------------------
#> OSQP v0.6.0 - Operator Splitting QP Solver
#> (c) Bartolomeo Stellato, Goran Banjac
#> University of Oxford - Stanford University 2019
#> -----------------------------------------------------------------
#> problem: variables n = 152, constraints m = 287
#> nnz(P) + nnz(A) = 13008
#> settings: linear system solver = qdldl,
#> eps_abs = 1.0e-06, eps_rel = 1.0e-06,
#> eps_prim_inf = 1.0e-06, eps_dual_inf = 1.0e-06,
#> rho = 1.00e-01 (adaptive),
#> sigma = 1.00e-06, alpha = 1.60, max_iter = 100000
#> check_termination: on (interval 25),
#> scaling: on, scaled_termination: off
#> warm start: on, polish: on, time_limit: off
#>
#> iter objective pri res dua res rho time
#> 1 -8.7153e+01 2.85e-02 3.07e+04 1.00e-01 5.07e-03s
#> 200 -7.6454e+01 1.54e-04 2.74e-01 1.54e+00 2.37e-02s
#> 400 -7.6354e+01 6.17e-06 1.11e+00 4.61e+00 4.76e-02s
#> 450 -7.6346e+01 1.20e-07 8.57e-03 4.61e+00 5.18e-02s
#> plsh -7.6346e+01 5.24e-16 2.74e-12 --------- 6.76e-02s
#>
#> status: solved
#> solution polish: successful
#> number of iterations: 450
#> optimal objective: -76.3461
#> run time: 6.76e-02s
#> optimal rho estimate: 9.95e-01
#>
#> Done !
#> Cort copula model: 200x4-dataset and 152 leaves.
Looking at the top of the output, we see that the first thing the algorithm did was removing the second dimension due to the independence test. Now that the copula is fitted, we have access to numerous of it’s methods. Two plotting functions are exported with this model, the pairs
function is implemented at a very low level in the class hierarchy and hence is working with almost all copulas of this package, but the plot
function is only implemented for Cort.
Pairs-plot of original data (in black, bottom-left corner) versus a simulation from the model (in red, top-right corner)
Gray boxes representing 2-d projections of the fitted density. In red, the imputed data points.
We see that there are some noise with point were there should not be. A bagged version of the model is accessible via the CortForest
class, and might be able to correct these problems.