2. The Copula Recursive Tree

2020-03-31

Introduction

In this vignette, we will use the Cort algorithm to fit a copula on a given simulated dataset based on clayton copula simulations. We will show how the algorithm can be used to produce a bona-fide copula, and describe some of the parameters.

Dataset

First, let’s create and plot the dataset we will work with. For that, we’ll use the gamma frailty model for the Clayton copula (but it’ll work for any other completely monotonous archimedean generator), as it is done in the copula package, see there. The following code is directly taken from the previous link, from the copula package :

psi <- function(t,alpha) (1 + sign(alpha)*t) ^ (-1/alpha) # generator
rClayton <- function(n,dim,alpha){
  val <- matrix(runif(n * dim), nrow = n)
  gam <- rgamma(n, shape = 1/alpha, rate = 1)
  gam <- matrix(gam, nrow = n, ncol = dim)
  psi(- log(val) / gam,alpha)
}

For reproducibility reasons, we set the random number generator. This vignette has been compiled on a version of R > 3.6, so the fix of the random number generator added in this version is used. To reproduce results from an earlier version, just use sample.kind = "Rounding". The following code simulates a dataset and then visualise it :

if(as.numeric(version$minor)<6){
  # the way of specifying the random number generation changed. 
  set.seed(12,kind = "Mersenne-Twister",normal.kind = "Inversion")
} else {
  set.seed(12,kind = "Mersenne-Twister",normal.kind = "Inversion",sample.kind = "Rejection")
}


n = 200 # taken small to reduce runtime of the vignette.
d = 4
n_trees = 5 # taken small to reduce runtime of the vignette.
number_max_dim_forest = 2 # taken small to reduce runtime of the vignette.

data <- matrix(nrow=n,ncol=d)
data[,c(1,4,3)] = rClayton(n=n,dim=d-1,alpha=7)
data[,2] = runif(n)
data[,3] <- 1 - data[,3]


pairs(data,cex=0.6)

We can clearly see that the second marginal is independent form the rest. In the following we will use this package to fit this dependence structure.

Fitting the Cort copula

Now that we have a dataset, we can run the Cort algorithm on it. In the implementation proposed here, this is done via the cort::Cort() function, passing first the dataset, and then various parameters. See ?Cort for a detailed list of parameters. Note that the verbosity level is quite progressive: We will here put it on 4 to see the splitting decisions that the algorithm is making.

(model = Cort(data,verbose_lvl=4,p_value_for_dim_red = 0.75))
#> Splitting...
#> 
#>      1 leaves to split...
#>         Leaf with 200 points.
#>                     min   max   bp            p_value      action     reason
#>                 1   0     1     0.793066313   0.00000000   Splitted         
#>                 2   0     1     0.073362799   0.74574575   Splitted         
#>                 3   0     1     0.208957488   0.00000000   Splitted         
#>                 4   0     1     0.789018711   0.00000000   Splitted         
#> 
#> 
#>      10 leaves to split...
#>         Leaf with 10 points.
#>                     min          max           bp            p_value      action    reason           
#>                 1   0.00000000   0.793066313   0.029266960   0.23323323   Removed   Close to boundary
#>                 2   0.00000000   0.073362799   0.024694101   1.00000000   Removed   Independence test
#>                 3   0.20895749   1.000000000   0.976639186   1.00000000   Removed   Independence test
#>                 4   0.00000000   0.789018711   0.024931546   1.00000000   Removed   Independence test
#> 
#>         Leaf with 5 points.
#>                     min           max          bp           p_value       action    reason           
#>                 1   0.000000000   0.79306631   0.74647612   0.025025025   Removed   Close to boundary
#>                 2   0.073362799   1.00000000   0.46782571   0.815815816   Removed   Independence test
#>                 3   0.000000000   0.20895749   0.11341065   1.000000000   Removed   Independence test
#>                 4   0.000000000   0.78901871   0.73903510   0.001001001   Removed   Close to boundary
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value       action    reason           
#>                 1   0.000000000   0.79306631   0.74622987   0.059059059   Removed   Close to boundary
#>                 2   0.073362799   1.00000000   0.74698467   1.000000000   Removed   Independence test
#>                 3   0.000000000   0.20895749   0.11288653   1.000000000   Removed   Independence test
#>                 4   0.789018711   1.00000000   0.95782035   1.000000000   Removed   Independence test
#> 
#>         Leaf with 132 points.
#>                     min           max          bp            p_value      action     reason           
#>                 1   0.000000000   0.79306631   0.042821551   0.00000000   Splitted                    
#>                 2   0.073362799   1.00000000   0.537607355   0.93793794   Removed    Independence test
#>                 3   0.208957488   1.00000000   0.956730821   0.00000000   Splitted                    
#>                 4   0.000000000   0.78901871   0.042245292   0.00000000   Splitted                    
#> 
#>         Leaf with 10 points.
#>                     min           max          bp           p_value       action     reason
#>                 1   0.000000000   0.79306631   0.72963801   0.001001001   Splitted         
#>                 2   0.073362799   1.00000000   0.79100657   0.699699700   Splitted         
#>                 3   0.208957488   1.00000000   0.29191967   0.000000000   Splitted         
#>                 4   0.789018711   1.00000000   0.81149607   0.708708709   Splitted         
#> 
#>         Leaf with 2 points.
#>                     min          max           bp            p_value       action    reason           
#>                 1   0.79306631   1.000000000   0.831543045   1.000000000   Removed   Independence test
#>                 2   0.00000000   0.073362799   0.054725149   1.000000000   Removed   Independence test
#>                 3   0.00000000   0.208957488   0.077939286   1.000000000   Removed   Independence test
#>                 4   0.00000000   0.789018711   0.779741296   0.038038038   Removed   Close to boundary
#> 
#>         Leaf with 2 points.
#>                     min          max           bp            p_value   action    reason           
#>                 1   0.79306631   1.000000000   0.845773688   1         Removed   Independence test
#>                 2   0.00000000   0.073362799   0.064447416   1         Removed   Independence test
#>                 3   0.00000000   0.208957488   0.025492203   1         Removed   Independence test
#>                 4   0.78901871   1.000000000   0.979599081   1         Removed   Independence test
#> 
#>         Leaf with 4 points.
#>                     min           max          bp           p_value       action     reason           
#>                 1   0.793066313   1.00000000   0.84166992   1.000000000   Removed    Independence test
#>                 2   0.073362799   1.00000000   0.32339274   0.342342342   Splitted                    
#>                 3   0.000000000   0.20895749   0.15757214   0.326326326   Splitted                    
#>                 4   0.000000000   0.78901871   0.74348553   0.001001001   Removed    Close to boundary
#> 
#>         Leaf with 27 points.
#>                     min           max          bp            p_value       action     reason           
#>                 1   0.793066313   1.00000000   0.910329535   0.897897898   Removed    Independence test
#>                 2   0.073362799   1.00000000   0.338809561   0.349349349   Splitted                    
#>                 3   0.000000000   0.20895749   0.062301862   0.074074074   Splitted                    
#>                 4   0.789018711   1.00000000   0.945263364   0.055055055   Splitted                    
#> 
#>         Leaf with 5 points.
#>                     min           max          bp           p_value       action    reason           
#>                 1   0.793066313   1.00000000   0.91944990   1.000000000   Removed   Independence test
#>                 2   0.073362799   1.00000000   0.29163589   1.000000000   Removed   Independence test
#>                 3   0.208957488   1.00000000   0.23173294   0.051051051   Removed   Close to boundary
#>                 4   0.000000000   0.78901871   0.76680943   0.007007007   Removed   Close to boundary
#> 
#> 
#>      10 leaves to split...
#>         Leaf with 6 points.
#>                     min           max           bp             p_value   action    reason           
#>                 1   0.000000000   0.042821551   0.0065360942     1       Removed   Independence test
#>                 2   0.073362799   1.000000000            NaN   NaN                                  
#>                 3   0.956730821   1.000000000   0.9931915688     1       Removed   Independence test
#>                 4   0.000000000   0.042245292   0.0062637572     1       Removed   Independence test
#> 
#>         Leaf with 124 points.
#>                     min           max          bp            p_value   action     reason
#>                 1   0.042821551   0.79306631   0.065828676     0       Splitted         
#>                 2   0.073362799   1.00000000           NaN   NaN                        
#>                 3   0.208957488   0.95673082   0.935908260     0       Splitted         
#>                 4   0.042245292   0.78901871   0.062581190     0       Splitted         
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action     reason           
#>                 1   0.000000000   0.72963801   0.56858981   0.17917918   Splitted                    
#>                 2   0.073362799   0.79100657   0.27064799   0.30030030   Splitted                    
#>                 3   0.291919675   1.00000000   0.43283826   0.42042042   Removed    Close to boundary
#>                 4   0.811496071   1.00000000   0.82108441   1.00000000   Removed    Independence test
#> 
#>         Leaf with 2 points.
#>                     min          max          bp           p_value   action    reason           
#>                 1   0.79306631   1.00000000          NaN   NaN                                  
#>                 2   0.32339274   1.00000000   0.63368566     1       Removed   Independence test
#>                 3   0.00000000   0.15757214   0.14924937     1       Removed   Independence test
#>                 4   0.00000000   0.78901871          NaN   NaN                                  
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.793066313   1.00000000          NaN          NaN                              
#>                 2   0.073362799   0.33880956   0.29480487   0.56456456   Removed   Close to boundary
#>                 3   0.062301862   0.20895749   0.10448923   1.00000000   Removed   Independence test
#>                 4   0.789018711   0.94526336   0.88403653   1.00000000   Removed   Independence test
#> 
#>         Leaf with 4 points.
#>                     min           max          bp           p_value      action     reason           
#>                 1   0.793066313   1.00000000          NaN          NaN                               
#>                 2   0.073362799   0.33880956   0.29234682   0.42442442   Splitted                    
#>                 3   0.062301862   0.20895749   0.12394396   0.09009009   Splitted                    
#>                 4   0.945263364   1.00000000   0.96517297   1.00000000   Removed    Independence test
#> 
#>         Leaf with 6 points.
#>                     min          max           bp            p_value      action       reason           
#>                 1   0.79306631   1.000000000           NaN          NaN                                 
#>                 2   0.33880956   1.000000000   0.832772505   1.00000000   Removed      Independence test
#>                 3   0.00000000   0.062301862   0.052701408   1.00000000   Removed      Independence test
#>                 4   0.78901871   0.945263364   0.920371108   0.11511512   Dissmissed   No one-dim split 
#> 
#>         Leaf with 2 points.
#>                     min          max           bp             p_value   action    reason           
#>                 1   0.79306631   1.000000000            NaN   NaN                                  
#>                 2   0.33880956   1.000000000   0.8706379426     1       Removed   Independence test
#>                 3   0.00000000   0.062301862   0.0099673589     1       Removed   Independence test
#>                 4   0.94526336   1.000000000   0.9615632425     1       Removed   Independence test
#> 
#>         Leaf with 9 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.793066313   1.00000000          NaN          NaN                              
#>                 2   0.338809561   1.00000000   0.33887113   0.33933934   Removed   Close to boundary
#>                 3   0.062301862   0.20895749   0.20894104   0.85785786   Removed   Independence test
#>                 4   0.789018711   0.94526336   0.94524514   1.00000000   Removed   Independence test
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.793066313   1.00000000          NaN          NaN                              
#>                 2   0.338809561   1.00000000   0.40298836   0.64264264   Removed   Close to boundary
#>                 3   0.062301862   0.20895749   0.18667424   0.36536537   Removed   Close to boundary
#>                 4   0.945263364   1.00000000   0.95041568   1.00000000   Removed   Independence test
#> 
#> 
#>      3 leaves to split...
#>         Leaf with 3 points.
#>                     min           max           bp            p_value   action    reason           
#>                 1   0.042821551   0.065828676   0.053067993     1       Removed   Independence test
#>                 2   0.073362799   1.000000000           NaN   NaN                                  
#>                 3   0.935908260   0.956730821   0.950248756     1       Removed   Independence test
#>                 4   0.042245292   0.062581190   0.054726368     1       Removed   Independence test
#> 
#>         Leaf with 120 points.
#>                     min           max          bp           p_value   action     reason
#>                 1   0.065828676   0.79306631   0.11964882     0       Splitted         
#>                 2   0.073362799   1.00000000          NaN   NaN                        
#>                 3   0.208957488   0.93590826   0.88115172     0       Splitted         
#>                 4   0.062581190   0.78901871   0.11599548     0       Splitted         
#> 
#>         Leaf with 2 points.
#>                     min           max          bp            p_value      action    reason           
#>                 1   0.793066313   1.00000000           NaN          NaN                              
#>                 2   0.292346820   0.33880956   0.296951575   0.11411411   Removed   Close to boundary
#>                 3   0.062301862   0.12394396   0.069654825   1.00000000   Removed   Independence test
#>                 4   0.945263364   1.00000000           NaN          NaN                              
#> 
#> 
#>      3 leaves to split...
#>         Leaf with 7 points.
#>                     min           max          bp            p_value      action     reason
#>                 1   0.065828676   0.11964882   0.079603606   0.42242242   Splitted         
#>                 2   0.073362799   1.00000000           NaN          NaN                    
#>                 3   0.881151716   0.93590826   0.924981262   0.43143143   Splitted         
#>                 4   0.062581190   0.11599548   0.069823682   0.38838839   Splitted         
#> 
#>         Leaf with 107 points.
#>                     min           max          bp           p_value   action     reason
#>                 1   0.119648815   0.79306631   0.27188506     0       Splitted         
#>                 2   0.073362799   1.00000000          NaN   NaN                        
#>                 3   0.208957488   0.88115172   0.72636003     0       Splitted         
#>                 4   0.115995484   0.78901871   0.26648409     0       Splitted         
#> 
#>         Leaf with 2 points.
#>                     min           max          bp            p_value      action    reason           
#>                 1   0.065828676   0.11964882   0.101680877   1.00000000   Removed   Independence test
#>                 2   0.073362799   1.00000000           NaN          NaN                              
#>                 3   0.208957488   0.88115172   0.879194269   0.01001001   Removed   Close to boundary
#>                 4   0.062581190   0.11599548   0.084207853   1.00000000   Removed   Independence test
#> 
#> 
#>      7 leaves to split...
#>         Leaf with 2 points.
#>                     min           max           bp            p_value   action    reason           
#>                 1   0.065828676   0.079603606   0.078923524     1       Removed   Independence test
#>                 2   0.073362799   1.000000000           NaN   NaN                                  
#>                 3   0.924981262   0.935908260   0.925377175     1       Removed   Independence test
#>                 4   0.062581190   0.069823682   0.066008939     1       Removed   Independence test
#> 
#>         Leaf with 4 points.
#>                     min           max          bp            p_value   action    reason           
#>                 1   0.079603606   0.11964882   0.097829469     1       Removed   Independence test
#>                 2   0.073362799   1.00000000           NaN   NaN                                  
#>                 3   0.881151716   0.92498126   0.906854689     1       Removed   Independence test
#>                 4   0.069823682   0.11599548   0.099455676     1       Removed   Independence test
#> 
#>         Leaf with 21 points.
#>                     min           max          bp           p_value       action     reason
#>                 1   0.119648815   0.27188506   0.19685645   0.012012012   Splitted         
#>                 2   0.073362799   1.00000000          NaN           NaN                    
#>                 3   0.726360027   0.88115172   0.80101288   0.012012012   Splitted         
#>                 4   0.115995484   0.26648409   0.19373922   0.018018018   Splitted         
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value       action    reason           
#>                 1   0.119648815   0.27188506   0.25352940   0.641641642   Removed   Close to boundary
#>                 2   0.073362799   1.00000000          NaN           NaN                              
#>                 3   0.726360027   0.88115172   0.76618348   1.000000000   Removed   Independence test
#>                 4   0.266484092   0.78901871   0.29054102   0.076076076   Removed   Close to boundary
#> 
#>         Leaf with 78 points.
#>                     min           max          bp           p_value   action     reason
#>                 1   0.271885061   0.79306631   0.38434960     0       Splitted         
#>                 2   0.073362799   1.00000000          NaN   NaN                        
#>                 3   0.208957488   0.72636003   0.61193415     0       Splitted         
#>                 4   0.266484092   0.78901871   0.37693310     0       Splitted         
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value       action    reason           
#>                 1   0.271885061   0.79306631   0.33562755   0.105105105   Removed   Close to boundary
#>                 2   0.073362799   1.00000000          NaN           NaN                              
#>                 3   0.726360027   0.88115172   0.73246193   0.596596597   Removed   Close to boundary
#>                 4   0.266484092   0.78901871   0.30366987   0.078078078   Removed   Close to boundary
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value       action    reason           
#>                 1   0.119648815   0.27188506   0.19894481   1.000000000   Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN           NaN                              
#>                 3   0.208957488   0.72636003   0.72113597   0.079079079   Removed   Close to boundary
#>                 4   0.115995484   0.26648409   0.17420944   1.000000000   Removed   Independence test
#> 
#> 
#>      12 leaves to split...
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.119648815   0.19685645   0.18407815   1.00000000   Removed      Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.726360027   0.80101288   0.75733968   1.00000000   Removed      Independence test
#>                 4   0.193739216   0.26648409   0.21153553   0.34334334   Dissmissed   No one-dim split 
#> 
#>         Leaf with 7 points.
#>                     min           max          bp           p_value      action     reason           
#>                 1   0.119648815   0.19685645   0.14108896   0.77077077   Removed    Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                               
#>                 3   0.801012875   0.88115172   0.85572048   0.29429429   Splitted                    
#>                 4   0.115995484   0.19373922   0.13455778   0.27827828   Splitted                    
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.119648815   0.19685645   0.18879593   1.00000000   Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.801012875   0.88115172   0.84545595   1.00000000   Removed   Independence test
#>                 4   0.193739216   0.26648409   0.25373082   0.63263263   Removed   Close to boundary
#> 
#>         Leaf with 8 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.196856447   0.27188506   0.22662820   0.74974975   Dissmissed   No one-dim split 
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.726360027   0.80101288   0.75125930   0.82582583   Removed      Independence test
#>                 4   0.193739216   0.26648409   0.21393072   0.76976977   Removed      Independence test
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value   action    reason           
#>                 1   0.196856447   0.27188506   0.22554118     1       Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN   NaN                                  
#>                 3   0.801012875   0.88115172   0.82168855     1       Removed   Independence test
#>                 4   0.115995484   0.19373922   0.16915233     1       Removed   Independence test
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.271885061   0.38434960   0.32303566   0.55655656   Dissmissed   No one-dim split 
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.208957488   0.61193415   0.56218594   0.16616617   Removed      Close to boundary
#>                 4   0.376933100   0.78901871   0.42165697   0.18318318   Removed      Close to boundary
#> 
#>         Leaf with 10 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.271885061   0.38434960   0.36667239   0.55055055   Dissmissed   No one-dim split 
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.611934152   0.72636003   0.61195748   0.82582583   Removed      Independence test
#>                 4   0.266484092   0.37693310   0.32525261   0.82582583   Removed      Independence test
#> 
#>         Leaf with 3 points.
#>                     min           max          bp           p_value       action     reason           
#>                 1   0.271885061   0.38434960   0.35570733   0.310310310   Splitted                    
#>                 2   0.073362799   1.00000000          NaN           NaN                               
#>                 3   0.611934152   0.72636003   0.65107066   0.312312312   Splitted                    
#>                 4   0.376933100   0.78901871   0.42288955   0.095095095   Removed    Close to boundary
#> 
#>         Leaf with 4 points.
#>                     min           max          bp           p_value       action     reason
#>                 1   0.384349600   0.79306631   0.43274466   0.036036036   Splitted         
#>                 2   0.073362799   1.00000000          NaN           NaN                    
#>                 3   0.208957488   0.61193415   0.43790324   0.633633634   Splitted         
#>                 4   0.266484092   0.37693310   0.36074103   0.428428428   Splitted         
#> 
#>         Leaf with 52 points.
#>                     min           max          bp           p_value       action     reason
#>                 1   0.384349600   0.79306631   0.57453765   0.005005005   Splitted         
#>                 2   0.073362799   1.00000000          NaN           NaN                    
#>                 3   0.208957488   0.61193415   0.41031179   0.030030030   Splitted         
#>                 4   0.376933100   0.78901871   0.57220761   0.001001001   Splitted         
#> 
#>         Leaf with 4 points.
#>                     min           max          bp           p_value      action     reason           
#>                 1   0.384349600   0.79306631   0.55222598   1.00000000   Removed    Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                               
#>                 3   0.611934152   0.72636003   0.66067260   0.40840841   Splitted                    
#>                 4   0.376933100   0.78901871   0.45888203   0.16116116   Splitted                    
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.271885061   0.38434960   0.35256476   1.00000000   Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.208957488   0.61193415   0.53233104   0.22722723   Removed   Close to boundary
#>                 4   0.266484092   0.37693310   0.28582048   1.00000000   Removed   Independence test
#> 
#> 
#>      12 leaves to split...
#>         Leaf with 3 points.
#>                     min           max          bp           p_value      action     reason
#>                 1   0.119648815   0.19685645          NaN          NaN                    
#>                 2   0.073362799   1.00000000          NaN          NaN                    
#>                 3   0.801012875   0.85572048   0.82124427   0.46346346   Splitted         
#>                 4   0.134557776   0.19373922   0.16419882   0.40940941   Splitted         
#> 
#>         Leaf with 3 points.
#>                     min           max          bp           p_value   action    reason           
#>                 1   0.119648815   0.19685645          NaN   NaN                                  
#>                 2   0.073362799   1.00000000          NaN   NaN                                  
#>                 3   0.855720475   0.88115172   0.85660434     1       Removed   Independence test
#>                 4   0.115995484   0.13455778   0.12935286     1       Removed   Independence test
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.384349600   0.43274466   0.38452287   0.38238238   Removed   Close to boundary
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.208957488   0.43790324   0.43777821   1.00000000   Removed   Independence test
#>                 4   0.360741033   0.37693310   0.37690501   0.37537538   Removed   Close to boundary
#> 
#>         Leaf with 3 points.
#>                     min           max          bp           p_value      action     reason
#>                 1   0.384349600   0.57453765   0.53247665   0.58858859   Splitted         
#>                 2   0.073362799   1.00000000          NaN          NaN                    
#>                 3   0.208957488   0.41031179   0.36827940   0.53753754   Splitted         
#>                 4   0.572207608   0.78901871   0.62189266   0.52252252   Splitted         
#> 
#>         Leaf with 12 points.
#>                     min           max          bp           p_value      action     reason
#>                 1   0.384349600   0.57453765   0.45392145   0.52152152   Splitted         
#>                 2   0.073362799   1.00000000          NaN          NaN                    
#>                 3   0.410311786   0.61193415   0.54725983   0.17317317   Splitted         
#>                 4   0.376933100   0.57220761   0.50213797   0.21421421   Splitted         
#> 
#>         Leaf with 5 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.384349600   0.57453765   0.41791128   0.63563564   Dissmissed   No one-dim split 
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.410311786   0.61193415   0.52717035   0.79779780   Removed      Independence test
#>                 4   0.572207608   0.78901871   0.58706794   1.00000000   Removed      Independence test
#> 
#>         Leaf with 5 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.574537650   0.79306631   0.78561324   0.45745746   Removed   Close to boundary
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.208957488   0.41031179   0.40633893   0.58458458   Removed   Close to boundary
#>                 4   0.376933100   0.57220761   0.38812318   0.58658659   Removed   Close to boundary
#> 
#>         Leaf with 14 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.574537650   0.79306631   0.66905645   0.62262262   Dissmissed   No one-dim split 
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.208957488   0.41031179   0.32337244   0.95495495   Removed      Independence test
#>                 4   0.572207608   0.78901871   0.67361034   1.00000000   Removed      Independence test
#> 
#>         Leaf with 5 points.
#>                     min           max          bp           p_value     action       reason           
#>                 1   0.574537650   0.79306631   0.59204309   1.0000000   Removed      Independence test
#>                 2   0.073362799   1.00000000          NaN         NaN                                 
#>                 3   0.410311786   0.61193415   0.52113685   0.4974975   Dissmissed   No one-dim split 
#>                 4   0.376933100   0.57220761   0.55508101   0.7977978   Removed      Independence test
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value     action    reason           
#>                 1   0.574537650   0.79306631   0.73371970   1.0000000   Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN         NaN                              
#>                 3   0.410311786   0.61193415   0.44958932   0.2042042   Removed   Close to boundary
#>                 4   0.572207608   0.78901871   0.63185248   1.0000000   Removed   Independence test
#> 
#>         Leaf with 6 points.
#>                     min           max          bp           p_value       action     reason
#>                 1   0.384349600   0.57453765   0.55283435   0.028028028   Splitted         
#>                 2   0.073362799   1.00000000          NaN           NaN                    
#>                 3   0.208957488   0.41031179   0.26190578   0.294294294   Splitted         
#>                 4   0.376933100   0.57220761   0.54726179   0.049049049   Splitted         
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action     reason
#>                 1   0.384349600   0.79306631          NaN          NaN                    
#>                 2   0.073362799   1.00000000          NaN          NaN                    
#>                 3   0.611934152   0.66067260   0.62269454   0.60160160   Splitted         
#>                 4   0.376933100   0.45888203   0.39801108   0.65365365   Splitted         
#> 
#> 
#>      6 leaves to split...
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.119648815   0.19685645          NaN          NaN                              
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.801012875   0.82124427   0.80180241   0.20420420   Removed   Close to boundary
#>                 4   0.134557776   0.16419882   0.16412616   0.16716717   Removed   Close to boundary
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value   action    reason           
#>                 1   0.384349600   0.53247665   0.51740520     1       Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN   NaN                                  
#>                 3   0.208957488   0.36827940   0.30759802     1       Removed   Independence test
#>                 4   0.621892657   0.78901871   0.65174185     1       Removed   Independence test
#> 
#>         Leaf with 4 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.384349600   0.45392145   0.42785285   0.86686687   Removed      Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.547259827   0.61193415   0.59646282   0.19019019   Dissmissed   No one-dim split 
#>                 4   0.376933100   0.50213797   0.47070495   0.85985986   Removed      Independence test
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.384349600   0.45392145   0.39801501   1.00000000   Removed   Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.547259827   0.61193415   0.55232574   1.00000000   Removed   Independence test
#>                 4   0.502137972   0.57220761   0.56711339   0.19219219   Removed   Close to boundary
#> 
#>         Leaf with 3 points.
#>                     min           max          bp           p_value      action       reason           
#>                 1   0.453921453   0.57453765   0.50248891   1.00000000   Removed      Independence test
#>                 2   0.073362799   1.00000000          NaN          NaN                                 
#>                 3   0.410311786   0.54725983   0.46270142   0.44444444   Dissmissed   No one-dim split 
#>                 4   0.502137972   0.57220761   0.53924121   0.84484484   Removed      Independence test
#> 
#>         Leaf with 4 points.
#>                     min           max          bp           p_value       action     reason           
#>                 1   0.384349600   0.55283435   0.50746186   0.422422422   Splitted                    
#>                 2   0.073362799   1.00000000          NaN           NaN                               
#>                 3   0.261905783   0.41031179   0.34766423   1.000000000   Removed    Independence test
#>                 4   0.376933100   0.54726179   0.50579039   0.057057057   Splitted                    
#> 
#> 
#>      2 leaves to split...
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action    reason           
#>                 1   0.507461859   0.55283435   0.50748350   0.74674675   Removed   Close to boundary
#>                 2   0.073362799   1.00000000          NaN          NaN                              
#>                 3   0.261905783   0.41031179          NaN          NaN                              
#>                 4   0.505790392   0.54726179   0.54724087   0.72672673   Removed   Close to boundary
#> 
#>         Leaf with 2 points.
#>                     min           max          bp           p_value      action     reason
#>                 1   0.384349600   0.50746186   0.48002519   0.51851852   Splitted         
#>                 2   0.073362799   1.00000000          NaN          NaN                    
#>                 3   0.261905783   0.41031179          NaN          NaN                    
#>                 4   0.376933100   0.50579039   0.47760360   0.54054054   Splitted         
#> 
#> 
#>      0 leaves to split...
#> Enforcing constraints...
#> -----------------------------------------------------------------
#>            OSQP v0.6.0  -  Operator Splitting QP Solver
#>               (c) Bartolomeo Stellato,  Goran Banjac
#>         University of Oxford  -  Stanford University 2019
#> -----------------------------------------------------------------
#> problem:  variables n = 152, constraints m = 287
#>           nnz(P) + nnz(A) = 13008
#> settings: linear system solver = qdldl,
#>           eps_abs = 1.0e-06, eps_rel = 1.0e-06,
#>           eps_prim_inf = 1.0e-06, eps_dual_inf = 1.0e-06,
#>           rho = 1.00e-01 (adaptive),
#>           sigma = 1.00e-06, alpha = 1.60, max_iter = 100000
#>           check_termination: on (interval 25),
#>           scaling: on, scaled_termination: off
#>           warm start: on, polish: on, time_limit: off
#> 
#> iter  objective    pri res    dua res    rho        time
#>    1  -8.7153e+01   2.85e-02   3.07e+04   1.00e-01   5.07e-03s
#>  200  -7.6454e+01   1.54e-04   2.74e-01   1.54e+00   2.37e-02s
#>  400  -7.6354e+01   6.17e-06   1.11e+00   4.61e+00   4.76e-02s
#>  450  -7.6346e+01   1.20e-07   8.57e-03   4.61e+00   5.18e-02s
#> plsh  -7.6346e+01   5.24e-16   2.74e-12  ---------   6.76e-02s
#> 
#> status:               solved
#> solution polish:      successful
#> number of iterations: 450
#> optimal objective:    -76.3461
#> run time:             6.76e-02s
#> optimal rho estimate: 9.95e-01
#> 
#> Done !
#> Cort copula model: 200x4-dataset and 152 leaves.

Looking at the top of the output, we see that the first thing the algorithm did was removing the second dimension due to the independence test. Now that the copula is fitted, we have access to numerous of it’s methods. Two plotting functions are exported with this model, the pairs function is implemented at a very low level in the class hierarchy and hence is working with almost all copulas of this package, but the plot function is only implemented for Cort.

pairs(model)
Pairs-plot of original data (in black, bottom-left corner) versus a simulation from the model (in red, top-right corner)

Pairs-plot of original data (in black, bottom-left corner) versus a simulation from the model (in red, top-right corner)

plot(model)
Gray boxes representing 2-d projections of the fitted density. In red, the imputed data points.

Gray boxes representing 2-d projections of the fitted density. In red, the imputed data points.

We see that there are some noise with point were there should not be. A bagged version of the model is accessible via the CortForest class, and might be able to correct these problems.