Introduction to sBIC

Purpose

This package allows you to compute the singilar Bayesian information criterion as described in Drton and Plummer (2017) for collections of the following model types:

  1. Binomial mixtures
  2. Gaussian mixtures
  3. Latent class analysis
  4. Gaussian latent forests
  5. Reduced rank regression
  6. Factor analysis

All of these models, excluding Gaussian latent forests, are described in the above paper. For details regardings the use of the sBIC with Gaussian latent forests see Drton et al (2014).

Object oriented approach

This package makes extensive use of the R.oo package (Bengtsson 2003) which allows for the use of some object oriented principles in R. While not strictly necessary to use this package it may be helpful to read sections 1 and 2 of Bengstsson (2003) which serve as an introduction to R.oo.

An important consequence of the use of R.oo is that objects in the sBIC package use call by reference semantics and are modified by calling their associated methods.

Example

Each collection of models is defined as its own class. As an example for how to use the package we will compute the sBIC for a collection of Gaussian mixture models with at most 8 components:

set.seed(123)

Create an object representing a collection of Gaussian mixture models with at most 8 components in 2 dimensions.

library(sBIC)
gms = GaussianMixtures(maxNumComponents = 8, dim = 2, restarts = 100)

Generate some simulated data, a mixture of 3 bivariate normals.

library(MASS)
n = 175
class = sample(0:2, n, replace = TRUE)
X = (class == 0) * mvrnorm(n, mu = c(0, 0), Sigma = diag(2)) +
    (class == 1) * mvrnorm(n, mu = c(2.5, 2.5), Sigma = diag(1.3, 2)) +
    (class == 2) * mvrnorm(n, mu = c(-3, 2.5), Sigma = diag(1.2, 2))

Compute the sBIC on the mixture models with the randomly generated data.

sBIC(X, gms)
## $logLike
## [1] -732.9610 -697.5850 -683.9564 -676.2722 -668.5807 -661.2474 -653.9223
## [8] -646.2799
## 
## $sBIC
## [1] -747.6058 -729.8036 -727.4262 -728.7474 -729.9027 -731.3697 -732.8346
## [8] -733.9800
## 
## $BIC
## [1] -747.6058 -729.8037 -733.7488 -743.6385 -753.5207 -763.7613 -774.0100
## [8] -783.9413
## 
## $modelPoset
## [1] "GaussianMixtures: 0x7fde1cf43c98"

Notice that the BIC too strongly penalizes the (true) model with 3 components.

References