Introduction to bain

Herbert Hoijtink, Caspar van Lissa, Joris Mulder, Marlyne Bosman, Camiel van Zundert, Xin Gu

2019-09-26

Introduction

bain is an acronym for “Bayesian informative hypotheses evaluation”. It uses the Bayes factor to evaluate hypotheses specified using equality and inequality constraints among (linear combinations of) parameters in a wide range of statistical models.

A tutorial is provided by Hoijtink, Mulder, van Lissa, and Gu (2019) retrievable from the Psychological Methods website at https://www.apa.org/pubs/journals/met/ or the bain website at https://informative-hypotheses.sites.uu.nl/software/bain/

Users are advised to read the tutorial and this vignette before using bain.

The full bain reference list (all retrievable from the bain website) is

Bosman, M. and Hoijtink, H. (unpublished). Robust Bayes factors for Bayesian ANOVA: overcoming overcoming adverse effect of non-normality and outliers.

Gu, X., Mulder, J., and Hoijtink, H. (2018). Approximate adjusted fractional Bayes factors: A general method for testing informative hypotheses. British Journal of Mathematical and Statistical Psychology, 71, 229-261. DOI: 10.1111/bmsp.12110

Gu, X., Hoijtink, H., Mulder, J., and Rosseel, Y. (2019). Bain: A program for Bayesian testing of order constrained hypotheses in structural equation models. Journal of Statistical Computation and Simulation. https://doi.org/10.1080/00949655.2019.1590574

Hoijtink, H., Mulder, J., van Lissa, C., and Gu, X. (2019). A tutorial on testing hypotheses using the Bayes factor. Psychological Methods. DOI: 10.1037/met0000201

Hoijtink, H., Gu, X., and Mulder, J. (2019). Bayesian evaluation of informative hypotheses for multiple populations. British Journal of Mathematical and Statistical Psychology, 72, 219-243. DOI: 10.1111/bmsp.12145

Hoijtink, H., Gu, X., Mulder, J., and Rosseel, Y. (2019). Computing Bayes Factors from Data with Missing Values. Psychological Methods, 24, 253-268. DOI: 10.1037/met0000187

Van Lissa, C.J., Gu, X., Mulder, J., Rosseel, Y., van Zundert, C, and Hoijtink, H. (unpublished). Teacher’s corner: Evaluating informative hypotheses using the Bayes factor in structural equation models.

Usage

bain(x, hypothesis, fraction = 1, ...)

Arguments

x An R object containing the outcome of a statistical analysis. Currently, the following objects can be processed:

  1. t_test() objects (Student’s t-test, Welch’s t-test, paired samples t-test, one-group t-test, equivalence test). Note that, t_test can be used in the same way as t.test.
  2. lm() objects (ANOVA, ANCOVA, multiple regression).
  3. lavaan objects generated with the sem(), cfa(), and growth functions.
  4. A named vector containing the estimates resulting from a statistical analysis. Note that, named means that each estimate has to be named such that it can be referred to in hypotheses.

hypothesis A character string containing the informative hypotheses to evaluate (see below the section on “The Specification of Hypotheses”).

fraction = 1 A number representing the fraction of information in the data used to construct the prior distribution (see Hoijtink, Mulder, van Lissa, and Gu, 2019): The default value 1 denotes the minimal fraction, 2 denotes twice the minimal fraction, etc.

... Additional arguments (see below).

Using bain with a lm or t_test object

The following steps need to be executed:

  1. x <- lm() or x <- t_test(). Execute an analysis with lm or t_test. See the Examples section below for a complete elaboration of the calls to lm and t_test that can be processed by bain. Note that, lm and t_test will apply list-wise deletion if there are cases with missing values in the variables used.
  2. Display the estimates and their names using the command coef(x). (Unique abbreviations of) the names will be used to specify hypotheses.
  3. set.seed(seed). Set seed equal to an integer number to create a repeatable random number sequence. It is recommended to run analyses with two different seeds to ensure stability of the results.
  4. results <- bain(x,hypotheses,fraction = 1) or results <- bain(x,hypotheses,fraction = 1,standardize = FALSE). The first call to bain is used in case of lm implementations of ANOVA, ANCOVA, and t_test. The second call to bain is used in case of lm implementations of multiple regression. With standardize = TRUE hypotheses with respect to standardized regression coefficients are evaluated. With standardize = FALSE hypotheses with respect to unstandardized regression coefficients are evaluated.
  5. print(results) Print the results of an analysis with bain.
  6. summary(results, ci=0.95) Present descriptives for the parameters used to specify hypotheses. ci can be used to specify the confidence level of the credibility intervals.

Using bain with a lavaan object

The following steps need to be executed:

  1. x <- sem() or x <- cfa() or x <- growth(). Execute an analysis with the sem, cfa, or growth functions implemented in lavaan (http://lavaan.org/) . Note that, by default, lavaan will apply list-wise deletion if there are cases with missing values in the variables used. However, other options are available.
  2. Specify a lavaan model using the model <- ... command. In case of multiple group models, only models without between group restrictions can be processed by bain with a lavaan object as input.
  3. Display the estimates and their names using the command coef(x). Only parameters who’s names contain ~ (regression coefficients), ~1 (intercepts), or =~ (factor loadings) can be used in the specification of hypotheses. (Unique abbreviations of) the names can be used to specify hypotheses. For multiple group analyses the names have to end with a group label .grp. Group labels can be assigned using commands like sesamesim$sex <- factor(sesamesim$sex, labels = c("boy", "girl")). If in a lavaan model specification parameters are labeled, e.g., as in model <- 'age ~ c(a1, a2)*peabody + c(b1, b2)*1 then the labels have to be used in the specification of hypotheses.
  4. set.seed(seed). Set seed equal to an integer number to create a repeatable random number sequence. It is recommended to run analyses with two different seeds to ensure stability of the results.
  5. results <- bain(x,hypotheses,fraction = 1,standardize = FALSE). With standardize = TRUE hypotheses with respect to standardized coefficients are evaluated. With standardize = FALSE hypotheses with respect to unstandardized coefficients are evaluated.
  6. print(results) Print the results of an analysis with bain.
  7. summary(results, ci=0.95) Present descriptives for the parameters used to specify hypotheses. ci can be used to specify the confidence level of the credibility intervals.

Using bain with a named vector

The following steps need to be executed:

  1. Execute a statistical analysis.

In case of a single group analysis, the following information has to be extracted from the statistical analysis and supplied to bain:

  1. A vector containing estimates of the parameters used to specify hypotheses;
  2. A list containing the covariance matrix of these parameters; and,
  3. The sample size used for estimation of the parameters. Note that, due to missing values this sample size may be smaller than the total sample size.

In case of a multiple group analysis, the following information has to be extracted from the statistical analysis and supplied to bain:

  1. A vector containing estimates of the group specific parameters possibly augmented with the estimates of parameters that are joined by the groups. The structure of this vector is [parameters of group 1, parameters of group 2, …, the parameters that are joined by the groups];
  2. A list containing, per group, the covariance matrix of the parameters corresponding to the group at hand and, possibly, the augmented parameters. In the rows and columns of each covariance matrix the parameters of the group at hand come first, possibly followed by the augmented parameters.
  3. Per group the sample size used for estimation of the parameters. Note that, due to missing values this sample size may be smaller than the total sample size per group.
  1. Assign names to the estimates using names(estimates)<-names. Note that, names is a character vector containing new names for the estimates in estimates. Each name has to start with a letter, and may consist of “letters”, “numbers”, “.”, “_”, “:”, “~”, “=~”, and “~1”. These names are used to specify hypotheses (see below). An example is names <- c("a", "b", "c").
  2. set.seed(seed). Set seed equal to an integer number to create a repeatable random number sequence. It is recommended to run analyses with two different seeds to ensure stability of the results.
  3. results <- bain(estimates, hypotheses, n=., Sigma=., group_parameters = 2, joint_parameters = 0, fraction = 1) executes bain with the following arguments:
  1. estimates A named vector with parameter estimates.
  2. hypotheses A character string containing the informative hypotheses to evaluate (the specification is elaborated below).
  3. n A vector containing the sample size of each group in the analysis. See, Hoijtink, Gu, and Mulder (2019), for an elaboration of the difference between one and multiple group analyses. A multiple group analysis is required when group specific parameters are used to formulate hypotheses. Examples are the Student’s and Welch’s t-test, ANOVA, and ANCOVA. See the Examples section for elaborations of the specification of multiple group analyses when a named vector is input for bain.
  4. Sigma A list of covariance matrices. In case of one group analyses the list contains one covariance matrix. In case of multiple group analyses the list contains one covariance matrix for each group. See the Examples section and Hoijtink, Gu, and Mulder (2019) for further instructions.
  5. group_parameters The number of group specific parameters. In, for example, an ANOVA with three groups, estimates will contain three sample means, group_parameters = 1 because each group is characterized by one mean, and joint_parameters = 0 because there are no parameters that apply to each of the groups. In, for example, an ANCOVA with three groups and two covariates, estimates will contain five parameters (first the three adjusted means, followed by the regression coefficients of the two covariates), group_parameters = 1 because each group is characterized by one adjusted mean, and joint_parameters = 2 because there are two regression coefficients that apply to each group. In, for example, a repeated measures design with four repeated measures and two groups (a between factor with two levels and a within factor with four levels) estimates will contain eight means (first the four for group 1, followed by the four for group 2), group_parameters = 4 because each group is characterized by four means and joint_parameters = 0 because there are no parameters that apply to each of the groups.
  6. joint_parameters In case of one group joint_parameters = 0. In case of two or more groups, the number of parameters in estimates shared by the groups. In, for example, an ANCOVA, the number of joint_parameters equals the number of covariates.
  7. fraction = 1 A number representing the fraction of information in the data used to construct the prior distribution (see Hoijtink, Mulder, van Lissa, and Gu, 2019): The default value 1 denotes the minimal fraction, 2 denotes twice the minimal fraction, etc.
  1. print(results) Print the results of an analysis with bain.
  2. summary(results, ci=0.95) Present descriptives for the parameters used to specify hypotheses. ci can be used the specify the confidence level of the credibility intervals.

The specification of hypotheses

hypotheses is a character string that specifies which informative hypotheses have to be evaluated. A simple example is hypotheses <- "a > b > c; a = b = c;" which specifies two hypotheses using three estimates with names “a”, “b”, and “c”, respectively.

The hypotheses specified have to adhere to the following rules bain may still run if you deviate from the rules, however, the output will be nonsense:

  1. Each parameter name is used at most once.
  2. Each parameter name may or may not be pre-multiplied with a number.
  3. A constant may be added or subtracted from each parameter name.
  4. A linear combination can also be a single number.

    Examples are: 3 * a + 5; a + 2 * b + 3 * c - 2; a - b; and 5.

The set of hypotheses has to be compatible. For the statistical background of this requirement see Gu, Mulder, Hoijtink (2019). Usually the sets of hypotheses specified by researchers are compatible, and if not, bain will return an error message. The following steps can be used to determine if a set of hypotheses is compatible:

  1. Replace a range constraint, e.g., 1 < a1 < 3, by an equality constraint in which the parameter involved is equated to the midpoint of the range, that is, a1 = 2.
  2. Replace in each hypothesis the < and > by =. For example, a1 = a2 > a3 > a4 becomes a1 = a2 = a3 = a4.
  3. The hypotheses are compatible if there is at least one solution to the resulting set of equations. For the two hypotheses considered under 1. and 2., the solution is a1 = a2 = a3 = a4 = 2. An example of two non-compatible hypotheses is hypotheses <- "a = 0; a > 2;" because there is no solution to the equations a=0 and a=2.

Each hypothesis in a set of hypotheses has to be non-redundant. A hypothesis is redundant if it can also be specified with fewer constraints. For example, a = b & a > 0 & b > 0 is redundant because it can also be specified as a = b & a > 0. bain will work correctly if hypotheses specified using only < and > are redundant. bain will return an error message if hypotheses specified using at least one = are redundant.

Each hypothesis in a set of hypotheses has to be possible. An hypothesis is impossible if estimates in agreement with the hypothesis do not exist. For example: values for a in agreement with a = 0 & a > 2 do not exist. It is the responsibility of the user to ensure that the hypotheses specified are possible. If not, bain will either return an error message or render an output table containing Inf’s.

Output from an analysis with bain

The commands bain() or results<-bain() followed by results or print(results) will render the default (most important) output from bain. These concern for each hypothesis specified in hypotheses the fit, complexity, Bayes factor versus its complement, posterior model probability (based on equal prior model probabilities) excluding the unconstrained hypothesis, and posterior model probability including the unconstrained hypothesis. In Hoijtink, Mulder, van Lissa, and Gu (2019) it is elaborated how these quantities (and the other output presented below) should be interpreted. Additionally, using summary(results, ci=0.95), a descriptives matrix can be obtained in which for each estimate, the name, the value, and a 95% central credibility interval is presented.

The following commands can be used to retrieve the default and additional information from the bain output object:

Examples

Note that, each of the examples given below can be run by copy pasting them into the Source screen of RStudio.

Unless indicated otherwise, the examples that follow below use a simulated data set inspired by the Sesame Street data set from: Stevens, J. P. (1996). Applied Multivariate Statistics for the Social Sciences. Mahwah NJ: Lawrence Erlbaum. This data set is included in the bain package. The variables contained in sesamesim are subsequently:

The examples that follow below are organized in three categories:

  1. Running bain with a t_test object;
  2. running bain with a lm object; and,
  3. running bain with a named vector.

The ANOVA and ANCOVA examples are provided using both a lm object and a named vector as input for bain. Below you will find the following examples:

1. Examples Using a t_test Object

If a t_test object is used, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class t_test” (which denotes that a t-test was executed).

  1. Bayesian Student’s t-test (equal within group variances)
  2. Bayesian Welch’s t-test (unequal within group variances
  3. Bayesian paired samples t-test
  4. Bayesian one group t-test
  5. Bayesian Equivalence test

2. Examples Using a lm Object

  1. Bayesian ANOVA. The example concerns a one-way ANOVA. Two-way or higher order ANOVAs can only be handled by recoding all factors into one factor. If, for example, there is a factor sex with levels man and woman, and a factor age with levels young and old, these have to be recoded in a new factor sexage with levels manyoung, manold, womanyoung, womanold. If a lm “ANOVA”object is used, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class lm (ANOVA)”
  2. Bayesian ANCOVA. The example concerns a one-way ANCOVA. Two-way or higher order ANCOVAs can only be handled by recoding all factors into one factor. Note that calls to lm using functions of the predictors (for example, adding squared predictors to the model using commands like y ~ x + I(x^2)) can not be processed by bain. However, one can compute a new variable (for example, a squared predictor) and add this variable to the model specification of lm. If a lm “ANCOVA”object is used, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class lm (ANCOVA)”
  3. Bayesian multiple regression. Note that calls to lm using functions of the predictors (for example, adding squared predictors to the model using commands like y ~ x + I(x^2)) can not be processed by bain. However, one can compute a new variable (for example, a squared predictor) and add this variable to the model specification of lm. Note furthermore, that only if standardize = FALSE interactions between predictors can be processed. If a lm “regression with only continuous predictors” object is used, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class lm (continuous predictors)”. If a lm object contains two or more factors and, optionally, continuous predictors, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class lm (mixed predictors)”. In case of mixed predictors, the one group approach to computing Bayes factors is used (see, Hoijtink, Gu, and Mulder, 2019). This may render inferior results if group sizes are unequal.
  4. Bayesian ANOVA. Sensitivity analysis. See Hoijtink, Mulder, van Lissa, and Gu (2019) for elaborations.

3. Examples Using a lavaan Object

If a lavaan object is used, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class lavaan”.

  1. Bayesian confirmatory factor analysis
  2. Bayesian latent regression
  3. Bayesian multiple group latent regression. Note that, multiple group models with between group restrictions cannot be processed.

Additional examples can be found in Gu, Hoijtink, Mulder, and Rosseel (2019).

4. Examples Using a Named Vector

If a named vector object is used, the main output table is labeled: “Bayesian informative hypothesis testing for an object of class numeric” (which denotes that named vector input was used).

  1. Bayesian ANOVA
  2. Bayesian ANCOVA
  3. Bayesian repeated measures analysis (one within factor)
  4. Bayesian repeated measures analysis (within between design)
  5. Bayesian one group logistic regression (counterpart of multiple regression)
  6. Bayesian multiple group logistic regression (counterpart of ANCOVA)
  7. Bayesian robust ANOVA (unequal within groups variances)
  8. Bayesian multiple regression with missing data
  9. Bayesian confirmatory factor analysis

Additional examples are available in BFTutorial.R downloadable from the bain website at https://informative-hypotheses.sites.uu.nl/software/bain/.

1. Examples Using a t_test Object

  1. An example of the Bayesian Student’s t-test (equal within group variances):
# load the bain package which includes the simulated sesamesim data set
library(bain)
# collect the data for the boys in the vector x and for the girls in the
# vector y
x<-sesamesim$postnumb[which(sesamesim$sex==1)]
y<-sesamesim$postnumb[which(sesamesim$sex==2)]
# execute student's t-test
ttest <- t_test(x,y,paired = FALSE, var.equal = TRUE)
# Check the names of the estimates
coef(ttest)
# set a seed value
set.seed(100)
# test hypotheses with bain
results <- bain(ttest, "x = y; x > y; x < y")
# display the results
results
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian Welch’s t-test (unequal within group variances):
# load the bain package which includes the simulated sesamesim data set
library(bain)
# collect the data for the boys in the vector x and for the girs in the
# vector y
x<-sesamesim$postnumb[which(sesamesim$sex==1)]
y<-sesamesim$postnumb[which(sesamesim$sex==2)]
# execute student's t-test
ttest <- t_test(x,y,paired = FALSE, var.equal = FALSE)
# Check the names of the coefficients
coef(ttest)
# set a seed value
set.seed(100)
# test hypotheses with bain
results <- bain(ttest, "x = y; x > y; x < y")
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian paired samples t-test:
# load the bain package which includes the simulated sesamesim data set
library(bain)
# compare the pre with the post measurements
ttest <- t_test(sesamesim$prenumb,sesamesim$postnumb,paired = TRUE)
# Check name of the coefficient
coef(ttest)
# set a seed value
set.seed(100)
# test hypotheses with bain
results <- bain(ttest, "difference=0; difference>0; difference<0")
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian one group t-test:
# load the bain package which includes the simulated sesamesim data se
library(bain)
# compare post measurements with the reference value 30
ttest <- t_test(sesamesim$postnumb)
# Check name of estimate
coef(ttest)
# set a seed value
set.seed(100)
# test hypotheses with bain versus the reference value 30
results <- bain(ttest, "x=30; x>30; x<30")
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian equivalence test
# load the bain package which includes the simulated sesamesim data set
library(bain)
# collect the data for the boys in the vector x and for the girs in the
# vector y
x<-sesamesim$postnumb[which(sesamesim$sex==1)]
y<-sesamesim$postnumb[which(sesamesim$sex==2)]
# execute student's t-test
ttest <- t_test(x,y,paired = FALSE, var.equal = TRUE)
# Check the names of the estimates
coef(ttest)
# compute the pooled within standard deviation using the variance of x
# (ttest$v[1]) and y (ttest$v[2])
pwsd <- sqrt(((length(x) -1) * ttest$v[1] + (length(y)-1) * ttest$v[2])/
((length(x) -1) + (length(y) -1)))
# print pwsd in order to be able to include it in the hypothesis. Its value
# is 12.60
print(pwsd)
# set a seed value
set.seed(100)
# test hypotheses (the means of boy and girl differ less than .2 * pwsd =
# 2.52 VERSUS the means differ more than .2 * pwsd = 2.52) with bain
# note that, .2 is a value for Cohen's d reflecting a "small" effect, that
# is, the means differ less or more than .2 pwsd.
results <- bain(ttest, "x - y > -2.52 & x - y < 2.52")
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)

2. Examples Using a lm Object

  1. An example of the Bayesian ANOVA
# load the bain package which includes the simulated sesamesim data set
library(bain)
# make a factor of variable site
sesamesim$site <- as.factor(sesamesim$site)
# execute an analysis of variance using lm() which, due to the -1, returns
# estimates of the means per group
anov <- lm(postnumb~site-1,sesamesim)
# take a look at the estimated means and their names
coef(anov)
# set a seed value
set.seed(100)
# test hypotheses with bain
results <- bain(anov, "site1=site2=site3=site4=site5; site2>site5>site1>
site3>site4")
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian ANCOVA
# load the bain package which includes the simulated sesamesim data se
library(bain)
# make a factor of variable site
sesamesim$site <- as.factor(sesamesim$site)
# Center the covariate. If centered the coef() command below displays the
# adjusted means. If not centered the intercepts are displayed.
sesamesim$prenumb <- sesamesim$prenumb - mean(sesamesim$prenumb)
# execute an analysis of covariance using lm() which, due to the -1, returns
# estimates of the adjusted means per group
ancov <- lm(postnumb~site+prenumb-1,sesamesim)
# take a look at the estimated adjusted means, the regression coefficient
# of the covariate and their names
coef(ancov)
# set a seed value
set.seed(100)
# test hypotheses with bain
results <- bain(ancov, "site1=site2=site3=site4=site5;
                        site2>site5>site1>site3>site4")
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian multiple regression
# load the bain package which includes the simulated sesamesim data set
library(bain)
# execute a multiple regression using lm()
regr <- lm(postnumb ~ age + peabody + prenumb,sesamesim)
# take a look at the estimated regression coefficients and their names
coef(regr)
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that standardize = FALSE denotes that the
# hypotheses are in terms of unstandardized regression coefficients
results<-bain(regr, "age = 0 & peab=0 & pre=0 ; age > 0 & peab > 0 & pre > 0"
, standardize = FALSE)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
# Since it is only meaningful to compare regression coefficients if they are
# measured on the same scale, bain can also evaluate standardized regression
# coefficients (based on the seBeta function by Jeff Jones and Niels Waller):
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that standardize = TRUE denotes that the
# hypotheses are in terms of standardized regression coefficients
results<-bain(regr, "age = peab = pre ; pre > age > peab",standardize = TRUE)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian ANOVA: Sensitivity Analysis
# load the bain package which includes the simulated sesamesim data set
library(bain)
# make a factor of variable site
sesamesim$site <- as.factor(sesamesim$site)
# execute an analysis of variance using lm() which, due to the -1, returns
# estimates of the means per group
anov <- lm(postnumb~site-1,sesamesim)
# take a look at the estimated means and their names
coef(anov)
# set a seed value
set.seed(100)
# test hypotheses with bain
results1 <- bain(anov, "site1=site2=site3=site4=site5; site2>site5>site1>
site3>site4", fraction = 1)
# display the results
print(results1)
# obtain the descriptives table
summary(results1, ci = 0.95)
# execute a sensitivity analysis. See Hoijtink, Mulder, 
# van Lissa, and Gu (2019) for elaborations.
set.seed(100)
results2 <- bain(anov, "site1=site2=site3=site4=site5; site2>site5>site1>
site3>site4",fraction = 2)
print(results2)
set.seed(100)
results3 <- bain(anov, "site1=site2=site3=site4=site5; site2>site5>site1>
site3>site4",fraction = 3)
print(results3)

3. Examples Using a lavaan object

  1. An example of Bayesian confirmatory factor analysis. See van Lissa et al. (unpublished) for further elaborations
# Load the bain and lavaan libraries. Visit www.lavaan.org for 
# lavaan mini-tutorials, examples, and elaborations
library(bain)
library(lavaan)

#  Specify and fit the confirmatory factor model
model1 <- '
    A =~ Ab + Al + Af + An + Ar + Ac 
    B =~ Bb + Bl + Bf + Bn + Br + Bc 
'
# Use the lavaan sem function to execute the confirmatory factor analysis
fit1 <- sem(model1, data = sesamesim, std.lv = TRUE)

# Inspect the parameter names
coef(fit1)

# Formulate hypotheses, call bain, obtain summary stats
hypotheses1 <-
" A=~Ab > .6 & A=~Al > .6 & A=~Af > .6 & A=~An > .6 & A=~Ar > .6 & A=~Ac >.6 & 
B=~Bb > .6 & B=~Bl > .6 & B=~Bf > .6 & B=~Bn > .6 & B=~Br > .6 & B=~Bc >.6"
set.seed(100)
y <- bain(fit1,hypotheses1,fraction=1,standardize=TRUE)
sy <- summary(y, ci = 0.95)
  1. An example of Bayesian latent regression. See van Lissa et al. (unpublished) for further elaborations.
# Load the bain and lavaan libraries. Visit www.lavaan.org for 
# lavaan mini-tutorials, examples, and elaborations
library(bain)
library(lavaan)

# Specify and fit the latent regression model
model2 <- '
    A  =~ Ab + Al + Af + An + Ar + Ac 
    B =~ Bb + Bl + Bf + Bn + Br + Bc

    A ~ B + age + peabody
'
fit2 <- sem(model2, data = sesamesim, std.lv = TRUE)

# Inspect the parameter names
coef(fit2)

# Formulate hypotheses, call bain, obtain summary stats

hypotheses2 <- "A~B > A~peabody = A~age = 0; 
               A~B > A~peabody > A~age = 0; 
A~B > A~peabody > A~age > 0"
set.seed(100)
y1 <- bain(fit2, hypotheses2, fraction = 1, standardize = TRUE)
sy1 <- summary(y1, ci = 0.99)
  1. An example of Bayesian multiple group latent regression. See van Lissa et al. (unpublished) for further elaborations.
# Load the bain and lavaan libraries. Visit www.lavaan.org for 
# lavaan mini-tutorials, examples, and elaborations
library(bain)
library(lavaan)

# Specify the multiple group latent regression model

model3 <- '
    A  =~ Ab + Al + Af + An + Ar + Ac 
    B =~ Bb + Bl + Bf + Bn + Br + Bc 

    A ~ B + age + peabody
'
# Assign labels to the groups to be used when formulating
# hypotheses
sesamesim$sex <- factor(sesamesim$sex, labels = c("boy", "girl"))
# Fit the multiple group latent regression model
fit3 <- sem(model3, data = sesamesim, std.lv = TRUE, group = "sex")
# Inspect the parameter names
coef(fit3)

# Formulate and evaluate two sets of hypotheses

# Compute the mean of the intercepts in both groups for both factors
parest <- parameterEstimates(fit3, standardize = TRUE)
ints <- parest[ parest$op == "~1", "std.all"]
intAboy <- ints[1:6]
intBboy <- ints[7:12]
intAgirl <- ints[17:22]
intBgirl <- ints[23:28]
iAb <- mean(intAboy)
iBb <- mean(intBboy)
iAg <- mean(intAgirl)
iBg <- mean(intBgirl)

# Print the means so you can include these numbers
# in the hypotheses in order to make the intercepts
# comparable between both groups
iAb
iBb
iAg
iBg

# Compute the geometric mean of the loadings in both groups for both factors
load <- parest[ parest$op == "=~", "std.all"]
loadAboy <- load[1:6]
loadBboy <- load[7:12]
loadAgirl <- load[13:18]
loadBgirl <- load[19:24]
lAb <- prod(loadAboy)**(1/6)
lBb <- prod(loadBboy)**(1/6)
lAg <- prod(loadAgirl)**(1/6)
lBg <- prod(loadBgirl)**(1/6)

# Print the inverse geometric means so you can include these numbers
# in the hypotheses in order to make the slopes comparable
# between both groups. Note: to specify hypotheses, 
# the * can be used, but the / CANNOT be used.
1/lAb
1/lBb
1/lAg
1/lBg

# Specify and evaluate the two sets of hypotheses
hypotheses31 <-
"1.214269 *A=~Ab.boy = 1.276502 *A=~Ab.girl&
1.214269 *A=~Al.boy = 1.276502 *A=~Al.girl &
1.214269 *A=~Af.boy = 1.276502 *A=~Af.girl &
1.214269 *A=~An.boy = 1.276502 *A=~An.girl &
1.214269 *A=~Ar.boy = 1.276502 *A=~Ar.girl &
1.214269 *A=~Ac.boy = 1.276502 *A=~Ac.girl &
1.26346 *B=~Bb.boy = 1.32143 *B=~Bb.girl &
1.26346 *B=~Bl.boy = 1.32143 *B=~Bl.girl &
1.26346 *B=~Bf.boy = 1.32143 *B=~Bf.girl &
1.26346 *B=~Bn.boy = 1.32143 *B=~Bn.girl &
1.26346 *B=~Br.boy = 1.32143 *B=~Br.girl &
1.26346 *B=~Bc.boy = 1.32143 *B=~Bc.girl &
Ab~1.boy -3.20752= Ab~1.girl -3.37765&
Al~1.boy -3.20752= Al~1.girl -3.37765&
Af~1.boy -3.20752= Af~1.girl -3.37765&
An~1.boy -3.20752= An~1.girl -3.37765&
Ar~1.boy -3.20752= Ar~1.girl -3.37765&
Ac~1.boy -3.20752= Ac~1.girl -3.37765&
Bb~1.boy -2.682939= Bb~1.girl -2.620678&
Bl~1.boy -2.682939= Bl~1.girl -2.620678&
Bf~1.boy -2.682939= Bf~1.girl -2.620678&
Bn~1.boy -2.682939= Bn~1.girl -2.620678&
Br~1.boy -2.682939= Br~1.girl -2.620678&
Bc~1.boy -2.682939= Bc~1.girl -2.620678
"
set.seed(100)
y1 <- bain(fit3, hypotheses31, standardize = TRUE)
sy1 <- summary(y1, ci = 0.90)

hypotheses32 <-
"1.214269 *A=~Ab.boy = 1.276502 *A=~Ab.girl &
1.214269 *A=~Al.boy = 1.276502 *A=~Al.girl &
1.214269 *A=~Af.boy = 1.276502 *A=~Af.girl &
1.214269 *A=~An.boy = 1.276502 *A=~An.girl &
1.214269 *A=~Ar.boy = 1.276502 *A=~Ar.girl &
1.214269 *A=~Ac.boy = 1.276502 *A=~Ac.girl &
1.26346 *B=~Bb.boy = 1.32143 *B=~Bb.girl &
1.26346 *B=~Bl.boy = 1.32143 *B=~Bl.girl &
1.26346 *B=~Bf.boy = 1.32143 *B=~Bf.girl &
1.26346 *B=~Bn.boy = 1.32143 *B=~Bn.girl &
1.26346 *B=~Br.boy = 1.32143 *B=~Br.girl &
1.26346 *B=~Bc.boy = 1.32143 *B=~Bc.girl &
Ab~1.boy -3.20752= Ab~1.girl -3.37765&
Al~1.boy -3.20752= Al~1.girl -3.37765&
Af~1.boy -3.20752= Af~1.girl -3.37765&
An~1.boy -3.20752= An~1.girl -3.37765&
Ar~1.boy -3.20752= Ar~1.girl -3.37765&
Ac~1.boy -3.20752= Ac~1.girl -3.37765&
Bb~1.boy -2.682939= Bb~1.girl -2.620678&
Bl~1.boy -2.682939= Bl~1.girl -2.620678&
Bf~1.boy -2.682939= Bf~1.girl -2.620678&
Bn~1.boy -2.682939= Bn~1.girl -2.620678&
Br~1.boy -2.682939= Br~1.girl -2.620678&
Bc~1.boy -2.682939= Bc~1.girl -2.620678 &
A~age.boy < A~age.girl  &
A~peabody.boy < A~peabody.girl  &
A~B.boy < A~B.girl;
1.214269 *A=~Ab.boy = 1.276502 *A=~Ab.girl &
1.214269 *A=~Al.boy = 1.276502 *A=~Al.girl &
1.214269 *A=~Af.boy = 1.276502 *A=~Af.girl &
1.214269 *A=~An.boy = 1.276502 *A=~An.girl &
1.214269 *A=~Ar.boy = 1.276502 *A=~Ar.girl &
1.214269 *A=~Ac.boy = 1.276502 *A=~Ac.girl &
1.26346 *B=~Bb.boy = 1.32143 *B=~Bb.girl &
1.26346 *B=~Bl.boy = 1.32143 *B=~Bl.girl &
1.26346 *B=~Bf.boy = 1.32143 *B=~Bf.girl &
1.26346 *B=~Bn.boy = 1.32143 *B=~Bn.girl &
1.26346 *B=~Br.boy = 1.32143 *B=~Br.girl &
1.26346 *B=~Bc.boy = 1.32143 *B=~Bc.girl &
Ab~1.boy -3.20752= Ab~1.girl -3.37765&
Al~1.boy -3.20752= Al~1.girl -3.37765&
Af~1.boy -3.20752= Af~1.girl -3.37765&
An~1.boy -3.20752= An~1.girl -3.37765&
Ar~1.boy -3.20752= Ar~1.girl -3.37765&
Ac~1.boy -3.20752= Ac~1.girl -3.37765&
Bb~1.boy -2.682939= Bb~1.girl -2.620678&
Bl~1.boy -2.682939= Bl~1.girl -2.620678&
Bf~1.boy -2.682939= Bf~1.girl -2.620678&
Bn~1.boy -2.682939= Bn~1.girl -2.620678&
Br~1.boy -2.682939= Br~1.girl -2.620678&
Bc~1.boy -2.682939= Bc~1.girl -2.620678 &
A~age.boy = A~age.girl  &
A~peabody.boy < A~peabody.girl  &
A~B.boy < A~B.girl;
1.214269 *A=~Ab.boy = 1.276502 *A=~Ab.girl &
1.214269 *A=~Al.boy = 1.276502 *A=~Al.girl &
1.214269 *A=~Af.boy = 1.276502 *A=~Af.girl &
1.214269 *A=~An.boy = 1.276502 *A=~An.girl &
1.214269 *A=~Ar.boy = 1.276502 *A=~Ar.girl &
1.214269 *A=~Ac.boy = 1.276502 *A=~Ac.girl &
1.26346 *B=~Bb.boy = 1.32143 *B=~Bb.girl &
1.26346 *B=~Bl.boy = 1.32143 *B=~Bl.girl &
1.26346 *B=~Bf.boy = 1.32143 *B=~Bf.girl &
1.26346 *B=~Bn.boy = 1.32143 *B=~Bn.girl &
1.26346 *B=~Br.boy = 1.32143 *B=~Br.girl &
1.26346 *B=~Bc.boy = 1.32143 *B=~Bc.girl &
Ab~1.boy -3.20752= Ab~1.girl -3.37765&
Al~1.boy -3.20752= Al~1.girl -3.37765&
Af~1.boy -3.20752= Af~1.girl -3.37765&
An~1.boy -3.20752= An~1.girl -3.37765&
Ar~1.boy -3.20752= Ar~1.girl -3.37765&
Ac~1.boy -3.20752= Ac~1.girl -3.37765&
Bb~1.boy -2.682939= Bb~1.girl -2.620678&
Bl~1.boy -2.682939= Bl~1.girl -2.620678&
Bf~1.boy -2.682939= Bf~1.girl -2.620678&
Bn~1.boy -2.682939= Bn~1.girl -2.620678&
Br~1.boy -2.682939= Br~1.girl -2.620678&
Bc~1.boy -2.682939= Bc~1.girl -2.620678 &
A~age.boy = A~age.girl  &
A~peabody.boy = A~peabody.girl  &
A~B.boy < A~B.girl;
1.214269 *A=~Ab.boy = 1.276502 *A=~Ab.girl &
1.214269 *A=~Al.boy = 1.276502 *A=~Al.girl &
1.214269 *A=~Af.boy = 1.276502 *A=~Af.girl &
1.214269 *A=~An.boy = 1.276502 *A=~An.girl &
1.214269 *A=~Ar.boy = 1.276502 *A=~Ar.girl &
1.214269 *A=~Ac.boy = 1.276502 *A=~Ac.girl &
1.26346 *B=~Bb.boy = 1.32143 *B=~Bb.girl &
1.26346 *B=~Bl.boy = 1.32143 *B=~Bl.girl &
1.26346 *B=~Bf.boy = 1.32143 *B=~Bf.girl &
1.26346 *B=~Bn.boy = 1.32143 *B=~Bn.girl &
1.26346 *B=~Br.boy = 1.32143 *B=~Br.girl &
1.26346 *B=~Bc.boy = 1.32143 *B=~Bc.girl &
Ab~1.boy -3.20752= Ab~1.girl -3.37765&
Al~1.boy -3.20752= Al~1.girl -3.37765&
Af~1.boy -3.20752= Af~1.girl -3.37765&
An~1.boy -3.20752= An~1.girl -3.37765&
Ar~1.boy -3.20752= Ar~1.girl -3.37765&
Ac~1.boy -3.20752= Ac~1.girl -3.37765&
Bb~1.boy -2.682939= Bb~1.girl -2.620678&
Bl~1.boy -2.682939= Bl~1.girl -2.620678&
Bf~1.boy -2.682939= Bf~1.girl -2.620678&
Bn~1.boy -2.682939= Bn~1.girl -2.620678&
Br~1.boy -2.682939= Br~1.girl -2.620678&
Bc~1.boy -2.682939= Bc~1.girl -2.620678 &
A~age.boy = A~age.girl  &
A~peabody.boy = A~peabody.girl  &
A~B.boy = A~B.girl"
set.seed(100)
y2 <- bain(fit3, hypotheses32,standardize=TRUE)

4. Examples Using a named vector

  1. An example of the Bayesian ANOVA using a named vector as input for bain
# load the bain package which includes the simulated sesamesim data se
library(bain)
# make a factor of variable site
sesamesim$site <- as.factor(sesamesim$site)
# execute an analysis of variance using lm() which, due to the -1, returns
# estimates of the means per group
anov <- lm(postnumb~site-1,sesamesim)
# take a look at the estimated means and their names
coef(anov)
# collect the estimates means in a vector
estimate <- coef(anov)
# give names to the estimates in anov
names(estimate) <- c("site1", "site2", "site3","site4","site5")
# create a vector containing the sample size of each group
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup <- table(sesamesim$site)
# compute for each group the covariance matrix of the parameters
# of that group and collect these in a list
# for the ANOVA this is simply a list containing for each group the variance
# of the mean note that, the within group variance as estimated using lm is
# used to compute the variance of each of the means! See, Hoijtink, Gu, and
# Mulder (2019) for further elaborations.
var <- summary(anov)$sigma**2
cov1 <- matrix(var/ngroup[1], nrow=1, ncol=1)
cov2 <- matrix(var/ngroup[2], nrow=1, ncol=1)
cov3 <- matrix(var/ngroup[3], nrow=1, ncol=1)
cov4 <- matrix(var/ngroup[4], nrow=1, ncol=1)
cov5 <- matrix(var/ngroup[5], nrow=1, ncol=1)
covlist <- list(cov1, cov2, cov3, cov4,cov5)
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that there are multiple groups
# characterized by one mean, therefore group_parameters=0. Note that are no
# joint parameters, therefore, joint_parameters=0.
results <- bain(estimate,
"site1=site2=site3=site4=site5; site2>site5>site1>site3>site4",
n=ngroup,Sigma=covlist,group_parameters=1,joint_parameters = 0)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian ANCOVA using a named vector as input for bain
# load the bain package which includes the simulated sesamesim data se
library(bain)
# make a factor of variable site
sesamesim$site <- as.factor(sesamesim$site)
# center the covariate. If centered the coef() command below displays the
# adjusted means. If not centered the intercepts are displayed.
sesamesim$prenumb <- sesamesim$prenumb - mean(sesamesim$prenumb)
# execute an analysis of covariance using lm() which, due to the -1, returns
# estimates of the adjusted means per group
ancov2 <- lm(postnumb~site+prenumb-1,sesamesim)
# take a look at the estimated adjusted means and their names
coef(ancov2)
# collect the estimates of the adjusted means and regression coefficient of
# the covariate in a vector (the vector has to contain first the
# adjusted means and next the regression coefficients of the covariates)
estimates <- coef(ancov2)
# assign names to the estimates
names(estimates)<- c("v.1", "v.2", "v.3", "v.4","v.5", "pre")
# compute the sample size per group
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup <- table(sesamesim$site)
# compute for each group the covariance matrix of the parameters of that
# group and collect these in a list note that, the residual variance as
# estimated using lm is used to compute these covariance matrices
var <- (summary(ancov2)$sigma)**2
# below, for each group, the covariance matrix of the adjusted mean and
# covariate is computed
# see Hoijtink, Gu, and Mulder (2019) for further explanation and elaboration
cat1 <- subset(cbind(sesamesim$site,sesamesim$prenumb), sesamesim$site == 1)
cat1[,1] <- 1
cat1 <- as.matrix(cat1)
cov1 <- var * solve(t(cat1) %*% cat1)
#
cat2 <- subset(cbind(sesamesim$site,sesamesim$prenumb), sesamesim$site == 2)
cat2[,1] <- 1
cat2 <- as.matrix(cat2)
cov2 <- var * solve(t(cat2) %*% cat2)
#
cat3 <- subset(cbind(sesamesim$site,sesamesim$prenumb), sesamesim$site == 3)
cat3[,1] <- 1
cat3 <- as.matrix(cat3)
cov3 <- var * solve(t(cat3) %*% cat3)
#
cat4 <- subset(cbind(sesamesim$site,sesamesim$prenumb), sesamesim$site == 4)
cat4[,1] <- 1
cat4 <- as.matrix(cat4)
cov4 <- var * solve(t(cat4) %*% cat4)
#
cat5 <- subset(cbind(sesamesim$site,sesamesim$prenumb), sesamesim$site == 5)
cat5[,1] <- 1
cat5 <- as.matrix(cat5)
cov5 <- var * solve(t(cat5) %*% cat5)
#
covariances <- list(cov1, cov2, cov3, cov4,cov5)
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that there are multiple groups
# characterized by one adjusted mean, therefore group_parameters=1 Note that
# there is one covariate, therefore, joint_parameters=1.
results2<-bain(estimates,"v.1=v.2=v.3=v.4=v.5;v.2 > v.5 > v.1 > v.3 >v.4;",
n=ngroup,Sigma=covariances,group_parameters=1,joint_parameters = 1)
# display the results
print(results2)
# obtain the descriptives table
summary(results2, ci = 0.95)
  1. An example of the Bayesian one within factor repeated measures analysis
# load the bain package which includes the simulated sesamesim data set
library(bain)
# estimate the means of three repeated measures of number knowledge
# the 1 denotes that these means have to be estimated
within <- lm(cbind(prenumb,postnumb,funumb)~1, data=sesamesim)
# take a look at the estimated means and their names
coef(within)
# note that the model specified in lm has three dependent variables.
# Consequently, the estimates rendered by lm are collected in a "matrix".
# Since bain needs a named vector containing the estimated means, the [1:3]
# code is used to select the three means from a matrix and store them in a
# vector.
estimate <- coef(within)[1:3]
# give names to the estimates in anov
names(estimate) <- c("pre", "post", "fu")
# compute the sample size
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup <- nrow(sesamesim)
# compute the covariance matrix of the three means
covmatr <- list(vcov(within))
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that there is one group, with three
# means therefore group_parameters=3.
results <- bain(estimate,"pre = post = fu; pre < post < fu", n=ngroup,
Sigma=covmatr, group_parameters=3, joint_parameters = 0)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian one between factor and one within factor repeated measures analysis
# load the bain package which includes the simulated sesamesim data set
library(bain)
# make a factor of the variable sex
sesamesim$sex <- factor(sesamesim$sex)
# estimate the means of prenumb, postnumb, and funumb for boys and girls
# the -1 denotes that the means have to estimated
bw <- lm(cbind(prenumb, postnumb, funumb)~sex-1, data=sesamesim)
# take a look at the estimated means and their names
coef(bw)
# collect the estimated means in a vector
est1 <-coef(bw)[1,1:3] # the three means for sex = 1
est2 <-coef(bw)[2,1:3] # the three means for sex = 2
estimate <- c(est1,est2)
# give names to the estimates in anov
names(estimate) <- c("pre1", "post1", "fu1","pre2", "post2", "fu2")
# determine the sample size per group
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup<-table(sesamesim$sex)
# cov1 has to contain the covariance matrix of the three means in group 1.
# cov2 has to contain the covariance matrix in group 2
# typing vcov(bw) in the console pane highlights the structure of
# the covariance matrix of all 3+3=6 means
# it has to be dissected in to cov1 and cov2
cov1 <- c(vcov(bw)[1,1],vcov(bw)[1,3],vcov(bw)[1,5],vcov(bw)[3,1],
vcov(bw)[3,3],vcov(bw)[3,5],vcov(bw)[5,1],vcov(bw)[5,3],vcov(bw)[5,5])
cov1 <- matrix(cov1,3,3)
cov2 <- c(vcov(bw)[2,2],vcov(bw)[2,4],vcov(bw)[2,6],vcov(bw)[4,2],
vcov(bw)[4,4],vcov(bw)[4,6],vcov(bw)[6,2],vcov(bw)[6,4],vcov(bw)[6,6])
cov2 <- matrix(cov2,3,3)
covariance<-list(cov1,cov2)
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that there are multiple groups
# characterized by three means, therefore group_parameters=3. Note that there
# are no additional parameters, therefore, joint_parameters=0.
results <-bain(estimate, "pre1 - pre2 = post1 - post2 = fu1 -fu2;
pre1 - pre2 > post1 - post2 > fu1 -fu2"  , n=ngroup, Sigma=covariance,
group_parameters=3, joint_parameters = 0)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian logistic regression as the counterpart of Bayesian multiple regression
# load the bain package which includes the simulated sesamesim data set
library(bain)
# regression coefficients can only be mutually compared if the
# corresponding predictors are measured on the same scale, therefore the
# predictors are standardized
sesamesim$age <- (sesamesim$age - mean(sesamesim$age))/sd(sesamesim$age)
sesamesim$peabody <- (sesamesim$peabody - mean(sesamesim$peabody))/
sd(sesamesim$peabody)
sesamesim$prenumb <- (sesamesim$prenumb - mean(sesamesim$prenumb))/
sd(sesamesim$prenumb)
# estimate the logistic regression coefficients
logreg <- glm(viewenc ~ age + peabody + prenumb, family=binomial,
data=sesamesim)
# take a look at the estimates and their names
coef(logreg)
# collect the estimated intercept and regression coefficients in a vector
estimate <- coef(logreg)
# give names to the estimates
names(estimate) <- c("int", "age", "peab" ,"pre" )
# compute the sample size. Note that, this is an analysis with ONE group
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup <- nrow(sesamesim)
# compute the covariance matrix of the intercept and regression coefficients
covmatr <- list(vcov(logreg))
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that there is one group, with four
# parameters (intercept plus three regression coefficients) therefore
# group_parameters=4.
results <- bain(estimate, "age = peab = pre; age > pre > peab", n=ngroup,
Sigma=covmatr, group_parameters=4, joint_parameters = 0)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian logistic regression as the counterpart of an ANCOVA. This is Example 3 from Hoijtink, Gu, and Mulder (2019). The research question concerns the probability of encouragement to view sesame street (a dichotomous 0/1 meaning no/yes dependent variable) for boys (sex=1) and girls (sex=2) adjusted for their age. This is a kind of ANCOVA with a dichotomous instead of a continuous dependent variable.
# load the bain package which includes the simulated sesamesim data set
library(bain)
# load the numDeriv package which will be used to compute the covariance
# matrix of the adjusted group coefficients and regression coefficient of age
# for the boys and the girls # using the estimates obtained using the data for 
# the boys AND the girls.
library(numDeriv)
# make a factor of the variable sex
sesamesim$sex <- factor(sesamesim$sex)
# center the covariate age
sesamesim$age <- sesamesim$age - mean(sesamesim$age)
# determine sample size per sex group
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup <- table(sesamesim$sex)
# execute the logistic regression, -1 ensures that the coefficients
# for boys and girl are estimated adjusted for the covariate age
anal <- glm(viewenc ~ sex + age - 1, family=binomial, data=sesamesim)
# take a look at the estimates and their names
coef(anal)
# collect the estimates obtained using the data of the boys AND the
# girls in a vector. This vector contains first the group specific
# parameters followed by the regression coefficient of the covariate.
estimates <-coef(anal)
# give names to the estimates
names(estimates) <- c("boys", "girls", "age")
# use numDeriv to compute the Hessian matrix and subsequently the
# covariance matrix for each of the two (boys and girls) groups.
# The vector f should contain the regression coefficient of the group
# at hand and the regression coefficient of the covariate.
#
# the first group
data <- subset(cbind(sesamesim$sex,sesamesim$age,sesamesim$viewenc),
sesamesim$sex==1)
f <- 1
f[1] <- estimates[1] # the regression coefficient of boys
f[2] <- estimates[3] # the regression coefficient of age
#
# within the for loop below the log likelihood of the logistic
# regression model is computed using the data for the boys
logist1 <- function(x){
  out <- 0
  for (i in 1:ngroup[1]){
    out <- out + data[i,3]*(x[1] + x[2]*data[i,2]) - log (1 +
    exp(x[1] + x[2]*data[i,2]))
  }
  return(out)
}
hes1 <- hessian(func=logist1, x=f)
# multiply with -1 and invert to obtain the covariance matrix for the
# first group
cov1 <- -1 * solve(hes1)
#
# the second group
data <- subset(cbind(sesamesim$sex,sesamesim$age,sesamesim$viewenc),
sesamesim$sex==2)
f[1] <- estimates[2] # the regression coefficient of girls
f[2] <- estimates[3] # the regression coefficient of age

# within the for loop below the log likelihood of the logistic
# regression model is computed using the data for the girls
logist2 <- function(x){
  out <- 0
  for (i in 1:ngroup[2]){ 
    out <- out + data[i,3]*(x[1] + x[2]*data[i,2]) - log (1 +
    exp(x[1] + x[2]*data[i,2]))
  }
  return(out)
}
hes2 <- hessian(func=logist2, x=f)
# multiply with -1 and invert to obtain the covariance matrix
cov2 <- -1 * solve(hes2)
#
#make a list of covariance matrices
covariance<-list(cov1,cov2)
#
# set a seed value
set.seed(100)
# test hypotheses with bain. Note that there are multiple groups
# characterized by one adjusted group coefficient,
# therefore group_parameters=1. Note that there is one covariate,
# therefore, joint_parameters=1.
results <- bain(estimates, "boys < girls & age > 0", n=ngroup,
Sigma=covariance, group_parameters=1, joint_parameters = 1)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian robust ANOVA with unequal within group variances. Robust ANOVA is elaborated in Bosman and Hoijtink (unpublished).
# load the bain package which includes the simulated sesamesim data set
library(bain)
# load the WRS2 package which renders the trimmed sample mean and
# corresponding standard error
library(WRS2)
# make a factor of variable site
sesamesim$site <- as.factor(sesamesim$site)
# create a vector containing the sample sizes of each group
# (in case of missing values in the variables used, the command
# below has to be modified such that only the cases without
# missing values are counted)
ngroup <- table(sesamesim$site)
# Compute the 20\% sample trimmed mean for each site
estimates <- c(mean(sesamesim$postnumb[sesamesim$site == 1], tr = 0.2),
               mean(sesamesim$postnumb[sesamesim$site == 2], tr = 0.2),
               mean(sesamesim$postnumb[sesamesim$site == 3], tr = 0.2),
               mean(sesamesim$postnumb[sesamesim$site == 4], tr = 0.2),
               mean(sesamesim$postnumb[sesamesim$site == 5], tr = 0.2))
# give names to the estimates
names(estimates) <- c("s1", "s2", "s3","s4","s5")
# display the estimates and their names
print(estimates)
# Compute the sample trimmed mean standard error for each site
se <- c(trimse(sesamesim$postnumb[sesamesim$site == 1]),
        trimse(sesamesim$postnumb[sesamesim$site == 2]),
        trimse(sesamesim$postnumb[sesamesim$site == 3]),
        trimse(sesamesim$postnumb[sesamesim$site == 4]),
        trimse(sesamesim$postnumb[sesamesim$site == 5]))
# Square the standard errors to obtain the variances of the sample
# trimmed means
var <- se^2
# Store the variances in a list of matrices
covlist <- list(matrix(var[1]),matrix(var[2]),
matrix(var[3]),matrix(var[4]), matrix(var[5]))
# set a seed value
set.seed(100)
# test hypotheses with bain
results <- bain(estimates,"s1=s2=s3=s4=s5;s2>s5>s1>s3>s4",
n=ngroup,Sigma=covlist,group_parameters=1,joint_parameters= 0)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of the Bayesian Regression with missing data. The computation of Bayes factors in the presence of missing data is elaborated in Hoijtink, Gu, Mulder, and Rosseel (2019).

RUNNING THIS EXAMPLE WILL TAKE ABOUT 60 SECONDS

# load the bain package which includes the simulated sesamesim data set
library(bain)
# load the mice (multiple imputation of missing data), psych
# (provides access to the describe function), and MASS libraries - 
# inspect the mice help file to obtain further information 
# - also surf to http://www.stefvanbuuren.nl/mi/MICE.html to obtain 
# further information and references for mice.
library(mice)
library(psych)
library(MASS)
sesamesim <- cbind(sesamesim$prenumb,sesamesim$postnumb,sesamesim$funumb,sesamesim$peabody)
colnames(sesamesim) <- c("prenumb","postnumb","funumb","peabody")
sesamesim <- as.data.frame(sesamesim)
# this examples is based on the prenumb, postnumb, funumb and peabody
# variables in the sesamesim data set. First of all, missing data are
# created in these four variables.
#
set.seed(1)
pmis1<-1
pmis2<-1
pmis3<-1
#
for (i in 1:240){
  pmis1[i] <- .80
  pmis2[i]<- .80
  pmis3[i]<- .80
#
  uni<-runif(1)
  if (pmis1[i] < uni) {
    sesamesim$funumb[i]<-NaN
  }
  uni<-runif(1)
  if (pmis2[i] < uni) {
    sesamesim$prenumb[i]<-NaN
    sesamesim$postnumb[i]<-NaN
  }
  uni<-runif(1)
  if (pmis3[i] < uni) {
    sesamesim$peabody[i]<-NaN
  }
}
# print data summaries - note that due to missing valus the n per variable is smaller than 240
print(describe(sesamesim))
# use mice to create 1000 imputed data matrices. Note that, the approach used below 
# is only one manner in which mice can be instructed. Many other options are available.
M <- 1000 
out <- mice(data = sesamesim, m = M, seed=999, meth=c("norm","norm","norm","norm"), diagnostics = FALSE, printFlag = FALSE)
# create matrices in which 1000 vectors with estimates can be stored and in which a covariance matrix can be stored
mulest <- matrix(0,nrow=1000,ncol=2)
covwithin <- matrix(0,nrow=2,ncol=2)
# execute 1000 multiple regressions using the imputed data matrices and store the estimates 
# of only the regression coefficients of funumb on prenumb and postnumband and the average 
# of the 1000 covariance matrices.
# See Hoijtink, Gu, Mulder, and Rosseel (2019) for an explanation of the latter.
for(i in 1:M) {
mulres <- lm(funumb~prenumb+postnumb,complete(out,i))
mulest[i,]<-coef(mulres)[2:3]
covwithin<-covwithin + 1/M * vcov(mulres)[2:3,2:3]
}
# Compute the average of the estimates and assign names, the between and total covariance matrix.
# See Hoijtink, Gu, Mulder, and Rosseel (2019) for an explanation.
estimates <- colMeans(mulest)
names(estimates) <- c("prenumb", "postnumb")
covbetween <- cov(mulest)
covariance <- covwithin + (1+1/M)*covbetween
# determine the sample size
samp <- nrow(sesamesim)
# compute the effective sample size
# See Hoijtink, Gu, Mulder, and Rosseel (2019) for an explanation.
nucom<-samp-length(estimates)
lam <- (1+1/M)*(1/length(estimates))* tr(covbetween %*% ginv(covariance))
nuold<-(M-1)/(lam^2)
nuobs<-(nucom+1)/(nucom+3)*nucom*(1-lam)
nu<- nuold*nuobs/(nuold+nuobs)
fracmis <- (nu+1)/(nu+3)*lam + 2/(nu+3)
neff<-samp-samp*fracmis
covariance<-list(covariance)
# set the seed
set.seed(100)
# test hypotheses with bain
results <- bain(estimates,"prenumb=postnumb=0",n=neff,Sigma=covariance,group_parameters=2,joint_parameters = 0)
# display the results
print(results)
# obtain the descriptives table
summary(results, ci = 0.95)
  1. An example of Bayesian confirmatory factor analysis with a named vector object. See van Lissa et al. (unpublished) for further elaborations
# Load the bain and lavaan libraries. Visit www.lavaan.org for 
# lavaan mini-tutorials, examples, and elaborations
library(bain)
library(lavaan)

# Specify and fit the confirmatory factor model
model1 <- '
A =~ Ab + Al + Af + An + Ar + Ac
B =~ Bb + Bl + Bf + Bn + Br + Bc
'
fit1 <- sem(model1, data = sesamesim, std.lv = TRUE)

# Obtain the required input for bain:

# the sample size
ngroup1 <- nobs(fit1)

# The parameterEstimates() function presents the estimated parameters
# in a data.frame; the additional argument standardized = TRUE add
# extra columns with standardized versions of these parameter estimates;
# for our purposes, we need the 'std.all' column, which means that both
# the observed and the latent variables have been standardized.
#
# we capture the output of parameterEstimates() in an object 'PE'
PE1 <- parameterEstimates(fit1, standardized = TRUE)

# We only need the rows that correspond to factor loadings (ie op == "=~")
# and the column "std.all":
estimate1 <- PE1[ PE1$op == "=~", "std.all"]

# Assign names to the estimates of the standardized factor loadingts
names(estimate1) <- c("Ab", "Al", "Af", "An", "Ar", "Ac",
                     "Bb", "Bl", "Bf", "Bn", "Br", "Bc")

# We will compute the full covariance matrix of standardized parameter
# estimates, and only extract the part we need: the rows/cols that
# correspond to the factor loadings
#
# to find out which rows/cols we need, we can look at the full parameter
# table:
PT1 <- parTable(fit1)
# and extract the parameter numbers (in the "free" column) that correspond
# to the factor loadings:
par.idx1 <- PT1$free[ PT1$op == "=~" ]
# Obtain the covariance matrix of the standardized factor loadings
covariance1 <- list(lavInspect(fit1, "vcov.std.all")[par.idx1, par.idx1])

# Specify the hypotheses using the syntax implementen in bain. Note that
# the & is used to combine different parts into one hypothesis. Note
# furthermore, that the ; is used to separate hypotheses
hypotheses1 <-
   " Ab > .6 & Al > .6 & Af > .6 & An > .6 & Ar > .6 & Ac >.6 &
     Bb > .6 & Bl > .6 & Bf > .6 & Bn > .6 & Br > .6 & Bc >.6"

# Evaluate the hypotheses using bain, the command used has been explained in
# the paper. Don't forget to set the seed first in order to obtain results
# that can be reproduced exactly.
set.seed(100)
y <- bain(estimate1, hypotheses1, n =ngroup1, Sigma = covariance1,
                group_parameters = 12, joint_parameters = 0, fraction = 1,
                standardize = TRUE)

# print the results of the bain analysis and obtain the estimates and 95%
# central credibility intervals
sy <- summary(z, ci = 0.95)