Adjacent category logit models using brglm2

Ioannis Kosmidis

05 February 2019

bracl

The brglm2 R package provides bracl() which is a wrapper of brglmFit() for fitting adjacent category models for ordinal responses using either maximum likelihood or maximum penalized likelihood or any of the various bias reduction methods described in brglmFit(). There is a formal equivalence between adjacent category logit models for ordinal responses and multinomial logistic regression models (see, e.g. the Multinomial logistic regression using brglm2 vignette and the brmultinom() function). bracl() utilizes that equivalence and fits the corresponding Poisson log-linear model, by appropriately re-scaling the Poisson means to match the multinomial totals (a.k.a. the “Poisson trick”). The mathematical details and algorithm on using the Poisson trick for mean-bias reduction are given in I. Kosmidis and Firth (2011).

Citation

If you found this vignette or brglm2, in general, useful, please consider citing brglm2 and the associated paper. You can find information on how to do this by typing citation("brglm2").

Opinion on stem cell research and religious fundamentalism

The stemcell data set ships with brglm2. Agresti (2015, sec. 4.1) provides a detailed description of the variables recorded in this data set (see also ?stemcell).

library("brglm2")
data("stemcell", package = "brglm2")
stem <- within(stemcell, religion <- as.numeric(religion))

Maximum likelihood estimation

The following chunk of code fits an adjacent category logit model with proportional odds and reproduces Agresti (2010, Table 4.2). Note that the intercept parameters are different because Agresti (2010, Table 4.2) uses different contrasts for the intercept parameters.

stem_formula <- research ~ religion + gender
stemcells_ml <- bracl(stem_formula, weights = frequency, data = stem,
                      parallel = TRUE, type = "ML")
summary(stemcells_ml)
#> Call:
#> bracl(formula = stem_formula, data = stem, weights = frequency, 
#>     parallel = TRUE, type = "ML")
#> 
#> Coefficients:
#>                          Estimate Std. Error z value Pr(>|z|)    
#> definitely:(Intercept)    -0.9509     0.1426   -6.67  2.6e-11 ***
#> probably:(Intercept)       0.5573     0.1451    3.84  0.00012 ***
#> probably not:(Intercept)  -0.1066     0.1648   -0.65  0.51776    
#> religion                   0.2668     0.0479    5.57  2.5e-08 ***
#> genderfemale              -0.0141     0.0767   -0.18  0.85395    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual Deviance: 2033.208 
#> Log-likelihood: -1016.604 
#> AIC: 2043.208 
#> 
#> 
#> Type of estimator: ML (maximum likelihood)
#> Number of Fisher Scoring iterations: 2

stemcells_ml is an object inheriting from

class(stemcells_ml)
#> [1] "bracl"      "brmultinom" "brglmFit"   "glm"

brglm2 implements print, coef, fitted, predict, summary, vcov and logLik methods for

We can check if a model with non-proportional odds fits the data equally well by fitting it and carrying out a likelihood ration test.

stemcells_ml_full <- bracl(stem_formula, weights = frequency, data = stemcell,
                           parallel = FALSE, type = "ML")
summary(stemcells_ml_full)
#> Call:
#> bracl(formula = stem_formula, data = stemcell, weights = frequency, 
#>     parallel = FALSE, type = "ML")
#> 
#> Coefficients:
#>                           Estimate Std. Error z value Pr(>|z|)    
#> definitely:(Intercept)    -0.37499    0.12692   -2.95   0.0031 ** 
#> probably:(Intercept)       0.99079    0.15514    6.39  1.7e-10 ***
#> probably not:(Intercept)   0.45775    0.21225    2.16   0.0310 *  
#> definitely:religion.L      0.59389    0.14794    4.01  6.0e-05 ***
#> probably:religion.L        0.36457    0.18481    1.97   0.0485 *  
#> probably not:religion.L   -0.00922    0.24461   -0.04   0.9699    
#> definitely:religion.Q      0.23646    0.14713    1.61   0.1080    
#> probably:religion.Q       -0.11603    0.18061   -0.64   0.5206    
#> probably not:religion.Q   -0.16547    0.25085   -0.66   0.5095    
#> definitely:genderfemale   -0.12598    0.16817   -0.75   0.4538    
#> probably:genderfemale      0.18153    0.20877    0.87   0.3846    
#> probably not:genderfemale -0.16828    0.28097   -0.60   0.5492    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual Deviance: 2023.391 
#> Log-likelihood: -1011.696 
#> AIC: 2047.391 
#> 
#> 
#> Type of estimator: ML (maximum likelihood)
#> Number of Fisher Scoring iterations: 3

The value of the log likelihood ratio statistic here is

(lrt <- deviance(stemcells_ml) - deviance(stemcells_ml_full))
#> [1] 9.816886

and has an asymptotic chi-squared distribution with

(df1 <- df.residual(stemcells_ml) - df.residual(stemcells_ml_full))
#> [1] 7

The p-value from testing the hypothesis that stemcells_ml_full is an as good fit as stemcells_ml is

pchisq(lrt, df1, lower.tail = FALSE)
#> [1] 0.19919

hence, the simpler model is found to be as adequate as the full model is.

Mean and median bias reduction

We can use bracl() to fit the adjacent category model using estimators with smaller mean or median bias. For mean bias reduction we do

summary(update(stemcells_ml, type = "AS_mean"))
#> Call:
#> bracl(formula = stem_formula, data = stem, weights = frequency, 
#>     parallel = TRUE, type = "AS_mean")
#> 
#> Coefficients:
#>                          Estimate Std. Error z value Pr(>|z|)    
#> definitely:(Intercept)    -0.9456     0.1424   -6.64  3.1e-11 ***
#> probably:(Intercept)       0.5562     0.1450    3.84  0.00012 ***
#> probably not:(Intercept)  -0.1097     0.1644   -0.67  0.50453    
#> religion                   0.2653     0.0478    5.55  2.8e-08 ***
#> genderfemale              -0.0138     0.0766   -0.18  0.85670    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual Deviance: 2033.215 
#> Log-likelihood: -1016.608 
#> AIC: 2043.215 
#> 
#> 
#> Type of estimator: AS_mean (mean bias-reducing adjusted score equations)
#> Number of Fisher Scoring iterations: 3

and for median

summary(update(stemcells_ml, type = "AS_median"))
#> Call:
#> bracl(formula = stem_formula, data = stem, weights = frequency, 
#>     parallel = TRUE, type = "AS_median")
#> 
#> Coefficients:
#>                          Estimate Std. Error z value Pr(>|z|)    
#> definitely:(Intercept)    -0.9481     0.1425   -6.65  2.8e-11 ***
#> probably:(Intercept)       0.5574     0.1450    3.84  0.00012 ***
#> probably not:(Intercept)  -0.1082     0.1646   -0.66  0.51105    
#> religion                   0.2659     0.0478    5.56  2.7e-08 ***
#> genderfemale              -0.0140     0.0766   -0.18  0.85522    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual Deviance: 2033.21 
#> Log-likelihood: -1016.605 
#> AIC: 2043.21 
#> 
#> 
#> Type of estimator: AS_median (median bias-reducing adjusted score equations)
#> Number of Fisher Scoring iterations: 3

The estimates from mean and median bias reduction are similar to the maximum likelihood ones, indicating that estimation bias is not a major issue here.

Prediction

We can predict the category probabilities using the predict() method

predict(stemcells_ml, type = "probs")
#>    definitely  probably probably not definitely not
#> 1   0.2138135 0.4297953    0.1911925     0.16519872
#> 2   0.2931825 0.4513256    0.1537533     0.10173853
#> 3   0.3784551 0.4461609    0.1163995     0.05898444
#> 4   0.2177773 0.4316255    0.1893146     0.16128261
#> 5   0.2975956 0.4516958    0.1517219     0.09898674
#> 6   0.3830297 0.4452227    0.1145262     0.05722143
#> 7   0.2138135 0.4297953    0.1911925     0.16519872
#> 8   0.2931825 0.4513256    0.1537533     0.10173853
#> 9   0.3784551 0.4461609    0.1163995     0.05898444
#> 10  0.2177773 0.4316255    0.1893146     0.16128261
#> 11  0.2975956 0.4516958    0.1517219     0.09898674
#> 12  0.3830297 0.4452227    0.1145262     0.05722143
#> 13  0.2138135 0.4297953    0.1911925     0.16519872
#> 14  0.2931825 0.4513256    0.1537533     0.10173853
#> 15  0.3784551 0.4461609    0.1163995     0.05898444
#> 16  0.2177773 0.4316255    0.1893146     0.16128261
#> 17  0.2975956 0.4516958    0.1517219     0.09898674
#> 18  0.3830297 0.4452227    0.1145262     0.05722143
#> 19  0.2138135 0.4297953    0.1911925     0.16519872
#> 20  0.2931825 0.4513256    0.1537533     0.10173853
#> 21  0.3784551 0.4461609    0.1163995     0.05898444
#> 22  0.2177773 0.4316255    0.1893146     0.16128261
#> 23  0.2975956 0.4516958    0.1517219     0.09898674
#> 24  0.3830297 0.4452227    0.1145262     0.05722143

Relevant resources

?brglmFit and ?brglm_control provide descriptions of the various bias reduction methods supported in brglm2. The iteration vignette describes the iteration and gives the mathematical details for the bias-reducing adjustments to the score functions for generalized linear models.

Citation

If you found this vignette or brglm2, in general, useful, please consider citing brglm2 and the associated paper. You can find information on how to do this by typing citation("brglm2").

References

Agresti, A. 2010. Analysis of Ordinal Categorical Data. 2nd ed. Wiley Series in Probability and Statistics. Wiley.
———. 2015. Foundations of Linear and Generalized Linear Models. Wiley Series in Probability and Statistics. Wiley.
Kosmidis, I., and D. Firth. 2011. “Multinomial Logit Bias Reduction via the Poisson Log-Linear Model.” Biometrika 98 (3): 755–59.
Kosmidis, Ioannis, Euloge Clovis Kenne Pagui, and Nicola Sartori. 2020. “Mean and Median Bias Reduction in Generalized Linear Models.” Statistics and Computing 30: 43–59. https://doi.org/10.1007/s11222-019-09860-6.