Logistic_Regression

The CLRtools package provides a set of functions to support the structured development of logistic regression models using the Purposeful Selection Method described by Hosmer, Lemeshow, and Sturdivant in Applied Logistic Regression (2013). This method offers a step-by-step approach to building multivariable logistic regression models through seven steps to identify variables that have a meaningful effect on the outcome while excluding variables that do not contribute useful information.

This vignette demonstrates the complete modelling process using the GLOW500 dataset, a study that examines risk factors for fracture in women over 50. The dataset includes variables such as age at enrollment, body mass index (BMI), smoking status, prior fracture history, among others. Each modelling step is supported by corresponding functions from the CLRtools package.

We now proceed through the seven steps of the Purposeful Selection Method, applying each one to the GLOW500 dataset with the support of CLRtools functions.

Step 1. Fit univariable models

The first step of the Purposeful Selection Method involves fitting separate univariable logistic regression models, one for each candidate predictor variable. The univariable.models() function is used to fit each univariable model and generate a summary table. The output includes odds ratios (optional), confidence intervals, p-values, and the G statistic, which represents the likelihood ratio test comparing each univariable model to the intercept-only model. The resulting table corresponds to Table 4.1 in Chapter 4 of the book, and serves in selecting candidate variables for the multivariable model.

We retain variables with a p-value less than 0.25 as initial candidates for inclusion in the next modelling step. Based on this criterion, the variables selected for the first multivariable model are: age, height, priorfrac, momfrac, armassist, and raterisk.

# Predictor variables to evaluate
var_to_test <- c('age','weight','height', 'bmi', 'priorfrac', 'premeno', 'momfrac', 'armassist','smoke', 'raterisk')

# Define OR scaling factors (used to interpret ORs per meaningful unit)
OR_values <- c(5, 5, 10, 5, 1, 1, 1, 1, 1, 1, 1)

# Generate the univariable models table
univariable.models(glow500, yval = 'fracture',xval = var_to_test, OR=TRUE, inc.or = OR_values)
#>                       coeff        se    estim.OR   low.limit   up.limit
#> age              0.47542260 0.1045364 10.77375413 3.867756769 30.0106198
#> weight          -0.08541440 0.1054345  0.65241656 0.232164052  1.8333905
#> height          -0.32835922 0.1086460  0.03749333 0.004458178  0.3153194
#> bmi              0.03439538 0.1026644  1.18765041 0.434258454  3.2480968
#> priorfrac        1.06382945 0.2230811  2.89744539 1.871234880  4.4864436
#> premeno          0.05077233 0.2592086  1.05208333 0.633011166  1.7485937
#> momfrac          0.66050224 0.2809838  1.93576431 1.116037135  3.3575795
#> armassist        0.70914752 0.2097666  2.03225806 1.347178655  3.0657202
#> smoke           -0.30765421 0.4358070  0.73516949 0.312916083  1.7272176
#> rateriskSame     0.54621675 0.2664361  1.72670807 1.024302199  2.9107824
#> rateriskGreater  0.90912224 0.2711471  2.48214286 1.458901333  4.2230636
#>                           G        p_val
#> age             21.27380948 3.981342e-06
#> weight           0.66547783 4.146327e-01
#> height           9.52651603 2.025242e-03
#> bmi              0.11173856 7.381734e-01
#> priorfrac       22.26720591 2.372235e-06
#> premeno          0.03817558 8.450910e-01
#> momfrac          5.26977352 2.169884e-02
#> armassist       11.40750993 7.314781e-04
#> smoke            0.52515643 4.686503e-01
#> rateriskSame    11.75680330 2.799256e-03
#> rateriskGreater 11.75680330 2.799256e-03

Step 2: Fit Initial multivariable Model and Refine Predictors

We now fit a multivariable logistic regression model using the variables selected in Step 1. In this stage, variables that are statistically insignificant or lack clinical relevance are considered to be candidates to be removed, and assess in more depth.

All variables in the model are statistically significant, except for one of the categories in the variable raterisk. In the book, a likelihood ratio test was used to compare the model with and without raterisk, yielding a p-value of 0.051 — just above the conventional significance threshold of 0.05. This variable will be considered a candidate for removal in the next steps.

summary(glm(fracture ~ age + height + priorfrac + momfrac + armassist + raterisk, family = binomial, data = glow500))
#> 
#> Call:
#> glm(formula = fracture ~ age + height + priorfrac + momfrac + 
#>     armassist + raterisk, family = binomial, data = glow500)
#> 
#> Coefficients:
#>                 Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)      -2.0095     0.2369  -8.484  < 2e-16 ***
#> age               0.3087     0.1173   2.632  0.00848 ** 
#> height           -0.2786     0.1161  -2.400  0.01640 *  
#> priorfrac         0.6453     0.2461   2.622  0.00873 ** 
#> momfrac           0.6212     0.3070   2.024  0.04300 *  
#> armassist         0.4458     0.2328   1.915  0.05551 .  
#> rateriskSame      0.4220     0.2792   1.511  0.13071    
#> rateriskGreater   0.7069     0.2934   2.409  0.01599 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 562.34  on 499  degrees of freedom
#> Residual deviance: 507.50  on 492  degrees of freedom
#> AIC: 523.5
#> 
#> Number of Fisher Scoring iterations: 4

Step 3: Assess Potential Confounding

In this step, we compare the coefficient estimates of the model from the significant variables from the model in Step 2 to models that include each candidate variable for removal in the same step. This comparison helps identify potential confounding effects. For each candidate variable, we fit two models — one with and one without the variable — and assess whether any of the coefficients for the variables retained in Step 2 change substantially. The delta beta hat percentage (\(\Delta \hat{\beta}\%\)) is used to quantify this change. Following the approach described by Hosmer et al., (2013), a change greater than 20% in any coefficient suggests the presence of confounding and indicates that the candidate variable should be retained in the model for adjustment.

The check_coef_change() function is used to automate this comparison and calculate the percentage change in coefficients for each added variable.

# Variables that were not significant in Step 2 to check
candidate.exclusion <- c( 'raterisk') 

# Significant variables in Step 2
var.preliminar <- c('age', 'height', 'priorfrac', 'momfrac','armassist') 

# Check for confounding 
check_coef_change(data = glow500, yval = 'fracture', xpre = var.preliminar, xcheck = candidate.exclusion)
#>               raterisk
#> (Intercept) -15.502784
#> age         -13.353135
#> height        5.733425
#> priorfrac    16.633979
#> momfrac      16.324630
#> armassist    17.481536

The category “Same” in the variable raterisk has a p-value above the conventional significance level of 0.05. When this variable is excluded from the model, the largest percent change observed in the coefficients is around 17% for the coefficient of armassist — the same result reported in the book. As noted by the authors, this suggests that raterisk is not needed to adjust the effects of the remaining variables and therefore does not appear to act as a confounder.

Step 4: Reassess Excluded Variables

In this step, the variables excluded in Step 1 are added back into the model, one at a time. Some of these variables, while not significantly associated with the outcome on their own, may become statistically significant when considered alongside the variables retained in the model. The check_coef_significant() function is used to reintroduce each excluded variable individually and fit the corresponding model. It automates the process of generating the coefficient estimates, standard errors, and Wald test statistics (z and p-values), so the user can efficiently assess the statistical significance of each variable without manually fitting multiple models.

# Variables excluded in Step 1 to check
excluded<-c('weight', 'bmi', 'premeno','smoke')

# Preliminary model variables (retained after Step 2 and 3)
var.preliminar <- c('age','height', 'priorfrac', 'momfrac','armassist', 'raterisk') 

check_coef_significant(data = glow500, yval = 'fracture', xpre = var.preliminar, xcheck = excluded)
#>               weight       bmi   premeno      smoke
#> Estimate   0.1084217 0.1197159 0.1209557 -0.3429650
#> Std. Error 0.1321431 0.1238882 0.2828346  0.4584643
#> z value    0.8204871 0.9663215 0.4276554 -0.7480735
#> Pr(>|z|)   0.4119385 0.3338833 0.6689020  0.4544158

None of the excluded variables became statistically significant when added individually to the multivariable model, and therefore they are not retained. For the variable raterisk, the authors considered an alternative model in which the categories “Less” and “Same” were combined. This decision was made in consultation with subject-matter experts, who agreed that the combination was reasonable.

# Transforming raterisk
glow500<-glow500 %>% mutate(raterisk_cat=case_when((raterisk=='Less'| raterisk=='Same')~'C1',
                                                   raterisk=='Greater'~'C2'))

summary(glm(fracture ~ age + height + priorfrac + momfrac + armassist + raterisk_cat, family = binomial, data = glow500))
#> 
#> Call:
#> glm(formula = fracture ~ age + height + priorfrac + momfrac + 
#>     armassist + raterisk_cat, family = binomial, data = glow500)
#> 
#> Coefficients:
#>                Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)     -1.7916     0.1799  -9.959  < 2e-16 ***
#> age              0.2982     0.1164   2.563  0.01039 *  
#> height          -0.2943     0.1152  -2.555  0.01063 *  
#> priorfrac        0.6642     0.2452   2.709  0.00676 ** 
#> momfrac          0.6640     0.3056   2.173  0.02978 *  
#> armassist        0.4727     0.2313   2.044  0.04100 *  
#> raterisk_catC2   0.4579     0.2381   1.924  0.05441 .  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 562.34  on 499  degrees of freedom
#> Residual deviance: 509.82  on 493  degrees of freedom
#> AIC: 523.82
#> 
#> Number of Fisher Scoring iterations: 4

Step 5: Assess Linearity of Continuous Predictors with Respect to the Logit

This step checks whether the relationship between each continuous predictor and the log-odds of the outcome is approximately linear, as required by logistic regression. Approaches described in the book include LOWESS plots, quartile-based design variables, fractional polynomials, and spline functions. Since it is highly context-dependent, and several existing R packages already offer tools to perform these checks, no specific function is provided in this package. Users are encouraged to apply the most suitable method to their data.

For this dataset, the continuous variables to be analyzed are height and age. After performing a linearity assessment, the authors concluded that no transformation was needed, as both variables were already approximately linear with respect to the logit. Therefore, the model remains the same from the previous step.

Step 6: Assess Interaction Terms

n this step, interaction terms formed by combining the variables from the preliminary main-effects model in Step 5 are added one at a time. This allows us to evaluate whether any combinations of variables are statistically significant, which would indicate that the effect of one variable is not constant across the levels of another. Only interactions that are statistically significant in the likelihood ratio test and clinically meaningful are retained, while non-significant interactions are removed from the final model. The significance level used can be 5% or 10%, depending on the context.

The check_interactions() function is used to automate this process by fitting models that include each candidate interaction and reporting the relevant test statistics. The output includes the log-likelihood of the model with the interaction, the likelihood ratio test comparing the main-effects model to the interaction model, and the associated p-value.

var.maineffects<-c('age', 'height', 'priorfrac', 'momfrac', 'armassist', 'raterisk_cat')

check_interactions(data = glow500, variables = var.maineffects, yval = 'fracture', rounding = 4)
#>                         log.likh      G p_value
#> 1                      -254.9089     NA      NA
#> age*height             -254.8421 0.1335  0.7149
#> age*priorfrac          -252.3921 5.0336  0.0249
#> age*momfrac            -254.8395 0.1387  0.7096
#> age*armassist          -254.8358 0.1462  0.7022
#> age*raterisk_cat       -254.3857 1.0464  0.3063
#> height*priorfrac       -254.8024 0.2129  0.6445
#> height*momfrac         -253.7043 2.4093  0.1206
#> height*armassist       -254.1113 1.5952  0.2066
#> height*raterisk_cat    -254.4218 0.9741  0.3237
#> priorfrac*momfrac      -253.5093 2.7991  0.0943
#> priorfrac*armassist    -254.7962 0.2253  0.6350
#> priorfrac*raterisk_cat -254.8475 0.1227  0.7262
#> momfrac*armassist      -252.5179 4.7819  0.0288
#> momfrac*raterisk_cat   -254.6423 0.5331  0.4653
#> armassist*raterisk_cat -253.7923 2.2331  0.1351

The table above, created with the check_interactions() function, corresponds to Table 4.14 in Chapter 4 of Applied Logistic Regression. Three interactions were significant at the 10% level: age*priorfrac, priorfrac*momfrac, and momfrac*armassist. These three interactions were added to the preliminary model from Step 5 to assess their contribution. One interaction, priorfrac*momfrac, was not statistically significant and was therefore excluded. The remaining variables in the model were then reassessed. Only the binary variable raterisk_cat had a p-value above 0.05. However, the authors chose to retain this variable due to its clinical importance and the fact that its p-value was close to the 5% threshold. The final model was the following.

final.model <- glm(fracture ~ age + height + priorfrac + momfrac + armassist + raterisk_cat + age*priorfrac + momfrac*armassist, family = binomial, data = glow500)
summary(final.model)
#> 
#> Call:
#> glm(formula = fracture ~ age + height + priorfrac + momfrac + 
#>     armassist + raterisk_cat + age * priorfrac + momfrac * armassist, 
#>     family = binomial, data = glow500)
#> 
#> Coefficients:
#>                   Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)        -1.8950     0.1909  -9.929  < 2e-16 ***
#> age                 0.5152     0.1483   3.473 0.000515 ***
#> height             -0.2970     0.1165  -2.550 0.010781 *  
#> priorfrac           0.8226     0.2578   3.191 0.001417 ** 
#> momfrac             1.2466     0.3930   3.172 0.001512 ** 
#> armassist           0.6442     0.2519   2.557 0.010561 *  
#> raterisk_catC2      0.4690     0.2408   1.948 0.051449 .  
#> age:priorfrac      -0.4969     0.2331  -2.132 0.033044 *  
#> momfrac:armassist  -1.2805     0.6230  -2.055 0.039834 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 562.34  on 499  degrees of freedom
#> Residual deviance: 500.50  on 491  degrees of freedom
#> AIC: 518.5
#> 
#> Number of Fisher Scoring iterations: 4

Step 7: Perform Model Assessment

Measure of goodness of fit

For assessing goodness of fit, the authors recommend three tests: the Hosmer–Lemeshow deciles-of-risk test, the Osius and Rojek normal approximation to the Pearson chi-square statistic, and Stukel’s test.

This package provides one function for each test, implemented using the formulas and steps described in Chapter 5 of Applied Logistic Regression.

Hosmer-Lemeshow deciles-of-risk test

The DRtest() function performs the Hosmer–Lemeshow test and returns a table of observed and expected frequencies within each decile of risk (or group), along with the chi-squared test statistic, degrees of freedom, and corresponding p-value. The function accepts either a fitted model or a pair of vectors: one for the outcome variable and one for the predicted probabilities. The number of groups can also be specified; the default is 10.

DRtest(final.model, g = 10)
#> $results
#>                cut obs.1 exp.1 obs.0 exp.0 total
#> 1  [0.0209,0.0847]     3  3.31    47 46.69    50
#> 2   (0.0847,0.111]     4  4.86    46 45.14    50
#> 3    (0.111,0.141]     7  6.27    43 43.73    50
#> 4    (0.141,0.176]    11  8.08    40 42.92    51
#> 5    (0.176,0.208]     7  9.40    42 39.60    49
#> 6    (0.208,0.249]    13 11.40    37 38.60    50
#> 7    (0.249,0.322]     9 14.27    41 35.73    50
#> 8    (0.322,0.388]    19 17.62    31 32.38    50
#> 9    (0.388,0.482]    25 21.81    25 28.19    50
#> 10   (0.482,0.747]    27 27.98    23 22.02    50
#> 
#> $chisq
#> [1] 6.391925
#> 
#> $df
#> [1] 10
#> 
#> $p.value
#> [1] 0.6034186
#> 
#> $groups
#> [1] 10

The output table corresponds to Table 5.1 in Chapter 5 and suggests that the model fits the data well. The observed and expected frequencies align closely across the deciles of risk. Although two groups contain fewer than five observations, the authors note that the p-value remains sufficiently reliable to support the hypothesis that the model fits.

Osius and Rojek normal approximation to the Pearson chi-square statistic

The osius_rojek() function implements the 8-step procedure described in Chapter 5, including the calculation of covariate patterns to obtain the normal approximation to the Pearson chi-square statistic. Covariate patterns refer to the unique combinations of covariates in the dataset and are important to consider when assessing goodness of fit, as they can influence test performance. The package also includes a separate function, covariate_patterns(), to compute these patterns if needed; however, they are automatically handled within osius_rojek(). This function also performs the normal approximation to the distribution of the test statistic S (sum of squared residuals), as described in the same section of the book.

osius_rojek(final.model)
#> $z_chisq
#> [1] -0.3902703
#> 
#> $p_value
#> [1] 0.6963367
#> 
#> $z_s
#> [1] 0.03119926
#> 
#> $p_value_S
#> [1] 0.9751106

The resulting tests have a p-value greater than 0.05, supporting the conclusion that there is no evidence that the model fits poorly.

Stukel’s test

The osius_rojek() function implements the 4-step procedure described in Chapter 5. It tests whether the logistic regression model is correctly specified — specifically, whether the assumed S-shape of the logistic curve fits the data well. If the test is significant, it may indicate that the relationship between the predictors and the outcome is not well captured by the standard logistic model.

stukels_test(final.model) 
#> $G
#> [1] 5.202026
#> 
#> $p_value
#> [1] 0.07419838
#> 
#> $model_summary
#> 
#> Call:
#> glm(formula = fracture ~ age + height + priorfrac + momfrac + 
#>     armassist + raterisk_cat + z1 + z2 + age:priorfrac + momfrac:armassist, 
#>     family = binomial, data = X.design)
#> 
#> Coefficients:
#>                   Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)        -2.4122     0.6765  -3.566 0.000363 ***
#> age                 0.6997     0.2847   2.458 0.013972 *  
#> height             -0.4693     0.1755  -2.674 0.007496 ** 
#> priorfrac           1.1574     0.4283   2.702 0.006889 ** 
#> momfrac             1.9332     0.6910   2.798 0.005143 ** 
#> armassist           0.8732     0.3615   2.416 0.015703 *  
#> raterisk_catC2      0.6428     0.2925   2.198 0.027971 *  
#> z1                 -6.6164     3.2964  -2.007 0.044736 *  
#> z2                 -0.2534     0.3710  -0.683 0.494549    
#> age:priorfrac      -0.7088     0.3382  -2.095 0.036129 *  
#> momfrac:armassist  -1.9541     0.8593  -2.274 0.022957 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 562.34  on 499  degrees of freedom
#> Residual deviance: 495.29  on 489  degrees of freedom
#> AIC: 517.29
#> 
#> Number of Fisher Scoring iterations: 5

In this case, the estimated coefficient for \(z_1\) was large, negative, and marginally significant (Wald test p = 0.045). This result suggests that the upper tail of the observed data may be longer than what the logistic model predicts. However, because only 41 subjects in the dataset have predicted probabilities greater than 0.5, the authors chose not to modify the fitted model at this stage.

Classification tables

In addition to the formal goodness-of-fit tests described above, the authors also describe methods for evaluating the model’s classification performance and examining diagnostic plots. However, they caution that these methods should be interpreted carefully, as they depend heavily on the distribution of predicted probabilities in the sample and the choice of classification threshold. They recommend using these tools primarily when the goal is to identify a model that can effectively discriminate between the two outcome groups.

The cutpoints() function helps assess classification accuracy by calculating sensitivity and specificity across a range of threshold values. This supports the selection of an appropriate cutoff for converting predicted probabilities into binary outcomes. The user can specify the lower and upper bounds of the threshold range, the step size, and whether to display a plot. The optional plot highlights, with a red line, the threshold that maximizes the trade-off between sensitivity and specificity. The table below corresponds to Table 5.7 in Chapter 5 of the book.

cutpoints(final.model, cmin = 0.05, cmax = 0.75, byval = 0.05, plot = TRUE)
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).

#>    Cutpoint Sensitivity Specificity Specificity1
#> 1      0.05       100.0     1.60000   98.4000000
#> 2      0.10        95.2    19.73333   80.2666667
#> 3      0.15        84.8    38.40000   61.6000000
#> 4      0.20        76.8    55.20000   44.8000000
#> 5      0.25        64.0    68.53333   31.4666667
#> 6      0.30        62.4    76.53333   23.4666667
#> 7      0.35        48.8    82.93333   17.0666667
#> 8      0.40        39.2    88.00000   12.0000000
#> 9      0.45        29.6    92.26667    7.7333333
#> 10     0.50        17.6    94.93333    5.0666667
#> 11     0.55         8.8    96.80000    3.2000000
#> 12     0.60         3.2    98.13333    1.8666667
#> 13     0.65         2.4    99.46667    0.5333333
#> 14     0.70         0.0    99.46667    0.5333333
#> 15     0.75         0.0   100.00000    0.0000000

The diagnosticplots_class() function generates the four recommended diagnostic plots described in Chapter 5 to assess a model’s discrimination, i.e., its ability to distinguish between outcome groups: 1. Scatter plot of the outcome versus the estimated probabilities from the fitted model 2. Plot of sensitivity versus 1–specificity for all possible cutpoints 3. Density plot of the estimated probabilities for Outcome = 0 3. Density plot of the estimated probabilities for Outcome = 1

diagnosticplots_class(final.model)
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).

There is some overlap between the two histograms for the outcome classes. Excellent discrimination would be indicated by almost complete separation. This overlap is also visible in the jittered plot of the outcome versus the estimated probabilities. However, the ROC curve suggests that the model demonstrates acceptable discriminatory ability.

\(R^2\) measures

Several options of \(R^2\) are presented, depending on whether the number of covariate patterns is equal to the number of data points (J=n) or fewer (J<n). Both cases are supported in the package with the functions (r_measures() and rcv_measures()). However, the authors note that these \(R^2\) measures tend to be lower than those typically observed in linear regression models, and therefore recommend focusing on overall goodness-of-fit tests for model assessment.

# Measures assuming J = n, with J being the number of covariate patterns
r_measures(final.model)
#> $squared.Pearson.cc
#> [1] 0.1235509
#> 
#> $R_ss
#> [1] 0.1235491
#> 
#> $adjusted_R
#> [1] 0.1092689
#> 
#> $adjust_shrink_R
#> [1] 0.1075657
#> 
#> $R_L
#> [1] 0.1099674

# Measures assuming J < n
rcv_measures(final.model)
#> $squared.Pearson.cc
#> [1] 0.1420452
#> 
#> $R_ss
#> [1] 0.1380405
#> 
#> $adjusted_R
#> [1] 0.1174675
#> 
#> $adjust_shrink_R
#> [1] 0.1201824
#> 
#> $R_LS
#> [1] 0.1163538

Logistic regression diagnostics

To complement the summary tests that assess the agreement between observed and fitted values, it is recommended to examine regression diagnostics. These involve analyzing residuals and influence measures for each covariate pattern, including:

Pearson residuals
Leverage values
Changes in estimated coefficients (\(\Delta \hat{\beta_j}\))
Change in the Pearson chi-square statistic (\(\Delta \chi_j\))
Change in deviance (\(\Delta D_j\))

These diagnostics help identify data points that are outliers, have a poor model fit, or have high influence on the estimated coefficients. They provide insight into whether specific observations are disproportionately affecting the model. All of these measures can be obtained using the residuals_logistic() function included in the package. This function include the 4 recommended plots for the analysis of diagnostics.

head(residuals_logistic(final.model)$x.cv)
#>   X.Intercept.        age      height priorfrac momfrac armassist
#> 1            1 -0.7299597 -0.52930593         0       0         0
#> 2            1 -0.3962384 -0.21461750         0       0         0
#> 3            1  2.1622915 -0.68665014         1       1         1
#> 4            1  1.4948489 -0.21461750         0       0         0
#> 5            1 -0.8412001 -1.47337119         0       0         0
#> 6            1 -0.1737576 -0.05727329         1       0         0
#>   raterisk_catC2 age.priorfrac momfrac.armassist y_j m  est.prob    leverage
#> 1              0     0.0000000                 0   1 3 0.1077543 0.014956388
#> 2              0     0.0000000                 0   0 1 0.1155329 0.004117951
#> 3              0     2.1622915                 1   0 1 0.4455498 0.070803386
#> 4              0     0.0000000                 0   0 1 0.2570885 0.015026547
#> 5              0     0.0000000                 0   0 1 0.1311556 0.009009089
#> 6              0    -0.1737576                 0   0 1 0.2575707 0.015721832
#>      pearson   spearson   deviance  delta.beta delta.chi delta.deviance
#> 1  1.2600850  1.2696152  1.0453580 0.024474592 1.6119227      1.1093654
#> 2 -0.3614198 -0.3621663 -0.4955198 0.000542362 0.1311644      0.2465552
#> 3 -0.8964311 -0.9299575 -1.0860740 0.065898051 0.8648209      1.2694372
#> 4 -0.5882647 -0.5927350 -0.7709454 0.005359889 0.3513347      0.6034242
#> 5 -0.3885282 -0.3902902 -0.5302665 0.001384798 0.1523265      0.2837388
#> 6 -0.5890073 -0.5936927 -0.7717870 0.005630005 0.3524710      0.6051696
residuals_logistic(final.model)$leverage

residuals_logistic(final.model)$change.Pearsonchi

residuals_logistic(final.model)$change.deviance

residuals_logistic(final.model)$cooks

residuals_logistic(final.model)$change.Pb

Based on the diagnostic plots, the model appears to fit reasonably well. To explore individual outliers more closely, the dataset of residuals returned by the residuals_logistic() function can be used. Outliers can be identified by inspecting the plots and looking for values that deviate substantially from the rest of the data. The authors recommend focusing on points that are clearly separated from the general pattern. These potential outliers are filtered in the code below, and the eight covariate patterns identified by the authors in Table 5.10 are recovered, which will be further investigated.

# Extract the residuals and other measures
residuals.data<-residuals_logistic(final.model)$x.cv

# Restore variable names for readability
colnames(residuals.data)[c(4:7)]<-c('priorfrac','momfrac','armassist','raterisk_cat')

# Flag observations with large influence or poor fit, and filter them
residuals.data<-residuals.data%>% mutate(outliers=ifelse(delta.chi>10 | delta.deviance >4.5 | delta.beta>0.12, 1,0))
out<-residuals.data %>% filter(outliers==1)

# Select and format relevant columns for display
out<-out[,c(2:7, 10:12, 18:19, 13, 17)]
out[order(out[, 1], decreasing = FALSE), ]
#>          age     height priorfrac momfrac armassist raterisk_cat y_j m
#> 6 -1.3974023 -1.0013386         0       0         0            0   1 1
#> 7 -1.2861619  0.7294478         0       0         0            0   1 1
#> 5 -0.9524406  0.1000709         0       1         1            1   1 1
#> 2 -0.6187193 -1.3160270         1       1         0            1   0 1
#> 3 -0.3962384  0.8867920         0       0         0            0   1 1
#> 4 -0.3962384  1.0441362         0       1         0            0   2 2
#> 1  0.1599637 -3.0468133         1       1         0            0   0 1
#> 8  0.7161659  2.1455457         0       1         1            0   1 1
#>     est.prob delta.chi delta.deviance    leverage delta.beta
#> 6 0.08968070 10.231569       4.861437 0.007906718 0.08154287
#> 7 0.05872715 16.102438       5.696073 0.004628971 0.07488436
#> 5 0.20813234  3.973973       3.278881 0.042611687 0.17687462
#> 2 0.73550298  2.913644       2.786956 0.045607139 0.13923298
#> 3 0.08607058 10.665946       4.927152 0.004460424 0.04778780
#> 4 0.23818050  6.738049       6.044876 0.050616446 0.35923953
#> 1 0.74689284  3.130850       2.915459 0.057477728 0.19092826
#> 8 0.17463758  4.934131       3.643676 0.042152698 0.21713998