This document shows examples for using the sjp.glm()
function of the sjPlot package. However, many other functions for plotting regression models, like sjp.lm()
, sjp.lmer()
or sjp.glmer()
work in a similar way and also offer the various plot-types (predictions, marginal effects, fixed effects…).
library(sjPlot)
library(sjmisc)
library(sjlabelled)
data(efc)
# set basic theme options
set_theme("forest",
axis.title.size = .85,
axis.textsize = .85,
legend.size = .8,
geom.label.size = 3.5)
First, we fit some models (binomial logit, poisson and a negative binomial), which will be used in the following examples.
# create binary response
y <- ifelse(efc$neg_c_7 < median(na.omit(efc$neg_c_7)), 0, 1)
# create data frame for fitting model
df <- data.frame(y = to_factor(y),
sex = to_factor(efc$c161sex),
dep = to_factor(efc$e42dep),
barthel = efc$barthtot,
education = to_factor(efc$c172code))
# set variable label for response
set_label(df$y) <- "High Negative Impact"
# fit model
fit <- glm(y ~.,
data = df,
family = binomial(link = "logit"))
# set variable label for service usage
set_label(efc$tot_sc_e) <- "Total number of services used by carer"
# fit poisson model
fit2 <- glm(tot_sc_e ~ neg_c_7 + e42dep + c161sex,
data = efc, family = poisson(link = "log"))
# fit negative binomial model as well
library(MASS)
fit3 <- glm.nb(tot_sc_e ~ neg_c_7 + e42dep + c161sex, data = efc)
With the sjp.glm()
function you can plot the odds ratios (or e.g. incidents ratios for poisson models) with confidence intervals as so called forest plots.
sjp.glm(fit)
Due to the log-scaling of the x-axis - which should be done when plotting odds ratios (see here and here) - the x-axis values have an exponential growth. However, you can transform the ticks with trns.ticks
(defaults to TRUE
) to get proportional distances between the values. The x-axis-tick marks are set accordingly.
sjp.glm(fit, trns.ticks = FALSE)
By default, the odds ratios are sorted from highest to lowest value. You can also keep the order of predictors as they were introduced into the model if you set sort.est
to FALSE
.
sjp.glm(fit, sort.est = FALSE)
As you can see, the fitted model contains two continuous variables. The odds ratios for these predictors may a bit more difficult to interprete than categorical or factor variables, because of the missing reference category. Thus, you can also plot predicted probability or incidents of all predictors (covariates, coefficients) with type = "slope"
, marginal effects with type = "eff"
and predictions for the response with type = "pred"
.
The predicted values from this plot type are based on the intercept’s estimate and each specific term’s estimate. All other co-variates are set to zero (i.e. ignored), which corresponds to family(fit)$linkinv(eta = b0 + bi * xi)
(where xi
is the estimate).
sjp.glm(fit, type = "slope")
A probability curve of all predictors is plotted, which indicates the probability that the event (indicated by the response) occurs for each value of the predictor (not adjusted for remaining co-variates). In the above example, the first facet plot would be interpreted as: with increasing Barthel-Index (which means, better functional / physical status), the probability that caring for a dependent person is negatively perceived, decreases (in short: the less dependent a person I care for is, the less negative is the impact of care).
This kind of plot may be more informative then the odds ratio of 0.97
for the predictor Total score BARTHEL INDEX.
The same works for other model families or link functions. Confidence intervals are shown when show.ci = TRUE
, and data points are not plotted when show.scatter = FALSE
. You can also plot single plots for each coefficient when facet.grid = FALSE
. To get selected plots for particular predictors only, pass the term names to the vars
argument. In the following example, only the relationship between barthel and negative impact is shown.
sjp.glm(fit, type = "slope", facet.grid = FALSE, show.ci = TRUE, vars = "barthel")
With type = "eff"
, you can plot marginal effects (predicted marginal probabilities resp. predicted marginal incident rates), where all remaining co-variates are set to the mean. Unlike type = "slope"
, this plot type adjusts for co-variates.
# the binary outcome
sjp.glm(fit, type = "eff")
As you can see in the above examples, multiple plots for type = "eff"
are plotted as facet grid resp. as facet wrap. Since this does not allow to set a different x-scale for each plot, x-axis are not properly labelled. However, facet.grid = FALSE
produces a single plot for each predictor.
To arrange all predictors of multiple in one plot, as grid, use the plot_grid()
function on multiple plot objects. plot_grid()
requires multiple plots, so you have to set facet.grid = FALSE
to get a plot.list
-value as return value from the function (see ?sjp.lm
on Return Value for more details). This allows you arrange multiple plots as grid in one plot, but with different x-axis-scales.
# get list of all plots
p <- sjp.glm(fit, type = "eff", facet.grid = FALSE,
show.ci = TRUE, prnt.plot = FALSE)$plot.list
# plot all marginal effects, as grid, proper x-axes
# also, set margins for this example
plot_grid(p, margin = c(0.3, 0.3, 0.3, 0.3))
With type = "pred"
, you can plot predicted values for the response, related to specific model predictors. The predicted values of the response are computed, based on the predict.glm
method and corresponds to predict(fit, type = "response")
. This plot type requires the vars
argument to select specific terms that should be used for the x-axis and - optional - as first or second grouping factor. Hence, vars
must be a character vector with the names of one to three model predictors.
# the binary outcome
sjp.glm(fit, type = "pred", vars = "barthel")
# the count outcome
sjp.glm(fit3, type = "pred", vars = c("neg_c_7", "e42dep"), show.ci = TRUE)
# the count outcome, non faceted
sjp.glm(fit2, type = "pred", vars = c("neg_c_7", "e42dep"), facet.grid = FALSE)
# the count outcome, grouped gender and education, w/o data points
# and adjusted y-limits, to completely show CI bands
sjp.glm(fit2, type = "pred", vars = c("neg_c_7", "c161sex","e42dep"), facet.grid = FALSE, show.ci = TRUE, show.scatter = FALSE, axis.lim = c(0, 4))