rmsBMA: Reduced Model Space Bayesian Model Averaging

Krzysztof Beck (Lazarski University)

2026-03-06

Acknowledgment: This article was prepared within the research project “Bayesian simultaneous equations model averaging — theoretical development and R package,” funded by the Polish National Science Centre, decision No. DEC-2021/43/B/HS4/01745.

Abstract. This article introduces the rmsBMA R package for Bayesian model averaging in settings with many candidate regressors relative to the available sample size. The package implements reduced model space BMA by restricting attention to models with at most \(M\) regressors. This restriction preserves degrees of freedom for estimation and inference and concentrates posterior mass on parsimonious specifications, while still accounting for model uncertainty within the admissible subset of models.

rmsBMA supports standard prior structures used in the BMA literature, including Zellner-type \(g\)-priors and alternative model priors with appropriate truncation under model space reduction. The package further provides heteroscedasticity-consistent inference, Extreme Bounds Analysis, jointness measures, and graphical tools for exploring prior and posterior model probabilities, model size distributions, and coefficient distributions. In addition, it introduces group dilution, a non-empirical rule that penalizes models containing multiple proxies for the same theoretical construct. An empirical application illustrates the workflow.

1 Introduction

The problem of model uncertainty plays a crucial role in scientific research that relies on statistical inference (Leamer 1978). In recent decades, the most prominent solution to this problem has been Bayesian model averaging (BMA) (Kass and Raftery 1995; Raftery 1995; Raftery, Madigan, and Hoeting 1997). This approach has become especially popular in economics1 (e.g., Liu and Maheu 2009; Ductor and Leiva-Leon 2016; Figini and Giudici 2017; D’Andrea 2022; Horvath, Horvatova, and Siranova 2024; K. Beck and Grabowski 2025); however, it has also been widely adopted in other fields (e.g., Sloughter, Gneiting, and Raftery 2013; Baran and Möller 2015; Aller, Ductor, and Grechyna 2021; Guliyev 2024; Payne, Ray, and Thomann 2024; J. Beck et al. 2025). To this day, BMA remains one of the primary tools for examining the determinants of economic growth (Fernández, Ley, and Steel 2001a, 2001b; Sala-I-Martin, Doppelhofer, and Miller 2004; Eicher, Papageorgiou, and Roehn 2007; Ley and Steel 2012; Moser and Hofmarcher 2014; Arin, Braunfels, and Doppelhofer 2019).

The growing interest in BMA has been fueled by the availability of R packages such as BMA (Raftery, Painter, and Volinsky 2005), BAS (Clyde, Ghosh, and Littman 2011), BMS (Feldkircher and Zeugner 2015), and badp (K. Beck, Wyszyński, and Dubel 2025), as well as the gretl BMA package developed by Błażejowski and Kwiatkowski (2015). Each of these packages provides unique functionalities. The BMA package enables model averaging for generalized linear models and survival models, BAS implements efficient sampling algorithms, BMS offers a comprehensive prior structure and adaptive sampling procedures, while badp addresses the issue of simultaneity2. The package proposed by Błażejowski and Kwiatkowski (2015) extends BMA methodology to the gretl environment.

rmsBMA extends this line of software by addressing issues that remain unexplored in existing implementations. In many empirical applications, researchers face situations in which the number of potential determinants is very large relative to the available sample size. In extreme cases, the number of candidate regressors may even exceed the number of observations. Such situations frequently arise when long questionnaires generate rich information on a limited number of subjects. This setting often produces an extensive list of theory-driven explanatory variables intended to capture specific dimensions of behavior. Consequently, researchers aim to extract a limited subset of regressors that explain variation in the dependent variable in a robust and economically meaningful way. This objective is consistent with the principle of model parsimony in applied research. A classical example is the literature on the money demand function, which considers a broad set of potential determinants. Nevertheless, a workable theory of money demand typically emphasizes only a small number of key variables (Judd and Scadding 1982; Laidler 1993).

The challenge posed by a large number of potential regressors relative to a limited sample size can be addressed within the BMA framework by restricting the model space. In particular, the researcher may limit attention to models containing at most \(M\) regressors. This constraint ensures that a sufficient number of observations remains available for meaningful statistical inference, while simultaneously focusing the analysis on the most relevant determinants. rmsBMA facilitates Bayesian model averaging within a reduced model space and provides graphical and analytical tools that allow users to fully exploit this framework.

Another challenge in selecting robust determinants from a large pool of regressors is multicollinearity. In the Bayesian model averaging and model selection literature, this issue is typically addressed through the dilution prior proposed by George (2010). However, this approach is empirical in nature, as it relies on the correlation structure of the regressors and therefore departs from a fully Bayesian treatment of prior specification.

To address this limitation, the rmsBMA package introduces a non-empirical dilution rule, termed group dilution. This approach allows users to specify dilution parameters for groups of variables associated with the same underlying theory. As a result, prior probabilities can be appropriately adjusted for models that include multiple proxies capturing the same fundamental determinant.

Finally, the rmsBMA package allows users to specify a rich prior structure, compute jointness measures (Doppelhofer and Weeks 2009; Ley and Steel 2007; Hofmarcher et al. 2018), conduct Extreme Bounds Analysis (EBA)3, perform model selection procedures, and generate a wide range of graphical representations of the results. Consequently, rmsBMA combines capabilities that are not jointly available in existing R packages or alternative software environments and further contributes original methodological innovations.

The remainder of the manuscript is structured as follows. BMA provides a theoretical overview of Bayesian model averaging, with a particular focus on Bayesian estimation, prior structures, jointness measures, and Extreme Bounds Analysis. Data preparation is described in data, while model_space discusses the construction and estimation of the model space. using_bma presents an overview of the rmsBMA functions for performing Bayesian model averaging, computing jointness measures, and presenting the estimation results. The details of the model prior specifications are discussed in priors. Finally, sum concludes.

2 Bayesian model averaging: A theoretical overview

This section outlines the main theoretical foundations of the package. setup presents the model setup, \(g\)-regression, and the calculation of the marginal likelihood. The principal BMA statistics are described in bma. Subsections g_prior, model_prior, and dilution_prior discuss the \(g\)-prior, model prior, and dilution prior specifications. Jointness measures are examined in joint. Finally, eba introduces Extreme Bounds Analysis.

2.1 Model setup and Bayesian estimation

rmsBMA is designed to accommodate variations of the standard linear regression model:

\[ y = \alpha l_N + X_j \beta_j + \varepsilon, \]

where \(y = (y_1, y_2, \ldots, y_N)'\) denotes the \(N \times 1\) vector of the dependent variable and \(N\) is the total number of observations. The parameter \(\alpha\) is a scalar intercept term, and \(l_N\) is an \(N \times 1\) vector of ones. The matrix \(X_j\) is an \(N \times k_j\) matrix of regressors included in model \(j\), and \(\beta_j\) is the corresponding \(k_j \times 1\) vector of coefficients. The index \(j\) denotes the model specification, the total number of possible models is \(2^K\) (including the model with no regressors), \(K\) denotes the total number of potential regressors, and \(k_j\) is the number of regressors in model \(j\). The error term \(\varepsilon\) is an \(N \times 1\) vector assumed to follow a normal distribution,

\[ \varepsilon \sim N(0_N, h^{-1} I_N), \]

where \(h = \sigma^{-2}\) is the error precision. As is common in the literature, noninformative priors are imposed on parameters that are common to all models (Koop 2003). The prior on the precision parameter is given by

\[ p(h) \propto \frac{1}{h}, \]

while the prior on the intercept is specified as

\[ p(\alpha) \propto 1. \]

The prior on the coefficient vector \(\beta_j\) is assumed to be normally distributed:

\[ \beta_j \mid h \sim N(\underline{\beta_j}, h^{-1}\underline{V_j}). \]

The prior mean reflects the assumption that regressors have no systematic effect on the dependent variable. Accordingly,

\[ \underline{\beta_j} = 0_{k_j}. \]

The prior covariance matrix \(\underline{V_j}\) is specified using Zellner’s \(g\)-prior (Zellner 1986), such that

\[ \underline{V_j} = \left(g_j X_j^{\top} X_j\right)^{-1}, \]

where \(g_j\) is a hyperparameter chosen by the user. Consequently,

\[ \beta_j \mid h \sim N\!\left(0_{k_j}, h^{-1}\left(g_j X_j^{\top} X_j\right)^{-1}\right). \]

The posterior on \(\beta_i\) follows a multivariate t distribution with mean

\[ E(\beta_j|y,M_j) \equiv \bar{\beta_j} = \bar{V_j}X_j^{\top}y \]

and covariance matrix

\[ Var(\beta_j|y,M_j) = \frac{N\bar{s_{j}^2}}{N-2}\bar{V_j} \]

where \(M_j\) denotes model \(j\) the number of degrees of freedom,

\[ \bar{s}_{j}^2 = \frac{\frac{1}{1+g_j}y^{\top}P_{Z_j}y+\frac{g_j}{1+g_j}(y-\bar{y}l_N)^{\top}(y-\bar{y}l_N)}{N} \]

and

\[ P_{Z_j} = I_N - Z_i(Z_j^{\top}Z_j)^{-1}Z_j^{\top} \]

with \(Z_j = (l_N,X_N)\). Within this framework the marginal likelihood of model \(j\) is given by

\[ p(y|M_j) \propto \left(\frac{g_j}{1+g_j}\right)^{-\frac{k_j}{2}}\left[\frac{1}{1+g_j}y^{\top}P_{Z_j}y+\frac{g_j}{1+g_j}(y-\bar{y}l_N)^{\top}(y-\bar{y}l_N)\right]^{-\frac{N-1}{2}} \]

There are two main extensions to that framework. Firstly, the user might want to opt out of \(g\)-regression and perform Bayesian model averaging using classical OLS estimates4 - the so-called Bayesian averaging of classical estimates (BACE) approach proposed by Sala-I-Martin, Doppelhofer, and Miller (2004). Within this approach marginal likelihood of a model \(M_j\) may be written as (Leamer 1978):

\[ l_(y|M_j) \propto N^{-k_{j}}\left[(y-\bar{y}l_N)^{\top}(y-\bar{y}l_N)\right]^{-\frac{N}{2}} \]

The second extension allows the user to utilize a heteroscedasticity-consistent covariance matrix5. In this case, the homoscedastic variance expression derived under the normal error assumption is replaced with the HC sandwich estimator of MacKinnon and White (1985),

\[ \widehat{\mathrm{Var}}_{HC}(\hat{\beta}_j) = \frac{N}{N-k_j}(Z_j^{\top} Z_j)^{-1}Z_j^{\top}\mathrm{diag}(\hat{\varepsilon}_j^{2})Z_j(Z_j^{\top} Z_j)^{-1}, \]

where \(\hat{\varepsilon}_j = y - Z_j \hat{\beta}_{OLS,j}\) denotes the vector of OLS residuals. This modification provides inference that remains valid under unknown forms of heteroscedasticity. While posterior means are derived under the \(g\)-prior specification, the robust covariance matrix replaces the homoscedastic variance estimator for inference purposes.

2.2 Bayesian model averaging

The setup described in setup can be used to construct the model space for a given set of potential regressors. The total number of possible models equals \(2^K\). However, within the reduced model space approach, the user may restrict attention to models containing at most \(M\) regressors. In this case, the total number of admissible models \((J)\) is given by

\[ J = \sum_{m=0}^{M} \binom{K}{m}. \]

Given the considered model space it is possible to perform Bayesian model averaging6. The posterior model probability (PMP) of model \(j\) given the data is

\[ P(M_{j}|y)=\frac{P(y|M_{j})P(M_{j})}{\sum_{j=1}^{J}P(y|M_{j})P(M_{j})} \]

where \(P(M_{j})\) denotes prior model probability. In other words, the PMP represents the share of model \(j\) in the total posterior probability mass. Alternatively, for BACE the PMP for model \(j\) is calculated as:

\[ P(M_{j}|y)=\frac{l(y|M_{j})P(M_{j})}{\sum_{j=1}^{J}l(y|M_{j})P(M_{j})}. \]

With PMPs, it is possible to calculate useful BMA statistics. Let’s denote by \(\pi_k\) the random variable which is equal to one if the \(k^{\text{th}}\) regressor should be considered as the determinant of the dependent variable. The posterior inclusion probability (PIP) for the regressor is given by:

\[ P(\pi_k = 1 |y) = \sum_{j=1}^{J} \mathbf{1} (k^{\text{th}} \text{ regressor is in model } M_{j}) \cdot P(M_{j}|1) \]

where the indicator function \(\mathbf{1}\) is equal to one if the regressor is part of the model \(M_j\) and zero otherwise. In other words, the PIP tells us how likely it is that the given regressor has impact on the variable of interest.

Another important statistic is the posterior mean (PM) of a given parameter \(\beta_k\). Let’s denote by \(\pi_{\beta_k}\) the random variable which is equal to one if the given parameter is present in the model, and zero otherwise. The posterior mean of \(\beta_k\) is given by:

\[ \mathbb{E}(\beta_k|y)=\sum_{j=1}^{J}\widehat{\beta_k}_{j} \cdot P(M_{j}, \pi_{\beta} = 1 |y) \]

where \(\widehat{\beta_k}_{j}\) is the value of the coefficient \(\beta_k\) in model \(j\). It tells us what is the mean (or expected) value for the parameter taking into account all considered models.

The posterior variance of the parameter \(\beta_k\) is equal to:

\[ \begin{split} Var(\beta_k|y)=\ &\sum_{j=1}^{J} Var(\beta_{k,j}|y,M_j)\cdot P(M_{j},\pi_{\beta_k}=1|y) +\\[1ex] &\sum_{j=1}^{J}\Bigl[\widehat{\beta}_{k,j}-\mathbb{E}(\beta_k|y)\Bigr]^{2}\cdot P(M_{j},\pi_{\beta_k}=1|y) \end{split} \]

where \(Var(\beta_{k,j}|y,M_{j})\) denotes the conditional variance of the coefficient \(\beta\) in model \(M_{j}\) (in other words assuming that the model \(M_j\) is the true model). Posterior standard deviation (PSD) of \(\beta_k\) is then defined as the square root of the variance:

\[ SD(\beta_k | y) = \sqrt{ Var(\beta_k | y) } \]

Alternatively, one might be interested in the values of the mean and variance on the condition of inclusion of a given parameter, i.e. assuming that it is definitely a part of the model. Note that this is usually determined by the presence of a related regressor. The conditional posterior mean (PMcon) for a parameter \(\beta_k\) is given by:

\[ \mathbb{E}(\beta_k | \pi_{\beta_k}=1,y)=\frac{\mathbb{E}(\beta_k|y)}{P(\pi_{\beta_k} = 1|y)}. \]

Similarly, the conditional variance is:

\[ Var(\beta_k|\pi_{\beta_k}=1,y)=\frac{Var(\beta_k|y)+\mathbb{E}(\beta_k|y)^2}{P(\pi_{\beta_k}=1|y)}-\mathbb{E}(\beta_k|\pi_{\beta_k}=1,y)^2 \]

and so the conditional standard deviation (PSDcon) is:

\[ SD(\beta_k | \pi_{\beta_k}=1,y) = \sqrt{ Var(\beta_k | \pi_{\beta_k}=1,y)} \]

The next set of statistics that can be calculated within the BMA framework concerns the assessment of the sign of a coefficient associated with a specific regressor. The first measure is the posterior sign certainty (PSC), defined as the posterior probability that a coefficient has the same sign as its posterior mean. PSC is given by

$$ P((_k)y) = \[\begin{cases} \displaystyle \sum_{j=1}^{J} P(M_j \mid y)\, CDF\!\left(t_{k,j}; \mathrm{DF}_j\right), & \text{if } \operatorname{sign}\!\left[E(\beta_k \mid y)\right] = 1, \\[12pt] \displaystyle 1 - \sum_{j=1}^{J} P(M_j \mid y)\, CDF\!\left(t_{k,j}; \mathrm{DF}_j\right), & \text{if } \operatorname{sign}\!\left[E(\beta_k \mid y)\right] = -1, \end{cases}\]

$$

where

\[ t_{k,j} \equiv \frac{\hat{\beta}_{k \mid M_j}} {\widehat{\mathrm{SD}}_{k \mid M_j}}, \]

\(\mathrm{DF}_j\) denotes the number of degrees of freedom in model \(j\), and \(\mathrm{CDF}(\cdot;\mathrm{DF}_j)\) denotes the cumulative distribution function of the Student-\(t\) distribution with \(\mathrm{DF}_j\) degrees of freedom.

The second measure is the posterior probability of a positive sign of the coefficient, denoted by \(P(+)\). It is defined as

\[ P(\beta_k > 0 \mid y) = \sum_{j \in \mathcal{M}_k} P(M_j \mid y)\, CDF\!\left( t_{k,j}; \mathrm{DF}_j \right). \]

While PSC measures the probability that the coefficient retains the same sign as its posterior mean, \(P(+)\) directly measures the posterior probability that the coefficient is strictly positive.

The BMA statistics allow the assessment of the robustness of the examined regressors. Raftery (1995), classifies a variable as weak, positive, strong, and very strong when the posterior inclusion probability (PIP) is between 0.5 and 0.75, between 0.75 and 0.95, between 0.95 and 0.99, and above 0.99, respectively. Raftery (1995) also refers to the variable as robust when the absolute value of the ratio of posterior mean (PM) to posterior standard deviation (PSD) is above 1, indicating that the regressor improves the power of the regression. Masanjala and Papageorgiou (2008) propose a more stringent criterion, where they require the statistic to be higher than 1.3, while Sala-I-Martin, Doppelhofer, and Miller (2004) argue for 2, corresponding to $90

2.3 \(g\)-priors

In order to obtain the marginal likelihood function given in equation g_like and described in setup, the user needs to specify the hyperparameter g in Zellner’s \(g\)-prior. The parameter g controls the prior variance of the regression coefficients and thus determines the degree of shrinkage toward zero. Larger values of g imply weaker shrinkage (i.e., more diffuse priors), while smaller values impose stronger shrinkage toward the null model.

rmsBMA package offers the user the choice between the following conventionally applied choices for \(g\):

  1. Unit information prior (Kass and Wasserman 1995)

    \[g = \frac{1}{N}\]

  2. Risk inflation criterion (Foster and George 1994)

    \[g = \frac{1}{K^{2}}\]

  3. Benchmark prior (Fernández, Ley, and Steel 2001a)

    \[g = \frac{1}{\max\!\left(N, K^{2}\right)}\]

  4. Prior that mimics Hannan-Quinn information criterion (Fernández, Ley, and Steel 2001a)

    \[g = \frac{1}{\left(\log m\right)^{3}}\]

  5. Square root of unit information prior (Fernández, Ley, and Steel 2001a)

    \[g = \sqrt{\frac{1}{N}}\]

Moreover, the user may specify any strictly positive numerical value for the hyperparameter \(g\).

2.4 Model priors

To perform BMA one needs to specify prior model probability7. The package offers two main options. The first is binomial model prior (Sala-I-Martin, Doppelhofer, and Miller 2004):

\[ P(M_{j})=(\frac{EMS}{K})^{k_{j}}(1-\frac{EMS}{K})^{K-k_{j}} \]

where \(EMS\) is the expected model size and \(k_{j}\) is a number of regressors in model \(j\). If \(EMS = \frac{K}{2}\), the binomial model prior simplifies to a uniform model prior with \(P(M_{j}) = \frac{1}{2^K}\) for every \(j\), meaning that all models are assumed to have equal probabilities. The second is binomial-beta model prior Ley and Steel (2009) given by:

\[ P(M_{j}) \propto \Gamma(1+k_{j}) \cdot \Gamma(\frac{K-EMS}{EMS}+K-k_{j}). \]

where \(\Gamma\) is the gamma function. In the context of the binomial-beta prior \(EMS = \frac{K}{2}\) corresponds to equal probabilities on model sizes.

When reduced model space Bayesian model averaging is performed, both the binomial and beta-binomial model priors are truncated. As a result, the prior distribution over model size is restricted to models containing at most \(M\) regressors (where \(M\) is specified by the user), and the prior is renormalized so that the total probability mass over this admissible subset equals one.

2.5 Dilution priors

In order to account for potential multicollinearity between regressors one can use dilution prior introduced by George (2010). The dilution prior involves augmenting the model prior (binomial or binomial-beta) with a function that accounts for multicollinearity:

\[ P_{D}(M_{j}) \propto P(M_{j})|\mathrm{COR}_j|^{\omega} \]

where \(P_{D}(M_{j})\) is the diluted model prior, \(|\mathrm{COR}_j|\) is the determinant of the correlation matrix of regressors in model \(j\), and \(\omega\) is the dilution parameter. The lower the correlation between regressors, the closer \(|\mathrm{COR}_j|\) is to one, resulting in a smaller degree of dilution.

The dilution prior proposed by George (2010) is empirical in nature, as it relies on the determinant of the correlation matrix of the regressors, denoted by \(|\mathrm{COR}_j|\), and therefore departs from a fully Bayesian treatment of prior specification. To address this limitation, the rmsBMA package introduces a non-empirical dilution rule, termed group dilution. This approach enables users to specify dilution parameters for groups of variables representing the same underlying theoretical construct. Consequently, prior model probabilities can be systematically adjusted for specifications that include multiple proxies for a common fundamental determinant.

As group dilution is introduced here for the first time, it is presented in two stages. First, the case of a single group-specific dilution parameter is considered. Second, the framework is extended to allow for a vector of group-specific dilution parameters.

For the case of a single group-specific dilution parameter, let \(\gamma_j = (\gamma_{j,1}, \gamma_{j,2}, \ldots, \gamma_{j,K}) \in \{0,1\}^K\) denote the indicator vector associated with model \(j\), where \(\gamma_{j,k} = 1\) if regressor \(k\) is included in the model and \(\gamma_{j,k} = 0\) otherwise.

Each regressor \(k\) is assigned to a group \(\mathrm{GR}(k) \in \{0,1,2,\ldots\}\), where \(\mathrm{GR}(k) = 0\) indicates that the regressor does not belong to any group, while positive integers identify groups of related regressors. The grouping structure is specified by the researcher, who assigns regressors associated with the same theoretical construct to common groups labeled \(h=1,2,\ldots\).

Let \(p \in (0,1]\) denote a scalar group-specific dilution parameter that reflects the researcher’s prior belief regarding the extent to which multiple proxies capture the same underlying determinant. For a given model \(j\) and group \(h \ge 1\), define the number of included regressors from group \(h\) as

\[ c_{j,h} = \sum_{k=1}^{K} \gamma_{j,k} \, \mathbb{I}\{ \mathrm{GR}(k) = h \}, \]

where \(\mathbb{I}\{\cdot\}\) denotes the indicator function. The group-specific dilution exponent is defined as

\[ d_{j,h} = \max\{0,\, c_{j,h} - 1\}. \]

Thus, models including zero or one regressor from group \(h\) incur no penalty, while each additional regressor beyond the first contributes one unit to the dilution exponent. The total dilution exponent for model \(j\) is obtained by summing across all groups,

\[ D_j = \sum_{h \ge 1} d_{j,h} = \sum_{h \ge 1} \max\{0,\, c_{j,h} - 1\}. \]

The resulting group-based dilution prior component for model \(j\) is given by

\[ \phi_j \propto p^{D_j}. \]

This construction penalizes models that include multiple regressors from the same group, while allowing regressors from different groups to enter the model independently. Regressors not assigned to any group (\(\mathrm{GR}(k)=0\)) do not affect the group dilution prior. Finally, the diluted model prior is given by:

\[ P_{G}(M_{j}) \propto P(M_{j})p^{D_j}. \]

The second case, involving a vector of group-specific dilution parameters, generalizes the previous specification by allowing heterogeneous dilution across groups. This extension acknowledges that the researcher may assign different degrees of prior redundancy to different sets of proxies. Let

\[ \mathbf{p} = (p_h)_{h \ge 1} \]

denote a collection of dilution parameters, where \(p_h \in (0,1]\) is assigned to group \(h\). Using the previously defined quantities \(c_{j,h}\) and \(d_{j,h}\), the group-based dilution prior component for model \(j\) is given by

\[ \phi_j = \prod_{h \ge 1} p_h^{\, d_{j,h}} = \prod_{h \ge 1} p_h^{\, \max\{0,\, c_{j,h} - 1\}}. \]

Thus, each group contributes independently to the overall dilution of model \(j\), with the strength of the penalty governed by its corresponding parameter \(p_h\). Regressors not assigned to any group (\(\mathrm{GR}(k)=0\)) do not affect the dilution prior. When all group-specific parameters are equal, i.e., \(p_h = p\) for all \(h \ge 1\), the dilution prior reduces to

\[ \phi_j = p^{D_j}, \]

which corresponds to the scalar-parameter case.

2.6 Jointness measures

To determine whether regressors are substitutes or complements, various authors have developed jointness measures8. Assuming two different covariates \(a\) and \(b\), let \(P(a\cap b)\) be the posterior probability of the inclusion of both variables, \(P(\overline{a}\cap \overline{b})\) the posterior probability of the exclusion of both variables, \(P(\overline{a}\cap b)\) and \(P(a\cap \overline{b})\) denote the posterior probability of including each variable separately. The first measure of jointness is simply \(P(a\cap b)\).However, this measure ignores much of the information about the relationships between the regressors. Doppelhofer and Weeks (2009) measure is defined as:

\[ J_{DW}=\text{log}\left[\frac{P(a\cap b) \cdot P(\overline{a}\cap \overline{b})}{P(\overline{a}\cap b) \cdot P(a\cap \overline{b})}\right]. \]

If \(J_{DW} < -2\), \(-2 < J_{DW} < -1\), \(-1 < J_{DW} < 1\), \(1 < J_{DW} < 2\), and \(J_{DW} > 2\), the authors classify the regressors as strong substitutes, significant substitutes, not significantly related, significant complements, and strong complements, respectively. Jointness measure proposed by Ley and Steel (2007) is given by:

\[ J_{LS}=\frac{P(a\cap b)}{P(\overline{a}\cap b)+P(a\cap \overline{b})}. \]

The measure takes values in the range \([0, \infty)\), with higher values indicating a stronger complementary relationship.

Finally, Hofmarcher et al. (2018) measure of jointness is:

\[ J_{HCGHM}=\frac{(P(a\cap b)+\rho) \cdot P(\overline{a}\cap \overline{b})+\rho)-(P(\overline{a}\cap b)+\rho) \cdot P(a\cap \overline{b})+\rho)}{(P(a\cap b)+\rho) \cdot P(\overline{a}\cap \overline{b})+\rho)+(P(\overline{a}\cap b)+\rho) \cdot P(a\cap \overline{b})+\rho)+\rho}. \]

Hofmarcher et al. (2018) advocate the use of the Jeffreys (1946) prior, which results in \(\rho=\frac{1}{2}\). The measure takes values from -1 to 1, where values close to -1 indicate substitutes, and those close to 1 complements.

2.7 Extreme Bounds Analysis

Extreme Bounds Analysis (EBA) was developed by Leamer and Leonard (1981),Leamer+1983,Leamer+1985. EBA provides one of the most restrictive tests of robustness for the regressors under examination. The application of this approach led to well-known findings, such as the fragility of growth determinants reported by Levine and Renelt (1992). The restrictiveness of EBA subsequently motivated the development of less stringent variants of the method (Granger and Uhlig 1990; Sala-I-Martin 1997), and ultimately contributed to the emergence of Bayesian model averaging.

However, the restrictiveness of EBA is also one of its main virtues: variables that pass the EBA test can be regarded as robust determinants in the strongest possible sense. The rmsBMA package allows the user to implement a highly restrictive version of EBA to identify which regressors satisfy this demanding robustness criterion.

The idea of the test involves estimating all considered \(J\) models and identifying the specifications that yield the lowest and highest estimated coefficient values for a given regressor \(k\). The upper extreme bound (\(B_{k,U}\)) for regressor \(k\) is calculated as

\[ B_{k,U} = \beta_{k,\max} + 2*SE_{k,\max}, \]

and the lower extreme bound (\(B_{k,L}\)) as

\[ B_{k,L} = \beta_{k,\min} - 2*SE_{k,\min}, \]

where \(\beta_{k,\max}\) and \(\beta_{k,\min}\) denote the highest and lowest estimated values of the coefficient for variable \(k\), respectively, and \(SE_{k,\max}\) and \(SE_{k,\min}\) are the corresponding standard errors.

The EBA test is passed if

\[ \mathrm{sgn}(B_{k,U}) = \mathrm{sgn}(B_{k,L}). \]

Consequently, this test is extremely stringent, as even a single poorly specified model may lead to its failure—a situation analogous to a false negative result. Nevertheless, if a regressor passes this test, its robustness is difficult to question.

3 Data preparation

This section demonstrates how to prepare the data for estimation. The first step involves installing the package and subsequently loading it into the R session.

install.packages("rmsBMA")
library(rmsBMA)

Throughout the manuscript, we use the data from K. Beck (2020) on the determinants of international trade in the European Union (Trade_data and Trade_data_small). The package includes this datasets together with a detailed description of all variables. Since Trade_data is cross-sectional, while the package provides dedicated functionality for panel data analysis, we begin the presentation of the package using panel data on the determinants of international migration (migration_panel) from (Afonso, Alves, and Beck 2025). Detailed information on the dataset can be accessed by:

?migration_panel

A visual inspection of the dataset can be performed by typing:

migration_panel[1:10,1:7]
##    Year_0         Pair_ID    Migration MigrationLAG Unempl      Earn    Tax
## 1    2000 Austria-Belgium 0.2129378562  0.045619177   3.00  1603.506 10.052
## 2    2000 Austria-Czechia 0.2253256590  0.248093396   2.96 16950.652  8.924
## 3    2000 Austria-Denmark 0.0438465263  0.050595928   0.68  3344.318  7.284
## 4    2000 Austria-Estonia 0.0006882383  0.003676347   5.58 17559.100 10.484
## 5    2000 Austria-Finland 0.0186243528  0.014729586   4.04  1384.974  1.312
## 6    2000  Austria-France 0.0997188822  0.043595237   3.46  1112.894  3.586
## 7    2000 Austria-Germany 0.2866766860  0.041541835   4.20   399.480 10.260
## 8    2000  Austria-Greece 0.2938173505  0.022388828   5.58  7028.734  9.238
## 9    2000 Austria-Hungary 0.0530022686  0.172597748   1.00 17712.410  2.918
## 10   2000 Austria-Ireland 0.0236846386  0.118876974   0.70  5127.134  9.300

The first two columns represent the time identifier and the cross-sectional identifier; the cross-sectional unit in this dataset is a country pair. The third column contains the dependent variable, while the remaining columns correspond to the regressors.

The researcher may wish to estimate a fixed effects model or standardize the data. This can be accomplished using the data_preparation function. As an illustration, we introduce both time and cross-sectional fixed effects, along with data standardization. The user should specify which column indicates time (time) and which indicates the cross-sectional unit (id). To include both time and cross-sectional fixed effects, effect = "twoway" should be set. Finally, to standardize the data, the user should specify standardize = TRUE.

data <- data_preparation(migration_panel,
                         time = "Year_0",
                         id = "Pair_ID",
                         fixed_effects = TRUE,
                         effect = "twoway",
                         standardize = TRUE)
data[1:10,1:6]
##      Migration MigrationLAG     Unempl        Earn         Tax      Social
## 1   0.70147677  -0.17353020  0.1761765  0.68749881  0.63980838 -0.06951980
## 2  -0.05598238  -0.22073133  0.2605423 -0.08453441 -0.69734342 -0.91727505
## 3   0.14474255   0.13184524 -0.2108058 -0.25694993  1.67466773 -4.56049857
## 4   0.07771323   0.03533121  0.7190521  0.52405682 -2.27192796 -1.44458719
## 5   0.16120052   0.06954068  0.3339039  0.78919703 -1.08013135  0.05181206
## 6   0.20350209  -0.08221904 -0.1429463  0.28403595 -0.96170452 -0.62389309
## 7  -0.72208939  -1.00970587  0.5576566  0.50115030  1.35152754  0.29407464
## 8   0.12113608  -1.12842641 -2.0228364 -1.30269692  0.14380556 -1.24323369
## 9  -3.32285180  -1.09109771 -0.5867838 -0.82988872  0.32622341  0.58728702
## 10 -0.08231775   0.20050315 -1.0361234  0.69604548 -0.07277945 -1.19482749

As shown above, the data_preparation function removes the columns corresponding to the time and cross-sectional identifiers. This produces the data format required by the model_space function, where the dependent variable is placed in the first column and the regressors occupy the remaining columns. If the user has panel data without explicit time and cross-sectional identifiers, the data_preparation function can be used to introduce fixed effects and perform standardization.

As mentioned earlier the main dataset used in this manuscript is cross-sectional on the determinants of international trade. In this case the user can still utilize data_preparation function for standardization of the data.

data <- data_preparation(Trade_data,
                         standardize = TRUE)
data[1:10,1:6]
##         LNTRADE          B     LNDGEO          L  LNRGDPPROD  RGDPpcDIFF
## 1   0.685661658 -0.3412902 -0.4526402  4.8914368  0.37588807 -1.15600149
## 2  -0.135645463 -0.3412902 -0.5931909 -0.2038099 -0.33175575  1.80791818
## 3  -1.205596707 -0.3412902  0.4563165 -0.2038099 -1.13129985 -0.22585889
## 4   0.910798215  2.9210424 -2.4681456 -0.2038099  0.16270947 -0.06282635
## 5   0.105898299 -0.3412902 -0.4733266 -0.2038099  0.09997771 -1.35414601
## 6  -0.944259347 -0.3412902  0.1662654 -0.2038099 -1.01660137  0.80007366
## 7   0.001856051 -0.3412902  0.2434943 -0.2038099  0.04544252 -1.24811400
## 8   0.960858672 -0.3412902 -0.2855048 -0.2038099  1.26747625 -1.08291700
## 9   2.020557180  2.9210424 -1.1960182  4.8914368  1.44481745 -1.28218726
## 10 -0.157994767 -0.3412902  0.2164248 -0.2038099  0.26432517 -0.39265022

However, in the remainder of the manuscript, we use the unstandardized version of Trade_data. For detailed information on the dataset, type:

?Trade_data

4 Estimation of the model space

Before conducting Bayesian model averaging or Extreme Bounds Analysis, the model space must first be specified. In the rmsBMA package, the model_space function is used to construct the model space. The first argument of the function is df, a data matrix in which the first column contains the dependent variable and the remaining columns correspond to the regressors. The second argument, M, determines the maximum number of variables allowed in the considered models. Consequently, this parameter governs the size of the model space given the total number of regressors and enables reduced model space Bayesian model averaging. If the user does not specify M, it is set equal to the total number of regressors, and the standard version of Bayesian model averaging is performed.

The third argument, g, allows the user to specify the \(g\)-prior. The user may select one of the predefined options described in g_prior or provide any positive numerical value. The default option is "UIP" (the unit information prior). Setting g = "None" instructs the function to omit \(g\)-regression and compute results based on Bayesian averaging of classical estimates. When HC = TRUE, the function replaces the conventional covariance matrix of the estimators with the heteroscedasticity-consistent covariance matrix described in equation hc, and the corresponding robust standard errors are computed from its diagonal elements.

Suppose we wish to consider a model space that includes all models with at most 10 regressors, estimated using the benchmark prior and the conventional covariance matrix:

modelSpace10 <- model_space(Trade_data, M = 10, g = "Benchmark")

Alternatively user could consider model space of identical size, however, with the use of Bayesian averaging of classical estimates and heteroscedasticity consistent covariance matrix:

modelSpace10_none <- model_space(Trade_data, M = 10, g = "None", HC = TRUE)

For the reminder of the manuscript we are going to utilize smaller model space available to the user within the package and constructed using Trade_data_small

modelSpace <- model_space(Trade_data_small, M = 7, g = "UIP")

The object modelSpace, constructed according to the specification of the model_space function described above, is included in the package. The model_space function returns a list consisting of five components, comprising the full model space and additional elements required for subsequent use in the bma function.

5 Performing Bayesian model averaging

5.1 Bayesian model averaging: The bma function

The bma function enables users to perform Bayesian model averaging using the object obtained with the model_space function. The round parameter specifies the decimal place to which the BMA statistics should be rounded in the results.

bma_results <- bma(modelSpace, round = 3)

The bma function returns a list containing 15 elements. However, most of these elements are only required for other functions. The main objects of interest are the two tables with the BMA statistics. The results obtained with binomial model prior are first on the list.

bma_results[[1]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.608 0.714  7.608  0.714 1.000 1.000
## B          0.709  0.299 0.230  0.422  0.151 0.707 0.852
## LNDGEO     1.000 -1.234 0.076 -1.234  0.076 0.000 1.000
## L          0.519  0.258 0.288  0.497  0.204 0.515 0.755
## LNRGDPPROD 1.000  0.911 0.019  0.911  0.019 1.000 1.000
## HUMAN      0.052 -0.005 0.051 -0.087  0.205 0.018 0.509
## INFVAR     0.999 -0.049 0.011 -0.049  0.011 0.000 1.000
## ARABLE     0.115  0.001 0.002  0.005  0.004 0.105 0.547
## ARABLEpw   0.113  0.000 0.001 -0.003  0.002 0.011 0.546
## LAND       0.050  0.000 0.000  0.000  0.000 0.019 0.506
## LANDpc     0.984  0.000 0.000  0.000  0.000 0.984 0.992

PIP denotes the posterior inclusion probability, PM denotes the posterior mean, PSD denotes the posterior standard deviation. PMcon, PSDcon,denote the posterior mean and posterior standard deviation, respectively, conditional on the inclusion of the regressor in the model. Users should base their interpretation of the results on conditional BMA statistics only when they believe that certain regressors must be included. PSC is posterior sign certainty or posterior probability that the sing of a regressor is the same as the sing of PM, and P(+) is the posterior probability of the positive sign.

Two variables stand out: LNDGEO and LNRGDPPROD, representing geographical distance and the product of real GDPs, respectively. Both exhibit posterior inclusion probabilities (PIP) approximately equal to one, and the ratio of the posterior mean to the posterior standard deviation is well above two in terms of absolute value. The posterior sign certainty (PSC) and \(P(+)\) indicate that the signs of the coefficients are extremely stable across the model space. These results demonstrate that the two variables are robust determinants of international trade and justify their central role in the gravity model of trade (Tinbergen 1962), which has become a workhorse framework in the field (Feenstra 2015).

INFVAR and LANDpc, representing the difference in inflation variability and land per capita, respectively, are also characterized by high posterior inclusion probabilities (PIP) and a large absolute ratio of the posterior mean (PM) to the posterior standard deviation (PSD). However, the reported values of PM and PSD for LANDpc are both equal to zero. This results from the difference in measurement units between the dependent variable and this regressor. To mitigate this issue, the user may standardize the data prior to estimation using the model_space function. The variable B (common border) also satisfies the least stringent robustness criterion proposed by Raftery (1995). The remaining regressors may be regarded as fragile. The results obtained with binomial-beta model prior is included as the second object in the bma list.

bma_results[[2]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.599 0.720  7.599  0.720 1.000 1.000
## B          0.728  0.302 0.225  0.415  0.152 0.725 0.861
## LNDGEO     1.000 -1.231 0.076 -1.231  0.076 0.000 1.000
## L          0.564  0.277 0.288  0.491  0.204 0.560 0.778
## LNRGDPPROD 1.000  0.911 0.019  0.911  0.019 1.000 1.000
## HUMAN      0.068 -0.006 0.058 -0.089  0.205 0.022 0.511
## INFVAR     0.999 -0.049 0.011 -0.049  0.011 0.000 1.000
## ARABLE     0.149  0.001 0.002  0.005  0.003 0.136 0.561
## ARABLEpw   0.145  0.000 0.001 -0.003  0.002 0.014 0.559
## LAND       0.065  0.000 0.000  0.000  0.000 0.025 0.508
## LANDpc     0.984  0.000 0.000  0.000  0.000 0.984 0.992

The results obtained under the binomial–beta model prior corroborate those derived from the binomial model prior. The third element of the bma output list contains the results of the Extreme Bounds Analysis:

bma_results[[3]]
##      Variable Lower bound Minimum   Mean Maximum Upper bound EBA test    %(+)
## 1       CONST      -4.695  -2.764 14.627  32.947      35.090     FAIL  74.483
## 2           B       0.011   0.311  1.477   2.720       3.440     PASS 100.000
## 3      LNDGEO      -1.815  -1.551 -1.274  -1.043      -0.642     PASS   0.000
## 4           L      -0.786   0.311  1.287   3.247       4.341     FAIL 100.000
## 5  LNRGDPPROD       0.834   0.880  0.914   0.967       1.045     PASS 100.000
## 6       HUMAN      -2.228  -0.963 -0.108   0.843       1.997     FAIL  32.618
## 7      INFVAR      -0.200  -0.134 -0.077  -0.048      -0.027     PASS   0.000
## 8      ARABLE      -0.037  -0.019 -0.001   0.012       0.023     FAIL  49.571
## 9    ARABLEpw      -0.065  -0.054 -0.026   0.003       0.011     FAIL  13.090
## 10       LAND       0.000   0.000  0.000   0.000       0.000     FAIL  65.880
## 11     LANDpc       0.000   0.000  0.000   0.000       0.000     FAIL  35.193

The columns of the table report the following statistics: the lower bound (Equation lb), the minimum value of the coefficient, the average value of the coefficient, the maximum value of the coefficient, the upper bound (Equation ub), the result of the EBA test (Equation eba_test), and the percentage of positive coefficients across the model space.

Notably, four variables—B, LNDGEO, LNRGDPPROD, and INFVAR—pass the stringent EBA criterion. However, the issue related to the measurement units of the LANDpc regressor reappears. When standardized data are used, this variable also satisfies the EBA test.

The fourth object in the list is a table containing the prior and posterior expected model sizes for the binomial and binomial-beta model priors.

bma_results[[4]]
##               Prior model size Posterior model size
## Binomial                     5                5.542
## Binomial-beta                5                5.702

The results indicate that, after observing the data, the researcher should expect approximately five to six regressors in the model under both the binomial and the binomial–beta model priors. The posterior expected model size can be computed as the sum of the posterior inclusion probabilities.

The table also illustrates a potential pitfall of using reduced model space Bayesian model averaging. Both the prior and posterior expected model sizes may exceed the size of the largest model considered. If this occurs for the posterior expected model size, the researcher should consider enlarging the maximum model size. However, a high posterior expected model size may simply reflect a high prior expected model size. Consequently, careful experimentation and robustness checks are recommended.

5.2 Prior and posterior model probabilities

The model_pmp function allows the user to compare prior and posterior model probabilities over the entire model space in the form of a graph. The models are ranked from the one with the highest to the one with the lowest posterior model probability. The function returns a list with three objects:

  1. a graph for the binomial model prior;
  2. a graph for the binomial-beta model prior;
  3. a combined graph for both binomial and binomial-beta model priors.

The user can retrieve each graph separately from the list; however, the function automatically displays a combined graph.

for_models <- model_pmp(bma_results)

The graphs demonstrate that great share of the posterior probability mass is concentrated within just a couple of models. To view the results for only the best models, the user can use the top parameter.

for_models <- model_pmp(bma_results, top = 10)

The last graph is particularly illuminating. It demonstrated that the best three models are associated with vastly higher posterior model probabilities than the rest. It will be very instructive to see those models. The models in question will be identified after implementing model_sizes (through best_models, which is covered in the section below).

The model_sizes function displays prior and posterior model probabilities on a graph for models of different sizes. Similarly to the model_pmp function is returns a list with three objects:

  1. a graph for the binomial model prior;
  2. a graph for the binomial-beta model prior;
  3. a combined graph for both binomial and binomial-beta model priors.

Again, the user can retrieve each graph separately from the list; however, the function automatically displays a combined graph.

size_graphs <- model_sizes(bma_results)

The graph illustrates that the posterior distribution over model sizes is largely independent of the prior specification, as it appears very similar under both the binomial and the binomial–beta model priors.

5.3 Selecting the best models

The best_models function allows the user to view a chosen number of the best models in terms of posterior model probability. The function returns a list containing six objects:

  1. An inclusion table stored as an array object;
  2. A table with estimation results, stored as an array object;
  3. An inclusion table stored as a knitr object;
  4. A table with estimation results, stored as a knitr object;
  5. An inclusion table stored as a gTree object;
  6. A table with estimation results, stored as a gTree object.

The parameter estimate pertain only to the results that will be automatically displayed after running the function. The parameter criterion determines whether the models should be ranked according to posterior model probabilities calculated using the binomial (1) or binomial-beta (2) model prior. To obtain the inclusion array for the 10 best models ranked with the binomial model prior, the user needs to run:

best_8_models <- best_models(bma_results, criterion = 1, best = 8)
best_8_models[[1]]
##            'No. 1' 'No. 2' 'No. 3' 'No. 4' 'No. 5' 'No. 6' 'No. 7' 'No. 8'
## Const        1.000   1.000   1.000   1.000   1.000   1.000   1.000    1.00
## B            1.000   1.000   0.000   1.000   0.000   0.000   1.000    1.00
## LNDGEO       1.000   1.000   1.000   1.000   1.000   1.000   1.000    1.00
## L            0.000   1.000   1.000   0.000   0.000   1.000   1.000    1.00
## LNRGDPPROD   1.000   1.000   1.000   1.000   1.000   1.000   1.000    1.00
## HUMAN        0.000   0.000   0.000   0.000   0.000   0.000   0.000    0.00
## INFVAR       1.000   1.000   1.000   1.000   1.000   1.000   1.000    1.00
## ARABLE       0.000   0.000   0.000   1.000   0.000   0.000   1.000    0.00
## ARABLEpw     0.000   0.000   0.000   0.000   0.000   1.000   0.000    1.00
## LAND         0.000   0.000   0.000   0.000   0.000   0.000   0.000    0.00
## LANDpc       1.000   1.000   1.000   1.000   1.000   1.000   1.000    1.00
## PMP          0.310   0.190   0.150   0.038   0.037   0.034   0.032    0.03
## R^2          0.918   0.919   0.918   0.919   0.915   0.918   0.920    0.92

A value of 1 indicates the inclusion of a given regressor in a model, while the second-to-last row reports the posterior model probability associated with each specification. The table shows that the first three models account for 32

To generate a knitr table with the estimation output for the three best models ranked according to the binomial–beta model prior, the user should run:

best_3_models <- best_models(bma_results, criterion = 2, best = 3)
best_3_models[[4]]
‘No. 1’ ‘No. 2’ ‘No. 3’
Const 7.462 (0.646)*** 7.363 (0.643)*** 7.929 (0.608)***
B 0.456 (0.143)*** 0.37 (0.147)** NA
LNDGEO -1.215 (0.067)*** -1.201 (0.067)*** -1.289 (0.057)***
L NA 0.425 (0.193)** 0.554 (0.188)***
LNRGDPPROD 0.911 (0.018)*** 0.911 (0.018)*** 0.915 (0.018)***
HUMAN NA NA NA
INFVAR -0.05 (0.011)*** -0.048 (0.011)*** -0.048 (0.011)***
ARABLE NA NA NA
ARABLEpw NA NA NA
LAND NA NA NA
LANDpc NA NA NA
PMP 0.258 0.19 0.125
R^2 0.918 0.919 0.918

The table shows that the first three models account for 26

Finally, to obtain a gTree table with estimation output for the top 3 models ranked by the binomial-beta model prior, the user needs to run:

best_3_models <- best_models(bma_results, criterion = 2, best = 3)
grid::grid.draw(best_3_models[[6]])

5.4 Calculating jointness measures

Within the BMA framework, it is possible to establish the nature of the relationship between pairs of examined regressors using the jointness measures. This can be accomplished using the jointness function. The latest jointness measure, introduced by Hofmarcher et al. (2018), has been shown to outperform older alternatives developed by Ley and Steel (2007) and Doppelhofer and Weeks (2009)9. Therefore, the Hofmarcher et al. (2018) measure is the default option in the jointness function10.

jointness(bma_results)[1:9,1:9]
##                 B LNDGEO      L LNRGDPPROD  HUMAN INFVAR ARABLE ARABLEpw   LAND
## B              NA  0.418 -0.417      0.418 -0.363  0.418 -0.301   -0.379 -0.368
## LNDGEO      0.456     NA  0.038      1.000 -0.895  0.999 -0.770   -0.774 -0.899
## L          -0.291  0.129     NA      0.038 -0.047  0.037 -0.006    0.059 -0.040
## LNRGDPPROD  0.456  1.000  0.129         NA -0.895  0.999 -0.770   -0.774 -0.899
## HUMAN      -0.376 -0.865 -0.115     -0.865     NA -0.894  0.667    0.672  0.794
## INFVAR      0.456  0.999  0.128      0.999 -0.864     NA -0.769   -0.773 -0.898
## ARABLE     -0.280 -0.703 -0.035     -0.703  0.574 -0.702     NA    0.547  0.670
## ARABLEpw   -0.365 -0.710  0.029     -0.710  0.583 -0.709  0.426       NA  0.674
## LAND       -0.382 -0.870 -0.108     -0.870  0.735 -0.869  0.577    0.584     NA

The entries above the main diagonal report the results obtained under the binomial model prior, whereas the entries below the diagonal correspond to the binomial-beta model prior. Positive values of the measure indicate a complementary relationship between regressors, while negative values indicate they are substitutes. Consistent with the previous findings, LNDGEO and LNRGDPPROD exhibit strong complements. By contrast, LNRGDPPROD and LAND are relatively strong substitutes. Notably, LNDGEO and B are identified as complements, implying that geographical distance and the presence of a common border capture distinct dimensions of the forces behind the international trade.

To obtain the results for the Ley and Steel (2007) measure, the user should run:

jointness(bma_results, measure = "LS")[1:9,1:9]
##                B   LNDGEO     L LNRGDPPROD HUMAN   INFVAR ARABLE ARABLEpw  LAND
## B             NA    2.437 0.418      2.437 0.053    2.434  0.122    0.094 0.050
## LNDGEO     2.677       NA 1.078        Inf 0.055 1736.689  0.130    0.127 0.053
## L          0.564    1.295    NA      1.078 0.047    1.077  0.125    0.153 0.048
## LNRGDPPROD 2.677      Inf 1.295         NA 0.055 1736.689  0.130    0.127 0.053
## HUMAN      0.070    0.072 0.065      0.072    NA    0.055  0.027    0.028 0.019
## INFVAR     2.674 1809.490 1.294   1809.490 0.072       NA  0.130    0.127 0.053
## ARABLE     0.168    0.174 0.175      0.174 0.035    0.174     NA    0.037 0.025
## ARABLEpw   0.134    0.170 0.203      0.170 0.038    0.170  0.050       NA 0.024
## LAND       0.067    0.069 0.065      0.069 0.025    0.069  0.032    0.032    NA

For the Doppelhofer and Weeks (2009) measure type:

jointness(bma_results, measure = "DW")[1:9,1:9]
##                 B LNDGEO      L LNRGDPPROD  HUMAN INFVAR ARABLE ARABLEpw   LAND
## B              NA    NaN -1.965        NaN -0.057  0.347 -0.034   -0.669 -0.121
## LNDGEO        NaN     NA    NaN        NaN    NaN    NaN    NaN      NaN    NaN
## L          -1.601    NaN     NA        NaN -0.205 -0.705  0.151    0.645 -0.115
## LNRGDPPROD    NaN    NaN    NaN         NA    NaN    NaN    NaN      NaN    NaN
## HUMAN      -0.005    NaN -0.122        NaN     NA -0.021 -0.402   -0.333 -0.386
## INFVAR      0.534    NaN -0.414        NaN  0.230     NA  0.170    0.425 -0.101
## ARABLE      0.088    NaN  0.281        NaN -0.417  0.456     NA   -0.593 -0.461
## ARABLEpw   -0.513    NaN  0.718        NaN -0.330  0.708 -0.625       NA -0.465
## LAND       -0.052    NaN -0.033        NaN -0.380  0.149 -0.474   -0.468     NA

A drawback of the measures proposed by Ley and Steel (2007) and Doppelhofer and Weeks (2009), as illustrated in the previous two tables, is that they may yield problematic outputs such as NaN or Inf. This issue does not arise with the measure introduced by Hofmarcher et al. (2018). Avoiding such numerical irregularities constitutes one of several arguments advanced by Hofmarcher et al. (2018) in favor of their approach and helps explain why their measure performs more reliably than the alternatives.

5.5 Visualizing model coefficients and posterior distributions

The coef_hist function allows the user to plot the distribution of estimated coefficients. It returns a list containing a number of objects equal to the number of regressors plus one (for the constant). The first object in the list is a graph of the constants, while the remaining objects are graphs of the coefficients for the other regressors. The graph for the constants collects coefficients from the entire model space, whereas the graphs for the other regressors only collect coefficients from the models that include the given regressor.

There are two principal approaches to visualizing the posterior distributions of the coefficients. The first approach relies on a histogram. The coef_hist function allows the user to control the binning of the histogram through several arguments (BW, binW, BN, and num). To specify 80 bins for each graph, the following code can be used:

bin_sizes <- matrix(80, nrow = 11, ncol = 1)
coef_plots <- coef_hist(bma_results, BN = 1, num = bin_sizes)
coef_plots[[3]]

The second option allows the user to plot kernel densities.

coef_plots2 <- coef_hist(bma_results, kernel = 1)
coef_plots2[[5]]

The choice of appropriate plotting options is left to the user’s preferences regarding the style of presentation and the size of the model space.

library(gridExtra)
grid.arrange(coef_plots[[3]], coef_plots[[5]], coef_plots2[[3]],
             coef_plots2[[5]], nrow = 2, ncol = 2)

One additional option available to the user with the coef_hist function is weighting coefficients by posterior model probabilities. This can be accomplished using the weight parameter, whose default value is set to NULL. Setting weight to "binomial" or "beta" results in plotting histograms based on the binomial or binomial–beta model prior, respectively. For example, in the case of the binomial–beta model prior, it can be observed that the value of the posterior mean is mainly driven by the model characterized by the highest posterior model probability.

coef_plots3 <- coef_hist(bma_results, weight = "beta", BN = 1, num = bin_sizes)
coef_plots3[[5]]

The posterior_dens allows the user to plot posterior distributions of coefficients. Similarly, to coef_hist function, posterior_dens returns a list containing a number of objects equal to the number of regressors plus one. The first object in the list is a graph of the posterior distribution for the lagged dependent variable, while the remaining objects are graphs of the posterior distribution for the other regressors. The user needs to specify whether to plot posterior distributions based on the binomial (prior = "binomial") or binomial-beta (prior = "beta") model prior, as well as whether to use regular (SE = "standard") or robust (SE = "robust") standard errors used in the calculation of posterior objects.

distPlots <- posterior_dens(bma_results, prior = "binomial")
grid.arrange(distPlots[[3]], distPlots[[5]], nrow = 2, ncol = 1)

6 Changes in model priors

This section provides a more detailed description of the available model prior options. Subsection ems discusses the consequences of changes in the expected model size, while subsection dil describes the dilution priors.

6.1 Changing expected model size

The bma function calculates BMA statistics using both the binomial and binomial-beta model priors. By default, the bma function sets the expected model size (EMS) to \(K/2\), where \(K\) denotes the total number of regressors. The binomial model prior with \(EMS = K/2\) leads to a uniform model prior, assigning equal probabilities to all models. In contrast, the binomial-beta model prior with \(EMS = K/2\) assumes equal probabilities across all model sizes. However, the user can modify the prior model specification by changing the EMS parameter. First, consider the consequence of concentrating prior probability mass on small models by setting \(EMS = 2\).

bma_results2 <- bma(modelSpace, round = 3, EMS = 2)

Before turning to the main BMA results, let us focus on the changes in the posterior probability mass with respect to model sizes.

bma_results2[[4]]
##               Prior model size Posterior model size
## Binomial                     2                4.871
## Binomial-beta                2                5.385

The results show that decreasing the prior expected model size led to a small decline in the posterior expected model size. The consequences of this change in the prior expected model size are best illustrated using the prior and posterior probability mass over model sizes.

size_graphs2 <- model_sizes(bma_results2)

For the dataset considered in the manuscript, the resulting differences in posterior inference are negligible, as the likelihood contribution from the data substantially outweighs the influence of the prior specification.

The corresponding changes in the distribution of posterior probability mass over the model space are likewise negligible.

model_graphs2 <- model_pmp(bma_results2)

The principal posterior summary statistics under the binomial model prior exhibit only minimal change.

bma_results2[[1]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.754 0.720  7.754  0.720 1.000 1.000
## B          0.531  0.235 0.246  0.443  0.147 0.530 0.765
## LNDGEO     1.000 -1.257 0.081 -1.257  0.081 0.000 1.000
## L          0.307  0.162 0.267  0.527  0.198 0.306 0.652
## LNRGDPPROD 1.000  0.913 0.018  0.913  0.018 1.000 1.000
## HUMAN      0.015 -0.001 0.027 -0.085  0.206 0.005 0.502
## INFVAR     0.998 -0.050 0.011 -0.050  0.011 0.000 0.999
## ARABLE     0.033  0.000 0.001  0.005  0.004 0.030 0.513
## ARABLEpw   0.032  0.000 0.001 -0.003  0.002 0.003 0.513
## LAND       0.014  0.000 0.000  0.000  0.000 0.006 0.501
## LANDpc     0.941  0.000 0.000  0.000  0.000 0.941 0.970

The results for binomial-beta model prior are very similar.

bma_results2[[2]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.642 0.718  7.642  0.718 1.000 1.000
## B          0.668  0.285 0.235  0.426  0.151 0.666 0.832
## LNDGEO     1.000 -1.239 0.078 -1.239  0.078 0.000 1.000
## L          0.468  0.235 0.286  0.502  0.203 0.465 0.731
## LNRGDPPROD 1.000  0.912 0.019  0.912  0.019 1.000 1.000
## HUMAN      0.044 -0.004 0.046 -0.087  0.205 0.015 0.507
## INFVAR     0.999 -0.050 0.011 -0.050  0.011 0.000 1.000
## ARABLE     0.096  0.000 0.002  0.005  0.004 0.087 0.539
## ARABLEpw   0.094  0.000 0.001 -0.003  0.002 0.009 0.538
## LAND       0.042  0.000 0.000  0.000  0.000 0.016 0.505
## LANDpc     0.975  0.000 0.000  0.000  0.000 0.975 0.987

It is also very instructive to examine the jointness measures calculated under the new prior specification.

jointness(bma_results2, measure = "HCGHM", rho = 0.5, round = 3)[1:9,1:9]
##                 B LNDGEO      L LNRGDPPROD  HUMAN INFVAR ARABLE ARABLEpw   LAND
## B              NA  0.062 -0.459      0.062 -0.059  0.063 -0.055   -0.081 -0.060
## LNDGEO      0.336     NA -0.385      1.000 -0.970  0.996 -0.934   -0.937 -0.972
## L          -0.420 -0.064     NA     -0.385  0.368 -0.385  0.362    0.385  0.371
## LNRGDPPROD  0.336  1.000 -0.064         NA -0.970  0.996 -0.934   -0.937 -0.972
## HUMAN      -0.291 -0.913  0.055     -0.913     NA -0.966  0.905    0.907  0.942
## INFVAR      0.336  0.998 -0.064      0.998 -0.911     NA -0.930   -0.933 -0.967
## ARABLE     -0.240 -0.809  0.088     -0.809  0.725 -0.807     NA    0.871  0.906
## ARABLEpw   -0.305 -0.813  0.141     -0.813  0.729 -0.811  0.626       NA  0.908
## LAND       -0.296 -0.916  0.060     -0.916  0.829 -0.914  0.727    0.731     NA

The jointness measures likewise exhibit only minor changes.

Next, to examine the consequences of concentrating prior probability mass on larger models, EMS was set to eight, which exceeds the maximum number of regressors permitted in the model.

bma_results8 <- bma(modelSpace, round = 3, EMS = 8)
bma_results8[[4]]
##               Prior model size Posterior model size
## Binomial                     8                6.336
## Binomial-beta                8                5.825

The posterior model size increased for both the binomial prior and the binomial-beta model prior. The most interesting aspect is the new graphs of prior and posterior probability mass over the model sizes.

size_graphs8 <- model_sizes(bma_results8)

In both cases, the posterior probability mass has concentrated near the models with higher number of regressors. This conclusion is further supported by the graphs of posterior model probability across the entire model space.

model_graphs8 <- model_pmp(bma_results8)

The Figure demonstrates that the change in the expected model size led to a substantial increase in the posterior model probability for the best models. The increase in the expected model size also influenced the main BMA statistics.

bma_results8[[1]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.540 0.731  7.540  0.731 1.000 1.000
## B          0.824  0.325 0.204  0.394  0.152 0.820 0.908
## LNDGEO     1.000 -1.218 0.073 -1.218  0.073 0.000 1.000
## L          0.747  0.357 0.273  0.478  0.204 0.740 0.867
## LNRGDPPROD 1.000  0.909 0.020  0.909  0.020 1.000 1.000
## HUMAN      0.122 -0.011 0.077 -0.091  0.204 0.040 0.521
## INFVAR     1.000 -0.049 0.011 -0.049  0.011 0.000 1.000
## ARABLE     0.270  0.001 0.003  0.005  0.003 0.248 0.613
## ARABLEpw   0.262 -0.001 0.002 -0.003  0.002 0.024 0.608
## LAND       0.117  0.000 0.000  0.000  0.000 0.045 0.514
## LANDpc     0.993  0.000 0.000  0.000  0.000 0.993 0.997

The PIPs increased considerably, however, the main conclusions from the main exercise remain the same. Similar situation is present in the case of binomial-beta model prior.

bma_results8[[2]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.586 0.722  7.586  0.722 1.000 1.000
## B          0.748  0.307 0.222  0.411  0.152 0.745 0.871
## LNDGEO     1.000 -1.229 0.076 -1.229  0.076 0.000 1.000
## L          0.600  0.293 0.287  0.488  0.204 0.595 0.795
## LNRGDPPROD 1.000  0.910 0.019  0.910  0.019 1.000 1.000
## HUMAN      0.078 -0.007 0.062 -0.089  0.205 0.026 0.513
## INFVAR     1.000 -0.049 0.011 -0.049  0.011 0.000 1.000
## ARABLE     0.171  0.001 0.002  0.005  0.003 0.157 0.571
## ARABLEpw   0.167 -0.001 0.001 -0.003  0.002 0.015 0.568
## LAND       0.075  0.000 0.000  0.000  0.000 0.029 0.509
## LANDpc     0.987  0.000 0.000  0.000  0.000 0.986 0.993

Again, it is instructive to examine the jointness measures.

jointness(bma_results8, measure = "HCGHM", rho = 0.5, round = 3)[1:9,1:9]
##                 B LNDGEO      L LNRGDPPROD  HUMAN INFVAR ARABLE ARABLEpw   LAND
## B              NA  0.648  0.088      0.648 -0.496  0.648 -0.302   -0.419 -0.503
## LNDGEO      0.496     NA  0.494      1.000 -0.756  1.000 -0.459   -0.475 -0.765
## L          -0.226  0.200     NA      0.494 -0.420  0.493 -0.234   -0.164 -0.411
## LNRGDPPROD  0.496  1.000  0.200         NA -0.756  1.000 -0.459   -0.475 -0.765
## HUMAN      -0.401 -0.845 -0.175     -0.845     NA -0.755  0.234    0.256  0.518
## INFVAR      0.495  0.999  0.200      0.999 -0.844     NA -0.459   -0.475 -0.765
## ARABLE     -0.285 -0.658 -0.072     -0.658  0.511 -0.657     NA   -0.015  0.238
## ARABLEpw   -0.377 -0.666 -0.006     -0.666  0.522 -0.665  0.342       NA  0.253
## LAND       -0.407 -0.851 -0.167     -0.851  0.695 -0.850  0.514    0.522     NA

6.2 Dilution prior

One of the main issues associated with identifying robust regressors is multicollinearity. Some regressors may approximate the same underlying factor influencing the dependent variable. Multicollinearity may result from the absence of observable variables associated with a specific theory or from a theory failing to provide a unique candidate for a regressor. Moreover, some regressors may share a common determinant. The dilution prior proposed by George (2010), described in detail in the section below, constitutes one of the available approaches for addressing this issue.

To apply the dilution prior, the user must set dilution = 1 in the bma function. The user can also manipulate the dilution parameter \(\omega\). The default option is dil.Par = 0.5, as recommended by George (2010).

bma_results_dil <- bma(
  modelSpace = modelSpace,
  round       = 3,
  dilution    = 1
  )

The effect of implementing the dilution prior is well depicted by the distribution of prior probability mass over the model sizes.

size_graphs_dil <- model_sizes(bma_results_dil)

The change in the prior distribution is more visible for the binomial-beta model prior. In panel b, the prior probability mass has decreased for larger models and increased for smaller models. However, this change is not uniform, as models characterized by the highest degree of multicollinearity are subject to the greatest penalty in terms of prior probability mass.

Before moving to the BMA statistics, it is instructive to examine the change in the dil.Par parameter.

bma_results_dil01 <- bma(
  modelSpace = modelSpace,
  round       = 3,
  dilution    = 1,
  dil.Par     = 0.1
)
size_graphs_dil01 <- model_sizes(bma_results_dil01)

As we can see, decreasing the value of \(\omega\) diminishes the impact of dilution on the model prior. Conversely, raising the dil.Par parameter increases the degree of dilution.

bma_results_dil2 <- bma(
  modelSpace = modelSpace,
  round       = 3,
  dilution    = 1,
  dil.Par     = 2
)
size_graphs_dil2 <- model_sizes(bma_results_dil2)

An especially strong impact can be seen for the binomial-beta prior.

However, even after giving such priority to the penalty for multicollinearity, the main BMA statistics remain stable.

bma_results_dil2[[2]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.712 0.721  7.712  0.721 1.000 1.000
## B          0.530  0.225 0.238  0.424  0.151 0.528 0.763
## LNDGEO     1.000 -1.251 0.078 -1.251  0.078 0.000 1.000
## L          0.556  0.289 0.299  0.520  0.202 0.552 0.775
## LNRGDPPROD 1.000  0.912 0.019  0.912  0.019 1.000 1.000
## HUMAN      0.065 -0.005 0.056 -0.084  0.205 0.022 0.510
## INFVAR     0.999 -0.049 0.011 -0.049  0.011 0.000 1.000
## ARABLE     0.131  0.001 0.002  0.005  0.004 0.119 0.554
## ARABLEpw   0.096  0.000 0.001 -0.003  0.002 0.008 0.540
## LAND       0.047  0.000 0.000  0.000  0.000 0.018 0.506
## LANDpc     0.978  0.000 0.000  0.000  0.000 0.978 0.989

As discussed in dilution_prior, the dilution prior proposed by George (2010) is empirical in nature and, as such, departs from a strictly Bayesian framework. To address this concern, the present manuscript proposes a non-empirical alternative termed group dilution.

Within the dataset considered here, three pairs of variables may plausibly capture the same underlying determinants of international trade: ARABLE and ARABLEpw, LAND and LANDpc, as well as B and L. The first two pairs are self-explanatory, as they represent level and per-worker/person transformations of the same underlying quantities. In the case of B and L, countries that share a common border are more likely to share a common language, suggesting potential informational overlap between these regressors.

In order to implement group dilution, the user must provide two vectors. The first vector must have a length equal to the number of regressors. In this vector, a value of 0 indicates that a given regressor is not associated with any other regressor, that is, it does not represent the same underlying determinant as any other variable. Positive consecutive natural numbers indicate membership in a group of related regressors, namely those capturing the same underlying determinant. Consequently, for the case outlined in the previous paragraph:

group_vec <- c(1,0,1,0,0,0,2,2,3,3)

The numerical indices can be displayed alongside the regressor names by executing:

cbind(modelSpace[[1]],group_vec)
##                    group_vec
##  [1,] "B"          "1"      
##  [2,] "LNDGEO"     "0"      
##  [3,] "L"          "1"      
##  [4,] "LNRGDPPROD" "0"      
##  [5,] "HUMAN"      "0"      
##  [6,] "INFVAR"     "0"      
##  [7,] "ARABLE"     "2"      
##  [8,] "ARABLEpw"   "2"      
##  [9,] "LAND"       "3"      
## [10,] "LANDpc"     "3"

The second vector must have a length equal to the number of groups of related regressors. Each element of this vector specifies the dilution parameter associated with the corresponding group. If the researcher considers it unlikely that B and L represent the same underlying determinant, the dilution parameter for the group containing these variables may be set to 0.8. If, by contrast, the researcher regards the relationship between ARABLE and ARABLEpw as more probable, and the association between LAND and LANDpc as even stronger, the dilution parameters for these groups may be set to 0.6 and 0.4, respectively. Consequently, the dilution parameter vector is given by:

par_vec <- c(0.8,0.6,0.4)

The results under the binomial model prior with group dilution can be obtained by executing:

bma_results_dil3 <- bma(
  modelSpace = modelSpace,
  Narrative  = 1,
  Nar_vec    = group_vec,
  p          = par_vec,
  round      = 3
)
bma_results_dil3[[1]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.622 0.714  7.622  0.714 1.000 1.000
## B          0.693  0.295 0.233  0.426  0.151 0.692 0.845
## LNDGEO     1.000 -1.236 0.076 -1.236  0.076 0.000 1.000
## L          0.491  0.247 0.289  0.503  0.204 0.487 0.742
## LNRGDPPROD 1.000  0.911 0.019  0.911  0.019 1.000 1.000
## HUMAN      0.054 -0.005 0.051 -0.088  0.205 0.018 0.509
## INFVAR     0.999 -0.049 0.011 -0.050  0.011 0.000 1.000
## ARABLE     0.113  0.001 0.002  0.005  0.003 0.102 0.546
## ARABLEpw   0.111  0.000 0.001 -0.003  0.002 0.011 0.545
## LAND       0.022  0.000 0.000  0.000  0.000 0.008 0.502
## LANDpc     0.984  0.000 0.000  0.000  0.000 0.984 0.992

For the binomial-beta with group dilution execute:

bma_results_dil3[[2]]
##              PIP     PM   PSD  PMcon PSDcon  P(+)   PSC
## CONST         NA  7.614 0.720  7.614  0.720 1.000 1.000
## B          0.710  0.298 0.229  0.419  0.151 0.708 0.853
## LNDGEO     1.000 -1.233 0.077 -1.233  0.077 0.000 1.000
## L          0.532  0.265 0.289  0.498  0.204 0.528 0.762
## LNRGDPPROD 1.000  0.911 0.019  0.911  0.019 1.000 1.000
## HUMAN      0.069 -0.006 0.058 -0.089  0.205 0.023 0.512
## INFVAR     0.999 -0.050 0.011 -0.050  0.011 0.000 1.000
## ARABLE     0.144  0.001 0.002  0.005  0.003 0.131 0.559
## ARABLEpw   0.141  0.000 0.001 -0.003  0.002 0.013 0.557
## LAND       0.028  0.000 0.000  0.000  0.000 0.011 0.503
## LANDpc     0.984  0.000 0.000  0.000  0.000 0.983 0.992

Similarly to the dilution prior proposed by George (2010), the application of the group dilution prior does not lead to any meaningful changes in the results. This finding suggests that multicollinearity does not constitute a substantive issue in the dataset under consideration.

7 Concluding remarks

The manuscript introduces the rmsBMA package, which implements reduced model space Bayesian model averaging. This framework enables the user to conduct BMA and perform meaningful inference even in situations where the number of regressors exceeds the number of observations. By restricting the model space, a sufficient number of observations is preserved for reliable statistical inference, while the analysis remains focused on the most relevant determinants. To the best of our knowledge, this functionality is currently unique to the rmsBMA package.

Another distinctive feature of the rmsBMA package is the newly introduced group dilution prior, which addresses the challenge of identifying robust determinants in the presence of multicollinearity. This approach enables users to specify dilution parameters for groups of variables associated with the same underlying theoretical construct. As a consequence, prior probabilities can be systematically adjusted for models that include multiple proxies capturing the same fundamental determinant. Moreover, the group dilution prior provides a non-empirical alternative to the dilution prior proposed by George (2010). The latter is empirical in nature, as it relies on the correlation structure of the regressors and therefore departs from a fully Bayesian treatment of prior specification.

Finally, the rmsBMA package enables users to specify a flexible class of \(g\)-priors, model priors, and dilution structures; compute jointness measures (Doppelhofer and Weeks 2009; Ley and Steel 2007; Hofmarcher et al. 2018); conduct Extreme Bounds Analysis; perform model selection procedures; and generate a comprehensive range of graphical outputs. These include visualizations of prior and posterior probabilities over the model space and model size, as well as histograms, kernel density estimates, weighted histograms of coefficients, and posterior density plots.

Consequently, rmsBMA integrates functionalities that are not jointly available in existing R packages or alternative software environments, while also introducing original methodological contributions.

References

Afonso, António, José Alves, and Krzysztof Beck. 2025. “Drivers of Migration Flows in the European Union: Earnings or Unemployment?” International Labour Review 164 (2): 1–23. https://doi.org/10.16995/ilr.18845.
Aller, Carlos, Lorenzo Ductor, and Daryna Grechyna. 2021. “Robust Determinants of CO2 Emissions.” Energy Economics 96 (105154). https://doi.org/10.1016/j.eneco.2021.105154.
Arin, K. Peren, Elias Braunfels, and Gernot Doppelhofer. 2019. Revisiting the growth effects of fiscal policy: A Bayesian model averaging approach.” Journal of Macroeconomics 62 (103158): 1–16. https://doi.org/10.1016/j.jmacro.2019.103158.
Baran, S., and A. Möller. 2015. “Joint Probabilistic Forecasting of Wind Speed and Temperature Using Bayesian Model Averaging.” Environmetrics 26 (2): 120–32. https://doi.org/10.1002/env.2316.
Beck, Joanna, Krzysztof Beck, Marta de Białynia Woycikiewicz, and Katarzyna Jednoróg. 2025. “The Trajectory of Letter and Speech Sounds Association in Children: Insights from a Cross-Sectional Study.” Reading and Writing. https://doi.org/10.1007/s11145-025-10645-9.
Beck, Krzysztof. 2020. “What Drives International Trade? Robust Analysis for the European Union.” Journal of International Studies 13 (3): 68–84. https://doi.org/10.14254/2071-8330.2020/13-3/5.
Beck, Krzysztof, and Wojciech Grabowski. 2025. “Economic Crisis Contagion in the Eurozone: Results from Cross-Quantilogram and Network Approach.” Economics Letters 255: 112507. https://doi.org/10.1016/j.econlet.2025.112507.
Beck, Krzysztof, Mateusz Wyszyński, and Marcin Dubel. 2025. “Bdsm: Bayesian Dynamic Systems Modelling. Bayesian Model Averaging for Dynamic Panels with Weakly Exogenous Regressors.” Working paper 5243366. SSRN. https://doi.org/10.2139/ssrn.5243366.
Błażejowski, M., and J. Kwiatkowski. 2015. Bayesian Model Averaging and Jointness Measures for gretl.” Journal of Statistical Software 68 (5): 1–24. https://doi.org/10.18637/jss.v068.i05.
Chen, Huigang, Alin Mirestean, and Charalambos G Tsangarides. 2018. Limited information bayesian model averaging for dynamic panels with application to a trade gravity model.” Econometric Reviews 37 (7): 777–805. https://doi.org/10.1080/07474938.2016.1167857.
Clyde, Merlise A., Jayanta Ghosh, and Michael L. Littman. 2011. “Bayesian Adaptive Sampling for Variable Selection and Model Averaging.” Journal of Computational and Graphical Statistics 20 (1): 80–101. https://doi.org/10.1198/jcgs.2010.09049.
D’Andrea, Sara. 2022. “Are There Any Robust Determinants of Growth in Europe? A Bayesian Model Averaging Approach.” International Economics 171: 143–73. https://doi.org/10.1016/j.inteco.2022.06.001.
Doppelhofer, Gernot, and Melvyn Weeks. 2009. Jointness of growth determinants.” Journal of Applied Econometrics 24 (2): 209–44. https://doi.org/10.1002/jae.1046.
Ductor, Lorenzo, and Danilo Leiva-Leon. 2016. Dynamics of Global Business Cycles Interdependence.” Journal of International Economics 102: 1109–27. https://doi.org/10.1016/j.jinteco.2016.07.003.
Eicher, Theo S., Chris Papageorgiou, and Olivier Roehn. 2007. Unraveling the fortunes of the fortunate: An Iterative Bayesian Model Averaging (IBMA) approach.” Journal of Macroeconomics 29 (3): 494–514. https://doi.org/10.1016/j.jmacro.2007.01.008.
Feenstra, Robert C. 2015. Advanced International Trade: Theory and Evidence. 2nd ed. Princeton, NJ: Princeton University Press.
Feldkircher, Martin, and Stefan Zeugner. 2015. Bayesian Model Averaging Employing Fixed and Flexible Priors: The BMS Package for R 68 (4): 1–37. https://doi.org/10.18637/jss.v068.i04.
Fernández, Carmen, Eduardo Ley, and Mark F. J. Steel. 2001a. Benchmark priors for Bayesian model averaging.” Journal of Econometrics 100 (2): 381–427. https://doi.org/10.1016/S0304-4076(00)00076-2.
———. 2001b. Model uncertainty in cross-country growth regressions.” Journal of Applied Econometrics 16 (5): 563–76. https://doi.org/10.1002/jae.623.
Figini, Silvia, and Paolo Giudici. 2017. “Credit Risk Assessment with Bayesian Model Averaging.” Communications in Statistics - Theory and Methods 46 (19): 9507–17. https://doi.org/10.1080/03610926.2016.1212070.
Foster, Dean P., and Edward I. George. 1994. The Risk Inflation Criterion for Multiple Regression.” The Annals of Statistics 22: 1947–75. https://doi.org/10.1214/aos/1176325766.
George, Edward I. 2010. Dilution priors: Compensating for model space redundancy.” In Borrowing Strength: Theory Powering Applications – a Festschrift for Lawrence d. Brown, edited by James O Berger, Tony Cai, and Iain M Johnstone, 158–65. Beachwood, OH: Institute of Mathematical Statistics. https://doi.org/10.1214/10-IMSCOLL611.
Granger, Clive W. J., and Harald F. Uhlig. 1990. “Reasonable Extreme-Bounds Analysis.” Journal of Econometrics 44 (1): 159–70. https://doi.org/10.1016/0304-4076(90)90077-7.
Guliyev, Hasraddin. 2024. “Determinants of Ecological Footprint in European Countries: Fresh Insight from Bayesian Model Averaging for Panel Data Analysis.” Science of The Total Environment 912: 169455. https://doi.org/10.1016/j.scitotenv.2023.169455.
Hlavac, Marek. 2016. “ExtremeBounds: Extreme Bounds Analysis in r.” Journal of Statistical Software 72 (9): 1–22. https://doi.org/10.18637/jss.v072.i09.
Hofmarcher, Paul, Jesus Crespo Cuaresma, Bettina Grün, Stefan Humer, and Mathias Moser. 2018. Bivariate jointness measure in Bayesian Model Averaging: Solving the conundrum.” Journal of Macroeconomics 57: 150–65. https://doi.org/10.1016/j.jmacro.2018.05.005.
Horvath, Roman, Eva Horvatova, and Maria Siranova. 2024. “The Determinants of Financial Development: Evidence from Bayesian Model Averaging.” Economic Systems, no. 101274. https://doi.org/10.1016/j.ecosys.2024.101274.
Jeffreys, Harold. 1946. An invariant form for the prior probability in estimation problems.” Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences 186 (1007): 453–61. https://doi.org/10.1098/rspa.1946.0056.
Judd, John P., and John L. Scadding. 1982. “The Search for a Stable Money Demand Function: A Survey of the Post-1973 Literature.” Journal of Economic Literature 20 (3): 993–1023. https://www.jstor.org/stable/2724409.
Kass, Robert E., and Adrian E. Raftery. 1995. Bayes Factors.” Journal of the American Statistical Association 90 (430): 773–95. https://doi.org/10.1080/01621459.1995.10476572.
Kass, Robert E., and Larry Wasserman. 1995. A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion.” Journal of the American Statistical Association 90 (431): 928. https://doi.org/10.2307/2291327.
Koop, Gary. 2003. Bayesian Econometrics. 1st ed. Hoboken, NJ, USA: John Wiley & Sons.
Laidler, David E. W. 1993. The Demand for Money: Theories, Evidence, and Problems. 4th ed. New York, NY: HarperCollins College Publishers.
Leamer, Edward E. 1978. Specification Searches: Ad Hoc Inference with Nonexperimental Data. New York: John Wiley & Sons.
Leamer, Edward E., and Herman Leonard. 1981. Reporting the fragility of regression estimates.” The Review of Economics and Statistics 65 (2): 306–17. https://doi.org/10.2307/1924497.
Lenkoski, Alex, Theo S. Eicher, and Adrian E. Raftery. 2014. “Two-Stage Bayesian Model Averaging in Endogenous Variable Models.” Econometric Reviews 33 (1-4): 122–51. https://doi.org/10.1080/07474938.2013.807150.
León-González, Roberto, and Daniel Montolio. 2015. Endogeneity and panel data in growth regressions: A Bayesian model averaging approach.” Journal of Macroeconomics 24: 23–39. https://doi.org/10.1016/j.jmacro.2015.07.003.
Levine, Ross, and David Renelt. 1992. A Sensitivity Analysis of Cross-Country Growth Regressions.” The American Economic Review 82 (4): 942–63.
Ley, Eduardo, and Mark F. J. Steel. 2007. Jointness in Bayesian variable selection with applications to growth regression.” Journal of Macroeconomics 29 (3): 476–93. https://doi.org/10.1016/j.jmacro.2006.12.002.
———. 2009. On the effect of prior assumptions in Bayesian model averaging with applications to growth regression.” Journal of Applied Econometrics 24 (4): 651–74. https://doi.org/10.1002/jae.1057.
———. 2012. Mixtures of g-priors for Bayesian model averaging with economic applications.” Journal of Econometrics 171 (2): 251–66. https://doi.org/10.1016/j.jeconom.2012.06.009.
Liu, Chun, and John M. Maheu. 2009. “Forecasting Realized Volatility: A Bayesian Model-Averaging Approach.” Journal of Applied Econometrics 24 (5): 709–33. https://doi.org/10.1002/jae.1070.
MacKinnon, James G., and Halbert White. 1985. “Some Heteroskedasticity-Consistent Covariance Matrix Estimators with Improved Finite Sample Properties.” Journal of Econometrics 29 (3): 305–25. https://doi.org/10.1016/0304-4076(85)90158-7.
Masanjala, Winford H., and Chris Papageorgiou. 2008. Rough and lonely road to prosperity: a reexamination of the growth in Africa using Bayesian model averaging.” Journal of Applied Econometrics 23 (5): 671–82. https://doi.org/10.1002/jae.1020.
Mirestean, Alin, and Charalambos G Tsangarides. 2016. Growth Determinants Revisited Using Limited‐Information Bayesian Model Averaging.” Journal of Applied Econometrics 31 (1): 106–32. https://doi.org/10.1002/jae.2472.
Moral-Benito, Enrique. 2012. Determinants of Economic Growth: A Bayesian Panel Data Approach.” The Review of Economics and Statistics 92 (4): 566–79. https://doi.org/10.1162/REST_a_00154.
———. 2013. Likelihood-Based Estimation of Dynamic Panels with Predetermined Regressors.” Journal of Business and Economic Statistics 31 (4): 451–72. https://doi.org/10.1080/07350015.2013.818003.
———. 2015. Model Averaging in Economics: An Overview.” Journal of Economic Surveys 29 (1): 46–75. https://doi.org/10.1111/joes.12044.
———. 2016. Growth Empirics in Panel Data Under Model Uncertainty and Weak Exogeneity.” Journal of Applied Econometrics 31 (3): 584–602. https://doi.org/10.1002/jae.2429.
Moral-Benito, Enrique, Paul Allison, and Richard Williams. 2019. Dynamic panel data modelling using maximum likelihood: An alternative to Arellano-Bond.” Applied Economics 51 (20): 2221–32. https://doi.org/10.1080/00036846.2018.1540854.
Moser, Mathias, and Paul Hofmarcher. 2014. Model priors revisited: Interaction terms in BMA growth applications.” Journal of Applied Econometrics 29 (2): 344–47. https://doi.org/10.1002/jae.2365.
Payne, Richard D., Pallavi Ray, and Mitchell A. Thomann. 2024. “Bayesian Model Averaging of Longitudinal Dose-Response Models.” Journal of Biopharmaceutical Statistics 34 (3): 349–65. https://doi.org/10.1080/10543406.2023.2292214.
Raftery, Adrian E. 1995. Bayesian model selection in social research.” Sociological Methodology 25: 111–63. https://doi.org/10.2307/271063.
Raftery, Adrian E., David Madigan, and Jennifer A. Hoeting. 1997. “Bayesian Model Averaging for Linear Regression Models.” Journal of the American Statistical Association 92 (437): 179–91. https://doi.org/10.1080/01621459.1997.10473615.
Raftery, Adrian E., Ian S. Painter, and Christopher T. Volinsky. 2005. BMA: An R Package for Bayesian Model Averaging.” R News 5 (2): 2–8.
Sala-I-Martin, Xavier. 1997. I Just Ran Two Million Regressions.” The American Economic Review 27 (2): 178–83.
Sala-I-Martin, Xavier, Gernot Doppelhofer, and Ronald I. Miller. 2004. Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) Approach.” The American Economic Review 94: 813–35. https://doi.org/10.1257/0002828042002570.
Sloughter, J. McLean, Tilmann Gneiting, and Adrian E. Raftery. 2013. “Probabilistic Wind Vector Forecasting Using Ensembles and Bayesian Model Averaging.” Monthly Weather Review 141 (6): 2107–19. https://doi.org/10.1175/MWR-D-12-00002.1.
Steel, Mark F. J. 2020. Model Averaging and Its Use in Economics.” Journal of Economic Literature 58 (3): 644–719. https://doi.org/10.1257/jel.20191385.
Tinbergen, Jan. 1962. Shaping the World Economy: Suggestions for an International Economic Policy. New York: Twentieth Century Fund.
Zellner, A. 1986. On Assessing Prior Distributions and Bayesian Regression Analysis with g Prior Distributions.” In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti. Studies in Bayesian Econometrics 6, edited by P. K. Goel and A. Zellner, 233–43. New York.

  1. For a detailed review of BMA applications in economics, see (Moral-Benito 2015; Steel 2020).↩︎

  2. For more on the simultaneity problem within the BMA framework, see (Moral-Benito 2012, 2013, 2016; Lenkoski, Eicher, and Raftery 2014; León-González and Montolio 2015; Mirestean and Tsangarides 2016; Chen, Mirestean, and Tsangarides 2018; Moral-Benito, Allison, and Williams 2019).↩︎

  3. Hlavac (2016) developed an R package for EBA.↩︎

  4. This option is chosen by setting parameter g = "None" in the model\_space function, and described in model_space.↩︎

  5. This option is chosen by setting parameter HC = TRUE in the model\_space function, and described in model_space.↩︎

  6. For an introduction to BMA see Kass and Raftery (1995),Raftery+1995,Raftery+1997,Doppelhofer+2009,amini2011bayesian,Beck+2017,Fragoso+2018.↩︎

  7. For a thorough discussion of model priors see Sala-I-Martin, Doppelhofer, and Miller (2004),Ley+2009,George+2010,Eicher+2011.↩︎

  8. To learn more about jointness measures, we recommend reading Doppelhofer and Weeks (2009), Ley+2007, Hofmarcher+2018 in that order.↩︎

  9. See section joint for the interpretations of jointness measures.↩︎

  10. We compute the measure using the command with the subsetting operator [1:9, 1:9] is applied to restrict the output to the first nine rows and columns for presentation purposes.↩︎