| Title: | Reduced Model Space Bayesian Model Averaging |
| Version: | 0.1.1 |
| Description: | Implements Bayesian model averaging for settings with many candidate regressors relative to the available sample size, including cases where the number of regressors exceeds the number of observations. By restricting attention to models with at most M regressors, the package supports reduced model space inference, thereby preserving degrees of freedom for estimation. It provides posterior summaries, Extreme Bounds Analysis, model selection procedures, joint inclusion measures, and graphical tools for exploring model probabilities, model size distributions, and coefficient distributions. The methodological approach follows Doppelhofer and Weeks (2009) <doi:10.1002/jae.1046>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| Imports: | ggplot2, ggpubr, grid, gridExtra, Matrix, stats, tidyr, utils |
| Depends: | R (≥ 3.5.0) |
| LazyData: | true |
| Language: | en-US |
| NeedsCompilation: | no |
| Packaged: | 2026-03-06 19:53:25 UTC; becku |
| Author: | Krzysztof Beck |
| Maintainer: | Krzysztof Beck <beckkrzysztof@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-03-11 19:30:03 UTC |
Determinants of International Trade in the European Union
Description
The dataset contains bilateral trade and its economic determinants for 26 European Union countries over the period 1995–2015. Each observation represents a country pair, resulting in 325 unique trading pairs.
The countries included are: Austria, Belgium, Bulgaria, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, the Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, and the United Kingdom.
The dataset was used in: Beck (2020), What drives international trade? Robust analysis for the European Union, Journal of International Studies, 13(3), 68–84.
Usage
data(Trade_data)
Format
A data frame with 325 rows and 18 variables:
- LNTRADE
Natural logarithm of bilateral trade between two countries.
- B
Border dummy (1 if countries share a common border; 0 otherwise).
- LNDGEO
Natural logarithm of the shortest distance between capital cities.
- L
Common language dummy (1 if countries share at least one official language; 0 otherwise).
- LNRGDPPROD
Natural logarithm of the product of real GDPs of the two countries.
- RGDPpcDIFF
Absolute difference in real GDP per capita.
- GOV
Absolute difference in government expenditure shares of GDP.
- HUMAN
Absolute difference in human capital indicators.
- CPW
Absolute difference in capital per worker.
- INFVAR
Absolute difference in the standard deviation of inflation rates.
- ARABLE
Absolute difference in arable land.
- ARABLEpw
Absolute difference in arable land per worker.
- LAND
Absolute difference in total land area.
- LANDpc
Absolute difference in land area per capita.
- EPCpc
Absolute difference in electricity consumption per capita.
- FDI
Absolute difference in foreign direct investment flows.
- KSI
Krugman specialization index (value added, 35 sectors).
- BCIDIFF
Absolute difference in the Bayesian Corruption Index.
Source
Harvard Dataverse. doi:10.7910/DVN/JMMOEA
Determinants of International Trade in the European Union
Description
The dataset contains bilateral trade and its economic determinants for 26 European Union countries over the period 1995–2015. Each observation represents a country pair, resulting in 325 unique trading pairs.
The countries included are: Austria, Belgium, Bulgaria, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, the Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, and the United Kingdom.
The dataset was used in: Beck (2020), What drives international trade? Robust analysis for the European Union, Journal of International Studies, 13(3), 68–84.
Usage
data(Trade_data_small)
Format
A data frame with 325 rows and 11 variables:
- LNTRADE
Natural logarithm of bilateral trade between two countries.
- B
Border dummy (1 if countries share a common border; 0 otherwise).
- LNDGEO
Natural logarithm of the shortest distance between capital cities.
- L
Common language dummy (1 if countries share at least one official language; 0 otherwise).
- LNRGDPPROD
Natural logarithm of the product of real GDPs of the two countries.
- HUMAN
Absolute difference in human capital indicators.
- INFVAR
Absolute difference in the standard deviation of inflation rates.
- ARABLE
Absolute difference in arable land.
- ARABLEpw
Absolute difference in arable land per worker.
- LAND
Absolute difference in total land area.
- LANDpc
Absolute difference in land area per capita.
Source
Harvard Dataverse. doi:10.7910/DVN/JMMOEA
Table with the best models according to one of the posterior criteria
Description
This function creates a ranking of best models according to one of the possible criterion (PMP under binomial model prior, PMP under binomial-beta model prior, R^2 under binomial model prior, R^2 under binomial-beta model prior). The function gives two types of tables in three different formats: inclusion table (where 1 indicates presence of the regressor in the model and 0 indicates that the variable is excluded from the model) and estimation results table (it displays the best models and estimation output for those models: point estimates, standard errors, significance level, and R^2).
Usage
best_models(bma_list, criterion = 1, best = 5, round = 3, estimate = TRUE)
Arguments
bma_list |
bma object (the result of the bma function) |
criterion |
The criterion that will be used for a basis of the model ranking: |
best |
The number of the best models to be considered |
round |
Parameter indicating the decimal place to which number in the tables should be rounded (default round = 3) |
estimate |
A parameter with values TRUE or FALSE indicating which table should be displayed when
TRUE - table with estimation to the results |
Value
A list with best_models objects:
matrix with inclusion of the regressors in the best models
matrix with estimation output in the best models with regular standard errors
knitr_kable table with inclusion of the regressors in the best models (the best for the display on the console - up to 11 models)
knitr_kable table with estimation output in the best models with regular standard errors (the best for the display on the console - up to 6 models)
gTree table with inclusion of the regressors in the best models (displayed as a plot). Use grid::grid.draw() to display.
gTree table with estimation output in the best models with regular standard errors (displayed as a plot). Use grid::grid.draw() to display.
Examples
data <- Trade_data[,1:10]
modelSpace <- model_space(data, M = 9, g = "UIP")
bma_list <- bma(modelSpace)
models <- best_models(bma_list, best = 3)
models[[4]]
Calculation of the bma objects
Description
This function calculates bma and related objects for the modelSpace object obtained using model_space function.
Usage
bma(
modelSpace,
EMS = NULL,
dilution = 0,
dil.Par = 0.5,
Narrative = 0,
p = NULL,
Nar_vec = NULL,
round = 6
)
Arguments
modelSpace |
Model space object (the result of the model_space function) |
EMS |
Expected model size for model binomial and binomial-beta model prior. |
dilution |
Binary parameter: 0 - NO application of a dilution prior; 1 - application of a dilution prior (George 2010). |
dil.Par |
Parameter associated with dilution prior - the exponent of the determinant (George 2010). Used only if parameter dilution=1. |
Narrative |
Binary parameter: 0 - NO application of a Narrative dilution prior; 1 - application of a Narrative dilution prior. |
p |
Parameter or vector that indicates by how much we cut probability of a model with substitutes. |
Nar_vec |
Vector with information on narrative dilution prior where: 0 - the variable has no substitutes; numbers different than 0 denote consecutive groups of variables considered to be substitutes. |
round |
Parameter indicating to which place the function should round up the results in final tables. |
Value
A list with Posterior objects:
pmp_uniform_table - table with results with PMP under binomial model prior
pmp_random_table - table with results with PMP under binomial-beta model prior
eba_object - table with results of Extreme Bounds Analysis
pms_table - table with prior and posterior model sizes
x_names - vector with names of the regressors - to be used by the functions
K - total number of regressors
MS - size of the mode space
EMS - expected model size for binomial and binomial-beta model prior specified by the user (default EMS=K/2)
dilution - parameter indicating use of dilution
for_jointness - table for jointness function
for_best_models - table for best_models function
for_model_pmp - table for model_pmp function
for_model_sizes - table for model_sizes function
alphas - vector with the values of the constant
betas_nonzero - matrix with coefficients on regressors
Examples
data("Trade_data", package = "rmsBMA")
data <- Trade_data
modelSpace <- model_space(data, M = 6)
bma_list <- bma(modelSpace)
bma_list[[1]]
Graphs of the distribution of the coefficients over the model space
Description
This function draws graphs of the distribution (in the form of histogram or kernel density) of the coefficients for all the considered regressors over the part of the model space that includes this regressors (half of the model space).
Arguments
bma_list |
bma object (the result of the bma function) |
weight |
Parameter indicating whether the coefficients should be weighted by posterior model probabilities:
|
BW |
Parameter indicating what method should be chosen to find bin widths for the histograms:
|
binW |
A vector with bin widths to be used to construct histograms for the regressors. The vector must be of the size equal to total number of regressors. The vector with bin widths is used only if parameter BW="vec". |
BN |
Parameter taking the values (default: BN = 0): |
num |
A vector with the numbers of bins used to be used to construct histograms for the regressors. The vector must be of the size equal to total number of regressors. The vector with bin widths is used only if parameter BN=1. |
kernel |
A parameter taking the values (default: kernel = 0): |
Value
A list with the graphs of the distribution of coefficients for all the considered regressors.
Examples
data("Trade_data", package = "rmsBMA")
data <- Trade_data[,1:10]
modelSpace <- model_space(data, M = 9, g = "UIP")
bma_list <- bma(modelSpace)
coefs <- coef_hist(bma_list, kernel = 1)
coefs[[1]]
coefs[[2]]
Extracting coefficients from g_regression and fast_ols function
Description
The function extracts coefficients or standard errors from the results of g_regression and fast_ols function
Usage
coef_to_full(model_coefs, model_row)
Arguments
model_coefs |
a vector with estimated coefficients or standard errors |
model_row |
inclusion vector - row of a model space matrix |
Value
A vector of coefficients coefficients or standard errors from the results of g_regression and fast_ols function
Introduction of time and section fixed effects and data standardization.
Description
If the data is in the panel form the function assumes it has the following structure
section_1 year_1 y x1 x2 x3 ....
section_2 year_1 y x1 x2 x3 ....
section_3 year_1 y x1 x2 x3 ....
........
section_n year_1 y x1 x2 x3 ....
section_1 year_2 y x1 x2 x3 ....
section_2 year_2 y x1 x2 x3 ....
section_3 year_2 y x1 x2 x3 ....
........
section_n year_2 y x1 x2 x3 ....
........
section_n year_(T-1) y x1 x2 x3 ....
section_1 year_T y x1 x2 x3 ....
section_2 year_T y x1 x2 x3 ....
section_3 year_T y x1 x2 x3 ....
........
section_n year_T y x1 x2 x3 ....
Usage
data_prep(
data,
FE = FALSE,
Time = 0,
Section = 0,
Time_FE = 0,
Section_FE = 0,
STD = 0
)
Arguments
data |
A data file. |
FE |
Binary variable: TRUE - include fixed effect, FALSE - do not include fixed effects. |
Time |
The number of time periods - works only if FE=1. |
Section |
The number of cross-sections - works only if EF=1. |
Time_FE |
Binary variable: 1 - include time fixed effect, 0 - do not include time fixed effects. Works only if EF=1. |
Section_FE |
Binary variable: 1 - include cross-section fixed effect, 0 - do not include cross-section fixed effects. Works only if EF=1. |
STD |
Binary variable: 1 - standardize the data set, 0 - do not standardize the data set. By standardization we mean subtraction of a mean and division by standard deviation of each variable. |
Value
Formatted data set.
Examples
y <- matrix(1:20,nrow=20,ncol=1)
x1 <- matrix(21:40,nrow=20,ncol=1)
x2 <- matrix(41:60,nrow=20,ncol=1)
data <- cbind(y,x1,x2)
new_data <- data_prep(data,FE=1,Time=5,Section=4,Time_FE=1,Section_FE=1,STD=0)
y <- rnorm(20, mean = 0, sd = 1)
x1 <- rnorm(20, mean = 0, sd = 1)
x2 <- rnorm(20, mean = 0, sd = 1)
data <- cbind(y,x1,x2)
new_data <- data_prep(data,FE=1,Time=5,Section=4,Time_FE=1,Section_FE=1,STD=1)
Fixed-effects demeaning and data standardization
Description
Prepares a dataset for econometric analysis by applying fixed-effects demeaning (within transformation) and/or standardization to numeric variables. The behavior of the function depends on whether panel identifiers are supplied and whether fixed effects are explicitly requested.
Usage
data_preparation(
data,
id = NULL,
time = NULL,
fixed_effects = FALSE,
effect = c("twoway", "section", "time"),
standardize = FALSE
)
Arguments
data |
A data.frame containing the data. |
id |
An optional character string specifying the cross-sectional
(section) identifier. Must be supplied together with |
time |
An optional character string specifying the time identifier.
Must be supplied together with |
fixed_effects |
Logical. If |
effect |
A character string indicating the fixed-effects structure
when |
standardize |
Logical. If |
Details
If both id and time are provided and fixed_effects = TRUE,
the function applies section, time, or two-way fixed-effects demeaning and may
optionally standardize the transformed variables. If fixed_effects = FALSE,
fixed-effects demeaning is skipped even when identifiers are present, and only
standardization (if requested) is applied.
If either id or time is missing, fixed-effects demeaning is not
available and the function requires standardize = TRUE.
For two-way fixed effects, the transformation is:
x_{it}^{*} = x_{it} - \bar{x}_{i\cdot} - \bar{x}_{\cdot t} + \bar{x}_{\cdot\cdot}
Standardization consists of subtracting the mean and dividing by the standard deviation of each variable and is applied after fixed-effects demeaning (if any).
The function operates in three modes:
-
Fixed effects only:
fixed_effects = TRUE,standardize = FALSE. -
Fixed effects + standardization:
fixed_effects = TRUE,standardize = TRUE. -
Standardization only:
fixed_effects = FALSE,standardize = TRUE.
When id and time are not provided, only the standardization-only
mode is available.
Missing values are ignored when computing means and standard deviations. After fixed-effects demeaning, an intercept term is redundant in subsequent linear regressions.
Value
A data.frame containing only numeric variables used in estimation.
Panel identifiers (id, time) are removed from the output.
Transformed variables preserve their original column names.
Examples
df <- migration_panel
# Standardization only (panel identifiers present but FE skipped)
X <- data_preparation(
df,
id = "Pair_ID",
time = "Year_0",
fixed_effects = FALSE,
standardize = TRUE
)
# Two-way fixed effects with standardization
X <- data_preparation(
df,
id = "Pair_ID",
time = "Year_0",
fixed_effects = TRUE,
effect = "twoway",
standardize = TRUE
)
# Section fixed effects only
X <- data_preparation(
df,
id = "Pair_ID",
time = "Year_0",
fixed_effects = TRUE,
effect = "section"
)
# Standardization only (no panel identifiers)
X <- data_preparation(df, standardize = TRUE)
Extreme bounds analysis summaries from a Bayesian model space
Description
Computes Extreme Bounds Analysis (EBA) summaries for the intercept and each
regressor across a model space. For each coefficient, the function reports:
the minimum coefficient ("Low"), maximum coefficient ("High"), the mean
coefficient ("Mean_coef"), and corresponding "extreme bounds" defined as
\mathrm{Low} - 2\cdot \mathrm{SE} and \mathrm{High} + 2\cdot \mathrm{SE},
where \mathrm{SE}=\sqrt{\mathrm{VAR}} is the standard error associated
with the coefficient estimate in the model attaining the minimum/maximum.
Usage
eba(betas, VAR, Reg_ID, var_tol = 0)
Arguments
betas |
Numeric matrix of dimension |
VAR |
Numeric matrix of dimension |
Reg_ID |
Numeric or integer matrix of dimension |
var_tol |
Nonnegative numeric scalar used as a tolerance when checking
variance positivity. Entries with |
Details
The intercept (constant) is assumed to be included in all models. Each
regressor is summarized only over models in which it is included, as indicated
by the model-inclusion matrix Reg_ID.
Value
A numeric matrix of dimension (K+1) x 5 with columns:
- Lower_bound
\min(\beta) - 2\cdot \mathrm{SE}evaluated at the model where\betais minimal.- Low
Minimum coefficient value across relevant models.
- Mean_coef
Mean coefficient across relevant models (intercept: all models; regressor: included models only).
- High
Maximum coefficient value across relevant models.
- Upper_bound
\max(\beta) + 2\cdot \mathrm{SE}evaluated at the model where\betais maximal.
Rows correspond to the intercept (row 1) and regressors (rows 2..K+1).
If a regressor is never included (no 1s in its column of Reg_ID), its row
will contain NA.
OLS calculation with additional objects
Description
OLS calculation with additional objects
Usage
fast_ols(y, x)
Arguments
y |
A vector with the dependent variable. |
x |
A matrix with with regressors as columns. |
Value
A list with OLS objects: Coefficients, Standard errors, Marginal likelihood, R^2, Degrees of freedom, Determinant of the regressors' matrix.
Examples
x1<-rnorm(10, mean = 0, sd = 1)
x2<-rnorm(10, mean = 0, sd = 2)
e<-rnorm(10, mean = 0, sd = 0.5)
y<-2+x1+2*x2+e
x<-cbind(x1,x2)
fast_ols(y,x)
OLS calculation with heteroscedasticity consistent covariance matrix (MacKinnon & White 1985).
Description
OLS calculation with heteroscedasticity consistent covariance matrix (MacKinnon & White 1985).
Usage
fast_ols_HC(y, x)
Arguments
y |
A vector with the dependent variable. |
x |
A matrix with with regressors as columns. |
Value
A list with OLS objects: Coefficients, Standard errors, Marginal likelihood, R^2, Degrees of freedom, Determinant of the regressors' matrix.
Examples
x1<-rnorm(10, mean = 0, sd = 1)
x2<-rnorm(10, mean = 0, sd = 2)
e<-rnorm(10, mean = 0, sd = 0.5)
y<-2+x1+2*x2+e
x<-cbind(x1,x2)
fast_ols_HC(y,x)
OLS calculation with additional objects for model with just a constant.
Description
OLS calculation with additional objects for model with just a constant.
Usage
fast_ols_const(y)
Arguments
y |
A vector with the dependent variable. |
Value
A list with OLS objects: Coefficients, Standard errors, Marginal likelihood, R^2, Degrees of freedom, Determinant of the regressors matrix.
Examples
x1<-rnorm(10, mean = 0, sd = 1)
x2<-rnorm(10, mean = 0, sd = 2)
e<-rnorm(10, mean = 0, sd = 0.5)
y<-2+x1+2*x2+e
x<-cbind(x1,x2)
fast_ols_const(y)
Regression with g prior
Description
The function implements Bayesian regression with g prior (Zellner, 1986)
Usage
g_regression(data, g = "UIP")
Arguments
data |
A matrix with data. The first column is interpreted as with the dependent variable, while the remaining columns are interpreted as regressors. |
g |
Value for g in the g prior. Either a number above zero specified by the user or: |
Value
A list with g_regression objects:
Expected values of coefficients
Posterior standard errors
Natural logarithm of marginal likelihood
R^2 form ols model
Degrees of freedom
Determinant of the regressors' matrix
Examples
x1 <- rnorm(100, mean = 0, sd = 1)
x2 <- rnorm(100, mean = 0, sd = 2)
e <- rnorm(100, mean = 0, sd = 5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2)
g_result <- g_regression(data, g = "UIP")
g_result[[1]]
g_result[[2]]
x1 <- rnorm(50, mean = 0, sd = 1)
x2 <- rnorm(50, mean = 0, sd = 2)
e <- rnorm(50, mean = 0, sd = 0.5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2)
g_result <- g_regression(data, g = "benchmark")
g_result[[1]]
g_result[[2]]
Fast regression with g prior
Description
The function implements Bayesian regression with g prior (Zellner, 1986)
Usage
g_regression_fast(data, g = 0.5)
Arguments
data |
A matrix with data. The first column is interpreted as with the dependent variable, while the remaining columns are interpreted as regressors. |
g |
Value for g in the g prior. Default value: g = 0.5. |
Value
A list with g_regression objects:
Expected values of coefficients
Posterior standard errors
Natural logarithm of marginal likelihood
R^2 form ols model
Degrees of freedom
Determinant of the regressors' matrix
Examples
x1 <- rnorm(100, mean = 0, sd = 1)
x2 <- rnorm(100, mean = 0, sd = 2)
e <- rnorm(100, mean = 0, sd = 5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2)
g_result <- g_regression_fast(data, g = 0.99)
g_result[[1]]
g_result[[2]]
x1 <- rnorm(50, mean = 0, sd = 1)
x2 <- rnorm(50, mean = 0, sd = 2)
e <- rnorm(50, mean = 0, sd = 0.5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2)
g_result <- g_regression_fast(data, g = 1.1)
g_result[[1]]
g_result[[2]]
Fast regression with g prior and with heteroscedasticity consistent covariance matrix (MacKinnon & White 1985).
Description
The function implements Bayesian regression with g prior (Zellner, 1986)
Usage
g_regression_fast_HC(data, g = 0.5)
Arguments
data |
A matrix with data. The first column is interpreted as with the dependent variable, while the remaining columns are interpreted as regressors. |
g |
Value for g in the g prior. Default value: g = 0.5. |
Value
A list with g_regression objects:
Expected values of coefficients
Posterior standard errors
Natural logarithm of marginal likelihood
R^2 form ols model
Degrees of freedom
Determinant of the regressors' matrix
Examples
x1 <- rnorm(100, mean = 0, sd = 1)
x2 <- rnorm(100, mean = 0, sd = 2)
e <- rnorm(100, mean = 0, sd = 5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2)
g_result <- g_regression_fast_HC(data, g = 0.99)
g_result[[1]]
g_result[[2]]
x1 <- rnorm(50, mean = 0, sd = 1)
x2 <- rnorm(50, mean = 0, sd = 2)
e <- rnorm(50, mean = 0, sd = 0.5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2)
g_result <- g_regression_fast_HC(data, g = 1.1)
g_result[[1]]
g_result[[2]]
Fast regression with g prior for a model with just a constant
Description
The function implements Bayesian regression with g prior (Zellner, 1986) for a model with just a constant
Usage
g_regression_fast_const(y, g = 0.5)
Arguments
y |
A vector with data - only the dependent variable. |
g |
Value for g in the g prior. Default value: g = 0.5. |
Value
A list with g_regression objects:
Expected values of coefficients
Posterior standard errors
Natural logarithm of marginal likelihood
R^2 form ols model
Degrees of freedom
Determinant of the regressors' matrix
Examples
x1 <- rnorm(100, mean = 0, sd = 1)
x2 <- rnorm(100, mean = 0, sd = 2)
e <- rnorm(100, mean = 0, sd = 5)
y <- 2 + x1 + 2*x2 + e
g_result <- g_regression_fast_const(y, g = 0.99)
g_result[[1]]
g_result[[2]]
x1 <- rnorm(50, mean = 0, sd = 1)
x2 <- rnorm(50, mean = 0, sd = 2)
e <- rnorm(50, mean = 0, sd = 0.5)
y <- 2 + x1 + 2*x2 + e
g_result <- g_regression_fast_const(y, g = 1.1)
g_result[[1]]
g_result[[2]]
Compute group-based dilution factors for a model space
Description
Computes a group-based dilution factor for each model (row) in a model-inclusion
matrix. Regressors are assigned to "relatedness groups" via Nar_vec. For each
model and each group, the dilution exponent increases by 1 for every additional
regressor from that group beyond the first. The model's dilution factor is the
product of group-specific penalties.
Usage
group_dilution(Reg_ID, Nar_vec, p)
Arguments
Reg_ID |
An |
Nar_vec |
An integer vector of length |
p |
Either:
If
|
Details
Formally, for model i and group h >= 1, let
c_{ih} = \sum_{j=1}^K \gamma_{ij} I\{g(j)=h\}
and
D_i = \sum_{h \ge 1} \max(0, c_{ih} - 1).
If p is a scalar, the dilution factor is
p^{D_i}
.
If p is group-specific with values p_h, then
\mathrm{NarDilution}_i = \prod_{h \ge 1} p_h^{\max(0, c_{ih}-1)}.
Value
A numeric vector of length MS containing the dilution factor for each model.
Examples
# Example model space: MS models, K regressors
Reg_ID <- rbind(
c(0,0,0,0,0), # null
c(1,1,0,0,0), # two from group 1 -> penalty p_1^(1)
c(1,1,1,0,0), # three from group 1 -> penalty p_1^(2)
c(1,1,0,1,1) # two from group 1 and two from group 2 -> p_1^1 * p_2^1
)
Nar_vec <- c(1,1,1,2,2)
# Scalar p (same for all groups)
group_dilution(Reg_ID, Nar_vec, p = 0.7)
# Group-specific p (vector in group order: group 1 then group 2)
group_dilution(Reg_ID, Nar_vec, p = c(0.7, 0.5))
# Group-specific p with explicit mapping via names
group_dilution(Reg_ID, Nar_vec, p = c("1"=0.7, "2"=0.5))
Calculation of of the jointness measures
Description
This function calculates four types of the jointness measures based on the posterior model probabilities calculated using binomial and binomial-beta model prior. The four measures are:
HCGHM - for Hofmarcher et al. (2018) measure;
LS - for Ley & Steel (2007) measure;
DW - for Doppelhofer & Weeks (2009) measure;
PPI - for posterior probability of including both variables.
The measures under binomial model prior will appear in a table above the diagonal, and the measure calculated under binomial-beta model prior below the diagonal.
REFERENCES
Doppelhofer G, Weeks M (2009) Jointness of growth determinants. Journal of Applied Econometrics., 24(2), 209-244. doi: 10.1002/jae.1046
Hofmarcher P, Crespo Cuaresma J, Grün B, Humer S, Moser M (2018) Bivariate jointness measures in Bayesian Model Averaging: Solving the conundrum. Journal of Macroeconomics, 57, 150-165. doi: 10.1016/j.jmacro.2018.05.005
Ley E, Steel M (2007) Jointness in Bayesian variable selection with applications to growth regression. Journal of Macroeconomics, 29(3), 476-493. doi: 10.1016/j.jmacro.2006.12.002
Usage
jointness(bma_list, measure = "HCGHM", rho = 0.5, round = 3)
Arguments
bma_list |
bma object (the result of the bma function) |
measure |
Parameter for choosing the measure of jointness: |
rho |
The parameter "rho" ( |
round |
Parameter indicating the decimal place to which the jointness measures should be rounded (default round = 3). |
Value
A table with jointness measures for all the pairs of regressors used in the analysis. The results obtained with the binomial model prior are above the diagonal, while the ones obtained with the binomial-beta prior are below.
Examples
data("Trade_data", package = "rmsBMA")
data <- Trade_data[,1:10]
modelSpace <- model_space(data, M = 9, g = "UIP")
bma_list <- bma(modelSpace)
jointness_table <- jointness(bma_list)
Determinants of International Migration in the European Union
Description
The dataset contains bilateral migration and its economic determinants for 23 European Union countries over the period 1995–2020. Each observation represents a country pair in a given 5 year period, resulting in 353 unique country pairs and four periods (the first period is utilized for lagged data only).
The countries included are: Austria, Belgium, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Netherlands, Poland, Portugal, Slovakia, Slovenia, Spain, Sweden and the United Kingdom
The dataset was used in: Afonso, Aves, Beck (2025), Drivers of migration flows in the European Union: Earnings or unemployment?, International Labour Review, 164(2), 1-23. doi: 10.16995/ilr.18845
Usage
data(migration_panel)
Format
A data frame with 1012 rows and 14 variables:
- Year_0
Column indication observation period.
- Pair_ID
Column indication country pair.
- Migration
The absolute value of the net migration flows scaled by the sum of the population of a given pair of countries.
- MigrationLAG
The absolute value of the net migration flows scaled by the sum of the population of a given pair of countries lagged by one period (5 years).
- Unempl
The absolute value of the difference in unemployment rates, averaged over the 5-year period.
- Earn
The absolute value of the difference in net earnings expressed in PPP, averaged over the 5-year period.
- Tax
The absolute value of the difference in mean income tax, averaged over the 5-year period.
- Social
The absolute value of the difference in mean social benefits per person, averaged over the 5-year period.
- LNDGEO
A natural logarithm of the distance between the capital of a given pair of countries based on the shortest route.
- Temp
The absolute value of the difference in mean annual temperature, averaged over the 5-year period.
- HC
The absolute value of the difference in the human capital index (Barro and Lee 2013), averaged over the 5-year period.
- GOV
The absolute value of the difference in government spending share of GDP, averaged over the 5-year period.
- Gini
The absolute value of the difference in the Gini coefficient between a pair of countries, averaged over the 5-year period.
- FER
The absolute value of the difference in the fertility rate between a pair of countries, averaged over the 5-year period.
- Corruption
The absolute difference in the value of control of corruption measure from the Worldwide Governance Indicator, averaged over the 5-year period.
- Crime
The absolute value of the difference in the number of intentional homicides per 1 000 inhabitants, averaged over the 5-year period.
Source
Harvard Dataverse. doi:10.7910/DVN/GTOFJB
Model space for the Trade_data_small dataset
Description
A pre-computed model space object obtained using the
model_space function on the Trade_data_small
dataset with a maximum of 7 regressors and the Unit Information Prior.
Format
A list of length 5 with the following components:
- x_names
Character vector containing the names of the regressors.
- ols_results
Matrix containing the full model space. Each row corresponds to a model specification and includes:
binary inclusion indicators for regressors,
estimated coefficients (including intercept),
standard errors,
log-marginal likelihood,
-
R^2, degrees of freedom,
dilution prior term.
- MS
Integer. Total number of models in the model space.
- M
Integer. Maximum number of regressors allowed in each model.
- K
Integer. Total number of available regressors.
Details
The object contains all possible linear regression models constructed from the available regressors (up to 7 included simultaneously), together with their estimated coefficients, standard errors, log-marginal likelihood values, R-squared statistics, degrees of freedom, and dilution prior components.
The g-prior specification corresponds to the Unit Information Prior
(UIP), i.e. g = 1/m, where m denotes the sample size.
Source
Generated using:
modelSpace <- model_space(Trade_data_small, M = 7, g = "UIP")
See Also
Matrix reflecting model space
Description
The function creates a matrix with ones indicating inclusion and zeros indicating inclusion of a regressor in a model
Usage
model_matrix(K, M)
Arguments
K |
total number of regressors |
M |
maximum number of regressor in a model |
Value
A matrix with ones indicating inclusion and zeros indicating inclusion of a regressor in a model
Graphs of the prior and posterior model probabilities for the best individual models
Description
This function draws four graphs of prior and posterior model probabilities for the best individual models:
a) The results with binomial model prior (based on PMP - posterior model probability)
b) The results with binomial-beta model prior (based on PMP - posterior model probability)
Models on the graph are ordered according to their posterior model probability.
Arguments
bma_list |
bma_list object (the result of the bma function) |
top |
The number of the best model to be placed on the graphs |
Value
A list with three graphs with prior and posterior model probabilities for individual models:
The results with binomial model prior (based on PMP - posterior model probability)
The results with binomial-beta model prior (based on PMP - posterior model probability)
On graph combining the aforementioned graphs
Examples
data("Trade_data", package = "rmsBMA")
data <- Trade_data[,1:10]
modelSpace <- model_space(data, M = 9, g = "UIP")
bma_list <- bma(modelSpace)
model_pmps <- model_pmp(bma_list, top = 100)
model_pmps[[1]]
Graphs of the prior and posterior model probabilities of the model sizes
Description
This function draws four graphs of prior and posterior model probabilities:
a) The results with binomial model prior (based on PMP - posterior model probability)
b) The results with binomial-beta model prior (based on PMP - posterior model probability)
Arguments
bma_list |
bma_list object (the result of the bma function) |
Value
A list with three graphs with prior and posterior model probabilities for model sizes:
The results with binomial model prior (based on PMP - posterior model probability)
The results with binomial-beta model prior (based on PMP - posterior model probability)
One graph combining all the aforementioned graphs
Examples
data("Trade_data", package = "rmsBMA")
data <- Trade_data[,1:10]
modelSpace <- model_space(data, M = 9, g = "UIP")
bma_list <- bma(modelSpace)
sizes <- model_sizes(bma_list)
sizes[[1]]
Calculation of the model space
Description
This function calculates all possible models with M regressors that can be constructed out of K regressors.
Usage
model_space(data, M = NULL, g = "UIP", HC = FALSE)
Arguments
data |
Data set to work with. The first column is the data for the dependent variable, and the other columns is the data for the regressors. |
M |
Maximum number of regressor in the estimated models (default is K - total number of regressors). |
g |
Value for g in the g prior. Either a number above zero specified by the user or: |
HC |
Logical indicator (default = FALSE) specifying whether a heteroscedasticity-consistent covariance matrix should be used for the estimation of standard errors (MacKinnon & White 1985). |
Value
A list with model_space objects:
x_names - vector with names of the regressors
ols_results - table with the model space - contains ols objects for all the estimated models
MS - size of the mode space
M - maximum number of regressors in a model
K- total number of regressors
Examples
x1 <- rnorm(20, mean = 0, sd = 1)
x2 <- rnorm(20, mean = 0, sd = 2)
x3 <- rnorm(20, mean = 0, sd = 3)
x4 <- rnorm(20, mean = 0, sd = 1)
x5 <- rnorm(20, mean = 0, sd = 2)
x6 <- rnorm(20, mean = 0, sd = 4)
e <- rnorm(20, mean = 0, sd = 0.5)
y <- 2 + x1 + 2*x2 + e
data <- cbind(y,x1,x2,x3,x4,x5,x6)
modelSpace <- model_space(data, M = 3)
OLS calculation with additional objects
Description
OLS calculation with additional objects
Usage
ols(y, x, const, Norm = NULL)
Arguments
y |
A vector with the dependent variable. |
x |
A matrix with with regressors as columns or 0 for model without any regressors. |
const |
Binary variable: 1 - include a constant in the estimation, 0 - do not include a constant in the estimation. |
Norm |
A parameter used to correct likelihood function when it gets to close to zero in the case of high number of observations. |
Value
A list with OLS objects: Coefficients, Standard errors, Marginal likelihood, R^2, Degrees of freedom, Determinant of the regressors' matrix, log(Marginal likelihood).
Examples
x1<-rnorm(10, mean = 0, sd = 1)
x2<-rnorm(10, mean = 0, sd = 2)
y<-2+x1+2*x2
x<-cbind(x1,x2)
const<-1
ols(y,x,const)
x1<-rnorm(10, mean = 0, sd = 1)
x2<-rnorm(10, mean = 0, sd = 2)
e<-rnorm(10, mean = 0, sd = 0.5)
y<-2+x1+2*x2+e
x<-cbind(x1,x2)
const<-1
ols(y,x,const)
Posterior probability of a positive coefficient sign (P(+))
Description
Computes posterior probabilities of a positive coefficient sign, P(+), for the intercept and each regressor by averaging model-specific probabilities across the model space, weighted by posterior model probabilities.
Usage
plus_pmp_from_pmp(pmp_uniform, pmp_random, betas, VAR, DF, Reg_ID)
Arguments
pmp_uniform |
Numeric vector of length |
pmp_random |
Numeric vector of length |
betas |
Numeric matrix of dimension |
VAR |
Numeric matrix of dimension |
DF |
Numeric vector of length |
Reg_ID |
Numeric or integer matrix of dimension |
Details
For a given model i and coefficient j, the contribution is
p(M_i \mid y) \cdot F_t\!\left(
\frac{\beta_{ij}}{\sqrt{\mathrm{VAR}_{ij}}}; \mathrm{DF}_i
\right),
where F_t(\cdot;\mathrm{DF}_i) is the CDF of the Student-t
distribution with \mathrm{DF}_i degrees of freedom.
The intercept is included in all models, while each regressor contributes
only in those models in which it is included, as indicated by the model
inclusion matrix Reg_ID.
The posterior probability of a positive sign for coefficient j is
computed as
P(\beta_j > 0 \mid y)
=
\sum_{i \in \mathcal{M}_j}
p(M_i \mid y)\,
F_t\!\left(
\frac{\beta_{ij}}{\sqrt{\mathrm{VAR}_{ij}}}; \mathrm{DF}_i
\right),
where \mathcal{M}_j denotes the set of models that include regressor
j. For the intercept, \mathcal{M}_j contains all models.
This definition follows the sign-probability interpretation in Doppelhofer and Weeks (2009).
Value
A list with two elements:
- Plus_PMP_uniform
A
(K+1) x 1numeric matrix containing posterior probabilities of a positive coefficient sign under the uniform model prior. The first row corresponds to the intercept.- Plus_PMP_random
A
(K+1) x 1numeric matrix containing posterior probabilities of a positive coefficient sign under the random model prior. The first row corresponds to the intercept.
References
Doppelhofer, G. and Weeks, M. (2009). Jointness of growth determinants. Journal of Applied Econometrics, 24(2), 209–244.
Graphs of the posterior densities of the coefficients
Description
This function draws graphs of the posterior densities of all the coefficients of interest.
Arguments
bma_list |
bma object (the result of the bma function) |
prior |
Parameter indicating which model prior should be used for calculations:
|
Value
A list with the graphs of the posterior densities of coefficients for all the considered regressors.
Examples
data <- Trade_data[,1:10]
modelSpace <- model_space(data, M = 9, g = "UIP")
bma_list <- bma(modelSpace)
distPlots <- posterior_dens(bma_list, prior = "binomial")
distPlots[[1]]
distPlots[[2]]
Posterior sign certainty probability
Description
Computes the posterior probability that a regression coefficient has the same sign as its posterior mean, following the sign-certainty measure used in Sala-i-Martin, Doppelhofer and Miller (2004) and Doppelhofer and Weeks (2009).
Usage
posterior_sign_certainty(pmp_uniform, pmp_random, betas, VAR, DF, Reg_ID)
Arguments
pmp_uniform |
Numeric vector of length |
pmp_random |
Numeric vector of length |
betas |
Numeric matrix of dimension |
VAR |
Numeric matrix of dimension |
DF |
Numeric vector of length |
Reg_ID |
Numeric or integer matrix of dimension |
Details
For each coefficient j, define
S_j = \sum_{i=1}^{MS} p(M_i \mid y)\,
F_t\!\left(
\frac{\beta_{ij}}{\sqrt{\mathrm{VAR}_{ij}}}; \mathrm{DF}_i
\right),
where F_t(\cdot;\mathrm{DF}_i) is the CDF of the Student-t
distribution with \mathrm{DF}_i degrees of freedom.
The posterior sign certainty probability is then defined as
p(\mathrm{sign}_j \mid y) =
\begin{cases}
S_j, & \text{if } \mathrm{sign}(E[\beta_j \mid y]) > 0, \\
1 - S_j, & \text{if } \mathrm{sign}(E[\beta_j \mid y]) < 0, \\
0.5, & \text{if } E[\beta_j \mid y] = 0.
\end{cases}
The intercept is included in all models. For slope coefficients that are
excluded from a given model, the contribution to S_j is set to
F_t(0)=0.5, reflecting a symmetric distribution centered at zero.
Value
A list with four elements:
- PSC_uniform
A
(K+1) x 1numeric matrix containing posterior sign certainty probabilities under the uniform model prior.- PSC_random
A
(K+1) x 1numeric matrix containing posterior sign certainty probabilities under the random model prior.- PostMean_uniform
A
(K+1) x 1numeric matrix of posterior means under the uniform model prior.- PostMean_random
A
(K+1) x 1numeric matrix of posterior means under the random model prior.
References
Sala-i-Martin, X., Doppelhofer, G., and Miller, R. I. (2004). Determinants of long-term growth: A Bayesian averaging of classical estimates. American Economic Review, 94(4), 813–835.
Doppelhofer, G. and Weeks, M. (2009). Jointness of growth determinants. Journal of Applied Econometrics, 24(2), 209–244.
Regressors choice based on the inclusion vector
Description
The function creates a matrix with regressors that should be included in a specific model based on inclusion vector
Usage
subset_design(x, model_row)
Arguments
x |
matrix with regressors |
model_row |
inclusion vector - row of a model space matrix |
Value
A matrix with regressors to be used in a specific model