Which of the following cities are the capital of the corresponding country?

  • True. Astana is the capital of Kazakhstan.
  • True. Tokyo is the capital of Japan.
  • False. The capital of Turkey is Ankara.
  • False. The capital of Brazil is Brasilia.
  • False. The de facto capital of Switzerland is Bern.

What is the seat of the federal authorities in Switzerland (i.e., the de facto capital)?

There is no de jure capital but the de facto capital and seat of the federal authorities is Bern.

  • False
  • False
  • False
  • True
  • False

What is the seat of the federal authorities in Switzerland (i.e., the de facto capital)?

There is no de jure capital but the de facto capital and seat of the federal authorities is Bern.

  • False
  • True
  • False
  • False
  • False

What is the derivative of \(f(x) = x^{8} e^{2.3x}\), evaluated at \(x = 0.77\)?

Using the product rule for \(f(x) = g(x) \cdot h(x)\), where \(g(x) := x^{8}\) and \(h(x) := e^{2.3x}\), we obtain \[\begin{aligned} f'(x) & = & [g(x) \cdot h(x)]' = g'(x) \cdot h(x) + g(x) \cdot h'(x) \\ & = & 8 x^{8 - 1} \cdot e^{2.3x} + x^{8} \cdot e^{2.3x} \cdot 2.3 \\ & = & e^{2.3x} \cdot(8 x^7 + 2.3 x^{8}) \\ & = & e^{2.3x} \cdot x^7 \cdot (8 + 2.3x). \end{aligned}\] Evaluated at \(x = 0.77\), the answer is \[e^{2.3\cdot 0.77} \cdot 0.77^7 \cdot (8 + 2.3\cdot 0.77) = 9.215303.\] Thus, rounded to two digits we have \(f'(0.77) = 9.22\).

What is the derivative of \(f(x) = x^{6} e^{3.7x}\), evaluated at \(x = 0.66\)?

Using the product rule for \(f(x) = g(x) \cdot h(x)\), where \(g(x) := x^{6}\) and \(h(x) := e^{3.7x}\), we obtain \[\begin{aligned} f'(x) & = & [g(x) \cdot h(x)]' = g'(x) \cdot h(x) + g(x) \cdot h'(x) \\ & = & 6 x^{6 - 1} \cdot e^{3.7x} + x^{6} \cdot e^{3.7x} \cdot 3.7 \\ & = & e^{3.7x} \cdot(6 x^5 + 3.7 x^{6}) \\ & = & e^{3.7x} \cdot x^5 \cdot (6 + 3.7x). \end{aligned}\] Evaluated at \(x = 0.66\), the answer is \[e^{3.7\cdot 0.66} \cdot 0.66^5 \cdot (6 + 3.7\cdot 0.66) = 12.153802.\] Thus, rounded to two digits we have \(f'(0.66) = 12.15\).

What is the derivative of \(f(x) = x^{8} e^{2.6x}\), evaluated at \(x = 0.53\)?

Using the product rule for \(f(x) = g(x) \cdot h(x)\), where \(g(x) := x^{8}\) and \(h(x) := e^{2.6x}\), we obtain \[\begin{aligned} f'(x) & = & [g(x) \cdot h(x)]' = g'(x) \cdot h(x) + g(x) \cdot h'(x) \\ & = & 8 x^{8 - 1} \cdot e^{2.6x} + x^{8} \cdot e^{2.6x} \cdot 2.6 \\ & = & e^{2.6x} \cdot(8 x^7 + 2.6 x^{8}) \\ & = & e^{2.6x} \cdot x^7 \cdot (8 + 2.6x). \end{aligned}\] Evaluated at \(x = 0.53\), the answer is \[e^{2.6\cdot 0.53} \cdot 0.53^7 \cdot (8 + 2.6\cdot 0.53) = 0.437018.\] Thus, rounded to two digits we have \(f'(0.53) = 0.44\).

What is the derivative of \(f(x) = x^{8} e^{2.6x}\), evaluated at \(x = 0.79\)?

Using the product rule for \(f(x) = g(x) \cdot h(x)\), where \(g(x) := x^{8}\) and \(h(x) := e^{2.6x}\), we obtain \[\begin{aligned} f'(x) & = & [g(x) \cdot h(x)]' = g'(x) \cdot h(x) + g(x) \cdot h'(x) \\ & = & 8 x^{8 - 1} \cdot e^{2.6x} + x^{8} \cdot e^{2.6x} \cdot 2.6 \\ & = & e^{2.6x} \cdot(8 x^7 + 2.6 x^{8}) \\ & = & e^{2.6x} \cdot x^7 \cdot (8 + 2.6x). \end{aligned}\] Evaluated at \(x = 0.79\), the answer is \[e^{2.6\cdot 0.79} \cdot 0.79^7 \cdot (8 + 2.6\cdot 0.79) = 15.058073.\] Thus, rounded to two digits we have \(f'(0.79) = 15.06\).

  • True
  • False
  • False
  • False
  • False

What is the derivative of \(f(x) = x^{7} e^{2.2x}\), evaluated at \(x = 0.7\)?

Using the product rule for \(f(x) = g(x) \cdot h(x)\), where \(g(x) := x^{7}\) and \(h(x) := e^{2.2x}\), we obtain \[\begin{aligned} f'(x) & = & [g(x) \cdot h(x)]' = g'(x) \cdot h(x) + g(x) \cdot h'(x) \\ & = & 7 x^{7 - 1} \cdot e^{2.2x} + x^{7} \cdot e^{2.2x} \cdot 2.2 \\ & = & e^{2.2x} \cdot(7 x^6 + 2.2 x^{7}) \\ & = & e^{2.2x} \cdot x^6 \cdot (7 + 2.2x). \end{aligned}\] Evaluated at \(x = 0.7\), the answer is \[e^{2.2\cdot 0.7} \cdot 0.7^6 \cdot (7 + 2.2\cdot 0.7) = 4.686619.\] Thus, rounded to two digits we have \(f'(0.7) = 4.69\).

  • False
  • False
  • False
  • True
  • False

What is the derivative of \(f(x) = x^{8} e^{2.6x}\), evaluated at \(x = 0.67\)?

Using the product rule for \(f(x) = g(x) \cdot h(x)\), where \(g(x) := x^{8}\) and \(h(x) := e^{2.6x}\), we obtain \[\begin{aligned} f'(x) & = & [g(x) \cdot h(x)]' = g'(x) \cdot h(x) + g(x) \cdot h'(x) \\ & = & 8 x^{8 - 1} \cdot e^{2.6x} + x^{8} \cdot e^{2.6x} \cdot 2.6 \\ & = & e^{2.6x} \cdot(8 x^7 + 2.6 x^{8}) \\ & = & e^{2.6x} \cdot x^7 \cdot (8 + 2.6x). \end{aligned}\] Evaluated at \(x = 0.67\), the answer is \[e^{2.6\cdot 0.67} \cdot 0.67^7 \cdot (8 + 2.6\cdot 0.67) = 3.370643.\] Thus, rounded to two digits we have \(f'(0.67) = 3.37\).

  • False
  • False
  • True
  • False
  • False

Given the following information:

pineapple \(+\) orange \(+\) orange = \(647\)
banana \(+\) pineapple \(+\) pineapple = \(1008\)
pineapple \(+\) pineapple \(+\) orange = \(1018\)

Compute:

banana \(+\) orange \(+\) pineapple = \(\text{?}\)

The information provided can be interpreted as the price for three fruit baskets with different combinations of the three fruits. This corresponds to a system of linear equations where the price of the three fruits is the vector of unknowns \(x\):

\(x_1 =\) banana \(x_2 =\) orange \(x_3 =\) pineapple

The system of linear equations is then: \[ \begin{aligned} \left( \begin{array}{rrr} 0 & 2 & 1 \\ 1 & 0 & 2 \\ 0 & 1 & 2 \end{array} \right) \cdot \left( \begin{array}{r} x_1 \\ x_2 \\ x_3 \end{array} \right) & = & \left( \begin{array}{r} 647 \\ 1008 \\ 1018 \end{array} \right) \end{aligned} \] This can be solved using any solution algorithm, e.g., elimination: \[ x_1 = 82, \, x_2 = 92, \, x_3 = 463. \] Based on the three prices for the different fruits it is straightforward to compute the total price of the fourth fruit basket via:

banana \(+\) orange \(+\) pineapple =
\(x_1\) \(+\) \(x_2\) \(+\) \(x_3\) =
\(82\) \(+\) \(92\) \(+\) \(463\) = \(637\)

Given the following information:

pineapple \(+\) banana \(+\) banana = \(320\)
orange \(+\) pineapple \(+\) pineapple = \(606\)
pineapple \(+\) pineapple \(+\) banana = \(559\)

Compute:

banana \(+\) orange \(+\) pineapple = \(\text{?}\)

The information provided can be interpreted as the price for three fruit baskets with different combinations of the three fruits. This corresponds to a system of linear equations where the price of the three fruits is the vector of unknowns \(x\):

\(x_1 =\) banana \(x_2 =\) orange \(x_3 =\) pineapple

The system of linear equations is then: \[ \begin{aligned} \left( \begin{array}{rrr} 2 & 0 & 1 \\ 0 & 1 & 2 \\ 1 & 0 & 2 \end{array} \right) \cdot \left( \begin{array}{r} x_1 \\ x_2 \\ x_3 \end{array} \right) & = & \left( \begin{array}{r} 320 \\ 606 \\ 559 \end{array} \right) \end{aligned} \] This can be solved using any solution algorithm, e.g., elimination: \[ x_1 = 27, \, x_2 = 74, \, x_3 = 266. \] Based on the three prices for the different fruits it is straightforward to compute the total price of the fourth fruit basket via:

banana \(+\) orange \(+\) pineapple =
\(x_1\) \(+\) \(x_2\) \(+\) \(x_3\) =
\(27\) \(+\) \(74\) \(+\) \(266\) = \(367\)

Given the following information:

pineapple \(+\) orange \(+\) pineapple = \(773\)
orange \(+\) banana \(+\) orange = \(261\)
banana \(+\) banana \(+\) orange = \(249\)

Compute:

banana \(+\) orange \(+\) pineapple = \(\text{?}\)

The information provided can be interpreted as the price for three fruit baskets with different combinations of the three fruits. This corresponds to a system of linear equations where the price of the three fruits is the vector of unknowns \(x\):

\(x_1 =\) banana \(x_2 =\) orange \(x_3 =\) pineapple

The system of linear equations is then: \[ \begin{aligned} \left( \begin{array}{rrr} 0 & 1 & 2 \\ 1 & 2 & 0 \\ 2 & 1 & 0 \end{array} \right) \cdot \left( \begin{array}{r} x_1 \\ x_2 \\ x_3 \end{array} \right) & = & \left( \begin{array}{r} 773 \\ 261 \\ 249 \end{array} \right) \end{aligned} \] This can be solved using any solution algorithm, e.g., elimination: \[ x_1 = 79, \, x_2 = 91, \, x_3 = 341. \] Based on the three prices for the different fruits it is straightforward to compute the total price of the fourth fruit basket via:

banana \(+\) orange \(+\) pineapple =
\(x_1\) \(+\) \(x_2\) \(+\) \(x_3\) =
\(79\) \(+\) \(91\) \(+\) \(341\) = \(511\)

In the following figure the distributions of a variable given by two samples (A and B) are represented by parallel boxplots. Which of the following statements are correct? (Comment: The statements are either about correct or clearly wrong.)

  • False. Distribution A has on average higher values than distribution B.
  • True. Both distributions have no observations which deviate more than 1.5 times the interquartile range from the box.
  • False. The interquartile range in sample A is not clearly bigger than in B.
  • True. The skewness of both distributions is similar, both are left-skewed.
  • False. Distribution B is left-skewed.

In the following figure the distributions of a variable given by two samples (A and B) are represented by parallel boxplots. Which of the following statements are correct? (Comment: The statements are either about correct or clearly wrong.)

  • True. Both distributions have a similar location.
  • False. There are observations which deviate more than 1.5 times the interquartile range from the box.
  • False. The interquartile range in sample A is not clearly bigger than in B.
  • True. The skewness of both distributions is similar, both are about symmetric.
  • False. Distribution A is about symmetric.

In the following figure the distributions of a variable given by two samples (A and B) are represented by parallel boxplots. Which of the following statements are correct? (Comment: The statements are either about correct or clearly wrong.)

  • True. Both distributions have a similar location.
  • True. Both distributions have no observations which deviate more than 1.5 times the interquartile range from the box.
  • False. The interquartile range in sample A is not clearly bigger than in B.
  • True. The skewness of both distributions is similar, both are about symmetric.
  • False. Distribution B is about symmetric.

The waiting time (in minutes) at the cashier of two supermarket chains with different cashier systems is compared. The following statistical test was performed:


    Two Sample t-test

data:  Waiting by Supermarket
t = -2.7382, df = 96, p-value = 0.003682
alternative hypothesis: true difference in means between group Sparag and group Consumo is less than 0
95 percent confidence interval:
       -Inf -0.7194132
sample estimates:
 mean in group Sparag mean in group Consumo 
             4.215957              6.044487 

Which of the following statements are correct? (Significance level 5%)

  • True. The absolute value of the test statistic is equal to 2.738.
  • True. The test aims at showing that the difference of means is smaller than 0.
  • False. The p-value is equal to 0.00368.
  • False. The test aims at showing that the alternative that the waiting time is shorter at Sparag than at Consumo.
  • True. The test result is significant (\(p < 0.05\)) and hence the alternative is shown, that the difference of means are smaller than 0.

The waiting time (in minutes) at the cashier of two supermarket chains with different cashier systems is compared. The following statistical test was performed:


    Two Sample t-test

data:  Waiting by Supermarket
t = 2.1456, df = 114, p-value = 0.03402
alternative hypothesis: true difference in means between group Sparag and group Consumo is not equal to 0
95 percent confidence interval:
 0.09978863 2.50123208
sample estimates:
 mean in group Sparag mean in group Consumo 
             4.825278              3.524768 

Which of the following statements are correct? (Significance level 5%)

  • True. The absolute value of the test statistic is equal to 2.146.
  • False. The test aims at showing that the difference of means is unequal to 0.
  • False. The p-value is equal to 0.034.
  • True. The test result is significant (\(p < 0.05\)) and hence the alternative is shown that the difference of means is unequal to 0.
  • False.

The waiting time (in minutes) at the cashier of two supermarket chains with different cashier systems is compared. The following statistical test was performed:


    Two Sample t-test

data:  Waiting by Supermarket
t = 0.71983, df = 108, p-value = 0.7634
alternative hypothesis: true difference in means between group Sparag and group Consumo is less than 0
95 percent confidence interval:
     -Inf 1.716066
sample estimates:
 mean in group Sparag mean in group Consumo 
             6.841936              6.322675 

Which of the following statements are correct? (Significance level 5%)

  • False. The absolute value of the test statistic is equal to 0.72.
  • True. The test aims at showing that the difference of means is smaller than 0.
  • True. The p-value is equal to 0.763.
  • False. The test aims at showing that the alternative that the waiting time is shorter at Sparag than at Consumo. The test result is not significant (\(p \ge 0.05\)).
  • False. The test result ist not significant (\(p \ge 0.05\)).

What is the name of the R function for extracting the estimated coefficients from a fitted (generalized) linear model object?

coef is the R function for extracting the estimated coefficients from a fitted (generalized) linear model object. See ?coef for the corresponding manual page.

What is the name of the R function for extracting the estimated covariance matrix from a fitted (generalized) linear model object?

vcov is the R function for extracting the estimated covariance matrix from a fitted (generalized) linear model object. See ?vcov for the corresponding manual page.

What is the name of the R function for Poisson regression?

glm is the R function for Poisson regression. See ?glm for the corresponding manual page.

Theory: Consider a linear regression of y on x. It is usually estimated with which estimation technique (three-letter abbreviation)?

This estimator yields the best linear unbiased estimator (BLUE) under the assumptions of the Gauss-Markov theorem. Which of the following properties are required for the errors of the linear regression model under these assumptions?

Application: Using the data provided in linreg.csv estimate a linear regression of y on x. What are the estimated parameters?

Intercept:

Slope:

In terms of significance at 5% level:

Theory: Linear regression models are typically estimated by ordinary least squares (OLS). The Gauss-Markov theorem establishes certain optimality properties: Namely, if the errors have expectation zero, constant variance (homoscedastic), no autocorrelation and the regressors are exogenous and not linearly dependent, the OLS estimator is the best linear unbiased estimator (BLUE).

Application: The estimated coefficients along with their significances are reported in the summary of the fitted regression model, showing that y increases significantly with x (at 5% level).

Call:
lm(formula = y ~ x, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.48592 -0.10856 -0.00288  0.11152  0.52160 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.02453    0.01964  -1.249    0.215    
x            0.89555    0.03305  27.099   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1947 on 98 degrees of freedom
Multiple R-squared:  0.8823,    Adjusted R-squared:  0.8811 
F-statistic: 734.4 on 1 and 98 DF,  p-value: < 2.2e-16

Code: The analysis can be replicated in R using the following code.

## data
d <- read.csv("linreg.csv")
## regression
m <- lm(y ~ x, data = d)
summary(m)
## visualization
plot(y ~ x, data = d)
abline(m)

Theory: Consider a linear regression of y on x. It is usually estimated with which estimation technique (three-letter abbreviation)?

This estimator yields the best linear unbiased estimator (BLUE) under the assumptions of the Gauss-Markov theorem. Which of the following properties are required for the errors of the linear regression model under these assumptions?

Application: Using the data provided in linreg.csv estimate a linear regression of y on x. What are the estimated parameters?

Intercept:

Slope:

In terms of significance at 5% level:

Theory: Linear regression models are typically estimated by ordinary least squares (OLS). The Gauss-Markov theorem establishes certain optimality properties: Namely, if the errors have expectation zero, constant variance (homoscedastic), no autocorrelation and the regressors are exogenous and not linearly dependent, the OLS estimator is the best linear unbiased estimator (BLUE).

Application: The estimated coefficients along with their significances are reported in the summary of the fitted regression model, showing that y decreases significantly with x (at 5% level).

Call:
lm(formula = y ~ x, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.51701 -0.14686 -0.03693  0.15171  0.97472 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.03469    0.02437   1.424    0.158    
x           -0.74685    0.04156 -17.971   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2398 on 98 degrees of freedom
Multiple R-squared:  0.7672,    Adjusted R-squared:  0.7648 
F-statistic:   323 on 1 and 98 DF,  p-value: < 2.2e-16

Code: The analysis can be replicated in R using the following code.

## data
d <- read.csv("linreg.csv")
## regression
m <- lm(y ~ x, data = d)
summary(m)
## visualization
plot(y ~ x, data = d)
abline(m)

Theory: Consider a linear regression of y on x. It is usually estimated with which estimation technique (three-letter abbreviation)?

This estimator yields the best linear unbiased estimator (BLUE) under the assumptions of the Gauss-Markov theorem. Which of the following properties are required for the errors of the linear regression model under these assumptions?

Application: Using the data provided in linreg.csv estimate a linear regression of y on x. What are the estimated parameters?

Intercept:

Slope:

In terms of significance at 5% level:

Theory: Linear regression models are typically estimated by ordinary least squares (OLS). The Gauss-Markov theorem establishes certain optimality properties: Namely, if the errors have expectation zero, constant variance (homoscedastic), no autocorrelation and the regressors are exogenous and not linearly dependent, the OLS estimator is the best linear unbiased estimator (BLUE).

Application: The estimated coefficients along with their significances are reported in the summary of the fitted regression model, showing that y decreases significantly with x (at 5% level).

Call:
lm(formula = y ~ x, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.60702 -0.15874  0.01421  0.17645  0.45455 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.03966    0.02322   1.708   0.0908 .  
x           -0.80664    0.04072 -19.811   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2321 on 98 degrees of freedom
Multiple R-squared:  0.8002,    Adjusted R-squared:  0.7982 
F-statistic: 392.5 on 1 and 98 DF,  p-value: < 2.2e-16

Code: The analysis can be replicated in R using the following code.

## data
d <- read.csv("linreg.csv")
## regression
m <- lm(y ~ x, data = d)
summary(m)
## visualization
plot(y ~ x, data = d)
abline(m)

An industry-leading company seeks a qualified candidate for a management position. A management consultancy carries out an assessment center which concludes in making a positive or negative recommendation for each candidate: From previous assessments they know that of those candidates that are actually eligible for the position (event \(E\)) \(60\%\) get a positive recommendation (event \(R\)). However, out of those candidates that are not eligible \(65\%\) get a negative recommendation. Overall, they know that only \(6\%\) of all job applicants are actually eligible.

What is the corresponding fourfold table of the joint probabilities? (Specify all entries in percent.)

\(R\) \(\overline{R}\) sum
\(E\) % % %
\(\overline{E}\) % % %
sum % % %

Using the information from the text, we can directly calculate the following joint probabilities: \[ \begin{aligned} P(E \cap R) & = P(R | E) \cdot P(E) = 0.6 \cdot 0.06 = 0.036 = 3.6\%\\ P(\overline{E} \cap \overline{R}) & = P(\overline{R} | \overline{E}) \cdot P(\overline{E}) = 0.65 \cdot 0.94 = 0.611 = 61.1\%. \end{aligned} \] The remaining probabilities can then be found by calculating sums and differences in the fourfold table:

\(R\) \(\overline{R}\) sum
\(E\) 3.60 2.40 6.00
\(\overline{E}\) 32.90 61.10 94.00
sum 36.50 63.50 100.00
  • \(P(E \cap R) = 3.6\%\)
  • \(P(\overline{E} \cap R) = 32.9\%\)
  • \(P(E \cap \overline{R}) = 2.4\%\)
  • \(P(\overline{E} \cap \overline{R}) = 61.1\%\)
  • \(P(R) = 36.5\%\)
  • \(P(\overline{R}) = 63.5\%\)
  • \(P(E) = 6.0\%\)
  • \(P(\overline{E}) = 94.0\%\)
  • \(P(\Omega) = 100.0\%\)

An industry-leading company seeks a qualified candidate for a management position. A management consultancy carries out an assessment center which concludes in making a positive or negative recommendation for each candidate: From previous assessments they know that of those candidates that are actually eligible for the position (event \(E\)) \(72\%\) get a positive recommendation (event \(R\)). However, out of those candidates that are not eligible \(66\%\) get a negative recommendation. Overall, they know that only \(9\%\) of all job applicants are actually eligible.

What is the corresponding fourfold table of the joint probabilities? (Specify all entries in percent.)

\(R\) \(\overline{R}\) sum
\(E\) % % %
\(\overline{E}\) % % %
sum % % %

Using the information from the text, we can directly calculate the following joint probabilities: \[ \begin{aligned} P(E \cap R) & = P(R | E) \cdot P(E) = 0.72 \cdot 0.09 = 0.0648 = 6.48\%\\ P(\overline{E} \cap \overline{R}) & = P(\overline{R} | \overline{E}) \cdot P(\overline{E}) = 0.66 \cdot 0.91 = 0.6006 = 60.06\%. \end{aligned} \] The remaining probabilities can then be found by calculating sums and differences in the fourfold table:

\(R\) \(\overline{R}\) sum
\(E\) 6.48 2.52 9.00
\(\overline{E}\) 30.94 60.06 91.00
sum 37.42 62.58 100.00
  • \(P(E \cap R) = 6.48\%\)
  • \(P(\overline{E} \cap R) = 30.94\%\)
  • \(P(E \cap \overline{R}) = 2.52\%\)
  • \(P(\overline{E} \cap \overline{R}) = 60.06\%\)
  • \(P(R) = 37.42\%\)
  • \(P(\overline{R}) = 62.58\%\)
  • \(P(E) = 9.00\%\)
  • \(P(\overline{E}) = 91.00\%\)
  • \(P(\Omega) = 100.00\%\)

An industry-leading company seeks a qualified candidate for a management position. A management consultancy carries out an assessment center which concludes in making a positive or negative recommendation for each candidate: From previous assessments they know that of those candidates that are actually eligible for the position (event \(E\)) \(76\%\) get a positive recommendation (event \(R\)). However, out of those candidates that are not eligible \(73\%\) get a negative recommendation. Overall, they know that only \(9\%\) of all job applicants are actually eligible.

What is the corresponding fourfold table of the joint probabilities? (Specify all entries in percent.)

\(R\) \(\overline{R}\) sum
\(E\) % % %
\(\overline{E}\) % % %
sum % % %

Using the information from the text, we can directly calculate the following joint probabilities: \[ \begin{aligned} P(E \cap R) & = P(R | E) \cdot P(E) = 0.76 \cdot 0.09 = 0.0684 = 6.84\%\\ P(\overline{E} \cap \overline{R}) & = P(\overline{R} | \overline{E}) \cdot P(\overline{E}) = 0.73 \cdot 0.91 = 0.6643 = 66.43\%. \end{aligned} \] The remaining probabilities can then be found by calculating sums and differences in the fourfold table:

\(R\) \(\overline{R}\) sum
\(E\) 6.84 2.16 9.00
\(\overline{E}\) 24.57 66.43 91.00
sum 31.41 68.59 100.00
  • \(P(E \cap R) = 6.84\%\)
  • \(P(\overline{E} \cap R) = 24.57\%\)
  • \(P(E \cap \overline{R}) = 2.16\%\)
  • \(P(\overline{E} \cap \overline{R}) = 66.43\%\)
  • \(P(R) = 31.41\%\)
  • \(P(\overline{R}) = 68.59\%\)
  • \(P(E) = 9.00\%\)
  • \(P(\overline{E}) = 91.00\%\)
  • \(P(\Omega) = 100.00\%\)