# Binary Data

## Bernoulli Distribution

### Probabilistic Parametrization

#### Parameter

• Probability parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | p] &= \begin{cases} 1 - p & \text{ for } y = 0 \\ p & \text{ for } y = 1 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= p \\ \mathrm{var}[Y] &= p (1 - p) \\ \end{aligned}

#### Score

$\nabla_{m} (y; p) = \begin{cases} \frac{1}{p - 1} & \text{ for } y = 0 \\ \frac{1}{p} & \text{ for } y = 1 \\ \end{cases}$

#### Fisher Information

\begin{aligned} \mathcal{I}_{p, p} (p) &= \frac{1}{p (1 - p)} \\ \end{aligned}

# Categorical Data

## Categorical Distribution

### Worth Parametrization

#### Parameters

• Worth parameters $$w_i \in (0, \infty), i = 1, \ldots, n$$

#### Vector Notation

• Worth vector $$\boldsymbol{w}$$ of length $$n$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [\boldsymbol{Y} = \boldsymbol{y} | \boldsymbol{w}] &= \frac{1}{\sum_{i=1}^n w_i} \prod_{i=1}^n w_i^{y_i} \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \frac{1}{\sum_{i=1}^n w_i} \boldsymbol{w} \\ \mathrm{var}[\boldsymbol{Y}] &= \frac{1}{\sum_{i=1}^n w_i} \mathrm{diag} (\boldsymbol{w}) - \frac{1}{\left( \sum_{i=1}^n w_i \right)^2} \boldsymbol{w} \boldsymbol{w}^\intercal \\ \end{aligned}

#### Score

$\nabla_{\boldsymbol{w}} (\boldsymbol{y}; \boldsymbol{w}) = \boldsymbol{y} \oslash \boldsymbol{w} - \frac{1}{\sum_{i=1}^n w_i} \boldsymbol{1}_n$

#### Fisher Information

$\mathcal{I}_{\boldsymbol{w}, \boldsymbol{w}} (\boldsymbol{w}) = \mathrm{diag} \left( \sum_{i=1}^n w_i \boldsymbol{1}_n \oslash \boldsymbol{w} \right) - \frac{1}{\left( \sum_{i=1}^n w_i \right)^2} \boldsymbol{1}_{n \times n}$

### Notes

• We treat the categorical distribution as a multivariate distribution. For $$n$$ categories, observations are in the form of vectors of length $$n$$ with exactly one element equal to 1 and the others to 0.

• The probability mass function is invariant to the multiplication by a constant of the worth parameters. In the case of the logarithmic transformation, it is invariant to the addition of a constant to the transformed worth parameters. The parameters therefore need to be standardized, e.g. to zero sum in the latter case.

# Ranking Data

## Plackett-Luce Distribution

### Worth Parametrization

#### Parameters

• Worth parameters $$w_i \in (0, \infty), i = 1, \ldots, n$$

#### Ranking Notation

• Worth parameters by rank $$w_{j^{\mathrm{th}}}, j = 1, \ldots, n$$

#### Probability Mass Function

$\mathrm{P} [\boldsymbol{Y} = \boldsymbol{y} | w_1, \ldots, w_n] = \prod_{j=1}^n \frac{w_{j^{\mathrm{th}}}}{\sum_{k=j}^n w_{k^{\mathrm{th}}}}$

#### Score

$\nabla_{w_i} (\boldsymbol{y}; w_1, \ldots, w_n) = \frac{1}{w_i} - \sum_{j=1}^{y_i} \frac{1}{\sum_{k = j}^n w_{k^{\mathrm{th}}}}$

### Notes

• The expected value, the variance, and the Fisher information are computed directly from the definitions as sums over all possible rankings. As the number of permutations grows drastically with increasing $$n$$, we only use this approach for $$n \leq 6$$. For $$n \geq 7$$, we randomly sample 1 000 rankings. We locally set seed so the results are always the same.

• The probability mass function is invariant to the multiplication by a constant of the worth parameters. In the case of the logarithmic transformation, it is invariant to the addition of a constant to the transformed worth parameters. The parameters therefore need to be standardized, e.g. to zero sum in the latter case.

• Alvo, M. and Yu, P. L. H. (2014). Statistical Methods for Ranking Data. Springer. doi: 10.1007/978-1-4939-1471-5.

• Holý, V. and Zouhar, J. (2022). Modelling Time-Varying Rankings with Autoregressive and Score-Driven Dynamics. Journal of the Royal Statistical Society: Series C (Applied Statistics), 71(5). doi: 10.1111/rssc.12584.

• Luce, R. D. (1977). The Choice Axiom after Twenty Years. Journal of Mathematical Psychology, 15(3), 215–233. doi: 10.1016/0022-2496(77)90032-3.

• Plackett, R. L. (1975). The Analysis of Permutations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 24(2), 193–202. doi: 10.2307/2346567.

# Count Data

## Double Poisson Distribution

### Mean Parametrization

#### Parameters

• Mean parameter $$m \in (0, \infty)$$
• Dispersion parameter $$s \in (0, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | m, s] \approx \frac{1}{1 + \frac{1 - s}{12 s m} \left(1 + \frac{1}{s m} \right)} \sqrt{s} \frac{y^y}{y!} \left( \frac{m}{y} \right)^{s y} \exp(s y - s m - y)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &\approx m \\ \mathrm{var}[Y] &\approx \frac{m}{s} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &\approx \frac{s}{m} (y - m) \\ \nabla_{s} (y; m, s) &\approx \frac{1}{2 s} - m \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &\approx \frac{s}{m} \\ \mathcal{I}_{m, s} (m, s) &\approx 0 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{1}{2 s^2} \\ \end{aligned}

### Note

• The probability mass function is not available in a closed form. We use the approximation of Efron (1986) for the probability mass function, the mean, the variance, the score, and the Fisher information.

• Aragon, D. C., Achcar, J. A., and Martinez, E. Z. (2018). Maximum Likelihood and Bayesian Estimators for the Double Poisson Distribution. Journal of Statistical Theory and Practice, 12(4), 886–911. doi: 10.1080/15598608.2018.1489919.

• Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

• Efron, B. (1986). Double Exponential Families and Their Use in Generalized Linear Regression. Journal of the American Statistical Association, 81(395), 709–721. doi: 10.1080/01621459.1986.10478327.

• Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

• Holý, V. and Tomanová, P. (2022). Modeling Price Clustering in High-Frequency Prices. Quantitative Finance. doi: 10.1080/14697688.2022.2050285.

• Sellers, K. F. and Morris, D. S. (2017). Underdispersion Models: Models That Are “Under the Radar.” Communications in Statistics - Theory and Methods, 46(24), 12075–12086. doi: 10.1080/03610926.2017.1291976.

• Zou, Y., Geedipally, S. R., and Lord, D. (2013). Evaluating the Double Poisson Generalized Linear Model. Accident Analysis and Prevention, 59, 497–505. doi: 10.1016/j.aap.2013.07.017.

## Geometric Distribution

### Mean Parametrization

#### Parameter

• Mean parameter $$m \in (0, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | m] = \frac{1}{1 + m} \left( \frac{m}{1 + m} \right)^{y}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= m (1 + m) \\ \end{aligned}

#### Score

$\nabla_{m} (y; m) = \frac{y - m}{m (1 + m) }$

#### Fisher Information

$\mathcal{I}_{m, m} (m) = \frac{1}{m (1 + m)}$

### Probabilistic Parametrization

#### Parameter

• Probability parameter $$p \in (0, 1)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | p] = p (1 - p)^{y}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{1 - p}{p} \\ \mathrm{var}[Y] &= \frac{1 - p}{p^2} \\ \end{aligned}

#### Score

$\nabla_{p} (y; p) = \frac{p y + p - 1}{p (p - 1)}$

#### Fisher Information

$\mathcal{I}_{p, p} (p) = \frac{1}{p^2 (1 - p)}$

## Negative Binomial Distribution

### NB2 Parametrization

#### Parameters

• Mean parameter $$m \in (0, \infty)$$
• Dispersion parameter $$s \in (0, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | m, s] = \frac{\Gamma (y + s^{-1})}{\Gamma (y + 1) \Gamma (s^{-1})} \left( \frac{1}{1 + s m} \right)^{s^{-1}} \left( \frac{s m}{1 + s m} \right)^{y}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= m (1 + s m) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &= \frac{y - m}{m (1 + s m) } \\ \nabla_{s} (y; m, s) &= \frac{ y - m}{s (1 + s m)} + \frac{1}{s^2} \left( \ln(1 + s m) + \psi_0 \left( \frac{1}{s} \right) - \psi_0 \left( y + \frac{1}{s} \right) \right) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{m (1 + s m)} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{1}{s^4} \left( \ln(1 + s m) + \psi_0 \left( \frac{1}{s} \right) - \psi_0 \left( m + \frac{1}{s} \right) \right)^2 \\ \end{aligned}

### Probabilistic Parametrization

#### Parameters

• Probability parameter $$p \in (0, 1)$$
• Size parameter $$r \in (0, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | p, r] = \frac{\Gamma(y + r)}{\Gamma(y + 1) \Gamma(r)} (1 - p)^y p^r$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{r (1 - p)}{p} \\ \mathrm{var}[Y] &= \frac{r (1 - p)}{p^2} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{p} (y; p, r) &= \frac{p r + p y - r}{p (p - 1)} \\ \nabla_{r} (y; p, r) &= \ln(p) - \psi_0(r) + \psi_0(y + r) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{p, p} (p, r) &= \frac{r}{p^2 (1 - p)} \\ \mathcal{I}_{p, r} (p, r) &= -\frac{1}{p} \\ \mathcal{I}_{r, r} (p, r) &\approx \left( \ln(p) - \psi_0(r) + \psi_0 \left( \frac{r}{p} \right) \right)^2 \\ \end{aligned}

### Note

• The Fisher information for the dispersion or size parameter, $$\mathcal{I}_{s, s} (m, s)$$ or $$\mathcal{I}_{r, r} (p, r)$$, is not available in a closed form. To speed up calculations, we use a rough approximation by replacing $$y$$ with its expected value.

• Cameron, A. C. and Trivedi, P. K. (1986). Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics, 1(1), 29–53. doi: 10.1002/jae.3950010104.

• Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

• Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

## Poisson Distribution

### Mean Parametrization

#### Parameter

• Mean parameter $$m \in (0, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | m] = \frac{m^y}{y!} \exp(-m)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= m \\ \end{aligned}

#### Score

$\nabla_{m} (y; m) = \frac{y - m}{m}$

#### Fisher Information

$\mathcal{I}_{m, m} (m) = \frac{1}{m}$

• Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

• Davis, R. A., Dunsmuir, W. T. M., and Street, S. B. (2003). Observation-Driven Models for Poisson Counts. Biometrika, 90(4), 777–790. doi: 10.1093/biomet/90.4.777.

• Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

## Zero-Inflated Geometric Distribution

#### Parameters

• Mean parameter $$m \in (0, \infty)$$
• Zero inflation parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | m, p] &= \begin{cases} p + (1 - p) \left( \frac{1}{1 + m} \right) & \text{ for } y = 0 \\ (1 - p) \left( \frac{1}{1 + m} \right) \left( \frac{m}{1 + m} \right)^{y} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m (1 - p) \\ \mathrm{var}[Y] &= m(1 - p) (1 + p m + m) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, p) &= \begin{cases} \frac{p - 1}{(1 + m) (1 + p m)} & \text{ for } y = 0 \\ \frac{y - m}{m (1 + m) } & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{p} (y; m, p) &= \begin{cases} \frac{m}{1 + p m} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, p) &= \frac{(1 - p) (1 + m + p m^2)}{m (1 + m) (1 + p m)} \\ \mathcal{I}_{m, p} (m, p) &= - \frac{1}{ (1 + m) ( 1 + p m) } \\ \mathcal{I}_{p, p} (m, p) &= \frac{m}{(1 - p) ( 1 + p m)} \\ \end{aligned}

• Blasques, F., Holý, V., and Tomanová, P. (2022). Zero-Inflated Autoregressive Conditional Duration Model for Discrete Trade Durations with Excessive Zeros. Working Paper. arXiv: 1812.07318.

• Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

• Greene, W. H. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Stern School of Business Research Paper Series, EC-94-10. SSRN: 1293115.

• Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

• Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14. doi: 10.2307/1269547.

## Zero-Inflated Negative Binomial Distribution

### NB2 Parametrization

#### Parameters

• Mean parameter $$m \in (0, \infty)$$
• Dispersion parameter $$s \in (0, \infty)$$
• Zero inflation parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | m, s, p] &= \begin{cases} p + (1 - p) \left( \frac{1}{1 + s m} \right)^{s^{-1}} & \text{ for } y = 0 \\ (1 - p) \frac{\Gamma (y + s^{-1})}{\Gamma (y + 1) \Gamma (s^{-1})} \left( \frac{1}{1 + s m} \right)^{s^{-1}} \left( \frac{s m}{1 + s m} \right)^{y} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m (1 - p) \\ \mathrm{var}[Y] &= m(1 - p) (1 + p m + s m) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{p - 1}{(1 + s m) \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} & \text{ for } y = 0 \\ \frac{y - m}{m (1 + s m) } & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{s} (y; m, s, p) &= \begin{cases} \frac{(1 - p) \left( (1 + s m) \ln(1 + s m) -s m \right) }{ s^2 (1 + s m) \left( 1 + p (1 + s m)^{s^{-1}}- p \right) } & \text{ for } y = 0 \\ \frac{ s (y - m) + (1 + s m) \left( \ln(1 + s m) + \psi_0 \left( s^{-1} \right) - \psi_0 \left( y + s^{-1} \right) \right) }{s^2 (1 + s m)} & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{(1 + s m)^{s^{-1}} - 1}{1 + p (1 + s m)^{s^{-1}}- p} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s, p) &= \frac{p(p - 1)}{(1 + s m)^2 \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} + \frac{1 -p}{m(1 + s m)} \\ \mathcal{I}_{m, s} (m, s, p) &= \frac{\left( p - p^2 \right) \left( (1 + s m) \ln(1 + s m) - s m \right) }{s^2 (1 + s m)^2 \left( 1 + p (1 + s m)^{s^{-1}} -p \right)} \\ \mathcal{I}_{m, p} (m, s, p) &= \frac{-1}{ (1 + s m) \left( 1 + p (1 + s m)^{s^{-1}} - p \right) }\\ \mathcal{I}_{s, s} (m, s, p) &\approx \frac{1}{s^4} \left( \ln(1 + s m) + \psi_0 \left( s^{-1} \right) - \psi_0 \left( y + s^{-1} \right) \right)^2 \left( 1 - p - (1 - p) \left( 1 + s m \right)^{-s^{-1}} \right) \\ & \qquad + \frac{(1 - p)^2 \left( (1 + s m) \ln(1 + s m) - s m \right)^2} {s^4 (1 + s m)^{2 + s^{-1}} \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} \\ \mathcal{I}_{s, p} (m, s, p) &= \frac{(1 + s m) \ln(1 + s m) - s m}{s^2 (1 + s m) \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{1 - (1 + s m)^{s^{-1}}}{(p - 1) \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} \end{aligned}

### Note

• The Fisher information for the dispersion parameter, $$\mathcal{I}_{s, s} (m, s, p)$$, is not available in a closed form. To speed up calculations, we use an approximation by replacing $$y$$ with its expected value combined with the zero value.

• Blasques, F., Holý, V., and Tomanová, P. (2022). Zero-Inflated Autoregressive Conditional Duration Model for Discrete Trade Durations with Excessive Zeros. Working Paper. arXiv: 1812.07318.

• Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

• Greene, W. H. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Stern School of Business Research Paper Series, EC-94-10. SSRN: 1293115.

• Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

• Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14. doi: 10.2307/1269547.

## Zero-Inflated Poisson Distribution

### Mean Parametrization

#### Parameters

• Mean parameter $$m \in (0, \infty)$$
• Zero inflation parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | m, p] &= \begin{cases} p + (1 - p) \exp(-m) & \text{ for } y = 0 \\ (1 - p) \frac{m^y}{y!} \exp(-m) & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m (1 - p) \\ \mathrm{var}[Y] &= m(1 - p) (1 + p m) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{p - 1}{p \exp(m) - p + 1} & \text{ for } y = 0 \\ \frac{y - m}{m} & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{\exp(m) - 1}{p \exp(m) - p + 1} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s, p) &= \frac{p (p - 1)}{p \exp(m) - p + 1} - \frac{p - 1}{m} \\ \mathcal{I}_{m, p} (m, s, p) &= - \frac{1}{p \exp(m) - p + 1} \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{\exp(m) - 1}{(1 - p) (p \exp(m) - p + 1)} \\ \end{aligned}

### Note

• The Fisher information for the dispersion parameter, $$\mathcal{I}_{s, s} (m, s, p)$$, is not available in a closed form. To speed up calculations, we use an approximation by replacing $$y$$ with its expected value.

• Blasques, F., Holý, V., and Tomanová, P. (2022). Zero-Inflated Autoregressive Conditional Duration Model for Discrete Trade Durations with Excessive Zeros. Working Paper. arXiv: 1812.07318.

• Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

• Greene, W. H. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Stern School of Business Research Paper Series, EC-94-10. SSRN: 1293115.

• Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

• Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14. doi: 10.2307/1269547.

# Integer Data

## Skellam Distribution

### Difference Parametrization

#### Parameters

• First rate parameter $$r_1 \in (0, \infty)$$
• Second rate parameter $$r_2 \in (0, \infty)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | r_1, r_2] &= \exp(-r_1 - r_2) \left( \frac{r_1}{r_2} \right)^{\frac{y}{2}} I_y \left( 2 \sqrt{r_1 r_2} \right) \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= r_1 - r_2 \\ \mathrm{var}[Y] &= r_1 + r_2 \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{r_1} (y; r_1, r_2) &= \sqrt{\frac{r_2}{r_1}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } - 1 \\ \nabla_{r_2} (y; r_1, r_2) &= \sqrt{\frac{r_1}{r_2}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } -\frac{y}{r_2} - 1 \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{r_1, r_1} (r_1, r_2) &\approx \frac{r_2}{r_1} \left( \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } \right)^2 - 2 \sqrt{\frac{r_2}{r_1}} \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } + 1 \\ \mathcal{I}_{r_1, r_2} (r_1, r_2) &\approx \left( \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } \right)^2 - 2 \sqrt{\frac{r_1}{r_2}} \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } + \frac{r_1}{r_2} \\ \mathcal{I}_{r_2, r_2} (r_1, r_2) &\approx \frac{r_1}{r_2} \left( \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } \right)^2 - 2 \left( \frac{r_1}{r_2} \right)^{\frac{3}{2}} \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } + \left( \frac{r_1}{r_2} \right)^2 \\ \end{aligned}

### Mean-Dispersion Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Dispersion parameter $$s \in (0, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | m, s] = \exp(-|m| - s) \left( \frac{|m| + m + s}{|m| - m + s} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 + 2 |m| s} \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= |m| + s \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &= \frac{y}{2|m| + s} + \frac{\mathrm{sgn}(m) s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - \mathrm{sgn}(m) \\ \nabla_{s} (y; m, s) &= - \frac{m y}{s^2 + 2 |m| s} + \frac{|m| + s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - 1 \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &\approx \frac{s^2}{4 \left( s^2 + 2|m|s \right)} \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ \mathcal{I}_{m, s} (m, s) &\approx \frac{\mathrm{sgn}(m) (|m| + s) s}{4 \left( s^2 + 2|m|s \right)} \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{(|m| + s)^2}{4 \left( s^2 + 2|m|s \right)} \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ \end{aligned}

### Mean-Variance Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Variance parameter $$s \in (|m|, \infty)$$

#### Probability Mass Function

$\mathrm{P} [Y = y | m, s] = \exp(-s) \left( \frac{s + m}{s - m} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 - m^2} \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= s \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &= \frac{s y}{s^2 - m^2} - \frac{m}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } \\ \nabla_{s} (y; m, s) &= -\frac{m y}{s^2 - m^2} + \frac{s}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } - 1\\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &\approx \frac{m^2}{4 \left( s^2 - m^2 \right)} \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ \mathcal{I}_{m, s} (m, s) &\approx - \frac{m s}{4 \left( s^2 - m^2 \right)} \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{s^2}{4 \left( s^2 - m^2 \right)} \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ \end{aligned}

### Note

• The computation of the Fisher information is quite intricate and we resort to an approximation by replacing $$y$$ with its expected value.

• Alzaid, A. A. and Omair, M. A. (2010). On the Poisson Difference Distribution Inference and Applications. Bulletin of the Malaysian Mathematical Sciences Society, 33(1), 17–45. EuDML: 244475.

• Karlis, D. and Ntzoufras, I. (2009). Bayesian Modelling of Football Outcomes: Using the Skellam’s Distribution for the Goal Difference. IMA Journal of Management Mathematics, 20(2), 133–145. doi: 10.1093/imaman/dpn026.

• Koopman, S. J. and Lit, R. (2019). Forecasting Football Match Results in National League Competitions Using Score-Driven Time Series Models. International Journal of Forecasting, 35(2), 797–809. doi: 10.1016/j.ijforecast.2018.10.011.

• Koopman, S. J., Lit, R., Lucas, A., and Opschoor, A. (2018). Dynamic Discrete Copula Models for High-Frequency Stock Price Changes. Journal of Applied Econometrics, 33(7), 966–985. doi: 10.1002/jae.2645.

• Skellam, J. G. (1946). The Frequency Distribution of the Difference Between Two Poisson Variates Belonging to Different Populations. Journal of the Royal Statistical Society, 109(3), 296. doi: 10.2307/2981372.

## Zero-Inflated Skellam Distribution

### Difference Parametrization

#### Parameters

• First rate parameter $$r_1 \in (0, \infty)$$
• Second rate parameter $$r_2 \in (0, \infty)$$
• Inflation parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | r_1, r_2, p] &= \begin{cases} p + (1 - p) \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) & \text{ for } y = 0 \\ (1 - p) \exp(-r_1 - r_2) \left( \frac{r_1}{r_2} \right)^{\frac{y}{2}} I_y \left( 2 \sqrt{r_1 r_2} \right) & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= (1 - p) (r_1 - r_2) \\ \mathrm{var}[Y] &= (1 - p) \left( p \left( r_1 - r_2 \right)^2 + r_1 + r_2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{r_1} (y; r_1, r_2, p) &= \begin{cases} \frac{(p - 1) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} & \text{ for } y = 0 \\ \sqrt{\frac{r_2}{r_1}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{r_2} (y; r_1, r_2, p) &= \begin{cases} \frac{(p - 1) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} & \text{ for } y = 0 \\ \sqrt{\frac{r_1}{r_2}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } -\frac{y}{r_2} - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{p} (y; r_1, r_2, p) &= \begin{cases} \frac{\exp(r_1 + r_2) - I_0 \left( 2 \sqrt{r_1 r_2} \right)}{p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right)} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{r_1, r_1} (r_1, r_2, p) &\approx (1 - p) \left( 1 - \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right) \left( 1 - \sqrt{\frac{r_2}{r_1}} \frac{I_{r_1 - r_2 -1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-r_1 - r_2) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)^2}{r_1 r_2 \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ \mathcal{I}_{r_1, r_2} (r_1, r_2, p) &\approx (1 - p) \left( 1 - \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right) \left( 1 - \sqrt{\frac{r_2}{r_1}} \frac{I_{r_1 - r_2-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right) \\ & \qquad \times \left( \frac{r_1}{r_2} - \sqrt{\frac{r_1}{r_2}} \frac{I_{r_1 - r_2 - 1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right) \\ & \qquad + \frac{(1 - p)^2 \exp(-r_1 - r_2) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)}{r_1 r_2 \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ & \qquad \times \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right) \\ \mathcal{I}_{r_1, p} (r_1, r_2, p) &= \frac{(p - 1) \left( 1 - \exp(-r_1 - r_2 ) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ & \qquad \times \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right) \\ \mathcal{I}_{r_2, r_2} (r_1, r_2, p) &\approx (1 - p) \left( 1 - \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right) \left( \frac{r_1}{r_2} - \sqrt{\frac{r_1}{r_2}} \frac{I_{r_1 - r_2 - 1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-r_1 - r_2) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)^2}{r_1 r_2 \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ \mathcal{I}_{r_2, p} (r_1, r_2, p) &= \frac{(p - 1) \left( 1 - \exp(-r_1 - r_2 ) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ & \qquad \times \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right) \\ \mathcal{I}_{p, p} (r_1, r_2, p) &= \frac{\exp(r_1 + r_2) - I_0 \left( 2 \sqrt{r_1 r_2} \right)}{(1 - p) \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ \end{aligned}

### Mean-Dispersion Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Dispersion parameter $$s \in (0, \infty)$$
• Inflation parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | m, s, p] &= \begin{cases} p + (1 - p) \exp(-|m| - s) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) & \text{ for } y = 0 \\ (1 - p) \exp(-|m| - s) \left( \frac{|m| + m + s}{|m| - m + s} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 + 2 |m| s} \right) & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= (1 - p) m \\ \mathrm{var}[Y] &= (1 - p) \left( |m| + s + p m^2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{\mathrm{sgn}(m) (p - 1) \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{\sqrt{s^2 + 2 |m| s} \left( (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) + p \exp(|m| + s) \right)} & \text{ for } y = 0 \\ \frac{y}{2|m| + s} + \frac{\mathrm{sgn}(m) s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - \mathrm{sgn}(m) & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{s} (y; m, s, p) &= \begin{cases} \frac{ (p - 1) \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) }{\sqrt{s^2 + 2 |m| s} \left( (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) + p \exp(|m| + s) \right)} & \text{ for } y = 0 \\ - \frac{m y}{s^2 + 2 |m| s} + \frac{|m| + s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{\exp(|m| + s) - I_0 \left( \sqrt{s^2 + 2 |m| s} \right)}{p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right)} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s, p) &\approx \frac{s^2 (1 - p) \left( 1 - \exp(-|m|-s) I_{0} \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{4 s^2 + 8 |m| s} \\ & \qquad \times \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-|m| - s) }{\left( s^2 + 2 |m| s \right) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right)^2 \\ \mathcal{I}_{m, s} (m, s, p) &\approx \frac{\mathrm{sgn}(m) s (1 - p) (|m| + s) \left( 1 - \exp(-|m|-s) I_{0} \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{4 s^2 + 8 |m| s} \\ & \qquad \times \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ & \qquad + \frac{\mathrm{sgn}(m) (1 - p)^2 \exp(-|m| - s)}{\left( s^2 + 2 |m| s \right) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ \mathcal{I}_{m, p} (m, s, p) &= \frac{\mathrm{sgn}(m) (p - 1) \left( 1 - \exp(-|m| - s) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{\sqrt{s^2 + 2 |m| s} \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ \mathcal{I}_{s, s} (m, s, p) &\approx \frac{(1 - p) (|m| + s)^2 \left( 1 - \exp(-|m|-s) I_{0} \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{4 s^2 + 8 |m| s} \\ & \qquad \times \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-|m| - s)}{\left( s^2 + 2 |m| s \right) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right)^2 \\ \mathcal{I}_{s, p} (m, s, p) &= \frac{(p - 1) \left( 1 - \exp(-|m| - s) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{\sqrt{s^2 + 2 |m| s} \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{\exp(|m| + s) - I_0 \left( \sqrt{s^2 + 2 |m| s} \right)}{(1 - p) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ \end{aligned}

### Mean-Variance Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Variance parameter $$s \in (|m|, \infty)$$
• Inflation parameter $$p \in (0, 1)$$

#### Probability Mass Function

\begin{aligned} \mathrm{P} [Y = y | m, s, p] &= \begin{cases} p + (1 - p) \exp(-s) I_0 \left( \sqrt{s^2 - m^2} \right) & \text{ for } y = 0 \\ (1 - p) \exp(-s) \left( \frac{s + m}{s - m} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 - m^2} \right) & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned}

#### Moments

\begin{aligned} \mathrm{E}[Y] &= (1 - p) m \\ \mathrm{var}[Y] &= (1 - p) \left( s + p m^2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{m (p - 1) I_{1} \left( \sqrt{s^2 - m^2} \right)}{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)} & \text{ for } y = 0 \\ \frac{s y}{s^2 - m^2} - \frac{m}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{s} (y; m, s, p) &= \begin{cases} \frac{ (p - 1) \left( \sqrt{s^2 - m^2} I_{0} \left( \sqrt{s^2 - m^2} \right) - s I_{1} \left( \sqrt{s^2 - m^2} \right) \right) }{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)} & \text{ for } y = 0 \\ -\frac{m y}{s^2 - m^2} + \frac{s}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{\exp(s) - I_{0} \left( \sqrt{s^2 - m^2} \right)}{p \exp(s) + (1 - p) I_{0} \left( \sqrt{s^2 - m^2} \right)} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s, p) &\approx \frac{m^2 (1 - p) \left( 1 - \exp(-s) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)}{4 \left( s^2 - m^2 \right)} \\ & \qquad \times \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ & \qquad + \frac{m^2 (1 - p)^2 \exp(-s) I_{1} \left( \sqrt{s^2 - m^2} \right)^2}{\left( s^2 - m^2 \right) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ \mathcal{I}_{m, s} (m, s, p) &\approx \frac{m s (p - 1) \left( 1 - \exp(-s) I_{0} \left( \sqrt{s^2 - m^2} \right) \right) }{4 \left( s^2 - m^2 \right)} \\ & \qquad \times \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ & \qquad + \frac{m (1 - p)^2 \exp(-s) I_{1} \left( \sqrt{s^2 - m^2} \right)}{\left( s^2 - m^2 \right) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 - m^2} I_0 \left( \sqrt{s^2 - m^2} \right) - s I_1 \left( \sqrt{s^2 - m^2} \right) \right) \\ \mathcal{I}_{m, p} (m, s, p) &= \frac{m (p - 1) \left( 1 - \exp(-s) I_0 \left( \sqrt{s^2 - m^2} \right) \right) I_1 \left( \sqrt{s^2 - m^2} \right)}{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ \mathcal{I}_{s, s} (m, s, p) &\approx \frac{s^2 (1 - p) \left( 1 - \exp(-s) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)}{4 \left( s^2 - m^2 \right)} \\ & \qquad \times \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-s) \left( \sqrt{s^2 - m^2} I_0 \left( \sqrt{s^2 - m^2} \right) - s I_1 \left( \sqrt{s^2 - m^2} \right) \right)^2}{\left( s^2 - m^2 \right) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ \mathcal{I}_{s, p} (m, s, p) &= \frac{(p - 1) \left( 1 - \exp(-s) I_0 \left( \sqrt{s^2 - m^2} \right) \right) }{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 - m^2} I_0 \left( \sqrt{s^2 - m^2} \right) - s I_1 \left( \sqrt{s^2 - m^2} \right) \right) \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{\exp(s) - I_0 \left( \sqrt{s^2 - m^2} \right)}{(1 - p) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right) } \\ \end{aligned}

### Note

• The computation of the Fisher information for the first two parameters is quite intricate and we resort to an approximation by replacing $$y$$ with its expected value combined with the zero value.

• Karlis, D. and Ntzoufras, I. (2009). Bayesian Modelling of Football Outcomes: Using the Skellam’s Distribution for the Goal Difference. IMA Journal of Management Mathematics, 20(2), 133–145. doi: 10.1093/imaman/dpn026.

# Circular Data

## von Mises Distribution

### Mean-Concentration Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Concentration parameter $$v \in (0, \infty)$$

#### Density Function

$f(y | m, v) = \frac{1}{2 \pi I_0(v)} \exp \left( v \cos(y - m) \right)$

#### Circular Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= 1 - \frac{I_1(v)}{I_0(v)} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, v) &= v \sin(y - m) \\ \nabla_{v} (y; m, v) &= \cos(y - m) - \frac{I_1(v)}{I_0(v)} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, v) &= v \frac{I_1(v)}{I_0(v)} \\ \mathcal{I}_{m, v} (m, v) &= 0 \\ \mathcal{I}_{v, v} (m, v) &= \frac{1}{2} - \left( \frac{I_1(v)}{I_0(v)} \right)^2 + \frac{I_2(v)}{2 I_0(v)} \\ \end{aligned}

• Harvey, A., Hurn, S., and Thiele, S. (2019). Modeling Directional (Circular) Time Series. Cambridge Working Papers in Economics, CWPE 1971. doi: 10.17863/cam.43915.

# Interval Data

## Beta Distribution

### Concentration Parametrization

#### Parameters

• First concentration parameter $$a_1 \in (0, \infty)$$
• Second concentration parameter $$a_2 \in (0, \infty)$$

#### Density Function

$f(y | a_1, a_2) = \frac{1}{B(a_1, a_2)} y^{a_1 - 1} (1 - y)^{a_2 - 1}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{a_1}{a_1 + a_2} \\ \mathrm{var}[Y] &= \frac{a_1 a_2}{(a_1 + a_2)^2 (a_1 + a_2 + 1)} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{a} (y; a_1, a_2) &= \psi_0(a_1 + a_2) - \psi_0(a_1) + \ln(y) \\ \nabla_{b} (y; a_1, a_2) &= \psi_0(a_1 + a_2) - \psi_0(a_2) + \ln(1 - y) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{a_1, a_1} (a_1, a_2) &= \psi_1(a_1) - \psi_1(a_1 + a_2) \\ \mathcal{I}_{a_1, a_2} (a_1, a_2) &= -\psi_1(a_1 + a_2) \\ \mathcal{I}_{a_2, a_2} (a_1, a_2) &= \psi_1(a_2) - \psi_1(a_1 + a_2) \\ \end{aligned}

### Mean-Size Parametrization

#### Parameters

• Mean parameter $$m \in (0, 1)$$
• Size parameter $$v \in (0, \infty)$$

#### Density Function

$f(y | m, v) = \frac{1}{B(m v, (1 - m) v)} y^{m v - 1} (1 - y)^{(1 - m) v - 1}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= \frac{m (1 - m)}{v + 1} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, v) &= \frac{v}{1 - m} (\psi_0(v) - \psi_0(m v) + \ln(y)) \\ \nabla_{v} (y; m, v) &= \psi_0(v) - \psi_0(v - m v) + \ln(1 - y) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, v) &= \frac{v^2}{(1 - m)^2} (\psi_1(m v) - \psi_1(v)) \\ \mathcal{I}_{m, v} (m, v) &= \frac{v}{m - 1} \psi_1(v) \\ \mathcal{I}_{v, v} (m, v) &= \psi_1(v - m v) - \psi_1(v) \\ \end{aligned}

### Mean-Variance Parametrization

#### Parameters

• Mean parameter $$m \in (0, 1)$$
• Variance parameter $$s \in (0, m (1 - m))$$

#### Density Function

$f(y | m, s) = \frac{1}{B \left( m \left( \frac{m - m^2}{s} - 1 \right), (1 - m) \left( \frac{m - m^2}{s} - 1 \right) \right)} y^{m \left( \frac{m - m^2}{s} - 1 \right) - 1} (1 - y)^{(1 - m) \left( \frac{m - m^2}{s} - 1 \right) - 1}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= s \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &= \frac{m^2 - m + s}{(m - 1) s} \left( \psi_0 \left( \frac{m - m^2}{s} - 1 \right) - \psi_0 \left( m \left( \frac{m - m^2}{s} - 1 \right) \right) + \ln(y) \right) \\ \nabla_{s} (y; m, s) &= \frac{s^2 (3 m^2 - 2 m + s)}{m (m - 1) (m^2 - m + s)} \left( \psi_0 \left( \frac{m - m^2}{s} - 1 \right) - \psi_0 \left( (1 - m) \left( \frac{m - m^2}{s} - 1 \right) \right) + \ln(1 - y) \right) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{(m^2 - m + s)^2}{(m - 1)^2 s^2} \left( \psi_1 \left( m \left( \frac{m - m^2}{s} - 1 \right) \right) - \psi_1 \left( \frac{m - m^2}{s} - 1 \right) \right) \\ \mathcal{I}_{m, s} (m, s) &= \frac{s (2 m - 3 m^2 - s)}{m (m^2 - 2 m + 1)} \psi_1 \left( \frac{m - m^2}{s} - 1 \right) \\ \mathcal{I}_{s, s} (m, s) &= \frac{s^2 (3 m^2 - 2 m + s)^2}{(m - 1)^4 m^2} \left( \psi_1 \left( (1 - m) \left( \frac{m - m^2}{s} - 1 \right) \right) - \psi_1 \left( \frac{m - m^2}{s} - 1 \right) \right) \\ \end{aligned}

# Compositional Data

## Dirichlet Distribution

### Concentration Parametrization

#### Parameters

• Concentration parameters $$a_i \in (0, \infty)$$, $$i = 1,\ldots,n$$

#### Vector Notation

• Concentration vector $$\boldsymbol{a}$$ of length $$n$$

#### Density Function

$f(\boldsymbol{y} | \boldsymbol{a}) = \frac{1}{B(\boldsymbol{a})} \prod_{i=1}^n y_i^{a_i - 1}$

#### Moments

\begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \frac{1}{\sum_{i=1}^n a_i} \boldsymbol{a} \\ \mathrm{var}[\boldsymbol{Y}] &= \frac{1}{1 + \sum_{i=1}^n a_i} \left( \frac{1}{\sum_{i=1}^n a_i} \mathrm{diag}(\boldsymbol{a}) - \frac{1}{\left( \sum_{i=1}^n a_i \right)^2} \boldsymbol{a} \boldsymbol{a}^\intercal \right) \\ \end{aligned}

#### Score

$\nabla_{\boldsymbol{a}} (\boldsymbol{y}; \boldsymbol{a}) = \ln(\boldsymbol{y}) - \psi_0 (\boldsymbol{a}) + \psi_0 \left( \sum_{i=1}^n a_i \right) \\$

#### Fisher Information

$\mathcal{I}_{\boldsymbol{a}, \boldsymbol{a}} (\boldsymbol{a}) = \mathrm{diag} \left( \psi_1 \left( \boldsymbol{a} \right) \right) - \psi_1 \left( \sum_{i=1}^n a_i \right) \\$

• Calvori, F., Cipollini, F., and Gallo, G. M. (2013). Go with the Flow: A GAS Model For Predicting Intra-Daily Volume Shares. SSRN, 2363483. doi: 10.2139/ssrn.2363483.

# Duration Data

## Exponential Distribution

### Rate Parametrization

#### Parameter

• Rate parameter $$r \in (0, \infty)$$

#### Density Function

$f(y | r) = r \exp \left( -r y \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{1}{r} \\ \mathrm{var}[Y] &= \frac{1}{r^2} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{r} (y; r) &= \frac{1}{r} - y \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{r, r} (r) &= \frac{1}{r^2} \\ \end{aligned}

### Scale Parametrization

#### Parameter

• Scale parameter $$s \in (0, \infty)$$

#### Density Function

$f(y | s) = \frac{1}{s} \exp \left( - \frac{y}{s} \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= s \\ \mathrm{var}[Y] &= s^2 \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{s} (y; s) &= \frac{y - s}{s^2} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{s, s} (s) &= \frac{1}{s^2} \\ \end{aligned}

• Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

## Gamma Distribution

### Rate Parametrization

#### Parameters

• Rate parameter $$r \in (0, \infty)$$
• Shape parameter $$a \in (0, \infty)$$

#### Density Function

$f(y | r, a) = \frac{r}{\Gamma(a)} (r y)^{a - 1} \exp \left( -r y \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{a}{r} \\ \mathrm{var}[Y] &= \frac{a}{r^2} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{r} (y; r, a) &= \frac{a - r y}{r} \\ \nabla_{a} (y; r, a) &= \ln(r y) - \psi_0(a) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{r, r} (r, a) &= \frac{a}{r^2} \\ \mathcal{I}_{r, a} (r, a) &= - \frac{1}{r} \\ \mathcal{I}_{a, a} (r, a) &= \psi_1(a) \\ \end{aligned}

### Scale Parametrization

#### Parameters

• Scale parameter $$s \in (0, \infty)$$
• Shape parameter $$a \in (0, \infty)$$

#### Density Function

$f(y | s, a) = \frac{1}{\Gamma(a)} \frac{1}{s} \left( \frac{y}{s} \right)^{a - 1} \exp \left( - \frac{y}{s} \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= a s \\ \mathrm{var}[Y] &= a s^2 \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{s} (y; s, a) &= \frac{y - a s}{s^2} \\ \nabla_{a} (y; s, a) &= \ln \left( \frac{y}{s} \right) - \psi_0(a) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{s, s} (s, a) &= \frac{a}{s^2} \\ \mathcal{I}_{s, a} (s, a) &= \frac{1}{s} \\ \mathcal{I}_{a, a} (s, a) &= \psi_1(a) \\ \end{aligned}

• Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

## Generalized Gamma Distribution

### Rate Parametrization

#### Parameters

• Rate parameter $$r \in (0, \infty)$$
• First shape parameter $$a \in (0, \infty)$$
• Second shape parameter $$b \in (0, \infty)$$

#### Density Function

$f(y | r, a, b) = \frac{r b}{\Gamma(a)} (r y)^{a b - 1} \exp \left( -(r y)^b \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{1}{r} \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \\ \mathrm{var}[Y] &= \frac{1}{r^2} \left( \frac{\Gamma \left(a + 2 b^{-1} \right)}{\Gamma \left( a \right) } - \left( \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \right)^2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{r} (y; r, a, b) &= \frac{b}{r} \left( a - (r y)^b \right) \\ \nabla_{a} (y; r, a, b) &= b \ln(r y) - \psi_0(a) \\ \nabla_{b} (y; r, a, b) &= \left( a - (r y)^b \right) \ln (r y) + \frac{1}{b} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{r, r} (r, a, b) &= \frac{a b^2}{r^2} \\ \mathcal{I}_{r, a} (r, a, b) &= - \frac{b}{r} \\ \mathcal{I}_{r, b} (r, a, b) &= \frac{a \psi_0(a) + 1}{r} \\ \mathcal{I}_{a, a} (r, a, b) &= \psi_1(a) \\ \mathcal{I}_{a, b} (r, a, b) &= - \frac{\psi_0(a)}{b} \\ \mathcal{I}_{b, b} (r, a, b) &= \frac{a \psi_0(a)^2 + 2 \psi_0(a) + a \psi_1(a) + 1}{b^2} \\ \end{aligned}

### Scale Parametrization

#### Parameters

• Scale parameter $$s \in (0, \infty)$$
• First shape parameter $$a \in (0, \infty)$$
• Second shape parameter $$b \in (0, \infty)$$

#### Density Function

$f(y | s, a, b) = \frac{1}{\Gamma(a)} \frac{b}{s} \left( \frac{y}{s} \right)^{a b - 1} \exp \left( - \left( \frac{y}{s} \right)^b \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= s \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \\ \mathrm{var}[Y] &= s^2 \left( \frac{\Gamma \left(a + 2 b^{-1} \right)}{\Gamma \left( a \right) } - \left( \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \right)^2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{s} (y; s, a, b) &= \frac{b}{s} \left( \left( \frac{y}{s} \right)^b - a \right) \\ \nabla_{a} (y; s, a, b) &= b \ln \left( \frac{y}{s} \right) - \psi_0(a) \\ \nabla_{b} (y; s, a, b) &= \left( a - \left( \frac{y}{s} \right)^b \right) \ln \left( \frac{y}{s} \right) + \frac{1}{b} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{s, s} (s, a, b) &= \frac{a b^2}{s^2} \\ \mathcal{I}_{s, a} (s, a, b) &= \frac{b}{s} \\ \mathcal{I}_{s, b} (s, a, b) &= - \frac{a \psi_0(a) + 1}{s} \\ \mathcal{I}_{a, a} (s, a, b) &= \psi_1(a) \\ \mathcal{I}_{a, b} (s, a, b) &= - \frac{\psi_0(a)}{b} \\ \mathcal{I}_{b, b} (s, a, b) &= \frac{a \psi_0(a)^2 + 2 \psi_0(a) + a \psi_1(a) + 1}{b^2} \\ \end{aligned}

• Park, T. R. (2014). Derivation of the Fisher Information Matrix for 4-Parameter Generalized Gamma Distribution Using Mathematica. Journal of the Chosun Natural Science, 7(2), 138–144. doi: 10.13160/ricns.2014.7.2.138.

• Stacy, E. W. (1962). A Generalization of the Gamma Distribution. The Annals of Mathematical Statistics, 33(3), 1187–1192. doi: 10.1214/aoms/1177704481.

• Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

## Weibull Distribution

### Rate Parametrization

#### Parameters

• Rate parameter $$r \in (0, \infty)$$
• Shape parameter $$b \in (0, \infty)$$

#### Density Function

$f(y | r, b) = r b (r y)^{b - 1} \exp \left( -(r y)^b \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= \frac{1}{r} \Gamma \left(1 + b^{-1} \right) \\ \mathrm{var}[Y] &= \frac{1}{r^2} \left( \Gamma \left(1 + 2 b^{-1} \right) - \Gamma \left(1 + b^{-1} \right)^2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{r} (y; r, b) &= \frac{b}{r} \left( 1 - (r y)^b \right) \\ \nabla_{b} (y; r, b) &= \left( 1 - (r y)^b \right) \ln (r y) + \frac{1}{b} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{r, r} (r, b) &= \frac{b^2}{r^2} \\ \mathcal{I}_{r, b} (r, b) &= \frac{\psi_0(1) + 1}{r} \\ \mathcal{I}_{b, b} (r, b) &= \frac{\psi_0(1)^2 + 2 \psi_0(1) + \psi_1(1) + 1}{b^2} \\ \end{aligned}

### Scale Parametrization

#### Parameters

• Scale parameter $$s \in (0, \infty)$$
• Shape parameter $$b \in (0, \infty)$$

#### Density Function

$f(y | s, b) = \frac{b}{s} \left( \frac{y}{s} \right)^{b - 1} \exp \left( - \left( \frac{y}{s} \right)^b \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= s \Gamma \left(1 + b^{-1} \right) \\ \mathrm{var}[Y] &= s^2 \left( \Gamma \left(a + 2 b^{-1} \right) - \Gamma \left(1 + b^{-1} \right)^2 \right) \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{s} (y; s, b) &= \frac{b}{s} \left( \left( \frac{y}{s} \right)^b - 1 \right) \\ \nabla_{b} (y; s, b) &= \left( 1 - \left( \frac{y}{s} \right)^b \right) \ln \left( \frac{y}{s} \right) + \frac{1}{b} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{s, s} (s, b) &= \frac{b^2}{s^2} \\ \mathcal{I}_{s, b} (s, b) &= - \frac{\psi_0(1) + 1}{s} \\ \mathcal{I}_{b, b} (s, b) &= \frac{\psi_0(1)^2 + 2 \psi_0(1) + \psi_1(1) + 1}{b^2} \\ \end{aligned}

• Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

# Real Data

## Laplace Distribution

### Mean-Scale Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Scale parameter $$s \in (0, \infty)$$

#### Density Function

$f(y | m, s) = \frac{1}{2s} \exp \left\{- \frac{\lvert y - m \rvert}{s} \right\}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= 2s^2 \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &= \frac{\mathrm{sign}(y - m)}{s} \\ \nabla_{s} (y; m, s) &= \frac{\lvert y - m \rvert}{s^2} - \frac{1}{s} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{s^2} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{s^2} \\ \end{aligned}

## Normal Distribution

### Mean-Variance Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Variance parameter $$s \in (0, \infty)$$

#### Density Function

$f(y | m, s) = \frac{1}{\sqrt{2 \pi s}} \exp \left( -\frac{(y - m)^2}{2 s} \right)$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= s \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s) &= \frac{y - m}{s} \\ \nabla_{s} (y; m, s) &= \frac{(y - m)^2 - s}{2 s^2} \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{s} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{2 s^2} \\ \end{aligned}

## Student’s t Distribution

### Mean-Variance Parametrization

#### Parameters

• Mean parameter $$m \in \mathbb{R}$$
• Variance parameter $$s \in (0, \infty)$$
• Degrees of freedom parameter $$v \in (0, \infty)$$

#### Density Function

$f(y | m, s, v) = \frac{\Gamma \left( \frac{v + 1}{2} \right)}{\Gamma \left( \frac{v}{2} \right) \sqrt{\pi s v}} \left( 1 + \frac{(y - m)^2}{s v} \right)^{-\frac{v + 1}{2}}$

#### Moments

\begin{aligned} \mathrm{E}[Y] &= m, & \quad \text{for } v &> 1 \\ \mathrm{var}[Y] &= \frac{v}{v - 2} s, & \quad \text{for } v &> 2 \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{m} (y; m, s, v) &= \frac{(v + 1) (y - m) }{(y - m)^2 + s v} \\ \nabla_{s} (y; m, s, v) &= \frac{v}{2s} \frac{(y - m)^2 - s}{(y - m)^2 + s v} \\ \nabla_{v} (y; m, s, v) &= \frac{1}{2} \frac{(y - m)^2 - s}{(y - m)^2 + s v} - \frac{1}{2} \ln \left(1 + \frac{1}{v} \frac{(y - m)^2}{s} \right) - \frac{1}{2} \psi_0 \left( \frac{v}{2} \right) + \frac{1}{2} \psi_0 \left( \frac{v + 1}{2} \right) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{m, m} (m, s, v) &= \frac{v + 1}{s (v + 3)} \\ \mathcal{I}_{m, s} (m, s, v) &= 0 \\ \mathcal{I}_{m, v} (m, s, v) &= 0 \\ \mathcal{I}_{s, s} (m, s, v) &= \frac{v}{2 s^2 (v + 3)} \\ \mathcal{I}_{s, v} (m, s, v) &= \frac{-1}{s (v + 1) (v + 3)} \\ \mathcal{I}_{v, v} (m, s, v) &= - \frac{1}{2} \frac{v + 5}{v (v + 1) (v + 3)} + \frac{1}{4} \psi_1 \left( \frac{v}{2} \right) - \frac{1}{4} \psi_1 \left( \frac{v + 1}{2} \right) \\ \end{aligned}

• Blazsek, S. and Villatoro, M. (2015). Is Beta-t-EGARCH(1,1) Superior to GARCH(1,1)? Applied Economics, 47(17), 1764–1774. doi: 10.1080/00036846.2014.1000536.

• Harvey, A. C. and Chakravarty, T. (2008). Beta-t-(E)GARCH. Cambridge Working Papers in Economics, CWPE 0840. doi: 10.17863/cam.5286.

• Harvey, A. C. and Lange, R. J. (2018). Modeling the Interactions Between Volatility and Returns using EGARCH-M. Journal of Time Series Analysis, 39(6), 909–919. doi: 10.1111/jtsa.12419.

• Lange, K. L., Little, R. J. A., and Taylor, J. M. G. (1989). Robust Statistical Modeling Using the t Distribution. Journal of the American Statistical Association, 84(408), 881–896. doi: 10.1080/01621459.1989.10478852.

# Multivariate Real Data

## Multivariate Normal Distribution

### Mean-Variance Parametrization

#### Parameters

• Mean parameters $$m_i \in \mathbb{R}, i = 1, \ldots, n$$
• Variance parameters $$s_i \in (0, \infty), i = 1, \ldots, n$$
• Covariance parameters $$c_{ij} \in \mathbb{R}, i = 2, \ldots, n, j = 1, \ldots, i$$

#### Vector and Matrix Notation

• Mean vector $$\boldsymbol{m}$$ of length $$n$$
• Variance-covariance matrix $$\boldsymbol{K}$$ of size $$n \times n$$

#### Density Function

$f(\boldsymbol{y} | \boldsymbol{m}, \boldsymbol{K}) = \frac{1}{\sqrt{(2 \pi)^n | \boldsymbol{K}|}} \exp \left( - \frac{1}{2} (\boldsymbol{y} - \boldsymbol{m})^\intercal \boldsymbol{K}^{-1} (\boldsymbol{y} - \boldsymbol{m}) \right)$

#### Moments

\begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \boldsymbol{m} \\ \mathrm{var}[\boldsymbol{Y}] &= \boldsymbol{K} \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{\boldsymbol{m}} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}) &= \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \\ \nabla_{\mathrm{vec}(\boldsymbol{K})} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}) &= \mathrm{vec} \left( \frac{1}{2} \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} - \frac{1}{2} \boldsymbol{K}^{-1} \right) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{\boldsymbol{m}, \boldsymbol{m}} (\boldsymbol{m}, \boldsymbol{K}) &= \boldsymbol{K}^{-1} \\ \mathcal{I}_{\boldsymbol{m}, \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}) &= \boldsymbol{0} \\ \mathcal{I}_{\mathrm{vec}(\boldsymbol{K}), \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}) &= \frac{1}{4} \boldsymbol{K}^{-1} \otimes \boldsymbol{K}^{-1} + \frac{1}{4} \mathrm{vec}\left(\boldsymbol{K}^{-1} \right) \mathrm{vec}\left(\boldsymbol{K}^{-1} \right)^\intercal \\ \end{aligned}

## Multivariate Student’s t Distribution

### Mean-Variance Parametrization

#### Parameters

• Mean parameters $$m_i \in \mathbb{R}, i = 1, \ldots, n$$
• Variance parameters $$s_i \in (0, \infty), i = 1, \ldots, n$$
• Covariance parameters $$c_{ij} \in \mathbb{R}, i = 2, \ldots, n, j = 1, \ldots, i$$
• Degrees of freedom parameter $$v \in (0, \infty)$$

#### Vector and Matrix Notation

• Mean vector $$\boldsymbol{m}$$ of length $$n$$
• Variance-covariance matrix $$\boldsymbol{K}$$ of size $$n \times n$$

#### Density Function

$f(\boldsymbol{y} | \boldsymbol{m}, \boldsymbol{K}, v) = \frac{\Gamma \left( \frac{v + n}{2} \right)}{\Gamma \left( \frac{v}{2} \right) \sqrt{(v \pi)^n | \boldsymbol{K}|}} \left( 1 + \frac{1}{v} (\boldsymbol{y} - \boldsymbol{m})^\intercal \boldsymbol{K}^{-1} (\boldsymbol{y} - \boldsymbol{m}) \right)^{-\frac{v + n}{2}}$

#### Moments

\begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \boldsymbol{m}, & \quad \text{for } v &> 1 \\ \mathrm{var}[\boldsymbol{Y}] &= \frac{v}{v - 2} \boldsymbol{K}, & \quad \text{for } v &> 2 \\ \end{aligned}

#### Score

\begin{aligned} \nabla_{\boldsymbol{m}} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}, v) &= \frac{v + n}{v + \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right)} \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \\ \nabla_{\mathrm{vec}(\boldsymbol{K})} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}, v) &= \mathrm{vec} \left( \frac{1}{2} \frac{v + n}{v + \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right)} \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} - \frac{1}{2} \boldsymbol{K}^{-1} \right) \\ \nabla_{v} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}, v) &= \frac{1}{2} \frac{ \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) - n }{ \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right)) + v} - \frac{1}{2} \ln \left( 1 + \frac{1}{v} \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \right) \\ & \qquad - \frac{1}{2} \psi_0 \left( \frac{v}{2} \right) + \frac{1}{2} \psi_0 \left( \frac{v + n}{2} \right) \\ \end{aligned}

#### Fisher Information

\begin{aligned} \mathcal{I}_{\boldsymbol{m}, \boldsymbol{m}} (\boldsymbol{m}, \boldsymbol{K}, v) &= \frac{v + n}{v + n + 2} \boldsymbol{K}^{-1} \\ \mathcal{I}_{\boldsymbol{m}, \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}, v) &= \boldsymbol{0} \\ \mathcal{I}_{\boldsymbol{m}, v} (\boldsymbol{m}, \boldsymbol{K}, v) &= \boldsymbol{0} \\ \mathcal{I}_{\mathrm{vec}(\boldsymbol{K}), \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}, v) &= \frac{1}{4} \frac{v + n}{v + n + 2} \boldsymbol{K}^{-1} \otimes \boldsymbol{K}^{-1} + \frac{1}{4} \frac{v + n - 2}{v + n + 2} \mathrm{vec}\left(\boldsymbol{K}^{-1} \right) \mathrm{vec}\left(\boldsymbol{K}^{-1} \right)^\intercal \\ \mathcal{I}_{\mathrm{vec}(\boldsymbol{K}), v} (\boldsymbol{m}, \boldsymbol{K}, v) &= - \frac{1}{(v + n +2)(v + n)} \mathrm{vec}\left(\boldsymbol{K}^{-1} \right) \\ \mathcal{I}_{v, v} (\boldsymbol{m}, \boldsymbol{K}, v) &= ) - \frac{1}{2} \frac{n (v + n + 4)}{v (v + n + 2)(v + n)} + \frac{1}{4} \psi_1 \left( \frac{v}{2} \right) - \frac{1}{4} \psi_1 \left( \frac{v + n}{2} \right) \\ \end{aligned}