Probability Distributions

Vladimír Holý

2024-02-02

Binary Data

Bernoulli Distribution

Probabilistic Parametrization

Parameter

  • Probability parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | p] &= \begin{cases} 1 - p & \text{ for } y = 0 \\ p & \text{ for } y = 1 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= p \\ \mathrm{var}[Y] &= p (1 - p) \\ \end{aligned} \]

Score

\[ \nabla_{m} (y; p) = \begin{cases} \frac{1}{p - 1} & \text{ for } y = 0 \\ \frac{1}{p} & \text{ for } y = 1 \\ \end{cases} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{p, p} (p) &= \frac{1}{p (1 - p)} \\ \end{aligned} \]

Categorical Data

Categorical Distribution

Worth Parametrization

Parameters

  • Worth parameters \(w_i \in (0, \infty), i = 1, \ldots, n\)

Vector Notation

  • Worth vector \(\boldsymbol{w}\) of length \(n\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [\boldsymbol{Y} = \boldsymbol{y} | \boldsymbol{w}] &= \frac{1}{\sum_{i=1}^n w_i} \prod_{i=1}^n w_i^{y_i} \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \frac{1}{\sum_{i=1}^n w_i} \boldsymbol{w} \\ \mathrm{var}[\boldsymbol{Y}] &= \frac{1}{\sum_{i=1}^n w_i} \mathrm{diag} (\boldsymbol{w}) - \frac{1}{\left( \sum_{i=1}^n w_i \right)^2} \boldsymbol{w} \boldsymbol{w}^\intercal \\ \end{aligned} \]

Score

\[ \nabla_{\boldsymbol{w}} (\boldsymbol{y}; \boldsymbol{w}) = \boldsymbol{y} \oslash \boldsymbol{w} - \frac{1}{\sum_{i=1}^n w_i} \boldsymbol{1}_n \]

Fisher Information

\[ \mathcal{I}_{\boldsymbol{w}, \boldsymbol{w}} (\boldsymbol{w}) = \mathrm{diag} \left( \sum_{i=1}^n w_i \boldsymbol{1}_n \oslash \boldsymbol{w} \right) - \frac{1}{\left( \sum_{i=1}^n w_i \right)^2} \boldsymbol{1}_{n \times n} \]

Notes

  • We treat the categorical distribution as a multivariate distribution. For \(n\) categories, observations are in the form of vectors of length \(n\) with exactly one element equal to 1 and the others to 0.

  • The probability mass function is invariant to the multiplication by a constant of the worth parameters. In the case of the logarithmic transformation, it is invariant to the addition of a constant to the transformed worth parameters. The parameters therefore need to be standardized, e.g. to zero sum in the latter case.

Ranking Data

Plackett–Luce Distribution

Worth Parametrization

Parameters

  • Worth parameters \(w_i \in (0, \infty), i = 1, \ldots, n\)

Ranking Notation

  • Worth parameters by rank \(w_{j^{\mathrm{th}}}, j = 1, \ldots, n\)

Probability Mass Function

\[ \mathrm{P} [\boldsymbol{Y} = \boldsymbol{y} | w_1, \ldots, w_n] = \prod_{j=1}^n \frac{w_{j^{\mathrm{th}}}}{\sum_{k=j}^n w_{k^{\mathrm{th}}}} \]

Score

\[ \nabla_{w_i} (\boldsymbol{y}; w_1, \ldots, w_n) = \frac{1}{w_i} - \sum_{j=1}^{y_i} \frac{1}{\sum_{k = j}^n w_{k^{\mathrm{th}}}} \]

Notes

  • The expected value, the variance, and the Fisher information are computed directly from the definitions as sums over all possible rankings. As the number of permutations grows drastically with increasing \(n\), we only use this approach for \(n \leq 6\). For \(n \geq 7\), we randomly sample 1 000 rankings. We locally set seed so the results are always the same.

  • The probability mass function is invariant to the multiplication by a constant of the worth parameters. In the case of the logarithmic transformation, it is invariant to the addition of a constant to the transformed worth parameters. The parameters therefore need to be standardized, e.g. to zero sum in the latter case.

Further Reading

  • Alvo, M. and Yu, P. L. H. (2014). Statistical Methods for Ranking Data. Springer. doi: 10.1007/978-1-4939-1471-5.

  • Holý, V. and Zouhar, J. (2022). Modelling Time-Varying Rankings with Autoregressive and Score-Driven Dynamics. Journal of the Royal Statistical Society: Series C (Applied Statistics), 71(5). doi: 10.1111/rssc.12584.

  • Luce, R. D. (1977). The Choice Axiom after Twenty Years. Journal of Mathematical Psychology, 15(3), 215–233. doi: 10.1016/0022-2496(77)90032-3.

  • Plackett, R. L. (1975). The Analysis of Permutations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 24(2), 193–202. doi: 10.2307/2346567.

Count Data

Double Poisson Distribution

Mean Parametrization

Parameters

  • Mean parameter \(m \in (0, \infty)\)
  • Dispersion parameter \(s \in (0, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | m, s] \approx \frac{1}{1 + \frac{1 - s}{12 s m} \left(1 + \frac{1}{s m} \right)} \sqrt{s} \frac{y^y}{y!} \left( \frac{m}{y} \right)^{s y} \exp(s y - s m - y) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &\approx m \\ \mathrm{var}[Y] &\approx \frac{m}{s} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &\approx \frac{s}{m} (y - m) \\ \nabla_{s} (y; m, s) &\approx \frac{1}{2 s} - m \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &\approx \frac{s}{m} \\ \mathcal{I}_{m, s} (m, s) &\approx 0 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{1}{2 s^2} \\ \end{aligned} \]

Note

  • The probability mass function is not available in a closed form. We use the approximation of Efron (1986) for the probability mass function, the mean, the variance, the score, and the Fisher information.

Further Reading

  • Aragon, D. C., Achcar, J. A., and Martinez, E. Z. (2018). Maximum Likelihood and Bayesian Estimators for the Double Poisson Distribution. Journal of Statistical Theory and Practice, 12(4), 886–911. doi: 10.1080/15598608.2018.1489919.

  • Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

  • Efron, B. (1986). Double Exponential Families and Their Use in Generalized Linear Regression. Journal of the American Statistical Association, 81(395), 709–721. doi: 10.1080/01621459.1986.10478327.

  • Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

  • Holý, V. and Tomanová, P. (2022). Modeling Price Clustering in High-Frequency Prices. Quantitative Finance. doi: 10.1080/14697688.2022.2050285.

  • Sellers, K. F. and Morris, D. S. (2017). Underdispersion Models: Models That Are “Under the Radar.” Communications in Statistics - Theory and Methods, 46(24), 12075–12086. doi: 10.1080/03610926.2017.1291976.

  • Zou, Y., Geedipally, S. R., and Lord, D. (2013). Evaluating the Double Poisson Generalized Linear Model. Accident Analysis and Prevention, 59, 497–505. doi: 10.1016/j.aap.2013.07.017.

Geometric Distribution

Mean Parametrization

Parameter

  • Mean parameter \(m \in (0, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | m] = \frac{1}{1 + m} \left( \frac{m}{1 + m} \right)^{y} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= m (1 + m) \\ \end{aligned} \]

Score

\[ \nabla_{m} (y; m) = \frac{y - m}{m (1 + m) } \]

Fisher Information

\[ \mathcal{I}_{m, m} (m) = \frac{1}{m (1 + m)} \]

Probabilistic Parametrization

Parameter

  • Probability parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | p] = p (1 - p)^{y} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{1 - p}{p} \\ \mathrm{var}[Y] &= \frac{1 - p}{p^2} \\ \end{aligned} \]

Score

\[ \nabla_{p} (y; p) = \frac{p y + p - 1}{p (p - 1)} \]

Fisher Information

\[ \mathcal{I}_{p, p} (p) = \frac{1}{p^2 (1 - p)} \]

Negative Binomial Distribution

NB2 Parametrization

Parameters

  • Mean parameter \(m \in (0, \infty)\)
  • Dispersion parameter \(s \in (0, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | m, s] = \frac{\Gamma (y + s^{-1})}{\Gamma (y + 1) \Gamma (s^{-1})} \left( \frac{1}{1 + s m} \right)^{s^{-1}} \left( \frac{s m}{1 + s m} \right)^{y} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= m (1 + s m) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{y - m}{m (1 + s m) } \\ \nabla_{s} (y; m, s) &= \frac{ y - m}{s (1 + s m)} + \frac{1}{s^2} \left( \ln(1 + s m) + \psi_0 \left( \frac{1}{s} \right) - \psi_0 \left( y + \frac{1}{s} \right) \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{m (1 + s m)} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{1}{s^4} \left( \ln(1 + s m) + \psi_0 \left( \frac{1}{s} \right) - \psi_0 \left( m + \frac{1}{s} \right) \right)^2 \\ \end{aligned} \]

Probabilistic Parametrization

Parameters

  • Probability parameter \(p \in (0, 1)\)
  • Size parameter \(r \in (0, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | p, r] = \frac{\Gamma(y + r)}{\Gamma(y + 1) \Gamma(r)} (1 - p)^y p^r \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{r (1 - p)}{p} \\ \mathrm{var}[Y] &= \frac{r (1 - p)}{p^2} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{p} (y; p, r) &= \frac{p r + p y - r}{p (p - 1)} \\ \nabla_{r} (y; p, r) &= \ln(p) - \psi_0(r) + \psi_0(y + r) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{p, p} (p, r) &= \frac{r}{p^2 (1 - p)} \\ \mathcal{I}_{p, r} (p, r) &= -\frac{1}{p} \\ \mathcal{I}_{r, r} (p, r) &\approx \left( \ln(p) - \psi_0(r) + \psi_0 \left( \frac{r}{p} \right) \right)^2 \\ \end{aligned} \]

Note

  • The Fisher information for the dispersion or size parameter, \(\mathcal{I}_{s, s} (m, s)\) or \(\mathcal{I}_{r, r} (p, r)\), is not available in a closed form. To speed up calculations, we use a rough approximation by replacing \(y\) with its expected value.

Further Reading

  • Cameron, A. C. and Trivedi, P. K. (1986). Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics, 1(1), 29–53. doi: 10.1002/jae.3950010104.

  • Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

  • Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

Poisson Distribution

Mean Parametrization

Parameter

  • Mean parameter \(m \in (0, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | m] = \frac{m^y}{y!} \exp(-m) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= m \\ \end{aligned} \]

Score

\[ \nabla_{m} (y; m) = \frac{y - m}{m} \]

Fisher Information

\[ \mathcal{I}_{m, m} (m) = \frac{1}{m} \]

Further Reading

  • Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

  • Davis, R. A., Dunsmuir, W. T. M., and Street, S. B. (2003). Observation-Driven Models for Poisson Counts. Biometrika, 90(4), 777–790. doi: 10.1093/biomet/90.4.777.

  • Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

Zero-Inflated Geometric Distribution

Parameters

  • Mean parameter \(m \in (0, \infty)\)
  • Zero inflation parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | m, p] &= \begin{cases} p + (1 - p) \left( \frac{1}{1 + m} \right) & \text{ for } y = 0 \\ (1 - p) \left( \frac{1}{1 + m} \right) \left( \frac{m}{1 + m} \right)^{y} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m (1 - p) \\ \mathrm{var}[Y] &= m(1 - p) (1 + p m + m) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, p) &= \begin{cases} \frac{p - 1}{(1 + m) (1 + p m)} & \text{ for } y = 0 \\ \frac{y - m}{m (1 + m) } & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{p} (y; m, p) &= \begin{cases} \frac{m}{1 + p m} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, p) &= \frac{(1 - p) (1 + m + p m^2)}{m (1 + m) (1 + p m)} \\ \mathcal{I}_{m, p} (m, p) &= - \frac{1}{ (1 + m) ( 1 + p m) } \\ \mathcal{I}_{p, p} (m, p) &= \frac{m}{(1 - p) ( 1 + p m)} \\ \end{aligned} \]

Further Reading

  • Blasques, F., Holý, V., and Tomanová, P. (2022). Zero-Inflated Autoregressive Conditional Duration Model for Discrete Trade Durations with Excessive Zeros. Working Paper. arXiv: 1812.07318.

  • Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

  • Greene, W. H. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Stern School of Business Research Paper Series, EC-94-10. SSRN: 1293115.

  • Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

  • Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14. doi: 10.2307/1269547.

Zero-Inflated Negative Binomial Distribution

NB2 Parametrization

Parameters

  • Mean parameter \(m \in (0, \infty)\)
  • Dispersion parameter \(s \in (0, \infty)\)
  • Zero inflation parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | m, s, p] &= \begin{cases} p + (1 - p) \left( \frac{1}{1 + s m} \right)^{s^{-1}} & \text{ for } y = 0 \\ (1 - p) \frac{\Gamma (y + s^{-1})}{\Gamma (y + 1) \Gamma (s^{-1})} \left( \frac{1}{1 + s m} \right)^{s^{-1}} \left( \frac{s m}{1 + s m} \right)^{y} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m (1 - p) \\ \mathrm{var}[Y] &= m(1 - p) (1 + p m + s m) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{p - 1}{(1 + s m) \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} & \text{ for } y = 0 \\ \frac{y - m}{m (1 + s m) } & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{s} (y; m, s, p) &= \begin{cases} \frac{(1 - p) \left( (1 + s m) \ln(1 + s m) -s m \right) }{ s^2 (1 + s m) \left( 1 + p (1 + s m)^{s^{-1}}- p \right) } & \text{ for } y = 0 \\ \frac{ s (y - m) + (1 + s m) \left( \ln(1 + s m) + \psi_0 \left( s^{-1} \right) - \psi_0 \left( y + s^{-1} \right) \right) }{s^2 (1 + s m)} & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{(1 + s m)^{s^{-1}} - 1}{1 + p (1 + s m)^{s^{-1}}- p} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s, p) &= \frac{p(p - 1)}{(1 + s m)^2 \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} + \frac{1 -p}{m(1 + s m)} \\ \mathcal{I}_{m, s} (m, s, p) &= \frac{\left( p - p^2 \right) \left( (1 + s m) \ln(1 + s m) - s m \right) }{s^2 (1 + s m)^2 \left( 1 + p (1 + s m)^{s^{-1}} -p \right)} \\ \mathcal{I}_{m, p} (m, s, p) &= \frac{-1}{ (1 + s m) \left( 1 + p (1 + s m)^{s^{-1}} - p \right) }\\ \mathcal{I}_{s, s} (m, s, p) &\approx \frac{1}{s^4} \left( \ln(1 + s m) + \psi_0 \left( s^{-1} \right) - \psi_0 \left( y + s^{-1} \right) \right)^2 \left( 1 - p - (1 - p) \left( 1 + s m \right)^{-s^{-1}} \right) \\ & \qquad + \frac{(1 - p)^2 \left( (1 + s m) \ln(1 + s m) - s m \right)^2} {s^4 (1 + s m)^{2 + s^{-1}} \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} \\ \mathcal{I}_{s, p} (m, s, p) &= \frac{(1 + s m) \ln(1 + s m) - s m}{s^2 (1 + s m) \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{1 - (1 + s m)^{s^{-1}}}{(p - 1) \left( 1 + p (1 + s m)^{s^{-1}} - p \right)} \end{aligned} \]

Note

  • The Fisher information for the dispersion parameter, \(\mathcal{I}_{s, s} (m, s, p)\), is not available in a closed form. To speed up calculations, we use an approximation by replacing \(y\) with its expected value combined with the zero value.

Further Reading

  • Blasques, F., Holý, V., and Tomanová, P. (2022). Zero-Inflated Autoregressive Conditional Duration Model for Discrete Trade Durations with Excessive Zeros. Working Paper. arXiv: 1812.07318.

  • Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

  • Greene, W. H. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Stern School of Business Research Paper Series, EC-94-10. SSRN: 1293115.

  • Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

  • Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14. doi: 10.2307/1269547.

Zero-Inflated Poisson Distribution

Mean Parametrization

Parameters

  • Mean parameter \(m \in (0, \infty)\)
  • Zero inflation parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | m, p] &= \begin{cases} p + (1 - p) \exp(-m) & \text{ for } y = 0 \\ (1 - p) \frac{m^y}{y!} \exp(-m) & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m (1 - p) \\ \mathrm{var}[Y] &= m(1 - p) (1 + p m) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{p - 1}{p \exp(m) - p + 1} & \text{ for } y = 0 \\ \frac{y - m}{m} & \text{ for } y \geq 1 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{\exp(m) - 1}{p \exp(m) - p + 1} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \geq 1 \\ \end{cases} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s, p) &= \frac{p (p - 1)}{p \exp(m) - p + 1} - \frac{p - 1}{m} \\ \mathcal{I}_{m, p} (m, s, p) &= - \frac{1}{p \exp(m) - p + 1} \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{\exp(m) - 1}{(1 - p) (p \exp(m) - p + 1)} \\ \end{aligned} \]

Note

  • The Fisher information for the dispersion parameter, \(\mathcal{I}_{s, s} (m, s, p)\), is not available in a closed form. To speed up calculations, we use an approximation by replacing \(y\) with its expected value.

Further Reading

  • Blasques, F., Holý, V., and Tomanová, P. (2022). Zero-Inflated Autoregressive Conditional Duration Model for Discrete Trade Durations with Excessive Zeros. Working Paper. arXiv: 1812.07318.

  • Cameron, A. C. and Trivedi, P. K. (2013). Regression Analysis of Count Data. Second Edition. Cambridge University Press. doi: 10.1017/cbo9781139013567.

  • Greene, W. H. (1994). Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Stern School of Business Research Paper Series, EC-94-10. SSRN: 1293115.

  • Hilbe, J. M. (2011). Negative Binomial Regression. Second Edition. Cambridge University Press. doi: 10.1017/cbo9780511973420.

  • Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34(1), 1–14. doi: 10.2307/1269547.

Integer Data

Skellam Distribution

Difference Parametrization

Parameters

  • First rate parameter \(r_1 \in (0, \infty)\)
  • Second rate parameter \(r_2 \in (0, \infty)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | r_1, r_2] &= \exp(-r_1 - r_2) \left( \frac{r_1}{r_2} \right)^{\frac{y}{2}} I_y \left( 2 \sqrt{r_1 r_2} \right) \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= r_1 - r_2 \\ \mathrm{var}[Y] &= r_1 + r_2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r_1} (y; r_1, r_2) &= \sqrt{\frac{r_2}{r_1}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } - 1 \\ \nabla_{r_2} (y; r_1, r_2) &= \sqrt{\frac{r_1}{r_2}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } -\frac{y}{r_2} - 1 \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r_1, r_1} (r_1, r_2) &\approx \frac{r_2}{r_1} \left( \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } \right)^2 - 2 \sqrt{\frac{r_2}{r_1}} \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } + 1 \\ \mathcal{I}_{r_1, r_2} (r_1, r_2) &\approx \left( \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } \right)^2 - 2 \sqrt{\frac{r_1}{r_2}} \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } + \frac{r_1}{r_2} \\ \mathcal{I}_{r_2, r_2} (r_1, r_2) &\approx \frac{r_1}{r_2} \left( \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } \right)^2 - 2 \left( \frac{r_1}{r_2} \right)^{\frac{3}{2}} \frac{I_{r_1 - r_2 - 1} \left(2 \sqrt{r_1 r_2} \right) }{I_{r_1 - r_2} \left(2 \sqrt{r_1 r_2} \right) } + \left( \frac{r_1}{r_2} \right)^2 \\ \end{aligned} \]

Mean-Dispersion Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Dispersion parameter \(s \in (0, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | m, s] = \exp(-|m| - s) \left( \frac{|m| + m + s}{|m| - m + s} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 + 2 |m| s} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= |m| + s \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{y}{2|m| + s} + \frac{\mathrm{sgn}(m) s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - \mathrm{sgn}(m) \\ \nabla_{s} (y; m, s) &= - \frac{m y}{s^2 + 2 |m| s} + \frac{|m| + s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - 1 \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &\approx \frac{s^2}{4 \left( s^2 + 2|m|s \right)} \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ \mathcal{I}_{m, s} (m, s) &\approx \frac{\mathrm{sgn}(m) (|m| + s) s}{4 \left( s^2 + 2|m|s \right)} \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{(|m| + s)^2}{4 \left( s^2 + 2|m|s \right)} \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ \end{aligned} \]

Mean-Variance Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Variance parameter \(s \in (|m|, \infty)\)

Probability Mass Function

\[ \mathrm{P} [Y = y | m, s] = \exp(-s) \left( \frac{s + m}{s - m} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 - m^2} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= s \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{s y}{s^2 - m^2} - \frac{m}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } \\ \nabla_{s} (y; m, s) &= -\frac{m y}{s^2 - m^2} + \frac{s}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } - 1\\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &\approx \frac{m^2}{4 \left( s^2 - m^2 \right)} \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ \mathcal{I}_{m, s} (m, s) &\approx - \frac{m s}{4 \left( s^2 - m^2 \right)} \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ \mathcal{I}_{s, s} (m, s) &\approx \frac{s^2}{4 \left( s^2 - m^2 \right)} \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ \end{aligned} \]

Note

  • The computation of the Fisher information is quite intricate and we resort to an approximation by replacing \(y\) with its expected value.

Further Reading

  • Alzaid, A. A. and Omair, M. A. (2010). On the Poisson Difference Distribution Inference and Applications. Bulletin of the Malaysian Mathematical Sciences Society, 33(1), 17–45. EuDML: 244475.

  • Karlis, D. and Ntzoufras, I. (2009). Bayesian Modelling of Football Outcomes: Using the Skellam’s Distribution for the Goal Difference. IMA Journal of Management Mathematics, 20(2), 133–145. doi: 10.1093/imaman/dpn026.

  • Koopman, S. J. and Lit, R. (2019). Forecasting Football Match Results in National League Competitions Using Score-Driven Time Series Models. International Journal of Forecasting, 35(2), 797–809. doi: 10.1016/j.ijforecast.2018.10.011.

  • Koopman, S. J., Lit, R., Lucas, A., and Opschoor, A. (2018). Dynamic Discrete Copula Models for High-Frequency Stock Price Changes. Journal of Applied Econometrics, 33(7), 966–985. doi: 10.1002/jae.2645.

  • Skellam, J. G. (1946). The Frequency Distribution of the Difference Between Two Poisson Variates Belonging to Different Populations. Journal of the Royal Statistical Society, 109(3), 296. doi: 10.2307/2981372.

Zero-Inflated Skellam Distribution

Difference Parametrization

Parameters

  • First rate parameter \(r_1 \in (0, \infty)\)
  • Second rate parameter \(r_2 \in (0, \infty)\)
  • Inflation parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | r_1, r_2, p] &= \begin{cases} p + (1 - p) \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) & \text{ for } y = 0 \\ (1 - p) \exp(-r_1 - r_2) \left( \frac{r_1}{r_2} \right)^{\frac{y}{2}} I_y \left( 2 \sqrt{r_1 r_2} \right) & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= (1 - p) (r_1 - r_2) \\ \mathrm{var}[Y] &= (1 - p) \left( p \left( r_1 - r_2 \right)^2 + r_1 + r_2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r_1} (y; r_1, r_2, p) &= \begin{cases} \frac{(p - 1) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} & \text{ for } y = 0 \\ \sqrt{\frac{r_2}{r_1}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{r_2} (y; r_1, r_2, p) &= \begin{cases} \frac{(p - 1) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} & \text{ for } y = 0 \\ \sqrt{\frac{r_1}{r_2}} \frac{I_{y-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_y \left( 2 \sqrt{r_1 r_2} \right) } -\frac{y}{r_2} - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{p} (y; r_1, r_2, p) &= \begin{cases} \frac{\exp(r_1 + r_2) - I_0 \left( 2 \sqrt{r_1 r_2} \right)}{p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right)} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r_1, r_1} (r_1, r_2, p) &\approx (1 - p) \left( 1 - \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right) \left( 1 - \sqrt{\frac{r_2}{r_1}} \frac{I_{r_1 - r_2 -1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-r_1 - r_2) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)^2}{r_1 r_2 \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ \mathcal{I}_{r_1, r_2} (r_1, r_2, p) &\approx (1 - p) \left( 1 - \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right) \left( 1 - \sqrt{\frac{r_2}{r_1}} \frac{I_{r_1 - r_2-1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right) \\ & \qquad \times \left( \frac{r_1}{r_2} - \sqrt{\frac{r_1}{r_2}} \frac{I_{r_1 - r_2 - 1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right) \\ & \qquad + \frac{(1 - p)^2 \exp(-r_1 - r_2) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)}{r_1 r_2 \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ & \qquad \times \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right) \\ \mathcal{I}_{r_1, p} (r_1, r_2, p) &= \frac{(p - 1) \left( 1 - \exp(-r_1 - r_2 ) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ & \qquad \times \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_2 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right) \\ \mathcal{I}_{r_2, r_2} (r_1, r_2, p) &\approx (1 - p) \left( 1 - \exp(-r_1 - r_2) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right) \left( \frac{r_1}{r_2} - \sqrt{\frac{r_1}{r_2}} \frac{I_{r_1 - r_2 - 1} \left( 2 \sqrt{r_1 r_2} \right)}{I_{r_1 - r_2} \left( 2 \sqrt{r_1 r_2} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-r_1 - r_2) \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right)^2}{r_1 r_2 \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ \mathcal{I}_{r_2, p} (r_1, r_2, p) &= \frac{(p - 1) \left( 1 - \exp(-r_1 - r_2 ) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)}{\sqrt{r_1 r_2} \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ & \qquad \times \left( \sqrt{r_1 r_2} I_0 \left( 2 \sqrt{r_1 r_2} \right) - r_1 I_1 \left( 2 \sqrt{r_1 r_2} \right) \right) \\ \mathcal{I}_{p, p} (r_1, r_2, p) &= \frac{\exp(r_1 + r_2) - I_0 \left( 2 \sqrt{r_1 r_2} \right)}{(1 - p) \left( p \exp(r_1 + r_2) + (1 - p) I_0 \left( 2 \sqrt{r_1 r_2} \right) \right)} \\ \end{aligned} \]

Mean-Dispersion Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Dispersion parameter \(s \in (0, \infty)\)
  • Inflation parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | m, s, p] &= \begin{cases} p + (1 - p) \exp(-|m| - s) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) & \text{ for } y = 0 \\ (1 - p) \exp(-|m| - s) \left( \frac{|m| + m + s}{|m| - m + s} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 + 2 |m| s} \right) & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= (1 - p) m \\ \mathrm{var}[Y] &= (1 - p) \left( |m| + s + p m^2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{\mathrm{sgn}(m) (p - 1) \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{\sqrt{s^2 + 2 |m| s} \left( (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) + p \exp(|m| + s) \right)} & \text{ for } y = 0 \\ \frac{y}{2|m| + s} + \frac{\mathrm{sgn}(m) s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - \mathrm{sgn}(m) & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{s} (y; m, s, p) &= \begin{cases} \frac{ (p - 1) \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) }{\sqrt{s^2 + 2 |m| s} \left( (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) + p \exp(|m| + s) \right)} & \text{ for } y = 0 \\ - \frac{m y}{s^2 + 2 |m| s} + \frac{|m| + s}{2 \sqrt{s^2 + 2 |m| s}} \frac{ I_{y-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{y+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_y \left( \sqrt{s^2 + 2 |m| s} \right) } - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{\exp(|m| + s) - I_0 \left( \sqrt{s^2 + 2 |m| s} \right)}{p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right)} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s, p) &\approx \frac{s^2 (1 - p) \left( 1 - \exp(-|m|-s) I_{0} \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{4 s^2 + 8 |m| s} \\ & \qquad \times \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-|m| - s) }{\left( s^2 + 2 |m| s \right) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right)^2 \\ \mathcal{I}_{m, s} (m, s, p) &\approx \frac{\mathrm{sgn}(m) s (1 - p) (|m| + s) \left( 1 - \exp(-|m|-s) I_{0} \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{4 s^2 + 8 |m| s} \\ & \qquad \times \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ & \qquad + \frac{\mathrm{sgn}(m) (1 - p)^2 \exp(-|m| - s)}{\left( s^2 + 2 |m| s \right) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ \mathcal{I}_{m, p} (m, s, p) &= \frac{\mathrm{sgn}(m) (p - 1) \left( 1 - \exp(-|m| - s) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{\sqrt{s^2 + 2 |m| s} \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - s I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ \mathcal{I}_{s, s} (m, s, p) &\approx \frac{(1 - p) (|m| + s)^2 \left( 1 - \exp(-|m|-s) I_{0} \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{4 s^2 + 8 |m| s} \\ & \qquad \times \left( \frac{2 (|m| + s)}{\sqrt{s^2 + 2 |m| s}} - \frac{ I_{m-1} \left( \sqrt{s^2 + 2 |m| s} \right) + I_{m+1} \left( \sqrt{s^2 + 2 |m| s} \right) }{ I_m \left( \sqrt{s^2 + 2 |m| s} \right)} \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-|m| - s)}{\left( s^2 + 2 |m| s \right) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right)^2 \\ \mathcal{I}_{s, p} (m, s, p) &= \frac{(p - 1) \left( 1 - \exp(-|m| - s) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)}{\sqrt{s^2 + 2 |m| s} \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 + 2 |m| s} I_0 \left( \sqrt{s^2 + 2 |m| s} \right) - (|m| + s) I_1 \left( \sqrt{s^2 + 2 |m| s} \right) \right) \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{\exp(|m| + s) - I_0 \left( \sqrt{s^2 + 2 |m| s} \right)}{(1 - p) \left( p \exp(|m| + s) + (1 - p) I_0 \left( \sqrt{s^2 + 2 |m| s} \right) \right)} \\ \end{aligned} \]

Mean-Variance Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Variance parameter \(s \in (|m|, \infty)\)
  • Inflation parameter \(p \in (0, 1)\)

Probability Mass Function

\[ \begin{aligned} \mathrm{P} [Y = y | m, s, p] &= \begin{cases} p + (1 - p) \exp(-s) I_0 \left( \sqrt{s^2 - m^2} \right) & \text{ for } y = 0 \\ (1 - p) \exp(-s) \left( \frac{s + m}{s - m} \right)^{\frac{y}{2}} I_y \left( \sqrt{s^2 - m^2} \right) & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= (1 - p) m \\ \mathrm{var}[Y] &= (1 - p) \left( s + p m^2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s, p) &= \begin{cases} \frac{m (p - 1) I_{1} \left( \sqrt{s^2 - m^2} \right)}{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)} & \text{ for } y = 0 \\ \frac{s y}{s^2 - m^2} - \frac{m}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{s} (y; m, s, p) &= \begin{cases} \frac{ (p - 1) \left( \sqrt{s^2 - m^2} I_{0} \left( \sqrt{s^2 - m^2} \right) - s I_{1} \left( \sqrt{s^2 - m^2} \right) \right) }{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)} & \text{ for } y = 0 \\ -\frac{m y}{s^2 - m^2} + \frac{s}{2 \sqrt{s^2 - m^2}} \frac{ I_{y-1} \left( \sqrt{s^2 - m^2} \right) + I_{y+1} \left( \sqrt{s^2 - m^2} \right) }{ I_y \left( \sqrt{s^2 - m^2} \right) } - 1 & \text{ for } y \neq 0 \\ \end{cases} \\ \nabla_{p} (y; m, s, p) &= \begin{cases} \frac{\exp(s) - I_{0} \left( \sqrt{s^2 - m^2} \right)}{p \exp(s) + (1 - p) I_{0} \left( \sqrt{s^2 - m^2} \right)} & \text{ for } y = 0 \\ \frac{1}{p - 1} & \text{ for } y \neq 0 \\ \end{cases} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s, p) &\approx \frac{m^2 (1 - p) \left( 1 - \exp(-s) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)}{4 \left( s^2 - m^2 \right)} \\ & \qquad \times \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ & \qquad + \frac{m^2 (1 - p)^2 \exp(-s) I_{1} \left( \sqrt{s^2 - m^2} \right)^2}{\left( s^2 - m^2 \right) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ \mathcal{I}_{m, s} (m, s, p) &\approx \frac{m s (p - 1) \left( 1 - \exp(-s) I_{0} \left( \sqrt{s^2 - m^2} \right) \right) }{4 \left( s^2 - m^2 \right)} \\ & \qquad \times \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ & \qquad + \frac{m (1 - p)^2 \exp(-s) I_{1} \left( \sqrt{s^2 - m^2} \right)}{\left( s^2 - m^2 \right) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 - m^2} I_0 \left( \sqrt{s^2 - m^2} \right) - s I_1 \left( \sqrt{s^2 - m^2} \right) \right) \\ \mathcal{I}_{m, p} (m, s, p) &= \frac{m (p - 1) \left( 1 - \exp(-s) I_0 \left( \sqrt{s^2 - m^2} \right) \right) I_1 \left( \sqrt{s^2 - m^2} \right)}{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ \mathcal{I}_{s, s} (m, s, p) &\approx \frac{s^2 (1 - p) \left( 1 - \exp(-s) I_{0} \left( \sqrt{s^2 - m^2} \right) \right)}{4 \left( s^2 - m^2 \right)} \\ & \qquad \times \left( \frac{2 s}{\sqrt{s^2 - m^2}} - \frac{ I_{m-1} \left( \sqrt{s^2 - m^2} \right) + I_{m+1} \left( \sqrt{s^2 - m^2} \right) }{ I_m \left( \sqrt{s^2 - m^2} \right) } \right)^2 \\ & \qquad + \frac{(1 - p)^2 \exp(-s) \left( \sqrt{s^2 - m^2} I_0 \left( \sqrt{s^2 - m^2} \right) - s I_1 \left( \sqrt{s^2 - m^2} \right) \right)^2}{\left( s^2 - m^2 \right) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ \mathcal{I}_{s, p} (m, s, p) &= \frac{(p - 1) \left( 1 - \exp(-s) I_0 \left( \sqrt{s^2 - m^2} \right) \right) }{\sqrt{s^2 - m^2} \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right)} \\ & \qquad \times \left( \sqrt{s^2 - m^2} I_0 \left( \sqrt{s^2 - m^2} \right) - s I_1 \left( \sqrt{s^2 - m^2} \right) \right) \\ \mathcal{I}_{p, p} (m, s, p) &= \frac{\exp(s) - I_0 \left( \sqrt{s^2 - m^2} \right)}{(1 - p) \left( p \exp(s) + (1 - p) I_0 \left( \sqrt{s^2 - m^2} \right) \right) } \\ \end{aligned} \]

Note

  • The computation of the Fisher information for the first two parameters is quite intricate and we resort to an approximation by replacing \(y\) with its expected value combined with the zero value.

Further Reading

  • Karlis, D. and Ntzoufras, I. (2009). Bayesian Modelling of Football Outcomes: Using the Skellam’s Distribution for the Goal Difference. IMA Journal of Management Mathematics, 20(2), 133–145. doi: 10.1093/imaman/dpn026.

Circular Data

von Mises Distribution

Mean-Concentration Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Concentration parameter \(v \in (0, \infty)\)

Density Function

\[ f(y | m, v) = \frac{1}{2 \pi I_0(v)} \exp \left( v \cos(y - m) \right) \]

Circular Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= 1 - \frac{I_1(v)}{I_0(v)} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, v) &= v \sin(y - m) \\ \nabla_{v} (y; m, v) &= \cos(y - m) - \frac{I_1(v)}{I_0(v)} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, v) &= v \frac{I_1(v)}{I_0(v)} \\ \mathcal{I}_{m, v} (m, v) &= 0 \\ \mathcal{I}_{v, v} (m, v) &= \frac{1}{2} - \left( \frac{I_1(v)}{I_0(v)} \right)^2 + \frac{I_2(v)}{2 I_0(v)} \\ \end{aligned} \]

Further Reading

  • Harvey, A., Hurn, S., and Thiele, S. (2019). Modeling Directional (Circular) Time Series. Cambridge Working Papers in Economics, CWPE 1971. doi: 10.17863/cam.43915.

Interval Data

Beta Distribution

Concentration Parametrization

Parameters

  • First concentration parameter \(a_1 \in (0, \infty)\)
  • Second concentration parameter \(a_2 \in (0, \infty)\)

Density Function

\[ f(y | a_1, a_2) = \frac{1}{B(a_1, a_2)} y^{a_1 - 1} (1 - y)^{a_2 - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{a_1}{a_1 + a_2} \\ \mathrm{var}[Y] &= \frac{a_1 a_2}{(a_1 + a_2)^2 (a_1 + a_2 + 1)} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{a} (y; a_1, a_2) &= \psi_0(a_1 + a_2) - \psi_0(a_1) + \ln(y) \\ \nabla_{b} (y; a_1, a_2) &= \psi_0(a_1 + a_2) - \psi_0(a_2) + \ln(1 - y) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{a_1, a_1} (a_1, a_2) &= \psi_1(a_1) - \psi_1(a_1 + a_2) \\ \mathcal{I}_{a_1, a_2} (a_1, a_2) &= -\psi_1(a_1 + a_2) \\ \mathcal{I}_{a_2, a_2} (a_1, a_2) &= \psi_1(a_2) - \psi_1(a_1 + a_2) \\ \end{aligned} \]

Mean-Size Parametrization

Parameters

  • Mean parameter \(m \in (0, 1)\)
  • Size parameter \(v \in (0, \infty)\)

Density Function

\[ f(y | m, v) = \frac{1}{B(m v, (1 - m) v)} y^{m v - 1} (1 - y)^{(1 - m) v - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= \frac{m (1 - m)}{v + 1} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, v) &= \frac{v}{1 - m} (\psi_0(v) - \psi_0(m v) + \ln(y)) \\ \nabla_{v} (y; m, v) &= \psi_0(v) - \psi_0(v - m v) + \ln(1 - y) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, v) &= \frac{v^2}{(1 - m)^2} (\psi_1(m v) - \psi_1(v)) \\ \mathcal{I}_{m, v} (m, v) &= \frac{v}{m - 1} \psi_1(v) \\ \mathcal{I}_{v, v} (m, v) &= \psi_1(v - m v) - \psi_1(v) \\ \end{aligned} \]

Mean-Variance Parametrization

Parameters

  • Mean parameter \(m \in (0, 1)\)
  • Variance parameter \(s \in (0, m (1 - m))\)

Density Function

\[ f(y | m, s) = \frac{1}{B \left( m \left( \frac{m - m^2}{s} - 1 \right), (1 - m) \left( \frac{m - m^2}{s} - 1 \right) \right)} y^{m \left( \frac{m - m^2}{s} - 1 \right) - 1} (1 - y)^{(1 - m) \left( \frac{m - m^2}{s} - 1 \right) - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= s \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{m^2 - m + s}{(m - 1) s} \left( \psi_0 \left( \frac{m - m^2}{s} - 1 \right) - \psi_0 \left( m \left( \frac{m - m^2}{s} - 1 \right) \right) + \ln(y) \right) \\ \nabla_{s} (y; m, s) &= \frac{s^2 (3 m^2 - 2 m + s)}{m (m - 1) (m^2 - m + s)} \left( \psi_0 \left( \frac{m - m^2}{s} - 1 \right) - \psi_0 \left( (1 - m) \left( \frac{m - m^2}{s} - 1 \right) \right) + \ln(1 - y) \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{(m^2 - m + s)^2}{(m - 1)^2 s^2} \left( \psi_1 \left( m \left( \frac{m - m^2}{s} - 1 \right) \right) - \psi_1 \left( \frac{m - m^2}{s} - 1 \right) \right) \\ \mathcal{I}_{m, s} (m, s) &= \frac{s (2 m - 3 m^2 - s)}{m (m^2 - 2 m + 1)} \psi_1 \left( \frac{m - m^2}{s} - 1 \right) \\ \mathcal{I}_{s, s} (m, s) &= \frac{s^2 (3 m^2 - 2 m + s)^2}{(m - 1)^4 m^2} \left( \psi_1 \left( (1 - m) \left( \frac{m - m^2}{s} - 1 \right) \right) - \psi_1 \left( \frac{m - m^2}{s} - 1 \right) \right) \\ \end{aligned} \]

Kumaraswamy Distribution

Concentration Parametrization

Parameters

  • First concentration parameter \(a \in (0, \infty)\)
  • Second concentration parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | a, b) = a b y^{a - 1} \left(1 - y^a \right)^{b - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= b B \left(1 + \frac{1}{a}, b \right) \\ \mathrm{var}[Y] &= b B \left(1 + \frac{2}{a}, b \right) - b^2 B \left(1 + \frac{1}{a}, b \right)^2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{a} (y; a, b) &= \frac{(b - 1) \ln(y)}{y^a - 1} + b \ln(y) + \frac{1}{a} \\ \nabla_{b} (y; a, b) &= \ln \left( 1 - y^a \right) + \frac{1}{b} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{a, a} (a, b) &= \frac{1}{a^2} + \frac{b}{a^2 (b - 2)} \left( \left(\psi_0(b) - \psi_0(2) \right)^2 - \left( \psi_1(b) - \psi_1(2) \right) \right) \\ \mathcal{I}_{a, b} (a, b) &= - \frac{\psi_0(b + 1) - \psi_0(2)}{a (b - 1)} \\ \mathcal{I}_{b, b} (a, b) &= \frac{1}{b^2} \\ \end{aligned} \]

Further Reading

  • Jones, M. C. (2009). Kumaraswamy’s Distribution: A Beta-Type Distribution with Some Tractability Advantages. Statistical Methodology, 6(1), 70–81. doi: 10.1016/j.stamet.2008.04.001.

Logit-Normal Distribution

Logit-Mean-Variance Parametrization

Parameters

  • Logit-mean parameter \(m \in \mathbb{R}\)
  • Logit-variance parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | m, s) = \frac{1}{y (1 - y)}\frac{1}{\sqrt{2 \pi s}} \exp \left( - \frac{1}{2 s} \left( \ln \left( \frac{y}{1-y} \right) - m \right)^2 \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &\approx \frac{1}{K - 1} \sum_{k = 1}^{K - 1} \frac{1}{1 + \exp( - \Phi^{-1}_{m,s} (k / K)} \\ \mathrm{var}[Y] &\approx \frac{1}{K - 1} \sum_{k = 1}^{K - 1} \left( \frac{1}{1 + \exp( - \Phi^{-1}_{m,s} (k / K)} \right)^2 - \left( \frac{1}{K - 1} \sum_{k = 1}^{K - 1} \frac{1}{1 + \exp( - \Phi^{-1}_{m,s} (k / K)} \right)^2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{1}{s} \left( \ln \left( \frac{y}{1-y} \right) - m \right) \\ \nabla_{s} (y; m, s) &= \frac{1}{2 s^2} \left( \ln \left( \frac{y}{1-y} \right) - m \right)^2 - \frac{1}{2s} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{s} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{2 s^2} \\ \end{aligned} \]

Note

  • The mean and variance have no analytic solution. We use the quasi Monte Carlo approximation with \(K=1000\).

Compositional Data

Dirichlet Distribution

Concentration Parametrization

Parameters

  • Concentration parameters \(a_i \in (0, \infty)\), \(i = 1,\ldots,n\)

Vector Notation

  • Concentration vector \(\boldsymbol{a}\) of length \(n\)

Density Function

\[ f(\boldsymbol{y} | \boldsymbol{a}) = \frac{1}{B(\boldsymbol{a})} \prod_{i=1}^n y_i^{a_i - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \frac{1}{\sum_{i=1}^n a_i} \boldsymbol{a} \\ \mathrm{var}[\boldsymbol{Y}] &= \frac{1}{1 + \sum_{i=1}^n a_i} \left( \frac{1}{\sum_{i=1}^n a_i} \mathrm{diag}(\boldsymbol{a}) - \frac{1}{\left( \sum_{i=1}^n a_i \right)^2} \boldsymbol{a} \boldsymbol{a}^\intercal \right) \\ \end{aligned} \]

Score

\[ \nabla_{\boldsymbol{a}} (\boldsymbol{y}; \boldsymbol{a}) = \ln(\boldsymbol{y}) - \psi_0 (\boldsymbol{a}) + \psi_0 \left( \sum_{i=1}^n a_i \right) \\ \]

Fisher Information

\[ \mathcal{I}_{\boldsymbol{a}, \boldsymbol{a}} (\boldsymbol{a}) = \mathrm{diag} \left( \psi_1 \left( \boldsymbol{a} \right) \right) - \psi_1 \left( \sum_{i=1}^n a_i \right) \\ \]

Further Reading

  • Calvori, F., Cipollini, F., and Gallo, G. M. (2013). Go with the Flow: A GAS Model For Predicting Intra-Daily Volume Shares. SSRN, 2363483. doi: 10.2139/ssrn.2363483.

Duration Data

Birnbaum–Saunders Distribution

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • Shape parameter \(a \in (0, \infty)\)

Density Function

\[ f(y | s, a) = \frac{\sqrt{\frac{s}{y}} \left( 1 + \frac{s}{y} \right)}{2 a s \sqrt{2 \pi}} \exp \left( \frac{2 - \frac{y}{s} - \frac{s}{y}}{2 a^2} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s \left( 1 + \frac{a^2}{2} \right) \\ \mathrm{var}[Y] &= s^2 a^2 \left( 1 + \frac{5 a^2}{4} \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, a) &= \frac{y}{2 a^2 s^2} - \frac{1}{2 a^2 y} + \frac{1}{s + y} - \frac{1}{2 s} \\ \nabla_{a} (y; s, a) &= \frac{y}{a^3 s} + \frac{s}{a^3 y} - \frac{2 + a^2}{a^3} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, a) &= \frac{1}{a^2 s^2} \left( 1 + \frac{a}{\sqrt{2 \pi}} \left( a \sqrt{\frac{\pi}{2}} - \pi \exp \left( \frac{2}{a^2} \right) \left(1 - \Phi \left( \frac{2}{a} \right) \right) \right) \right) \\ \mathcal{I}_{s, a} (s, a) &= 0 \\ \mathcal{I}_{a, a} (s, a) &= \frac{2}{a^2} \\ \end{aligned} \]

Further Reading

  • Lemonte, A. J. (2021). A Note on the Fisher Information Matrix of the Birnbaum–Saunders Distribution. Journal of Statistical Theory and Applications, 15(2), 196–205. doi: 10.2991/jsta.2016.15.2.9.

Burr Distribution

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • First shape parameter \(a \in (0, \infty)\)
  • Second shape parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | s, a, b) = \frac{a b}{s} \left( \frac{y}{s} \right)^{a - 1} \left( 1 + \left( \frac{y}{s} \right)^a \right)^{-b - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s b B \left( b - \frac{1}{a}, 1 + \frac{1}{a} \right), & \quad \text{for } a &> 1 \\ \mathrm{var}[Y] &= s^2 b B \left( b - \frac{2}{a}, 1 + \frac{2}{a} \right) - s^2 b^2 B \left( b - \frac{1}{a}, 1 + \frac{1}{a} \right)^2, & \quad \text{for } a &> 2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, a, b) &= \frac{a}{s} \left(b \left (\frac{y}{s} \right)^a - 1 \right) \left( \left( \frac{y}{s} \right)^a + 1 \right)^{-1} \\ \nabla_{a} (y; s, a, b) &= \frac{1}{a} - \left( b \left( \frac{y}{s} \right)^a - 1 \right) \ln \left( \frac{y}{s} \right) \left( \left( \frac{y}{s} \right)^a + 1 \right)^{-1} \\ \nabla_{b} (y; s, a, b) &= \frac{1}{b} - \ln \left( \left( \frac{y}{s} \right)^a + 1 \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, a, b) &= \frac{a^2 b}{s^2 (b + 2)} \\ \mathcal{I}_{s, a} (s, a, b) &= - \frac{b ( 1 - \gamma - \psi_0(b + 1))}{s (b + 2)} \\ \mathcal{I}_{s, b} (s, a, b) &= - \frac{a}{s (b + 1)} \\ \mathcal{I}_{a, a} (s, a, b) &= \frac{1}{\alpha^2} \left( 1 + \frac{b}{b + 2} \left( \frac{\pi^2}{6} + \gamma^2 - 2 \gamma + 2 (\gamma - 1) \psi_0(b + 1) + \psi_0(b + 1)^2 + \psi_1(b + 1) \right) \right) \\ \mathcal{I}_{a, b} (s, a, b) &= \frac{1 - \gamma - \psi_0(b)}{a (b + 1)} \\ \mathcal{I}_{b, b} (s, a, b) &= \frac{1}{b^2} \\ \end{aligned} \]

Further Reading

  • Watkins, A. J. (1997). A Note on Expected Fisher Information for the Burr XII Distribution. Microelectronics Reliability, 37(12), 1849–1852. doi: 10.1016/s0026-2714(97)00030-9.

Exponential Distribution

Rate Parametrization

Parameter

  • Rate parameter \(r \in (0, \infty)\)

Density Function

\[ f(y | r) = r \exp \left( -r y \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{1}{r} \\ \mathrm{var}[Y] &= \frac{1}{r^2} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r} (y; r) &= \frac{1}{r} - y \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r, r} (r) &= \frac{1}{r^2} \\ \end{aligned} \]

Scale Parametrization

Parameter

  • Scale parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | s) = \frac{1}{s} \exp \left( - \frac{y}{s} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s \\ \mathrm{var}[Y] &= s^2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s) &= \frac{y - s}{s^2} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s) &= \frac{1}{s^2} \\ \end{aligned} \]

Further Reading

  • Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

Exponential-Logarithmic Distribution

Rate Parametrization

Parameters

  • Rate parameter \(r \in (0, \infty)\)
  • Shape parameter \(p \in (0, 1)\)

Density Function

\[ f(y | r, p) = \frac{r}{- \ln(p)} \frac{(1 - p) \exp(-r y)}{1 - (1 - p) \exp(-r y)} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= - \frac{\mathrm{Li}_2(1 - p)}{r \ln(p)} \\ \mathrm{var}[Y] &= - 2 \frac{\mathrm{Li}_3(1 - p)}{r^2 \ln(p)} - \left( \frac{\mathrm{Li}_2(1 - p)}{r \ln(p)} \right)^2 \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r} (y; r, p) &= \frac{1}{r} - y - \frac{y (1 - p) \exp(-r y)}{1 - (1 - p) \exp(-r y)} \\ \nabla_{p} (y; r, p) &= -\frac{1}{p log(p)} - \frac{1}{1 - p} - \frac{\exp(-r y)}{1 - (1 - p) \exp(-r y)} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r, r} (r, p) &= -\frac{\mathrm{Li}_2(1 - p)}{r^2 \ln(p)} \\ \mathcal{I}_{r, p} (r, p) &= \frac{1 - p + p \ln(p)}{2 r p (1 - p) \ln(p)} \\ \mathcal{I}_{p, p} (r, p) &= \frac{1}{(1 - p)^2} - \frac{\ln(p) + 1}{(p \ln(p))^2} + \frac{1 - 4 p + 3 p^2 - 2 p^2 \ln(p)}{2 p^2 (1 - p)^2 \ln(p)} \\ \end{aligned} \]

Further Reading

  • Tahmasbi, R. and Rezaei, S. (2008). A Two-Parameter Lifetime Distribution with Decreasing Failure Rate. Computational Statistics and Data Analysis, 52(8), 3889–3901 doi: 10.1016/j.csda.2007.12.002.

Fisk Distribution

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • Shape parameter \(a \in (0, \infty)\)

Density Function

\[ f(y | s, a) = \frac{a}{s} \left( \frac{y}{s} \right)^{a - 1} \left( 1 + \left ( \frac{y}{s} \right)^a \right)^{-2} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s \frac{\pi / a}{\sin(\pi / a)}, & \quad \text{for } a &> 1 \\ \mathrm{var}[Y] &= s^2 \left( \frac{2 \pi / a}{\sin(2 \pi / a)} - \frac{\pi^2 / a^2}{\sin(\pi / a)^2} \right), & \quad \text{for } a &> 2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, a) &= \frac{a}{s} \left( \left( \frac{y}{s} \right)^a - 1 \right) \left( \left( \frac{y}{s} \right)^a + 1 \right)^{-1} \\ \nabla_{a} (y; s, a) &= \frac{1}{a} - \left( \left( \frac{y}{s} \right)^a - 1 \right) \ln \left( \frac{y}{s} \right) \left( \left( \frac{y}{s} \right)^a + 1 \right)^{-1} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, a) &= \frac{a^2}{3 s^2} \\ \mathcal{I}_{s, a} (s, a) &= 0 \\ \mathcal{I}_{a, a} (s, a) &= \frac{\pi^2 + 3}{9 a^2} \\ \end{aligned} \]

Gamma Distribution

Rate Parametrization

Parameters

  • Rate parameter \(r \in (0, \infty)\)
  • Shape parameter \(a \in (0, \infty)\)

Density Function

\[ f(y | r, a) = \frac{r}{\Gamma(a)} (r y)^{a - 1} \exp \left( -r y \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{a}{r} \\ \mathrm{var}[Y] &= \frac{a}{r^2} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r} (y; r, a) &= \frac{a - r y}{r} \\ \nabla_{a} (y; r, a) &= \ln(r y) - \psi_0(a) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r, r} (r, a) &= \frac{a}{r^2} \\ \mathcal{I}_{r, a} (r, a) &= - \frac{1}{r} \\ \mathcal{I}_{a, a} (r, a) &= \psi_1(a) \\ \end{aligned} \]

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • Shape parameter \(a \in (0, \infty)\)

Density Function

\[ f(y | s, a) = \frac{1}{\Gamma(a)} \frac{1}{s} \left( \frac{y}{s} \right)^{a - 1} \exp \left( - \frac{y}{s} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= a s \\ \mathrm{var}[Y] &= a s^2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, a) &= \frac{y - a s}{s^2} \\ \nabla_{a} (y; s, a) &= \ln \left( \frac{y}{s} \right) - \psi_0(a) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, a) &= \frac{a}{s^2} \\ \mathcal{I}_{s, a} (s, a) &= \frac{1}{s} \\ \mathcal{I}_{a, a} (s, a) &= \psi_1(a) \\ \end{aligned} \]

Further Reading

  • Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

Generalized Gamma Distribution

Rate Parametrization

Parameters

  • Rate parameter \(r \in (0, \infty)\)
  • First shape parameter \(a \in (0, \infty)\)
  • Second shape parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | r, a, b) = \frac{r b}{\Gamma(a)} (r y)^{a b - 1} \exp \left( -(r y)^b \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{1}{r} \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \\ \mathrm{var}[Y] &= \frac{1}{r^2} \left( \frac{\Gamma \left(a + 2 b^{-1} \right)}{\Gamma \left( a \right) } - \left( \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \right)^2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r} (y; r, a, b) &= \frac{b}{r} \left( a - (r y)^b \right) \\ \nabla_{a} (y; r, a, b) &= b \ln(r y) - \psi_0(a) \\ \nabla_{b} (y; r, a, b) &= \left( a - (r y)^b \right) \ln (r y) + \frac{1}{b} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r, r} (r, a, b) &= \frac{a b^2}{r^2} \\ \mathcal{I}_{r, a} (r, a, b) &= - \frac{b}{r} \\ \mathcal{I}_{r, b} (r, a, b) &= \frac{a \psi_0(a) + 1}{r} \\ \mathcal{I}_{a, a} (r, a, b) &= \psi_1(a) \\ \mathcal{I}_{a, b} (r, a, b) &= - \frac{\psi_0(a)}{b} \\ \mathcal{I}_{b, b} (r, a, b) &= \frac{a \psi_0(a)^2 + 2 \psi_0(a) + a \psi_1(a) + 1}{b^2} \\ \end{aligned} \]

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • First shape parameter \(a \in (0, \infty)\)
  • Second shape parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | s, a, b) = \frac{1}{\Gamma(a)} \frac{b}{s} \left( \frac{y}{s} \right)^{a b - 1} \exp \left( - \left( \frac{y}{s} \right)^b \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \\ \mathrm{var}[Y] &= s^2 \left( \frac{\Gamma \left(a + 2 b^{-1} \right)}{\Gamma \left( a \right) } - \left( \frac{\Gamma \left(a + b^{-1} \right)}{\Gamma \left( a \right) } \right)^2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, a, b) &= \frac{b}{s} \left( \left( \frac{y}{s} \right)^b - a \right) \\ \nabla_{a} (y; s, a, b) &= b \ln \left( \frac{y}{s} \right) - \psi_0(a) \\ \nabla_{b} (y; s, a, b) &= \left( a - \left( \frac{y}{s} \right)^b \right) \ln \left( \frac{y}{s} \right) + \frac{1}{b} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, a, b) &= \frac{a b^2}{s^2} \\ \mathcal{I}_{s, a} (s, a, b) &= \frac{b}{s} \\ \mathcal{I}_{s, b} (s, a, b) &= - \frac{a \psi_0(a) + 1}{s} \\ \mathcal{I}_{a, a} (s, a, b) &= \psi_1(a) \\ \mathcal{I}_{a, b} (s, a, b) &= - \frac{\psi_0(a)}{b} \\ \mathcal{I}_{b, b} (s, a, b) &= \frac{a \psi_0(a)^2 + 2 \psi_0(a) + a \psi_1(a) + 1}{b^2} \\ \end{aligned} \]

Further Reading

  • Park, T. R. (2014). Derivation of the Fisher Information Matrix for 4-Parameter Generalized Gamma Distribution Using Mathematica. Journal of the Chosun Natural Science, 7(2), 138–144. doi: 10.13160/ricns.2014.7.2.138.

  • Stacy, E. W. (1962). A Generalization of the Gamma Distribution. The Annals of Mathematical Statistics, 33(3), 1187–1192. doi: 10.1214/aoms/1177704481.

  • Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

Log-Normal Distribution

Log-Mean-Variance Parametrization

Parameters

  • Log-mean parameter \(m \in \mathbb{R}\)
  • Log-variance parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | m, s) = \frac{1}{y}\frac{1}{\sqrt{2 \pi s}} \exp \left( - \frac{\left( \ln(y) - m \right)^2}{2 s} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \exp \left( m + \frac{s}{2} \right) \\ \mathrm{var}[Y] &= \left( \exp(s) - 1 \right) \exp \left( 2 m + s \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{\ln(y) - m}{s} \\ \nabla_{s} (y; m, s) &= \frac{\left( \ln(y) - m \right)^2}{2 s^2} - \frac{1}{2s} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{s} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{2 s^2} \\ \end{aligned} \]

Lomax Distribution

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • Shape parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | s, b) = \frac{b}{s} \left( 1 + \frac{y}{s} \right)^{-b - 1} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{s}{b - 1}, & \quad \text{for } a &> 1 \\ \mathrm{var}[Y] &= \frac{s^2 b}{(b - 1)^2 (b - 2)}, & \quad \text{for } a &> 2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, b) &= \frac{1}{s} \left( b \frac{y}{s} - 1 \right) \left( \frac{y}{s} + 1 \right)^{-1} \\ \nabla_{b} (y; s, b) &= \frac{1}{b} - \ln \left( \frac{y}{s} + 1 \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, b) &= \frac{b}{s^2 (b + 2)} \\ \mathcal{I}_{s, b} (s, b) &= - \frac{1}{s (b + 1)} \\ \mathcal{I}_{b, b} (s, b) &= \frac{1}{b^2} \\ \end{aligned} \]

Rayleigh Distribution

Scale Parametrization

Parameter

  • Scale parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | s) = \frac{y}{s^2} \exp \left(- \frac{y^2}{2 s^2} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s \sqrt{\frac{\pi}{2}} \\ \mathrm{var}[Y] &= s^2 \frac{4 - \pi}{2}\\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s) &= \frac{y^2 - 2 s^2}{s^3} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s) &= \frac{4}{s^2} \\ \end{aligned} \]

Weibull Distribution

Rate Parametrization

Parameters

  • Rate parameter \(r \in (0, \infty)\)
  • Shape parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | r, b) = r b (r y)^{b - 1} \exp \left( -(r y)^b \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= \frac{1}{r} \Gamma \left(1 + b^{-1} \right) \\ \mathrm{var}[Y] &= \frac{1}{r^2} \left( \Gamma \left(1 + 2 b^{-1} \right) - \Gamma \left(1 + b^{-1} \right)^2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{r} (y; r, b) &= \frac{b}{r} \left( 1 - (r y)^b \right) \\ \nabla_{b} (y; r, b) &= \left( 1 - (r y)^b \right) \ln (r y) + \frac{1}{b} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{r, r} (r, b) &= \frac{b^2}{r^2} \\ \mathcal{I}_{r, b} (r, b) &= \frac{\psi_0(1) + 1}{r} \\ \mathcal{I}_{b, b} (r, b) &= \frac{\psi_0(1)^2 + 2 \psi_0(1) + \psi_1(1) + 1}{b^2} \\ \end{aligned} \]

Scale Parametrization

Parameters

  • Scale parameter \(s \in (0, \infty)\)
  • Shape parameter \(b \in (0, \infty)\)

Density Function

\[ f(y | s, b) = \frac{b}{s} \left( \frac{y}{s} \right)^{b - 1} \exp \left( - \left( \frac{y}{s} \right)^b \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= s \Gamma \left(1 + b^{-1} \right) \\ \mathrm{var}[Y] &= s^2 \left( \Gamma \left(1 + 2 b^{-1} \right) - \Gamma \left(1 + b^{-1} \right)^2 \right) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{s} (y; s, b) &= \frac{b}{s} \left( \left( \frac{y}{s} \right)^b - 1 \right) \\ \nabla_{b} (y; s, b) &= \left( 1 - \left( \frac{y}{s} \right)^b \right) \ln \left( \frac{y}{s} \right) + \frac{1}{b} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{s, s} (s, b) &= \frac{b^2}{s^2} \\ \mathcal{I}_{s, b} (s, b) &= - \frac{\psi_0(1) + 1}{s} \\ \mathcal{I}_{b, b} (s, b) &= \frac{\psi_0(1)^2 + 2 \psi_0(1) + \psi_1(1) + 1}{b^2} \\ \end{aligned} \]

Further Reading

  • Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.

Real Data

Asymmetric Laplace Distribution

Mean-Scale Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Scale parameter \(s \in (0, \infty)\)
  • Asymmetry parameter \(a \in (0, \infty)\)

Density Function

\[ f(y | m, s, a) = \frac{1}{s \left( 1 / a + a\right)} \exp \left\{- \frac{\lvert y - m \rvert}{s} a^{\mathrm{sign}(y - m)} \right\} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m + s (1 / a - a) \\ \mathrm{var}[Y] &= s^2 (1 / a^2 + a^2) \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s, a) &= \frac{\mathrm{sign}(y - m) a^{\mathrm{sign}(y - m)}}{s} \\ \nabla_{s} (y; m, s, a) &= \frac{\lvert y - m \rvert a^{\mathrm{sign}(y - m)}}{s^2} - \frac{1}{s} \\ \nabla_{a} (y; m, s, a) &= -\frac{(y - m) a^{\mathrm{sign}(y - m)}}{s} + \frac{1 - a^2}{a + a^3} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s, a) &= \frac{1}{s^2} \\ \mathcal{I}_{m, s} (m, s, a) &= 0 \\ \mathcal{I}_{m, a} (m, s, a) &= -\frac{2}{s (1 + a^2)} \\ \mathcal{I}_{s, s} (m, s, a) &= \frac{1}{s^2} \\ \mathcal{I}_{s, a} (m, s, a) &= -\frac{1}{s a} \frac{1 - a^2}{1 + a^2} \\ \mathcal{I}_{a, a} (m, s, a) &= \frac{1}{a^2} + \frac{4}{(1 + a^2)^2} \\ \end{aligned} \]

Laplace Distribution

Mean-Scale Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Scale parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | m, s) = \frac{1}{2s} \exp \left\{- \frac{\lvert y - m \rvert}{s} \right\} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= 2s^2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{\mathrm{sign}(y - m)}{s} \\ \nabla_{s} (y; m, s) &= \frac{\lvert y - m \rvert}{s^2} - \frac{1}{s} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{s^2} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{s^2} \\ \end{aligned} \]

Logistic Distribution

Mean-Scale Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Scale parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | m, s) = \frac{1}{s} \exp \left( - \frac{x - m}{s} \right) \left( 1 + \exp \left( - \frac{x - m}{s} \right) \right)^{-2} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= \frac{\pi^2}{3} s^2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{1}{s} \tanh \left( \frac{y - m}{2 s} \right) \\ \nabla_{s} (y; m, s) &= \frac{y - m}{s^2} \tanh \left( \frac{y - m}{2 s} \right) - \frac{1}{s} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{3 s^2} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{3 s^2} \left( \frac{\pi^2}{3} + 1 \right) \\ \end{aligned} \]

Normal Distribution

Mean-Variance Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Variance parameter \(s \in (0, \infty)\)

Density Function

\[ f(y | m, s) = \frac{1}{\sqrt{2 \pi s}} \exp \left( -\frac{(y - m)^2}{2 s} \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m \\ \mathrm{var}[Y] &= s \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s) &= \frac{y - m}{s} \\ \nabla_{s} (y; m, s) &= \frac{(y - m)^2 - s}{2 s^2} \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s) &= \frac{1}{s} \\ \mathcal{I}_{m, s} (m, s) &= 0 \\ \mathcal{I}_{s, s} (m, s) &= \frac{1}{2 s^2} \\ \end{aligned} \]

Student’s t Distribution

Mean-Variance Parametrization

Parameters

  • Mean parameter \(m \in \mathbb{R}\)
  • Variance parameter \(s \in (0, \infty)\)
  • Degrees of freedom parameter \(v \in (0, \infty)\)

Density Function

\[ f(y | m, s, v) = \frac{\Gamma \left( \frac{v + 1}{2} \right)}{\Gamma \left( \frac{v}{2} \right) \sqrt{\pi s v}} \left( 1 + \frac{(y - m)^2}{s v} \right)^{-\frac{v + 1}{2}} \]

Moments

\[ \begin{aligned} \mathrm{E}[Y] &= m, & \quad \text{for } v &> 1 \\ \mathrm{var}[Y] &= \frac{v}{v - 2} s, & \quad \text{for } v &> 2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{m} (y; m, s, v) &= \frac{(v + 1) (y - m) }{(y - m)^2 + s v} \\ \nabla_{s} (y; m, s, v) &= \frac{v}{2s} \frac{(y - m)^2 - s}{(y - m)^2 + s v} \\ \nabla_{v} (y; m, s, v) &= \frac{1}{2} \frac{(y - m)^2 - s}{(y - m)^2 + s v} - \frac{1}{2} \ln \left(1 + \frac{1}{v} \frac{(y - m)^2}{s} \right) - \frac{1}{2} \psi_0 \left( \frac{v}{2} \right) + \frac{1}{2} \psi_0 \left( \frac{v + 1}{2} \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{m, m} (m, s, v) &= \frac{v + 1}{s (v + 3)} \\ \mathcal{I}_{m, s} (m, s, v) &= 0 \\ \mathcal{I}_{m, v} (m, s, v) &= 0 \\ \mathcal{I}_{s, s} (m, s, v) &= \frac{v}{2 s^2 (v + 3)} \\ \mathcal{I}_{s, v} (m, s, v) &= \frac{-1}{s (v + 1) (v + 3)} \\ \mathcal{I}_{v, v} (m, s, v) &= - \frac{1}{2} \frac{v + 5}{v (v + 1) (v + 3)} + \frac{1}{4} \psi_1 \left( \frac{v}{2} \right) - \frac{1}{4} \psi_1 \left( \frac{v + 1}{2} \right) \\ \end{aligned} \]

Further Reading

  • Blazsek, S. and Villatoro, M. (2015). Is Beta-t-EGARCH(1,1) Superior to GARCH(1,1)? Applied Economics, 47(17), 1764–1774. doi: 10.1080/00036846.2014.1000536.

  • Harvey, A. C. and Chakravarty, T. (2008). Beta-t-(E)GARCH. Cambridge Working Papers in Economics, CWPE 0840. doi: 10.17863/cam.5286.

  • Harvey, A. C. and Lange, R. J. (2018). Modeling the Interactions Between Volatility and Returns using EGARCH-M. Journal of Time Series Analysis, 39(6), 909–919. doi: 10.1111/jtsa.12419.

  • Lange, K. L., Little, R. J. A., and Taylor, J. M. G. (1989). Robust Statistical Modeling Using the t Distribution. Journal of the American Statistical Association, 84(408), 881–896. doi: 10.1080/01621459.1989.10478852.

Multivariate Real Data

Multivariate Normal Distribution

Mean-Variance Parametrization

Parameters

  • Mean parameters \(m_i \in \mathbb{R}, i = 1, \ldots, n\)
  • Variance parameters \(s_i \in (0, \infty), i = 1, \ldots, n\)
  • Covariance parameters \(c_{ij} \in \mathbb{R}, i = 2, \ldots, n, j = 1, \ldots, i\)

Vector and Matrix Notation

  • Mean vector \(\boldsymbol{m}\) of length \(n\)
  • Variance-covariance matrix \(\boldsymbol{K}\) of size \(n \times n\)

Density Function

\[ f(\boldsymbol{y} | \boldsymbol{m}, \boldsymbol{K}) = \frac{1}{\sqrt{(2 \pi)^n | \boldsymbol{K}|}} \exp \left( - \frac{1}{2} (\boldsymbol{y} - \boldsymbol{m})^\intercal \boldsymbol{K}^{-1} (\boldsymbol{y} - \boldsymbol{m}) \right) \]

Moments

\[ \begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \boldsymbol{m} \\ \mathrm{var}[\boldsymbol{Y}] &= \boldsymbol{K} \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{\boldsymbol{m}} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}) &= \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \\ \nabla_{\mathrm{vec}(\boldsymbol{K})} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}) &= \mathrm{vec} \left( \frac{1}{2} \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} - \frac{1}{2} \boldsymbol{K}^{-1} \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{\boldsymbol{m}, \boldsymbol{m}} (\boldsymbol{m}, \boldsymbol{K}) &= \boldsymbol{K}^{-1} \\ \mathcal{I}_{\boldsymbol{m}, \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}) &= \boldsymbol{0} \\ \mathcal{I}_{\mathrm{vec}(\boldsymbol{K}), \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}) &= \frac{1}{4} \boldsymbol{K}^{-1} \otimes \boldsymbol{K}^{-1} + \frac{1}{4} \mathrm{vec}\left(\boldsymbol{K}^{-1} \right) \mathrm{vec}\left(\boldsymbol{K}^{-1} \right)^\intercal \\ \end{aligned} \]

Multivariate Student’s t Distribution

Mean-Variance Parametrization

Parameters

  • Mean parameters \(m_i \in \mathbb{R}, i = 1, \ldots, n\)
  • Variance parameters \(s_i \in (0, \infty), i = 1, \ldots, n\)
  • Covariance parameters \(c_{ij} \in \mathbb{R}, i = 2, \ldots, n, j = 1, \ldots, i\)
  • Degrees of freedom parameter \(v \in (0, \infty)\)

Vector and Matrix Notation

  • Mean vector \(\boldsymbol{m}\) of length \(n\)
  • Variance-covariance matrix \(\boldsymbol{K}\) of size \(n \times n\)

Density Function

\[ f(\boldsymbol{y} | \boldsymbol{m}, \boldsymbol{K}, v) = \frac{\Gamma \left( \frac{v + n}{2} \right)}{\Gamma \left( \frac{v}{2} \right) \sqrt{(v \pi)^n | \boldsymbol{K}|}} \left( 1 + \frac{1}{v} (\boldsymbol{y} - \boldsymbol{m})^\intercal \boldsymbol{K}^{-1} (\boldsymbol{y} - \boldsymbol{m}) \right)^{-\frac{v + n}{2}} \]

Moments

\[ \begin{aligned} \mathrm{E}[\boldsymbol{Y}] &= \boldsymbol{m}, & \quad \text{for } v &> 1 \\ \mathrm{var}[\boldsymbol{Y}] &= \frac{v}{v - 2} \boldsymbol{K}, & \quad \text{for } v &> 2 \\ \end{aligned} \]

Score

\[ \begin{aligned} \nabla_{\boldsymbol{m}} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}, v) &= \frac{v + n}{v + \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right)} \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \\ \nabla_{\mathrm{vec}(\boldsymbol{K})} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}, v) &= \mathrm{vec} \left( \frac{1}{2} \frac{v + n}{v + \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right)} \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} - \frac{1}{2} \boldsymbol{K}^{-1} \right) \\ \nabla_{v} (\boldsymbol{y}; \boldsymbol{m}, \boldsymbol{K}, v) &= \frac{1}{2} \frac{ \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) - n }{ \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right)) + v} - \frac{1}{2} \ln \left( 1 + \frac{1}{v} \left(\boldsymbol{y} - \boldsymbol{m} \right)^\intercal \boldsymbol{K}^{-1} \left(\boldsymbol{y} - \boldsymbol{m} \right) \right) \\ & \qquad - \frac{1}{2} \psi_0 \left( \frac{v}{2} \right) + \frac{1}{2} \psi_0 \left( \frac{v + n}{2} \right) \\ \end{aligned} \]

Fisher Information

\[ \begin{aligned} \mathcal{I}_{\boldsymbol{m}, \boldsymbol{m}} (\boldsymbol{m}, \boldsymbol{K}, v) &= \frac{v + n}{v + n + 2} \boldsymbol{K}^{-1} \\ \mathcal{I}_{\boldsymbol{m}, \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}, v) &= \boldsymbol{0} \\ \mathcal{I}_{\boldsymbol{m}, v} (\boldsymbol{m}, \boldsymbol{K}, v) &= \boldsymbol{0} \\ \mathcal{I}_{\mathrm{vec}(\boldsymbol{K}), \mathrm{vec}(\boldsymbol{K})} (\boldsymbol{m}, \boldsymbol{K}, v) &= \frac{1}{4} \frac{v + n}{v + n + 2} \boldsymbol{K}^{-1} \otimes \boldsymbol{K}^{-1} + \frac{1}{4} \frac{v + n - 2}{v + n + 2} \mathrm{vec}\left(\boldsymbol{K}^{-1} \right) \mathrm{vec}\left(\boldsymbol{K}^{-1} \right)^\intercal \\ \mathcal{I}_{\mathrm{vec}(\boldsymbol{K}), v} (\boldsymbol{m}, \boldsymbol{K}, v) &= - \frac{1}{(v + n +2)(v + n)} \mathrm{vec}\left(\boldsymbol{K}^{-1} \right) \\ \mathcal{I}_{v, v} (\boldsymbol{m}, \boldsymbol{K}, v) &= ) - \frac{1}{2} \frac{n (v + n + 4)}{v (v + n + 2)(v + n)} + \frac{1}{4} \psi_1 \left( \frac{v}{2} \right) - \frac{1}{4} \psi_1 \left( \frac{v + n}{2} \right) \\ \end{aligned} \]

Further Reading

  • Blazsek, S. and Villatoro, M. (2015). Is Beta-t-EGARCH(1,1) Superior to GARCH(1,1)? Applied Economics, 47(17), 1764–1774. doi: 10.1080/00036846.2014.1000536.

  • Harvey, A. C. and Chakravarty, T. (2008). Beta-t-(E)GARCH. Cambridge Working Papers in Economics, CWPE 0840. doi: 10.17863/cam.5286.

  • Harvey, A. C. and Lange, R. J. (2018). Modeling the Interactions Between Volatility and Returns using EGARCH-M. Journal of Time Series Analysis, 39(6), 909–919. doi: 10.1111/jtsa.12419.

  • Lange, K. L., Little, R. J. A., and Taylor, J. M. G. (1989). Robust Statistical Modeling Using the t Distribution. Journal of the American Statistical Association, 84(408), 881–896. doi: 10.1080/01621459.1989.10478852.