Picking Priors

Overview

Picking priors is both an important and difficult part of Bayesian statistics. This vignette is not intended to be an introduction to Bayesian statistics, here I assume readers are already know what a prior/posterior is. Just to review, a prior is a probability distribution representing an analysts belief in model parameters prior to seeing the data. The posterior is the (in some sense optimal) probability distribution representing what you should belief having seen the data (given your prior beliefs).

Since priors represent an analysts belief prior to seeing the data, it makes sense that priors will often be specific for a given study. For example, we don’t necessarily believe the parameters learned for an RNA-seq data analysis will be the same as for someone studying microbial communities or political gerrymandering. What’s more, we probably have different prior beliefs depending on what microbial community we are studying or how a study is set up.

There are (at least) two important reasons to think carefully about priors. First, the meaning of the posterior is conditioned on your prior accurately reflecting your beliefs. The posterior represents an optimal belief given data and given prior beliefs. If a specified prior does not reflect your beliefs well then the prior won’t have the right meaning. Of course all priors are imperfect but we do the best we can. Second, on a practical note, some really weird priors can lead to numerical issues during optimization and uncertainty quantification in fido. This later problem can appear as failure to reach the MAP estimate or an error when trying to invert the Hessian.

Overall, a prior is a single function (a probability distribution) specified jointly on parameters of interest. Still, it can be confusing to think about the prior in the joint form. Here I will instead try to simplify this and break the prior down into distinct components. While there are numerous models in fido, here I will focus on the prior for the pibble model as it is, in my opinion, the heart of fido.

Just to review, the pibble model is given by: \[ \begin{align} Y_j & \sim \text{Multinomial}\left(\pi_j \right) \\ \pi_j & = \phi^{-1}(\eta_j) \\ \eta_j &\sim N(\Lambda X_j, \Sigma) \\ \Lambda &\sim N(\Theta, \Sigma, \Gamma) \\ \Sigma &\sim W^{-1}(\Xi, \upsilon). \end{align} \]