Notation of parts of Bayes’s Theorem

The symbol \(P\) to denote probability is a bit overloaded. To help aid in notation, we will use the following conventions going forward in the class.

Probabilities or probability densities describing measured data are denoted with \(f\).
Probabilities or probability densities describing parameter values, hypotheses, or other non-measured quantities, are denoted with \(g\).
A set of parameters for a given model are denoted \(\theta\).

So, if we were to write down Bayes’s theorem for a parameter estimation problem, it would be

\[\begin{aligned} g(\theta \mid y) = \frac{f(y\mid \theta)\,g(\theta)}{f(y)}. \end{aligned}\]

Probabilities or probability densities written with a \(g\) denote the prior or posterior, and those with an \(f\) denote the likelihood or evidence.

We can also define a joint probability, \(\pi(y, \theta) = f(y\mid \theta)\,g(\theta)\), such that

\[\begin{aligned} g(\theta \mid y) = \frac{\pi(y,\theta)}{f(y)}. \end{aligned}\]

Note that we will use this notation in the context of Bayesian inference, and we may generally speak about joint probability density functions, for example, using \(f(x, y)\). The use of \(f\) for likelihoods and evidence, \(g\) for priors and posteriors, and \(\pi\) for joint probabilities in the context of Bayesian modeling helps us keep track of what is what conceptually.