Homework 2.1: Overwhelming a prior (45 pts)

[1]:

import numpy as np

As we will see as we continue in this class, it is important to carefully choose the prior distribution when building your generative models. Nonetheless, for many parameter estimation problems where there are many data, the contribution of the prior is overwhelmed by the likelihood, so the exact details (within some constraints) of the prior are less important. We will explore this computationally in this problem.

To do this exploration, we will work with a data set that comes from David Prober’s lab. In the Prober lab, they monitor activity of zebrafish larvae of different genotypes in efforts to understand how sleep is controlled. In one experiment, published in Gandhi, et al., 2015, the measured how many minutes of activity each of 17 fish larvae had over the course of the nine hours of their third night. The results are most easily loaded directly into a Numpy array into a variable we will call y.

[2]:

y = np.array(
    [200, 190, 200, 249, 232, 319, 104, 93, 233,
     287, 49, 311, 225, 243, 113, 133, 179]
)

For our generative model, we will use a Normal likelihood; that is, we model the data as drawn from a Normal distribution with location parameter \(\mu\) and scale parameter \(\sigma\).

\begin{align} \{y_i\} \mid \mu, \sigma \sim \text{Norm}(\mu, \sigma)\;\forall i. \end{align}

Your task in this problem is to plot the posterior distribution, \(g(\mu, \sigma \mid \{y_i\})\) for each of the following priors.

a) For your first prior, we will use improper uninformative priors. We say they are uninformative because they are very broad (infinitely so; they are not even normalizable).

\begin{align} &g(\mu) = \text{constant},\\[1em] &g(\sigma) = 1/\sigma. \end{align}

Of course, both \(\mu\) and \(\sigma\) are assumed to be nonnegative.

b) Again assume \(g(\sigma) = 1/\sigma\). Now use a Normal prior for \(\mu\) with a location parameter of 225 minutes and a scale parameter of 150 minutes.

\begin{align} \mu \sim \text{Normal}(225 \text{ min}, 150 \text{ min}). \end{align}

What effect does choosing this prior as opposed to a uniform prior have on the posterior probability?

c) Now assume a scale parameter of 20 minutes for the prior for \(\mu\). That is, again take \(g(\sigma) = 1/\sigma\), but take

\begin{align} \mu \sim \text{Normal}(225 \text{ min}, 20 \text{ min}). \end{align}

How is the posterior effected? Comment on why you see the effect you do.