Calculating derivatives from data with GPs

Data set download

[2]:

import numpy as np
import polars as pl
import scipy.optimize
import scipy.stats as st

import cmdstanpy
import arviz as az

import bebi103

import bokeh.io
bokeh.io.output_notebook()

Loading BokehJS ...

Imagine doing an experiment in bacterial growth. You can measure the optical density (OD) over time. The growth rate is given by the derivative of the log OD over time. The problem is that you cannot directly observe a derivative.

There are various ways to smooth data so that you can then compute derivatives from the smoothed curve. Gaussian processes provide a convenient way to go about this. We use a GP prior and Normal likelihood to obtain a nonparametric curve defined by a GP posterior. However, instead of directly differentiating the nonparametric curve using, e.g., finite differences, we can actually infer characteristics about its derivative because the derivative of a GP is also a GP! This is remarkably convenient, as explained in this nice paper by Peter Swain and coworkers. Specifically, if we have a Normal likelihood and a GP prior, we know the posterior distribution is also a GP. Therefore, we can get the posterior distribution for derivatives directly from the posterior distribution for nonparametric functions.

In this lesson, we will put this to use. We will use a data set from our own Tom Röschinger. I will not go into detail about the experiment, but will rather extract a single bacterial growth curve out of the data set for wild type E. coli growth measured by optical density (OD).

[3]:

# Load in data set
df = pl.read_csv(
    os.path.join(data_path, "roeschinger_growth_rate_data.csv"), comment_prefix="#"
).filter(
      (pl.col("tetracycline_conc_µg_per_mL") == 0)
    & (pl.col("promoter") == "MG1655")
    & (pl.col("well") == "A01")
)
t = df["time_min"].to_numpy() / 60
od600 = df["OD600"].to_numpy()

# Plot
p = bokeh.plotting.figure(
    frame_height=250,
    frame_width=350,
    x_axis_label='time (hr)',
    y_axis_label='OD',
    y_axis_type='log',
)
p.scatter(t, od600)

bokeh.io.show(p)

We see that we have what looks like an exponential growth phase that then reaches stationary phase with a very slow die off. We see to measure the rate of growth over time, which is given by the derivative of the nonparametric function we are using to describe this growth process.

Before proceeding, we will do the usual setup of centered and scaled data and the points at which we want to evaluate the nonparametric curve. For the growth curves, we will only evaluate the curves at the beginning and end points and extrapolate beyond.

[4]:

t_scaled = (t - t.mean()) / t.std()
od600_scaled = (od600 - od600.mean()) / od600.std()

# Sample at 250 points
Nstar = 250

# Set up xstar
xstar = np.linspace(t_scaled.min(), t_scaled.max(), Nstar)

Derivatives of GPs

Applying linear operators to Gaussian processes results in Gaussian processes. Since differentiation is a linear operation, the result of differentiating a GP is a GP. In what follows, we will state results without proof, but you can reference Solak, et al. and Swain, et al. for details. We will limit ourselves to cases where the x-variables are scalar (like time is in the present example); the analysis generalizes to multidimensional \(\mathbf{x}\), but the notation gets more cumbersome, and we often encounter scalar x-variables in practice.

Recall that if I have a Normal likelihood and a GP prior with kernel \(k(x, x';\theta_k)\), and the posterior predictive distribution for the nonparametric function \(f(x)\) is

\begin{align} f(x) \mid \mathbf{y}, \mathbf{x} \sim \text{GP}(m(x), k(x, x') + \sigma^2_x \delta_{x,x'}). \end{align}

We can sample values of \(f(x)\) as specific points \(\mathbf{x}_*\) according to

\begin{align} \mathbf{f}_* \mid \mathbf{x}_*, \mathbf{x}, \mathbf{y} \sim \text{MultiNorm}(\mathbf{m}_*, \mathsf{\Sigma}_*), \end{align}

where

\begin{align} &\mathbf{m}_* = \mathsf{K}_*^\mathsf{T}\cdot\mathsf{K}_\mathbf{y}^{-1}\cdot \mathbf{y},\\[1em] &\mathsf{\Sigma}_* = \mathsf{K}_{**} - \mathsf{K}_*^\mathsf{T}\cdot\mathsf{K}_\mathbf{y}^{-1}\cdot\mathsf{K}_*. \end{align}

The covariance matrices \(\mathsf{K}\) are defined such that \(K_{ij}(x, z) = k(x_i, z_j;\theta_k)\). Specifically,

\begin{align} &\mathsf{K}_\mathbf{y} = K(\mathbf{x}, \mathbf{x}) - \text{diag}(\boldsymbol{\sigma})\\[1em] &\mathsf{K}_* = K(\mathbf{x}, \mathbf{x}_*),\\[1em] &\mathsf{K}_{**} = K(\mathbf{x}_*, \mathbf{x}_*). \end{align}

Let \(\mathbf{g}_*\) be the derivatives of \(f(x)\) evaluated at positions \(\mathbf{x}_*\). In order to compute these, we do exactly the same procedure as above, but with covariance matrices computed by differentiating the kernel.

Let \(\partial_1 \mathsf{K}(\mathbf{x}_1, \mathbf{x}_2)\) be the covariance matrix constructed using a kernel given by the derivative of the kernel with respect to the first variable,

\begin{align} \frac{\partial}{\partial x_1}\,k(x_1, x_2;\theta_k). \end{align}

Let \(\partial_1 \partial_2 \mathsf{K}(\mathbf{x}_1, \mathbf{x}_2)\) be the covariance matrix constructed using a kernel given by the mixed second derivative the kernel,

\begin{align} \frac{\partial^2}{\partial x_1\,\partial x_2}\,k(x_1, x_2;\theta_k). \end{align}

Derivative of the squared exponential kernel

Since it is widely used, we show here the pertinent derivatives of the squared exponential kernel.

\begin{align} k_\mathrm{SE}(x_1, x_2; \alpha, \rho) = \alpha^2\,\exp\left[-\frac{(x_1-x_2)^2}{2\rho^2}\right], \end{align}

we have

\begin{align} &\frac{\partial}{\partial x_1}\,k_\mathrm{SE}(x_1, x_2;\alpha, \rho) = -\frac{\alpha^2}{\rho^2}\,(x_1 - x_2)\,\exp\left[-\frac{(x_1-x_2)^2}{2\rho^2}\right],\\[1em] &\frac{\partial^2}{\partial x_1\,\partial x_2}\,k_\mathrm{SE}(x_1, x_2;\alpha, \rho) = \frac{\alpha^2}{\rho^2}\left(1 - \frac{(x_1 - x_2)^2}{\rho^2}\right)\exp\left[-\frac{(x_1-x_2)^2}{2\rho^2}\right]. \end{align}

Derivatives of the Matérn kernel

For a Matérn kernel,

\begin{align} k_{\mathrm{Matern}}(x_1, x_2;\alpha, \rho, \nu) = \alpha^2\,\frac{2^{1-\nu}}{\Gamma(\nu)} \,\beta^\nu\,K_\nu\left(\beta\right), \end{align}

where

\begin{align} \beta = \left(2\nu\,\frac{(x_1-x_2)^2}{\rho^2}\right)^{1/2}, \end{align}

and \(K_\nu\) is a modified Bessel function of the second kind, we have

\begin{align} &\frac{\partial}{\partial x_1}\,k_\mathrm{Matern}(x_1, x_2;\alpha, \rho, \nu) = -\alpha^2\,\frac{2^{1-\nu}}{\Gamma(\nu)} \,\frac{\beta^{\nu + 1}}{x_1-x_2}\,K_{\nu-1}\left(\beta\right),\\[1em] &\frac{\partial^2}{\partial x_1\,\partial x_2}\,k_\mathrm{Matern}(x_1, x_2;\alpha, \rho, \nu) = \alpha^2\,\frac{2^{2-\nu}\,\nu}{\Gamma(\nu)\rho^2}\,\left(\beta^{\nu-1}\,K_{\nu-1}(\beta) - \beta^\nu\,K_{\nu-2}(\beta)\right). \end{align}

The special cases of \(\nu = 3/2\) and \(\nu = 5/2\) are worth computing, since they are widely used. First, for \(\nu = 3/2\),

\begin{align} &\frac{\partial}{\partial x_1}\,k_\mathrm{Matern}(x_1, x_2;\alpha, \rho, 3/2) = -\frac{3\alpha^2}{\rho^2}(x_1-x_2)\exp\left[-\left(\frac{3(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right],\\[1em] &\frac{\partial^2}{\partial x_1\,\partial x_2}\,k_\mathrm{Matern}(x_1, x_2;\alpha, \rho, 3/2) = \frac{3\alpha^2}{\rho^2}\left(1 - \left(\frac{3(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right)\exp\left[-\left(\frac{3(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right]. \end{align}

Now for \(\nu = 5/2\),

\begin{align} &\frac{\partial}{\partial x_1}\,k_\mathrm{Matern}(x_1, x_2;\alpha, \rho, 5/2) = -\frac{5\alpha^2}{3\rho^2}(x_1-x_2)\left(1 + \left(\frac{5(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right)\exp\left[-\left(\frac{5(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right],\\[1em] &\frac{\partial^2}{\partial x_1\,\partial x_2}\,k_\mathrm{Matern}(x_1, x_2;\alpha, \rho, 5/2) = \frac{5\alpha^2}{3\rho^4}\left(\left(1 + \left(\frac{5(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right)\rho^2 - 5(x_1-x_2)^2\right)\exp\left[-\left(\frac{5(x_1-x_2)^2}{\rho^2}\right)^{1/2}\right]. \end{align}

Expressions for the gradient and covariance of the gradient

For notational convenience, we define \(\partial_1\mathsf{K}_* = \partial_1 \mathsf{K}(\mathbf{x}, \mathbf{x}_*)\) and \(\partial_1\partial_2\mathsf{K}_{**} = \partial_1 \partial_2 \mathsf{K}(\mathbf{x}_*, \mathbf{x}_*)\).

With all of these definitions in place, we can write the expressions for \(\mathbf{g}_*\) and its corresponding covariance matrix \(\mathsf{\Sigma}_*^\mathbf{g}\).

\begin{align} &\mathbf{g}_* = -(\partial_1 \mathsf{K}_*)^\mathsf{T}\cdot\mathsf{K}_\mathbf{y}^{-1}\cdot \mathbf{y},\\[1em] &\mathsf{\Sigma}_*^\mathbf{g} = \partial_1\partial_2\mathsf{K}_{**} - (\partial_1\mathsf{K}_{*})^\mathsf{T}\cdot\mathsf{K}_\mathbf{y}^{-1}\cdot\partial_1\mathsf{K}_{*}. \end{align}

Derivatives of GPs in practice using optimization

Let’s now put these results to use. For the growth curve, we will assume the following model for the centered and scaled data.

\begin{align} &\alpha \sim \text{HalfNorm}(2),\\[1em] &\rho \sim \text{InvGamma}(2, 10), \\[1em] &\sigma \sim \text{HalfNorm}(0, 1),\\[1em] &f(t) \sim \text{GP}(0, k_\mathrm{SE}(t, t'; \alpha, \rho)),\\[1em] &\text{OD}_i \sim \text{Norm}(f(t_i), \sigma)\;\;\forall i. \end{align}

We will first find the MAP values of the parameters \(\alpha\), \(\rho\), and \(\sigma\) using the optimization techniques we have already learned. We start by coding up the model.

[5]:

def log_prior(sigma, alpha, rho):
    """Log prior for homoscedastic error and GP hyperparameters"""
    # Choose 1e-6 as upper bound for sigma to protect against
    # singular covariance matrix
    if sigma <= 1e-6 or alpha <= 0 or rho <= 0:
        return -np.inf

    lp = st.halfnorm.logpdf(sigma, 0, 1)
    lp += st.halfnorm.logpdf(alpha, 0, 2)
    lp += st.invgamma.logpdf(rho, 2, loc=0, scale=10)

    return lp


def log_likelihood(x, y, sigma, alpha, rho):
    """Normal log likelihood for centered and scaled GP."""
    diag_to_add = sigma ** 2 * np.eye(len(x))
    Ky = bebi103.gp.cov_exp_quad(x, alpha=alpha, rho=rho) + diag_to_add

    return st.multivariate_normal.logpdf(y, np.zeros_like(x), Ky)


def log_posterior(params, x, y):
    """Log posterior for GP with normal likelihood."""
    sigma, alpha, rho = params

    lp = log_prior(sigma, alpha, rho)
    if lp == -np.inf:
        return lp

    return lp + log_likelihood(x, y, sigma, alpha, rho)


def neg_log_posterior(params, x, y):
    return -log_posterior(params, x, y)

We can now perform the optimization to get the parameter values.

[6]:

# Initial guess
params_0 = np.ones(3)

# Perform optimization
args = (t_scaled, od600_scaled)
res = scipy.optimize.minimize(neg_log_posterior, params_0, args=args, method="powell")

# Extract results
sigma, alpha, rho = res.x

# Show results
print("α = {0:.4f}, ρ = {1:.4f}, σ = {2:.4f}".format(alpha, rho, sigma))

α = 1.1420, ρ = 0.5917, σ = 0.0118

/Users/bois/miniconda3/envs/bebi103_build/lib/python3.12/site-packages/scipy/optimize/_optimize.py:2473: RuntimeWarning: invalid value encountered in scalar multiply
  tmp2 = (x - v) * (fx - fw)

With the parameter values in hand, we can perform the above calculations for the derivative with augmented covariance matrices. This is implemented in the bebi103.gp.posterior_mean_cov_deriv() function. We can get \(\mathbf{m}^*\) and \(\mathsf{\Sigma}_*\) using the bebi103.gp.posterior_mean_cov() function.

[7]:

mstar, Sigmastar = bebi103.gp.posterior_mean_cov(
    t_scaled, od600_scaled, xstar, sigma=sigma, kernel=bebi103.gp.exp_quad_kernel, rho=rho, alpha=alpha
)

gstar, Sigma_g_star = bebi103.gp.posterior_mean_cov_deriv(
    t_scaled, od600_scaled, xstar, sigma=sigma, kernel=bebi103.gp.exp_quad_kernel, rho=rho, alpha=alpha
)

Now that we have the augmented \(\mathbf{m}_*\) and \(\mathsf{\Sigma}_*\), we can put them to use. First, we’ll plot the nonparametric function with the data.

[8]:

# Bound of credible interval
high = mstar + 1.96 * np.sqrt(np.diag(Sigmastar))
low = mstar - 1.96 * np.sqrt(np.diag(Sigmastar))

# Uncenter/scale
tstar = t.std() * xstar + t.mean()
od600star = od600.std() * mstar + od600.mean()
low = od600.std() * low + od600.mean()
high = od600.std() * high + od600.mean()

# Make plot
p = bokeh.plotting.figure(
    frame_height=250,
    frame_width=350,
    x_axis_label='time (hr)',
    y_axis_label='OD',
    y_axis_type='log',
)

# Data and function
p.line(tstar, od600star, line_width=2, color="orange")
p = bebi103.viz.fill_between(
    tstar,
    high,
    tstar,
    low,
    show_line=False,
    patch_kwargs=dict(color="orange", alpha=0.3),
    p=p,
)
p.scatter(t, od600)

bokeh.io.show(p)

If you zoom in the above plot, the credible interval is more clear. Now, let’s plot the derivative! When we uncenter and unscale the derivative, we need to multiply by the standard deviation of the y-values of the observed data and divide by that of the x-values (in this case time).

[9]:

# Bound of credible interval for derivative
deriv_high = gstar + 1.96 * np.sqrt(np.diag(Sigma_g_star))
deriv_low = gstar - 1.96 * np.sqrt(np.diag(Sigma_g_star))

# Uncenter/scale
deriv_star = gstar * od600.std() / t.std()
deriv_low *= od600.std() / t.std()
deriv_high *= od600.std() / t.std()

# Make plot
p = bokeh.plotting.figure(
    frame_height=250,
    frame_width=350,
    x_axis_label="time (hr)",
    y_axis_label="d OD / dt (1/hr)",
)
p.line(tstar, deriv_star, line_width=2)
p = bebi103.viz.fill_between(
    tstar,
    deriv_high,
    tstar,
    deriv_low,
    show_line=False,
    patch_kwargs=dict(alpha=0.3),
    p=p,
)

bokeh.io.show(p)

We now have a measure of the derivative of the OD, but we need the derivative of the log of the OD to get what we commonly refer to as the growth rate. To compute the value of \(\sigma\) from the growth rate, we use an approximation for the variance of the ratio of Gaussian distributed random variables. If \(z = x/y\) and \(x\) and \(y\) are both Gaussian distributed, then

\begin{align} \frac{\sigma^2_z}{\mu_z^2} \approx \frac{\sigma^2_x}{\mu_x^2} + \frac{\sigma^2_y}{\mu_y^2}. \end{align}

I will not derive this result here, since I favor sampling anyway (in which there are no approximations), which we will do shortly.

[10]:

# Compute growth rate
growth_rate = deriv_star / od600star

# Compute sigma for growth rate
sigma_growth_rate = np.sqrt(
    (deriv_star / od600star) ** 2
    * (
        np.diag(Sigmastar) * od600.std() ** 2 / od600star ** 2
        + np.diag(Sigma_g_star) * (od600.std() / t.std()) ** 2 / deriv_star ** 2
    )
)

# Compute bounds
gr_high = growth_rate + 1.96 * sigma_growth_rate
gr_low = growth_rate - 1.96 * sigma_growth_rate

Now we’re ready to plot!

[11]:

# Make plot
p = bokeh.plotting.figure(
    frame_height=250,
    frame_width=350,
    x_axis_label='time (hr)',
    y_axis_label='growth rate (1/hr)'
)
p.line(tstar, growth_rate, line_width=2)
p = bebi103.viz.fill_between(
    tstar,
    gr_high,
    tstar,
    gr_low,
    show_line=False,
    patch_kwargs=dict(alpha=0.3),
    p=p,
)

bokeh.io.show(p)

Derivatives with a Matérn kernel

A Matérn kernel provides for functions that are rougher than a squared exponential kernel, which tends to give very smooth functions. Let us see how the growth rate estimate changes with a Matérn kernel with \(\nu = 5/2\). The code cell below is a bit long, since we need to redefine the posterior and again do the post-processing.

[12]:

# Update posterior
def log_likelihood(x, y, sigma, alpha, rho, nu):
    """Normal log likelihood for centered and scaled GP."""
    diag_to_add = sigma ** 2 * np.eye(len(x))
    Ky = bebi103.gp.cov_matern(x, alpha=alpha, rho=rho, nu=nu) + diag_to_add

    return st.multivariate_normal.logpdf(y, np.zeros_like(x), Ky)


def log_posterior(params, x, y, nu):
    """Log posterior for GP with normal likelihood."""
    sigma, alpha, rho = params

    lp = log_prior(sigma, alpha, rho)
    if lp == -np.inf:
        return lp

    return lp + log_likelihood(x, y, sigma, alpha, rho, nu)


def neg_log_posterior(params, x, y, nu):
    return -log_posterior(params, x, y, nu)

# Set the roughness parameter
nu = 5 / 2

# Perform optimization
args = (t_scaled, od600_scaled, nu)
res = scipy.optimize.minimize(neg_log_posterior, params_0, args=args, method="powell")

# Extract results
sigma, alpha, rho = res.x

# Compute GP derivatives
gstar, Sigma_g_star = bebi103.gp.posterior_mean_cov_deriv(
    t_scaled, od600_scaled, xstar, sigma=sigma, kernel=bebi103.gp.matern_kernel, rho=rho, alpha=alpha, nu=nu
)

# Bound of credible interval for derivative
deriv_high = gstar + 1.96 * np.sqrt(np.diag(Sigma_g_star))
deriv_low = gstar - 1.96 * np.sqrt(np.diag(Sigma_g_star))

# Uncenter/scale
deriv_star = gstar * od600.std() / t.std()
deriv_low *= od600.std() / t.std()
deriv_high *= od600.std() / t.std()

# Compute growth rate
growth_rate = deriv_star / od600star

# Compute sigma for growth rate
sigma_growth_rate = np.sqrt(
    (deriv_star / od600star) ** 2
    * (
        np.diag(Sigmastar) * od600.std() ** 2 / od600star ** 2
        + np.diag(Sigma_g_star) * (od600.std() / t.std()) ** 2 / deriv_star ** 2
    )
)

# Compute bounds
gr_high = growth_rate + 1.96 * sigma_growth_rate
gr_low = growth_rate - 1.96 * sigma_growth_rate

# Add to plot
p.line(tstar, growth_rate, line_width=2, line_color='orange')
p = bebi103.viz.fill_between(
    tstar,
    gr_high,
    tstar,
    gr_low,
    show_line=False,
    patch_kwargs=dict(alpha=0.3, fill_color='orange'),
    p=p,
)

bokeh.io.show(p)

/Users/bois/miniconda3/envs/bebi103_build/lib/python3.12/site-packages/scipy/optimize/_optimize.py:2473: RuntimeWarning: invalid value encountered in scalar multiply
  tmp2 = (x - v) * (fx - fw)

The growth rate for the Matérn kernel, in orange, shows a bit more wiggle and, again, low precision for the growth rate at short times.

Sampling derivatives with Stan

As we have already seen, sampling out of the GP posterior allows us to properly consider all values of the hyperparameters weighted by their posterior probability density. We can adjust the code from the last part of this lesson to include the derivative calculations outlined above. In the generated quantities block, I add the extra calculations of the values of the derivatives \(\mathbf{g}_*\).

functions {
  #include gp_one_dimensional.stanfunctions
}


data {
  int<lower=1> N;
  array[N] real x;
  vector[N] y;

  int<lower=1> Nstar;
  array[Nstar] real xstar;
}


transformed data {
  real delta = 1e-8;
}


parameters {
  real<lower=0> alpha;
  real<lower=0> rho;
  real<lower=0> sigma;
}


model {
  alpha ~ normal(0.0, 2.0);
  rho ~ inv_gamma(0.5, 2.0);
  sigma ~ normal(0.0, 1.0);

  matrix[N, N] Ky = gp_exp_quad_cov(x, alpha, rho)
                   + diag_matrix(rep_vector(square(sigma), N));
  matrix[N, N] Ly = cholesky_decompose(Ky);

  y ~ multi_normal_cholesky(rep_vector(0, N), Ly);
}


generated quantities {
  vector[Nstar] fstar;
  vector[Nstar] dfstar;
  array[Nstar] real y_ppc;

  {
    // Build covariance matrices
    matrix[N, N] Ky = gp_exp_quad_cov(x, alpha, rho)
                     + diag_matrix(rep_vector(square(sigma), N));
    matrix[N, N] Ly = cholesky_decompose(Ky);
    matrix[N, Nstar] Kstar = gp_exp_quad_cov(x, xstar, alpha, rho);
    matrix[Nstar, Nstar] Kstarstar = gp_exp_quad_cov(xstar, xstar, alpha, rho);

    // Derivative covariance matrices
    matrix[N, Nstar] d1_Kstar = d1_cov_exp_quad(x, xstar, alpha, rho);
    matrix[Nstar, Nstar] d1_d2_Kstarstar = d1_d2_cov_exp_quad(xstar, xstar, alpha, rho);

    // Obtain m* and Sigma*
    vector[Nstar] mstar = gp_posterior_mstar(y, Ly, Kstar);
    matrix[Nstar, Nstar] Lstar = gp_posterior_sigmastar_cholesky(
      y, Ly, Kstar, Kstarstar, delta);

    // Obtain g* and Sigmag*
    vector[Nstar] gstar = -gp_posterior_mstar(y, Ly, d1_Kstar);
    matrix[Nstar, Nstar] L_g_star = gp_posterior_sigmastar_cholesky(
      y, Ly, d1_Kstar, d1_d2_Kstarstar, delta);

    // Sample nonparametric function
    fstar = multi_normal_cholesky_rng(mstar, Lstar);

    // Sample derivative of nonparametric function
    dfstar = multi_normal_cholesky_rng(gstar, L_g_star);

    // Posterior predictive check
    y_ppc = normal_rng(fstar, sigma);
  }

}

I display all of the functions in the functions block here, but note that the contents of the functions block can be entirely replaced by #include gp_one_dimensional.stanfunctions using the Stan include files from the bebi103 package.

Let’s give it a go!

[13]:

# Create data dictionary for input
data = dict(N=len(t), x=t_scaled, y=od600_scaled, Nstar=Nstar, xstar=xstar)

# Compile and sample!
with bebi103.stan.disable_logging():
    sm = cmdstanpy.CmdStanModel(
        stan_file="gp_growth_curve.stan",
        stanc_options = {'include-paths': bebi103.stan.include_path()},
    )
    samples = sm.sample(data=data)

# Convert to ArviZ
samples = az.from_cmdstanpy(samples, posterior_predictive=["fstar", "dfstar", "y_ppc"])

# Check diagnostics
bebi103.stan.check_all_diagnostics(samples)


Effective sample size looks reasonable for all parameters.

Rhat looks reasonable for all parameters.

0 of 4000 (0.0%) iterations ended with a divergence.

0 of 4000 (0.0%) iterations saturated the maximum tree depth of 10.

E-BFMI indicated no pathological behavior.

[13]:

Great! Let’s start by checking out the hyperparameters.

[14]:

bokeh.io.show(bebi103.viz.corner(samples))

It looks good, and \(\sigma\) is small, as we saw in our MAP calculation. Now, let’s plot a posterior predictive check. To do so, we need to uncenter and unscale.

[15]:

y_ppc = (
    samples.posterior_predictive["y_ppc"]
    .stack({"sample": ("chain", "draw")})
    .transpose("sample", "y_ppc_dim_0")
)

# Uncenter and unscale
od600_ppc = od600.std() * y_ppc + od600.mean()
tstar = t.std() * xstar + t.mean()

# Make the plot
bokeh.io.show(
    bebi103.viz.predictive_regression(
        od600_ppc,
        tstar,
        data=np.stack((t, od600)).transpose(),
        x_axis_label="time (hr)",
        y_axis_label="OD600",
        y_axis_type='log'
    )
)

There is not much noise in the data, so the nonparametric curve from this GP goes right through.

Now, we can look at the derivatives. We have no data to compare these with, so we just plot the posterior predictive samples of the derivative of \(f(t)\).

[16]:

dfstar_scaled = (
    samples.posterior_predictive["dfstar"]
    .stack({"sample": ("chain", "draw")})
    .transpose("sample", "dfstar_dim_0")
)

# Uncenter and unscale
dfstar = dfstar_scaled * od600.std() / t.std()

# Plot posterior predictive
bokeh.io.show(
    bebi103.viz.predictive_regression(
        dfstar, tstar, x_axis_label="time (hr)", y_axis_label="d OD / dt (1/hr)"
    )
)

We can also compute the growth rate from the samples and plot the result.

[17]:

# Grab fstar
fstar_scaled = (
    samples.posterior_predictive["fstar"]
    .stack({"sample": ("chain", "draw")})
    .transpose("sample", "fstar_dim_0")
)

# Uncenter and unscale
fstar = od600.std() * fstar_scaled + od600.mean()

# Sample of growth rate
growth_rate = dfstar.values / fstar.values

# Make plot
bokeh.io.show(
    bebi103.viz.predictive_regression(
        growth_rate, tstar, x_axis_label="time (hr)", y_axis_label="growth rate (1/hr)"
    )
)

It is apparently very hard to know the growth rate early on in the growth curve. We can also see that due to the highly variable growth rate, these bacteria not not really experiencing purely exponential growth.

[18]:

bebi103.stan.clean_cmdstan()

Computing environment

[19]:

%load_ext watermark
%watermark -v -p numpy,scipy,polars,cmdstanpy,arviz,bokeh,bebi103,jupyterlab
print("cmdstan   :", bebi103.stan.cmdstan_version())

Python implementation: CPython
Python version       : 3.12.5
IPython version      : 8.27.0

numpy     : 1.26.4
scipy     : 1.13.1
polars    : 1.8.1
cmdstanpy : 1.2.4
arviz     : 0.19.0
bokeh     : 3.4.1
bebi103   : 0.1.26
jupyterlab: 4.2.5

cmdstan   : 2.35.0