# BE/Bi 103 b: Statistical Inference in the Biological Sciences

In the prequel to this course, we developed tools to build data analysis pipelines, including the organization, preservation, sharing, and display quantitative data. We also learned basic techniques in statistical inference using resampling methods taking a frequentist approach.

In this class, we go deeper into statistical modeling and inference, mostly taking a Bayesian approach. We discuss generative modeling, parameter estimation, model comparison, hierarchical modeling, Markov chain Monte Carlo, graphical display of inference results, and principled workflows. All of these topics are explored through analysis of real biological data sets.

If you are enrolled in the course, please read the Course policies below. We will not go over them in detail in class, and it is your responsibility to understand them.

## Useful links

Ed (used for course communications)

Course Zoom link (password protected)

Google doc for help queue (password protected)

Meeting recordings (password protected)

Homework solutions (password protected)

## People

Instructor

Justin Bois (bois at caltech dot edu)

TAs

Rosita Fu (rfu at caltech dot edu)

Matteo Guareschi (mmguar at caltech dot edu)

Arjuna Subramanian (amsubram at caltech dot edu)

- 0. Preparing for the course
- 1. Probability and the logic of scientific reasoning
- 2. Plotting posteriors
- 3. Marginalization by numerical quadrature
- 4. Conjugacy
- E1. To be completed after lesson 4
- 5. Introduction to Bayesian modeling
- 6. Parameter estimation by optimization
- E2. To be completed after lesson 6
- 7. Introduction to Markov chain Monte Carlo
- 8. Cloud computing setup and usage
- 9. Introduction to MCMC with Stan
- 10. Mixture models and label switching with MCMC
- 11. Regression with MCMC
- E3. To be completed after lesson 11
- 12. Display of MCMC results
- 13. Model building with prior predictive checks
- 14. Posterior predictive checks
- E4. To be completed after lesson 14
- 15. Collector’s box of distributions
- 16. MCMC diagnostics
- 17. A diagnostics case study: Artificial funnel of hell
- E5. To be completed after lesson 17
- 18. Model comparison
- 19. Model comparison in practice
- E6. To be completed after lesson 19
- 20. Hierarchical models
- 21. Implementation of hierarchical models
- E7. To be completed after lesson 21
- 22. Principled analysis pipelines
- 23: Simulation based calibration and related checks in practice
- E8. To be completed after lesson 23
- 24. Introduction to Gaussian processes
- 25. Implementation of Gaussian processes
- E9. To be completed after lesson 25
- 26: Variational Bayesian inference
- 27: Wrap-up

- R1: Review of probability
- R2. Review of MLE
- R3. Choosing priors
- R4. Stan installation and use of AWS
- R5. A Bayesian modeling case study: Ant traffic jams
- R6. Practice model building
- R7. Introduction to Hamiltonian Monte Carlo
- R8: Discussion of HW 10 project proposals
- R9: Sampling discrete parameters with Stan

- 0. Configuring your team
- 1. Intuitive generative modeling
- 2. Analytical and graphical methods for analysis of the posterior
- 3. Maximum a posteriori parameter estimation
- 4. Sampling with MCMC
- 5. Inference with Stan
- 6. Practice building and assessing Bayesian models
- 7. Model comparison
- 8. Hierarchical models
- 9. Principled pipelines
- 10. The grand finale
- 11. Course feedback