BE/Bi 103 b: Statistical Inference in the Biological Sciences
In the prequel to this course, we developed tools to build data analysis pipelines, including the organization, preservation, sharing, and display quantitative data. We also learned basic techniques in statistical inference using resampling methods taking a frequentist approach.
In this class, we go deeper into statistical modeling and inference, mostly taking a Bayesian approach. We discuss generative modeling, parameter estimation, model comparison, hierarchical modeling, Markov chain Monte Carlo, graphical display of inference results, and principled workflows. All of these topics are explored through analysis of real biological data sets.
If you are enrolled in the course, please read the Course policies below. We will not go over them in detail in class, and it is your responsibility to understand them.
Useful links
Ed (used for course communications)
Canvas (used for assignment submission/return)
Homework solutions (password protected)
People
Instructor
Justin Bois (bois at caltech dot edu)
TAs
Kayla Jackson
Zach Martinez
Kellan Moorse
- 0. Setting up computing resources
- 1. Probability and the logic of scientific reasoning
- 2. Introduction to Bayesian modeling
- 3. Plotting posteriors
- 4. Marginalization by numerical quadrature
- 5. Conjugacy
- E1. To be completed after lesson 5
- 6. Parameter estimation by optimization
- E2. To be completed after lesson 6
- 7. Introduction to Markov chain Monte Carlo
- 8. Introduction to MCMC with Stan
- 9. Mixture models and label switching with MCMC
- 10. Variate-covariate models with MCMC
- E3. To be completed after lesson 11
- 11. Display of MCMC results
- 12. Model building with prior predictive checks
- 13. Posterior predictive checks
- E4. To be completed after lesson 13
- 14. Collector’s box of distributions
- 15. MCMC diagnostics
- 16. A diagnostics case study: Artificial funnel of hell
- E5. To be completed after lesson 16
- 17. Model comparison
- 18. Model comparison in practice
- E6. To be completed after lesson 18
- 19. Hierarchical models
- 20. Implementation of hierarchical models
- E7. To be completed after lesson 21
- 21. Principled analysis pipelines
- 22: Simulation based calibration and related checks in practice
- E8. To be completed after lesson 23
- 23. Introduction to Gaussian processes
- 24. Implementation of Gaussian processes
- 25. Variational Bayesian inference
- E9. To be completed after lesson 25
- 26: Wrap-up
- R1: Review of probability
- R2: Choosing priors and review of optimization
- R3: Just homework help
- R4. Introduction to Hamiltonian Monte Carlo
- R5: Bayesian model building
- R6: MCMC using Caltech’s HPC
- R7: Sampling discrete parameters with Stan
- R8: Discussion of HW 10 project proposals
- R9: Just homework help
- 1. Intuitive generative modeling
- 2. Analytical and graphical methods for analysis of the posterior
- 3. Maximum a posteriori parameter estimation
- 4. Sampling with MCMC
- 5. Inference with Stan
- 6. MCMC with ion channels
- 7. Model comparison
- 8. Hierarchical models
- 9. Principled pipelines and/or VI and/or hierarchical modeling
- 10. The grand finale
- 11. Course feedback