Reading material¶
Different authors present content in different ways. Try a few and pick one that you find most readable/accessible.
Books about Bayesian inference¶
General¶
“Bayesian data analysis” (Gelman): http://www.stat.columbia.edu/~gelman/book/
“Introduction to Probability” (Dimitri Bertsekas): covers basics: random variable, combinatorics, derived distributions, what is a PDF, how to work with them, expectation values
“Probability and Statistics” (Morris H. DeGroot): covers classical approaches to hypothesis testing and frequentist analysis, so you can really understand, e.g., the classical tests, what p-values really are, why are they used and how and when they are useful.
for physical sciences¶
“Bayesian Logical Data Analysis for the Physical Sciences” (Gregory)
“Data Analysis - A Bayesian Tutorial” (Sivia): building an intuition for how Bayesian statistics works
“Scientific Inference” (Simon Vaughan): basic probability theory, statistical thinking, model and data representation in computers. likelihood function, graphical summaries, basic Monte Carlo.
“Bayesian reasoning in data analysis - A critical introduction” (D’Agostini) https://www.roma1.infn.it/~dagos/WSPC/ differences in uncertainties, confidence vs. credible intervals, frequentist/maximum likelihood, Bayesian inference and how they relate
“An Introduction to Statistical Learning with Applications in R” (James, Witten, Hastie & Tibshirani) http://www-bcf.usc.edu/~gareth/ISL/ available online
“Machine Learning: An Algorithmic Perspective” (Stephen Marsland) http://seat.massey.ac.nz/personal/s.r.marsland/MLbook.html
“Statistics, Data Mining, and Machine Learning in Astronomy” (Zeljko Ivezic, Andy Connolly, Jake VanderPlas, Alex Gray) for scikit-learn and astroml
https://itp.tugraz.at/LV/wvl/Statistik/A_WS_pdf.pdf “Wahrscheinlichkeitstheorie, Statistik und Datenanalyse”, von der Linden & Prüll
Lectures¶
A similar course is available in full online at https://kipac.github.io/StatisticalMethods/
Annual astrostatistics summer school at Penn State: http://astrostatistics.psu.edu/
Stefano Andreon’s page http://www.brera.mi.astro.it/~stefano.andreon/ (Click on teaching)
Brendon Brewer’s teaching page https://www.stat.auckland.ac.nz/~brewer/teaching.html (Click on STATS 331)
James M. Cordes’s teaching page http://hosting.astro.cornell.edu/~cordes/A6523/
James Long’s SAMSI course https://www.stat.tamu.edu/~jlong/astrostat/
Jarle Brinchmann’s Databases & Data mining course https://github.com/jbrinchmann/DDM2017
Astrostatistics & Machine Learning course https://github.com/dirac-institute/uw-astr598-w18
Research Papers on State-of-the-Art Methods¶
In the course, we will go through some of these (you do not have to read them beforehand).
Markov Chain Monte Carlo¶
(see also the end of the 3-MCMC notebook for references)
https://arxiv.org/abs/1701.02434: A Conceptual Introduction to Hamiltonian Monte Carlo
https://arxiv.org/abs/1909.12313: A Conceptual Introduction to Markov Chain Monte Carlo Methods
Talks:
By Michael Betancourt: https://www.youtube.com/results?search_query=michael+betancourt
Diagnostics:
https://arxiv.org/abs/1903.08008: Modern Convergence diagnostics
https://arxiv.org/abs/2003.07900: Convergence diagnostic with machine learning
Recommended state-of-the-art tools:
Stan https://mc-stan.org
There are also younger implementations in Python (PyMC3, https://docs.pymc.io/) and Julia (BAT, https://bat.github.io/BAT.jl/stable/).
Ensemble MCMC methods like emcee and zeus are popular for gradient-free inference. These are wrapped in autoemcee https://johannesbuchner.github.io/autoemcee/, which runs until convergence is reached.
Importance Sampling¶
(there are many resources online)
Combination with Variational Bayes and initialization: https://arxiv.org/abs/1304.7808
Recommended tools:
Programming Library: https://pypmc.github.io/ supports Variational Bayes and Expectation Maximization algorithms for proposal optimization, and MPI for sampling.
snowline: https://johannesbuchner.github.io/snowline/ wraps the above, together with minuit to find a initial guess.
Nested Sampling¶
https://arxiv.org/abs/2101.09675: Nested Sampling Methods
https://ui.adsabs.harvard.edu/search/?q=author%3A%22%5ESkilling%22+title%3ANested : papers by Skilling
https://arxiv.org/abs/1805.03924: Connection to Sequential/Population Monte Carlo
LRPS techniques:
https://arxiv.org/abs/astro-ph/0508461: The Ellipsoidal Nested sampling technique
https://arxiv.org/abs/1904.02180: Illustrations of various techniques
Diagnostics
https://arxiv.org/abs/1804.06406 - Several visualisations and tests, for example looking at consistency across runs.
https://arxiv.org/abs/2006.03371 - looks where the newly sampled point fits in the existing points likelihood distribution. Is its rank / order uniformly distributed?
Recommended state-of-the-art, open-source tools:
[ ]: