Bayes Theorem
Scope
This article is meant to setup the mathematical foundation of Bayesian statistics. Step-wise modeling will be explained in another article.
Set Notation
For binary A this simplifies to:
Substituting equation 2 in equation 1, we get:
For categorical A with multiple levels:
Continuous A
Continuous form of Bayes theorem in the form of densities:
General Note
In commonly used statistical modeling methods such as GLM, we stop with P(B|A) as given by the data — this is the likelihood function. Bayesian modeling allows us to introduce prior beliefs about A into the system either through probability mass or through probability density function.
Posterior Predictive Distribution
Introduction
A posterior predictive distribution is the distribution of unobserved values conditioned on observed values.
Further Reading
The Wikipedia page provides a rigorous treatment of posterior predictive distribution.
Mathematical Form
(i) Unobserved Parameter
(ii) Unobserved Random Variable
Closing Notes
a) It is not always possible to obtain an analytical solution for the posterior predictive distribution.
b) In most practical cases where an analytical solution exists for the posterior predictive distribution, the denominator term is either equal to 1 or does not play a role in determining the type of posterior predictive distribution (this is not always true). Hence only the numerator is retained for further analysis.