Bayes Theorem

Naveen Mathew Nathan S.
3 min readAug 21, 2018

--

Scope

This article is meant to setup the mathematical foundation of Bayesian statistics. Step-wise modeling will be explained in another article.

Set Notation

Equation 1

For binary A this simplifies to:

Equation 2

Substituting equation 2 in equation 1, we get:

For categorical A with multiple levels:

Continuous A

Continuous form of Bayes theorem in the form of densities:

General Note

In commonly used statistical modeling methods such as GLM, we stop with P(B|A) as given by the data — this is the likelihood function. Bayesian modeling allows us to introduce prior beliefs about A into the system either through probability mass or through probability density function.

Posterior Predictive Distribution

Introduction

A posterior predictive distribution is the distribution of unobserved values conditioned on observed values.

Further Reading

The Wikipedia page provides a rigorous treatment of posterior predictive distribution.

Mathematical Form

(i) Unobserved Parameter

(ii) Unobserved Random Variable

If Y2 and Y1 are not independent random variables
If Y2 and Y1 are independent random variables

Closing Notes

a) It is not always possible to obtain an analytical solution for the posterior predictive distribution.

b) In most practical cases where an analytical solution exists for the posterior predictive distribution, the denominator term is either equal to 1 or does not play a role in determining the type of posterior predictive distribution (this is not always true). Hence only the numerator is retained for further analysis.

--

--

Naveen Mathew Nathan S.
Naveen Mathew Nathan S.

Written by Naveen Mathew Nathan S.

Data Scientist, interested in theory and practice of machine learning.

No responses yet