Probability
Definitions
For discrete random variable \(x\) i.i.d. from \(X= \{ x_1, x_2, \cdots x_n\}\), the probability mass function (PMF) is
\[f:\mathrm{R} \rightarrow [0,1] : \sum_i^n f(x_i)= 1 = \sum_{x \in X} f(x)\]The cumulative distribution function (CDF) is then:
\[F_X(x)= \mathrm{P}(X \le x)= \sum_{i:x_i<x} f(x)\]A random variable is continuous if its CDF is: \(F_X(x)= \mathrm{P}(X \le x)= \int_{-\infty}^x f(u) du \quad \texttt{and} \quad f(x):\mathrm{R} \rightarrow [0, \infty] \quad \texttt{and} \quad \int_{-\infty}^{\infty} f(x)= 1\)
where \(f(x)\) is the probability density function (PDF) of \(x\). Then
\[\mathrm{P}(a \le X \le b)= \int_a^b f(x) dx\]Note that for a continuous variable \(X\), \(\mathrm{P}(X)= 0 \: \forall \quad X \in \mathrm{R}\), in that case
\[\mathrm{P}(x \le X \le x+dx)= f(x) dx\]can give sense of point probability of the continuous variable.
Expectation
Discrete case:
\[\mathrm{E} \{X\} = \sum_{x\in X}x \mathrm{P}(x)\]Continuous case:
\[\mathrm{E} \{X\}= \int_{-\infty}^{\infty} x f(x) dx\]In continuous case, expectation of \(g(x)\) with \(x\) i.i.d \(f(x)\) is (replace \(x\) with \(g(x)\) in the previous eqn)
\[\mathrm{E} \{ g(x)\}= \int_{-\infty}^{\infty} g(x) f(x) dx\]Joint probability
Probability of all the events across all the variables occurring together (Imagine the independent case for intuition)
Discrete case:
\[\mathrm{P}(x,y)= \mathrm{P}(x|y)\mathrm{P}(y)= \mathrm{P}(y|x)\mathrm{P}(x) \quad \texttt{(Chain rule or Baye's rule)}\]When they are independent, \(\mathrm{P}(x,y)= \mathrm{P}(x) \mathrm{P}(y)\)
Continuous case: When the variables are independent, the joint pdf \(f_{XY}(x,y)\) cannot be expressed in separable terms, when they are independent, the joint pdf can be simplified as \(f_{XY}(x,y)= f_X(x) f_Y(y)\), i.e., two separate pdfs of different variables.
Marginal probability
Probability of event \(x\) for all outcomes of \(Y\).
Discrete case:
\[\mathrm{P}(x)= \sum_{y \in Y} \mathrm{P}(x, y)= \sum_{y \in Y} \mathrm{P}(x|y)\mathrm{P}(y)\]OR
\[\mathrm{P}(x)= \sum_{i} \mathrm{P}(x, y_i)= \sum_{i} \mathrm{P}(x|y_i)\mathrm{P}(y_i)\]Continuous case:
\[f_X(x)= \int_{-\infty}^{\infty} f_{XY}(x,y)dy\]Condition probability
Probability of x happening given y happens
Discrete case:
\[\mathrm{P}(x|y)= \frac{\mathrm{P}(x,y)}{p(y)}\]Continuous case:
\[f_{X|Y}(x|Y)= \frac{f_{XY}(x,y)}{f_Y(y)}\]Notation
\(X,Y\) are the domains, \(x,y\) are the specific points in those domains, the variables we work on. \(P(x)\) means \(P(X= x)\)
CDF is just the sum of all the probabilities of all the points smaller than the current point. It’s curve is monotonically increasing. \newline Discrete case:
\[F_X(x)= \mathrm{P}(X \le x)= \sum_{i:x_i<x} f(x)\]Continuous case:
\[F_X(x)= \int_{-\infty}^x f(u) du \quad \texttt{OR} \quad f(x)= \frac{d}{dx} F(x)\]