1.
Exploratory Data Analysis
1.3.
EDA Techniques
1.3.6.
Probability Distributions
1.3.6.1.

What is a Probability Distribution


Discrete Distributions

The mathematical definition of a discrete probability function,
p(x), is a function that satisfies the following properties.
 The probability that x can take a specific value is p(x).
That is
\[ P[X = x] = p(x) = p_{x} \]
 p(x) is nonnegative for all real x.
 The sum of p(x) over all possible values of x is 1, that is
\[ \sum_{j}p_{j} = 1 \]
where j represents all possible values that
x can have and p_{j} is the
probability at x_{j}.
One consequence of properties 2 and 3 is that
0 <= p(x) <= 1.
What does this actually mean? A discrete probability function is a
function that can take a discrete number of values (not necessarily
finite). This is most often the nonnegative integers or some subset
of the nonnegative integers. There is no mathematical restriction
that discrete probability functions only be defined at integers, but
in practice this is usually what makes sense. For example, if
you toss a coin 6 times, you can get 2 heads or 3 heads but not
2 1/2 heads. Each of the discrete values has a certain probability
of occurrence that is between zero and one. That is, a discrete
function that allows negative values or values greater than one is
not a probability function. The condition that the probabilities
sum to one means that at least one of the values has to occur.

Continuous Distributions

The mathematical definition of a continuous probability function, f(x),
is a function that satisfies the following properties.
 The probability that x is between two points a and b is
\[ p[a \le x \le b] = \int_{a}^{b} {f(x)dx} \]
 It is nonnegative for all real x.
 The integral of the probability function is one, that is
\[ \int_{\infty}^{\infty} {f(x)dx} = 1 \]
What does this actually mean? Since continuous probability
functions are defined for an infinite number of points over a
continuous interval, the probability at a single point is always
zero. Probabilities are measured over intervals, not single points.
That is, the area under the curve between two distinct points
defines the probability for that interval. This means that the
height of the probability function can in fact be greater than one.
The property that the integral must equal one is equivalent to
the property for discrete distributions that the sum of all the
probabilities must equal one.

Probability Mass Functions Versus Probability Density Functions

Discrete probability functions are referred to as probability mass
functions and continuous probability functions are referred to as
probability density functions. The term probability functions
covers both discrete and continuous distributions. When we
are referring to probability functions in generic terms, we may
use the term probability density functions to mean both discrete
and continuous probability functions.
