CDF is the Cumulative Distribution Function. It is used in probability and statistics to calculate the distribution of cumulative probabilities. You can also look for the full and detailed definition on Wiki.
In probability and statistics, given a real-valued random-variableX and a value x we want to evaluate at, CDF of X is the probability that X is less than or equal to x.
The most critical point of CDF is ”Cumulative”, which means we need to accumulate the CDF values while calculating CDF of X at the point x. Let’s have a look at the formula of CDF.
Formula of CDF
For any real-valued random-variableX, CDF can be calculated by:
FX(x)=P(X<=x)
where FX(x) is CDF of X evaluated at x, P(X<=x) is the probability that X takes a value less than or equal to x. As shown below, the red point is the x point and the blue shaded area is the interval of all possible values for X.
When to Use CDF?
It is often used to calculate the probability that the value of X falls within a certain interquartile range. For example, when P(a<x<=b) need to be calculated, we can use CDF to do that:
P(a<x<=b)=FX(b)−FX(a)
Note that if you need to calculate the fully closed interval, P(a<=x<=b), you need to add the probability of x=a to the above equation, i.e.
P(a<=x<=b)=FX(b)−FX(a)+P(x=a)
CDF of Discrete Random Variables
Let’s play the dice game. Suppose the dice have 6 sides (X∈[1,2,3,4,5,6]). The probability that randomly placing each side of the dice is:
P(X=1)=P(X=2)=P(X=3)=P(X=4)=P(X=5)=P(X=6)=1/6
We can calculate the value of CDF (FX(x)) when
x∈[0,...,1,...,2,...,3,...,4,...,5,...,6,...(>6)]
as:
x
FX(x)
0
FX(0)=P(X<=0)=0
…
FX(...)=P(X<=...)=0
1
FX(1)=P(X<=1)=1/6
…
FX(...)=P(X<=...)=1/6
2
FX(2)=P(X<=2)=2/6
…
FX(...)=P(X<=...)=2/6
3
FX(3)=P(X<=3)=3/6
…
FX(...)=P(X<=...)=3/6
4
FX(4)=P(X<=4)=4/6
…
FX(...)=P(X<=...)=4/6
5
FX(5)=P(X<=5)=5/6
…
FX(...)=P(X<=...)=5/6
6
FX(6)=P(X<=6)=1
…
FX(...)=P(X<=...)=1
Plot the CDF according to the above table:
Based on this graph, we can see that the value of CDF of X is discrete.
CDF of Continuous Random Variables
Let’s use the game of Wheel of Fortune to explain the CDF of Continuous Random Variables.
Assuming that the number the pointer points to when the wheel is stopped is X (X∈[0,10)), we can also use CDF to calculate the probability. Let’s see what CDF of X is when x = [-1, 0, 1, 5, 10, 15].
With this graph, we can clearly see the continuity of its CDF values.
Differences in CDF Between Discrete and Continuous Random Variables
For discrete random variables, the value of the CDF is not continuous, which means that the CDF value at the point before the critical point will be very significantly different.
For example, let’s go back to the dice game where P(X=5)=65, P(X=4.9999999...)=64.
P(X=5)=P(X=4.9999999...)
Whereas for continuous random variables, the value of the CDF is continuous and each point will be very close to the previous point.
For example, going back to the wheel of fortune, P(X=5)=106, P(X=4.9999999...)=105.999999....