11 Independence of Multiple Events, Random Variables

Lecture from: 25.03.2025 | Video: Homelab | Rui Zhangs Notes

Independence of Events

Recalling Conditional Probability

Before we talk about independence, let’s quickly remember what conditional probability is. If we have two events, say $A$ and $B$ , the probability of $A$ happening given that we already know $B$ has happened is written as $P r [A ∣ B]$ . We calculate this as: $P r [A ∣ B] = \frac{P r [ A \cap B ]}{P r [ B ]}$ (This formula only makes sense if $P r [B]$ is not zero, of course!)

Think of it like this: $P r [A ∣ B]$ measures the probability of event $A$ when the “universe” of possibilities has shrunk down to only those outcomes where $B$ occurred.

The Idea of Independence

Now, what if knowing that event $B$ happened gives us absolutely no new information about whether event $A$ will happen? In such a case, the probability of $A$ remains the same, regardless of $B$ . So, intuitively, $A$ is independent of $B$ if: $P r [A ∣ B] = P r [A]$

If we plug this into our formula for conditional probability: $\frac{P r [ A \cap B ]}{P r [ B ]} = P r [A]$

Multiplying both sides by $P r [B]$ gives us the formal definition of independence.

Definition of Independence for Two Events

Two events $A$ and $B$ are said to be independent if the probability that both $A$ and $B$ occur is simply the product of their individual probabilities: $P r [A \cap B] = P r [A] \cdot P r [B]$

This definition is beautifully symmetric. If $A$ is independent of $B$ , then $B$ is also independent of $A$ . And it works even if one of the probabilities is zero.

A crucial point: stochastic independence (this mathematical definition) doesn’t always mean the events are “physically” or “causally” unrelated in the real world. They are independent if their probabilities behave this way.

Example: Two Dice Rolls (Revisited)

Remember our example with a red die and a black die?

Let $A$ be the event “the red die shows an even number.” We found $P r [A] = 1/2$ . Let $B$ be the event “the sum of the two dice is 7.” We found $P r [B] = 1/6$ . The event $A \cap B$ is “the red die is even AND the sum is 7.” The outcomes are ${(2, 5), (4, 3), (6, 1)}$ , so $P r [A \cap B] = 3/36 = 1/12$ .

Now, let’s check: $P r [A] \cdot P r [B] = (1/2) \cdot (1/6) = 1/12$ . Since $P r [A \cap B]$ is indeed equal to $P r [A] \cdot P r [B]$ , the events “red die is even” and “sum is 7” are independent. Knowing the red die is even doesn’t change the likelihood of the sum being 7, and vice-versa.

Independence of Multiple Events

What if we have more than two events, say $A$ , $B$ , and $C$ ? When do we call them independent?

Intuition: If $A, B,$ and $C$ are independent, then learning about any combination of them shouldn’t give us any information about the others. For example, knowing $A$ and $B$ happened shouldn’t change the probability of $C$ : $P r [C ∣ A \cap B] = P r [C]$ And similarly for other combinations like $P r [C ∣ A] = P r [C]$ , $P r [B ∣ A \cap C] = P r [B]$ , etc.

Systematic Definition:

For three events $A, B,$ and $C$ to be (mutually) independent, we need all of the following conditions to hold:

$P r [A \cap B] = P r [A] \cdot P r [B]$ (A and B are pairwise independent)
$P r [A \cap C] = P r [A] \cdot P r [C]$ (A and C are pairwise independent)
$P r [B \cap C] = P r [B] \cdot P r [C]$ (B and C are pairwise independent)
$P r [A \cap B \cap C] = P r [A] \cdot P r [B] \cdot P r [C]$ (The crucial one for mutual independence)

If these hold, then indeed, for example, $P r [C ∣ A \cap B] = \frac{P r [ A \cap B \cap C ]}{P r [ A \cap B ]} = \frac{P r [ A ] P r [ B ] P r [ C ]}{P r [ A ] P r [ B ]} = P r [C]$ .

It’s important to note that pairwise independence (conditions 1, 2, 3) does not automatically imply mutual independence (condition 4 is also needed).

Example 1: Numbers from 1 to 8 (where pairwise doesn’t mean mutual)

Let’s pick a number uniformly at random from ${1, 2, 3, 4, 5, 6, 7, 8}$ . Let $A :=$ “the number is in ${1, 2, 3, 4}$ ”. $P r [A] = 4/8 = 1/2$ . Let $B :=$ “the number is in ${1, 5, 6, 7}$ ”. $P r [B] = 4/8 = 1/2$ . Let $C :=$ " $B$ ". (So C is the same event as B). $P r [C] = 1/2$ .

Then: $A \cap B \cap C = A \cap B = {1}$ . So, $P r [A \cap B \cap C] = 1/8$ . And $P r [A] \cdot P r [B] \cdot P r [C] = (1/2) \cdot (1/2) \cdot (1/2) = 1/8$ . So, the condition $P r [A \cap B \cap C] = P r [A] P r [B] P r [C]$ holds.

However, $B$ and $C$ are clearly not independent, because they are the exact same event! $P r [B \cap C] = P r [B] = 1/2$ . But $P r [B] \cdot P r [C] = (1/2) \cdot (1/2) = 1/4$ . Since $1/2 \neq = 1/4$ , $B$ and $C$ are not independent. Thus, $A, B, C$ are not mutually independent, even though the “triple intersection” formula worked out. This example highlights why all pairwise conditions (and the n-wise condition) are needed.

Example 2: Two Coin Flips (where pairwise holds, but not mutual)

Let’s flip two fair coins. Outcomes: HH, HT, TH, TT (each with probability 1/4). Let $A :=$ “first coin is Heads” $= {HH, H T}$ . $P r [A] = 1/2$ . Let $B :=$ “second coin is Heads” $= {HH, T H}$ . $P r [B] = 1/2$ . Let $C :=$ “the two coins show different results” $= {H T, T H}$ . $P r [C] = 1/2$ .

Let’s check pairwise independence:

$A \cap B = {HH}$ . $P r [A \cap B] = 1/4$ . $P r [A] P r [B] = (1/2) (1/2) = 1/4$ . So, $A, B$ are independent.
$A \cap C = {H T}$ . $P r [A \cap C] = 1/4$ . $P r [A] P r [C] = (1/2) (1/2) = 1/4$ . So, $A, C$ are independent.
$B \cap C = {T H}$ . $P r [B \cap C] = 1/4$ . $P r [B] P r [C] = (1/2) (1/2) = 1/4$ . So, $B, C$ are independent.

So, all pairs are independent!

Now, what about $A \cap B \cap C$ ? This means “first is H, second is H, AND results are different”. This is impossible! So $A \cap B \cap C = \emptyset$ , and $P r [A \cap B \cap C] = 0$ . But, $P r [A] P r [B] P r [C] = (1/2) (1/2) (1/2) = 1/8$ . Since $0 \neq = 1/8$ , the events $A, B, C$ are not mutually independent, even though they are pairwise independent. This is a classic example!

General Definition of Independence for Many Events

A collection of events $A_{1}, A_{2}, \dots, A_{n}$ are (mutually) independent if for every subcollection of these events, say $A_{i_{1}}, A_{i_{2}}, \dots, A_{i_{k}}$ (where $I = {i_{1}, \dots, i_{k}}$ is any subset of ${1, \dots, n}$ ), the probability of their intersection is the product of their individual probabilities: $P r [A_{i_{1}} \cap A_{i_{2}} \cap \dots \cap A_{i_{k}}] = P r [A_{i_{1}}] \cdot P r [A_{i_{2}}] \cdot \dots \cdot P r [A_{i_{k}}]$

This must hold for all $2^{n} - n - 1$ non-trivial subsets of indices (of size 2 or more), plus the set of all $n$ events. This is a strong condition!

An infinite family of events $A_{i}$ (for $i \in N$ ) is independent if this condition holds for every finite subcollection.

Some Useful Properties (Lemmas about Independence)

If events $A, B_{1}, \dots, B_{k}$ are independent, then:

$\overset{ˉ}{A}, B_{1}, \dots, B_{k}$ are also independent (where $\overset{ˉ}{A}$ is the complement of $A$ ).
- Proof sketch: $P r [\overset{ˉ}{A} \cap B] = P r [B] - P r [A \cap B] = P r [B] - P r [A] P r [B] = (1 - P r [A]) P r [B] = P r [\overset{ˉ}{A}] P r [B]$ (where $B$ represents the intersection of any subcollection of $B_{i}$ ‘s).
If $A_{1}, A_{2}, B_{1}, \dots, B_{k}$ are independent, then $A_{1} \cap A_{2}, B_{1}, \dots, B_{k}$ are independent.
If $A_{1}, A_{2}, B_{1}, \dots, B_{k}$ are independent, then $A_{1} \cup A_{2}, B_{1}, \dots, B_{k}$ are independent.
- Proof sketch for union: $P r [(A_{1} \cup A_{2}) \cap B] = P r [(A_{1} \cap B) \cup (A_{2} \cap B)]$ . Using inclusion-exclusion and then independence for intersections: $= P r [A_{1} \cap B] + P r [A_{2} \cap B] - P r [A_{1} \cap A_{2} \cap B]$ $= P r [A_{1}] P r [B] + P r [A_{2}] P r [B] - P r [A_{1}] P r [A_{2}] P r [B]$ $= (P r [A_{1}] + P r [A_{2}] - P r [A_{1} \cap A_{2}]) P r [B]$ (since $A_{1}, A_{2}$ are also independent as a subcollection) $= P r [A_{1} \cup A_{2}] P r [B]$ .

An equivalent way to state that events $A_{1}, \dots, A_{n}$ are independent is that for any choice of $s_{1}, \dots, s_{n} \in {0, 1}$ (where $A_{i}^{1} = A_{i}$ and $A_{i}^{0} = \overset{ˉ}{A_{i}}$ ), the following holds: $P r [A_{1}^{s_{1}} \cap A_{2}^{s_{2}} \cap \dots \cap A_{n}^{s_{n}}] = P r [A_{1}^{s_{1}}] \cdot P r [A_{2}^{s_{2}}] \cdot \dots \cdot P r [A_{n}^{s_{n}}]$ This means you can “mix and match” events and their complements, and the product rule for intersections still applies.

Visual Cryptography: An Application of Independence (and Dependence)

Visual cryptography is a neat trick. Imagine you have a secret image. You can split it into two (or more) “shares” or transparencies. Each share by itself looks like random noise – you can’t see the image. But if you overlay the correct shares, the secret image magically appears!

How does this relate to independence? Consider a single pixel in the original secret image (either black or white). This pixel is encoded into corresponding sub-pixels on each share.

Within a single share: The sub-pixels making up the encoding of one original pixel might be chosen independently (or in a structured but random-looking way). Horizontally adjacent sub-pixels on a share might be independent.
Between shares (for a non-secret region): If you look at the sub-pixels in Share 1 corresponding to an original white pixel, and the sub-pixels in Share 2 for that same original white pixel, their combination when overlaid should still look random (or white).
Between shares (for a secret region): If you look at the sub-pixels in Share 1 corresponding to an original black pixel, and the sub-pixels in Share 2 for that same original black pixel, their combination when overlaid must result in black. This means the choice of sub-pixels on Share 1 depends on the choice on Share 2 (or they are chosen in a correlated way) to reveal the secret.

The “secret” lies in the dependencies created between the shares in the regions that form the image, while maintaining an appearance of randomness (independence) elsewhere and in each share individually.

Random Variables

Often, we are not interested in the elementary outcome $ω$ of an experiment itself, but rather in some numerical quantity associated with that outcome. This is where random variables come in.

What is a Random Variable?

A random variable $X$ is a function that maps each outcome $ω$ in the sample space $Ω$ to a real number. $X : Ω \to R$

It’s “random” because the outcome $ω$ is random, and therefore the value $X (ω)$ is also random. It’s a “variable” because it can take on different numerical values.

Examples of Random Variables

Die Roll: Sample space $Ω = {1, 2, 3, 4, 5, 6}$ (the face showing on a die).
- Let $X (ω) = ω$ (the value of the random variable is just the number rolled).
- Let $Y (ω) = {10 if ω is a prime number (2,3,5) otherwise (1,4,6)$ Here, $Y$ tells us if the outcome was prime or not.
- Let $Z (ω) = ω^{2} - 5 ω$ . (A more complex function of the outcome).
Coin Flips: Flip a coin 3 times. $Ω = {HHH, HH T, H T H, T HH, H TT, T H T, TT H, TTT}$ .
- Let $X$ be the “number of Heads”. $X (HHH) = 3, X (HH T) = 2, X (H TT) = 1, X (TTT) = 0$ , etc.

Notation for Events involving Random Variables

We often write expressions like " $X \leq 5$ ". This is shorthand for the event consisting of all outcomes $ω \in Ω$ such that the random variable $X$ takes on a value less than or equal to 5 for that outcome. Formally: " $X \leq 5$ " denotes the event ${ω \in Ω ∣ X (ω) \leq 5}$ . Similarly, " $X = x$ " denotes the event ${ω \in Ω ∣ X (ω) = x}$ .

Indicator Random Variables

A very important type of random variable is an indicator random variable (or characteristic function).

For any event $E \subseteq Ω$ , its indicator random variable, often denoted $I_{E}$ or $X_{E}$ , is defined as: $X_{E} (ω) = {10 if ω \in E (event E occurs) if ω \in / E (event E does not occur)$

Indicator variables are incredibly useful for translating properties of events into numerical values, especially when calculating expected values.

Probability Mass Function (PMF) / Density Function

For a discrete random variable $X$ (one that takes on a countable number of values), its probability mass function (PMF), often denoted $f_{X} (x)$ or $p_{X} (x)$ , gives the probability that $X$ takes on a specific value $x$ . $f_{X} (x) = P r [X = x] = P r ({ω \in Ω ∣ X (ω) = x})$

The PMF is defined for all real $x$ , but it will be non-zero only for the values that $X$ can actually take. The sum of $f_{X} (x)$ over all possible values of $x$ must be 1.

Cumulative Distribution Function (CDF)

The cumulative distribution function (CDF) of a random variable $X$ , denoted $F_{X} (x)$ , gives the probability that $X$ takes on a value less than or equal to $x$ .

$F_{X} (x) = P r [X \leq x] = P r ({ω \in Ω ∣ X (ω) \leq x})$

The CDF is defined for all real $x$ . It is a non-decreasing function, starting at 0 (as $x \to - \infty$ ) and ending at 1 (as $x \to \infty$ ).

Example: Three Coin Flips

Let $X$ be the number of Heads in three fair coin flips.

$Ω = {HHH, HH T, H T H, T HH, H TT, T H T, TT H, TTT}$ , each with probability $1/8$ . Possible values for $X$ : 0, 1, 2, 3.

$P r [X = 0] = P r [{TTT}] = 1/8$
$P r [X = 1] = P r [{H TT, T H T, TT H}] = 3/8$
$P r [X = 2] = P r [{HH T, H T H, T HH}] = 3/8$
$P r [X = 3] = P r [{HHH}] = 1/8$

PMF $f_{X} (x)$ : $f_{X} (0) = 1/8, f_{X} (1) = 3/8, f_{X} (2) = 3/8, f_{X} (3) = 1/8$ . $f_{X} (x) = 0$ for other $x$ .

CDF $F_{X} (x)$ :

$F_{X} (x) = 0$ for $x < 0$
$F_{X} (x) = P r [X = 0] = 1/8$ for $0 \leq x < 1$
$F_{X} (x) = P r [X = 0] + P r [X = 1] = 1/8 + 3/8 = 4/8$ for $1 \leq x < 2$
$F_{X} (x) = P r [X \leq 1] + P r [X = 2] = 4/8 + 3/8 = 7/8$ for $2 \leq x < 3$
$F_{X} (x) = P r [X \leq 2] + P r [X = 3] = 7/8 + 1/8 = 1$ for $x \geq 3$

Example: Three Dice Rolls - Number of Odd Outcomes

Roll a fair die three times. Let $Y$ be the number of times an odd number (1, 3, or 5) appears. The probability of rolling an odd number in one roll is $3/6 = 1/2$ . The probability of rolling an even number in one roll is $3/6 = 1/2$ . This is identical in structure to the three coin flips example (Heads = Odd, Tails = Even). So, $P r [Y = 0] = 1/8, P r [Y = 1] = 3/8, P r [Y = 2] = 3/8, P r [Y = 3] = 1/8$ . The PMF and CDF plots will look the same as for the coin flip example.

Random Sequences: Which is truly random?

Consider two sequences of K (Kopf/Heads) and Z (Zahl/Tails), each of length 200. One is from 200 actual coin flips, the other is artificially generated. Which one is which?

Sequence 1:

Number of Z (Zahl) = 137. Number of K (Kopf) = 63. (Ratio Z:K is 137:63 $\approx$ 2.17:1)

Sequence 2:

Number of Z (Zahl) = 97. Number of K (Kopf) = 103. (Ratio Z:K is 97:103 $\approx$ 0.94:1)

Typically, humans trying to “fake” a random sequence tend to make it “too balanced” or switch between heads and tails “too often.” True random sequences can have longer runs of the same outcome and can deviate more from a perfect 50/50 split than intuition might suggest for shorter sequences.

We re-analyze slightly different sequences later on:

Sequence 1:

Number of Z = 101 (out of 200). Number of K = 99. Ratio $\approx 1 : 1$ . Number of changes (K to Z or Z to K) = 112.

Sequence 2:

Number of Z = 100. Number of K = 100. Ratio $1 : 1$ . Number of changes = 133.

The sequence with 112 changes looks more like a real random sequence. For a fair coin, we expect about $200 \times (1/2) = 100$ changes. Sequence 2 has 133 changes, which is significantly more than expected, suggesting it might be the one where someone tried to make it look “random” by switching too often. The sequence with a Z:K ratio far from 1:1 is less likely to be from a fair coin over 200 flips, although not impossible. The top sequence on slide with counts 101:99 and 112 changes is a good candidate for the truly random sequence.

Expected Value

The expected value (or expectation or mean) of a random variable $X$ , denoted $E [X]$ , is the average value we would expect $X$ to take if we repeated the experiment many times.

Definition of Expected Value

For a discrete random variable $X$ that takes values in a set $W_{X}$ , its expected value is: $E [X] := \sum_{x \in W_{X}} x \cdot P r [X = x]$

This is a weighted average of the possible values of $X$ , where each value is weighted by its probability.

“we only consider random variables for which the expected value exists.” This means the sum must converge absolutely.

An alternative, often more fundamental, definition using the sample space $Ω$ : $E [X] = \sum_{ω \in Ω} X (ω) \cdot P r [ω]$

This definition always works, even for continuous random variables (where the sum becomes an integral). The two definitions are equivalent for discrete random variables.

Special Case for Non-Negative Integer Valued Random Variables

If a random variable $X$ takes values in $N_{0} = {0, 1, 2, \dots}$ , then its expected value can also be calculated as: $E [X] = \sum_{i = 1}^{\infty} P r [X \geq i] = \sum_{i = 0}^{\infty} P r [X > i]$

This formula is sometimes called the “tail sum formula” for expectation.

Proof

Linearity of Expectation

This is one of the most powerful properties of expected values. It states that the expectation of a sum of random variables is the sum of their expectations.

Crucially, this holds even if the random variables are dependent!

If $X$ and $Y$ are two random variables defined on the same sample space $Ω$ : Let $Z = X + Y$ . This means $Z (ω) = X (ω) + Y (ω)$ for all $ω \in Ω$ . Then, $E [X + Y] = E [X] + E [Y]$ .

Proof

General Linearity of Expectation

For random variables $X_{1}, \dots, X_{n}$ and constants $a_{1}, \dots, a_{n}, b \in R$ : Let $X = a_{1} X_{1} + a_{2} X_{2} + \dots + a_{n} X_{n} + b$ . Then, $E [X] = a_{1} E [X_{1}] + a_{2} E [X_{2}] + \dots + a_{n} E [X_{n}] + b$ .

Expected Value of an Indicator Variable

Let $X_{A}$ be the indicator variable for an event $A$ . $X_{A} (ω) = 1$ if $ω \in A$ , and $X_{A} (ω) = 0$ if $ω \in / A$ .

$E [X_{A}] = 1 \cdot P r [X_{A} = 1] + 0 \cdot P r [X_{A} = 0]$ $= 1 \cdot P r [A] + 0 \cdot P r [\overset{ˉ}{A}]$ $= P r [A]$ .

The expected value of an indicator variable for an event is simply the probability of that event. This is a very handy fact!

Example: Expected Number of Heads in 100 Coin Flips

Flip a fair coin 100 times. Let $X$ be the total number of Heads. We want to find $E [X]$ .

Let $X_{i}$ be the indicator variable for the event “the $i$ -th flip is Heads” (for $i = 1, \dots, 100$ ). $X_{i} = 1$ if $i$ -th flip is H, $X_{i} = 0$ if $i$ -th flip is T. $E [X_{i}] = P r (i-th flip is H) = 1/2$ .

The total number of heads $X$ can be written as the sum of these indicator variables: $X = X_{1} + X_{2} + \dots + X_{100}$ . By linearity of expectation: $E [X] = E [X_{1} + X_{2} + \dots + X_{100}]$ $= E [X_{1}] + E [X_{2}] + \dots + E [X_{100}]$ $= (1/2) + (1/2) + \dots + (1/2)$ (100 times) $= 100 \cdot (1/2) = 50$ . So, the expected number of heads is 50. Notice how easy this was using linearity, without needing to calculate $P r [X = k]$ for all $k$ .

Application: Expected Size of a Randomly Generated Stable Set

A stable set (or independent set) in a graph is a set of vertices where no two vertices are adjacent.

Goal: Find a “large” stable set. This is a hard problem (NP-hard).

Consider a randomized algorithm:

First Round (Node Selection): Each vertex $v \in V$ decides to “survive” (join a candidate set $S^{'}$ ) independently with some probability $p$ . Let $X$ be the number of vertices in $S^{'}$ .
Second Round (Edge Removal): For every edge ${u, v}$ where both $u$ and $v$ survived into $S^{'}$ , randomly remove one of $u$ or $v$ from $S^{'}$ to form the final stable set $S$ .

Algorithm

Each vertex $v \in V$ is kept with probability $p$ (independently). Let $V_{p}$ be the set of kept vertices.
Create an initial set $S_{0} = V_{p}$ .
For each edge $e = {u, v}$ where both $u, v \in S_{0}$ : remove both $u$ and $v$ from $S_{0}$ . (The slide’s analysis is a bit different here, it bounds $S \geq X - Y$ where $Y$ is number of “bad” edges where one endpoint is removed per edge).

Expectation

Let’s analyze the expectation of the size of the stable set $S$ :

Let $G = (V, E)$ with $∣ V ∣ = n$ and $∣ E ∣ = m$ .

Define $p$ (probability a vertex survives the first round, not explicitly stated but implied in $E [X_{v}] = p$ ).

Let $X_{v}$ be an indicator variable: $X_{v} = 1$ if vertex $v$ “survives” the first round, $0$ otherwise. $E [X_{v}] = p$ . Let $X = \sum_{v \in V} X_{v}$ be the number of vertices surviving the first round. By linearity, $E [X] = \sum_{v \in V} E [X_{v}] = \sum_{v \in V} p = n p$ .

Let $Y_{e}$ be an indicator variable: $Y_{e} = 1$ if edge $e = {u, v}$ “survives” the first round (meaning both $u$ and $v$ survived). $E [Y_{e}] = P r (u survives AND v survives)$ . Since survival is independent, $P r (u survives AND v survives) = P r (u survives) \cdot P r (v survives) = p \cdot p = p^{2}$ . So, $E [Y_{e}] = p^{2}$ .

Let $Y = \sum_{e \in E} Y_{e}$ be the number of edges surviving the first round. By linearity, $E [Y] = \sum_{e \in E} E [Y_{e}] = \sum_{e \in E} p^{2} = m p^{2}$ .

The algorithm then constructs a stable set $S$ . The size of $S$ is at least the number of initially selected vertices ( $X$ ) minus the number of “bad” edges ( $Y$ ) for which we have to remove one endpoint. So, $∣ S ∣ \geq X - Y$ .

Then, $E [∣ S ∣] \geq E [X - Y] = E [X] - E [Y]$ (by linearity). $E [∣ S ∣] \geq n p - m p^{2}$ .

Continue here: 12 Randomized QuickSort, Indicator Variables, Common Probability Distributions, Coupon Collector

CS Notes

Explorer

11 Independence of Multiple Events, Random Variables

Independence of Events

Recalling Conditional Probability

The Idea of Independence

Definition of Independence for Two Events

Example: Two Dice Rolls (Revisited)

Independence of Multiple Events

Example 1: Numbers from 1 to 8 (where pairwise doesn’t mean mutual)

Example 2: Two Coin Flips (where pairwise holds, but not mutual)

General Definition of Independence for Many Events

Some Useful Properties (Lemmas about Independence)

Visual Cryptography: An Application of Independence (and Dependence)

Random Variables

What is a Random Variable?

Examples of Random Variables

Notation for Events involving Random Variables

Indicator Random Variables

Probability Mass Function (PMF) / Density Function

Cumulative Distribution Function (CDF)

Example: Three Coin Flips

Example: Three Dice Rolls - Number of Odd Outcomes

Random Sequences: Which is truly random?

Expected Value

Definition of Expected Value

Special Case for Non-Negative Integer Valued Random Variables

Proof

Linearity of Expectation

Proof

General Linearity of Expectation

Expected Value of an Indicator Variable

Example: Expected Number of Heads in 100 Coin Flips

Application: Expected Size of a Randomly Generated Stable Set

Algorithm

Expectation

Table of Contents

Graph View