17 Floyd's Cycle-Finding (Tortoise and Hare), Primality Testing

Lecture from: 15.04.2025 | Video: Homelab | Rui Zhangs Notes

We’ve seen how Bloom filters provide a space-efficient way to check for set membership with a controllable false positive rate. Let’s compare them more directly with hash tables and then move on to another classic algorithm and a very important application of randomized algorithms: primality testing.

Comparison: Hash Table vs. Bloom Filter

Structure and Information Stored

Observation: When a Bloom filter uses only $k = 1$ hash function, its structure becomes somewhat similar to a hash table where we only mark if a hash bucket is occupied or not, without storing the actual element or resolving collisions by chaining/probing.

Key Difference (for $k = 1$ and beyond):

Bloom Filter: Stores only a “yes/no” indication for each bit in its array $M$ . If $M [h (x)] = 1$ , it means some element hashing to this position (or one of its $k$ positions if $k > 1$ ) has been seen. It doesn’t store which specific element or how many.
Hash Table (for set membership): Typically stores the actual elements (or pointers to them, or their original indices if we are checking for duplicates from a specific input sequence). This allows it to resolve collisions definitively and avoid false positives. When used for finding duplicates, it stores the original index of the element that hashed to a bucket.

Collision Probabilities and Space

Let’s consider the scenario where we want to achieve an $O (1)$ expected number of “undesired events.”

Floyd’s Cycle-Finding Algorithm (“Tortoise and Hare”)

This algorithm is used to find a duplicate number in an array where each value is treated as a pointer to the next index, forming a functional graph (linked list structure). The duplicate corresponds to a cycle in this structure.

Check out the video below which explains this in a much better way imo…

Diagram

You can visualize this with the classic rho (ρ) structure:

[Start] → → → [Cycle Start] → → → [Meeting Point]
      \________/          \_______/
         stem (k)              cycle (ℓ)

Assumptions

You are given an array a of length $n + 1$ with values in the range ${1, 2, \dots, n}$ . By the pigeonhole principle, there is at least one duplicate.

We define a function:

f (x) = a [x]

This defines a sequence:

x_{0} = n, x_{1} = f (x_{0}), x_{2} = f (x_{1}), \dots

Eventually, this sequence enters a cycle. Our goal is to find two distinct indices $i \neq = j$ such that:

a [i] = a [j]

Step-by-Step Algorithm

Phase 1: Detect Cycle (Find Meeting Point)

Use two pointers:

tortoise: moves 1 step at a time: tortoise = a[tortoise]
hare: moves 2 steps at a time: hare = a[a[hare]]

Initialize:

tortoise = a[n]
hare = a[a[n]]

Loop until they meet:

while tortoise != hare:
    tortoise = a[tortoise]
    hare = a[a[hare]]

At this point, both pointers are in the cycle and at the same position.

Phase 2: Find Start of Cycle (Duplicate Value)

Reset one pointer (tortoise) to the beginning:

tortoise = n

Move both pointers one step at a time:

while tortoise != hare:
    tortoise = a[tortoise]
    hare = a[hare]

They now meet at the start of the cycle, which corresponds to the duplicate value:

duplicate_value = hare  # or tortoise

Phase 3: Find Indices $(i, j)$ such that $a [i] = a [j] = d u pl i c a t e_v a l u e$

Find first index `i`: (predecessor on the path from star)

i = n
while a[i] != duplicate_value:
    i = a[i]

Find second index `j`: (predecessor inside the cycle)

j = a[duplicate_value]  # move one step into the cycle
while a[j] != duplicate_value:
    j = a[j]

Now you have:

$i \neq = j$
$a [i] = a [j] = duplicate_value$

Return the pair $(i, j)$ , or sort if needed:

return (i,j)

Why It Works

I didn’t like the way it was proven in the lecture, so I’ve attached a more intuitive proof below…

Time and Space Complexity

Time: $O (n)$ - each pointer traverses at most $n$ steps
Space: $O (1)$ - only a constant number of pointers used

Primality Testing

Problem

Given an integer $n \in N$ , determine whether $n$ is a prime number (i.e., has no divisors other than 1 and itself).

Application:

Efficiently generating large random prime numbers is critical in cryptography, especially in RSA encryption. For example, RSA often uses primes with around 1000 bits.

The standard approach:

Generate a random large number $n$ .
Test if $n$ is prime.
If it is, use it. Otherwise, repeat.

By the Prime Number Theorem, the density of primes around a large number $x$ is approximately:

π (x) \sim \frac{x}{ln x}

For a 1000-bit number $n \approx 2^{1000}$ , we estimate $ln n \approx 1000 ln 2 \approx 693$ . So, roughly 1 in every 693 such numbers is prime. Thus, the primality test must be very efficient, as we’ll need to test hundreds of candidates on average.

A Brief History of Primality Testing

~200 BC (Eratosthenes): Sieve of Eratosthenes: Finds all primes $\leq n$ . Efficient for small $n$ , but not polynomial in $lo g n$ (i.e., the input size in bits).
1976 (Miller, Rabin): Introduced a Monte Carlo primality test with one-sided error:
- If $n$ is prime, it always says “prime”.
- If $n$ is composite, it says “composite” with high probability (≥ 3/4). (False positives can occur, but their probability can be made arbitrarily small through repetition.)
1977 (Solovay, Strassen): Another Monte Carlo test, based on quadratic residues.
1987 (Adleman, Huang): A different probabilistic approach that never falsely accepts composites, but might wrongly reject a prime.
2002 (Agrawal, Kayal, Saxena - Ask): Developed the first deterministic, polynomial-time primality test. A major theoretical milestone, but still too slow in practice for cryptographic sizes compared to randomized methods like Miller-Rabin.

Basic Primality Tests (Flawed or Inefficient)

1. Naive Trial Division:

Check if any $a \in {2, \dots, n}$ divides $n$ .

Problem: Too slow - takes $\sim n$ divisions.
For $n \approx 2^{1000}$ , this is computationally infeasible.
We need algorithms that run in time polynomial in $lo g n$ .

2. Randomized Trial Division:

Choose $a \in {2, \dots, ⌊ n ⌋}$ uniformly at random.
If $a ∣ n$ , return “composite”; else return “prime”.

Problem: If $n$ is composite but has only large prime factors, the test is unlikely to find a divisor, leading to high error probability.

Euclid-Based Primality Test

Key Insight:

If $n$ is prime, then for all $a \in {1, \dots, n - 1}$ ,

g cd (a, n) = 1.

These $a$ ‘s form the multiplicative group $Z_{n}^{*}$ , which has $ϕ (n)$ elements.

If $n$ is prime, then $ϕ (n) = n - 1$ .

EUCLID-PRIMZAHLTEST Algorithm:

Pick $a \in {1, \dots, n - 1}$ uniformly at random.
Compute $g = g cd (a, n)$ .
If $g > 1$ : return “composite”.
If $g = 1$ : return “prime”.

Analysis:

If $n$ is prime: $g cd (a, n) = 1$ for all $a$ . So the test always correctly returns “prime” - no false negatives.
If $n$ is composite:
- The test returns “prime” (a false positive) if $a$ is coprime to $n$ .
- The error probability is:
  $Pr [false positive] = \frac{ϕ ( n )}{n - 1} .$

Example: RSA-type numbers

Let $n = pq$ , where $p$ , $q$ are large distinct primes:

Then $ϕ (n) = (p - 1) (q - 1) = n - p - q + 1$ .
So,
$\frac{ϕ ( n )}{n - 1} \approx 1 - \frac{p + q}{n} \approx 1 - \frac{2}{n},$
which is very close to 1. That means most random $a$ ’s will not reveal that $n$ is composite.

Example: $n = p^{2}$

Then $ϕ (n) = p^{2} - p = n - n$ ,
So error rate ≈ $1 - \frac{1}{n}$ , again close to 1.

Conclusion: This test is not reliable for composite numbers - especially those used in cryptography.

We need tests with guaranteed low error probabilities (e.g., ≤ 1/4), regardless of $n$ ’s factorization.

Looking Ahead

In the next lecture, we’ll explore probabilistic primality tests like Miller-Rabin, which are far more robust. These tests are based on Fermat’s Little Theorem and deeper properties of modular arithmetic, offering significantly lower error rates for composite numbers - even those used in cryptography.

Continue here: 18 Fermat and Miller-Rabin Primality Tests

CS Notes

Explorer

17 Floyd's Cycle-Finding (Tortoise and Hare), Primality Testing

Comparison: Hash Table vs. Bloom Filter

Structure and Information Stored

Collision Probabilities and Space

Floyd’s Cycle-Finding Algorithm (“Tortoise and Hare”)

Diagram

Assumptions

Step-by-Step Algorithm

Phase 1: Detect Cycle (Find Meeting Point)

Phase 2: Find Start of Cycle (Duplicate Value)

Phase 3: Find Indices $(i, j)$ such that $a [i] = a [j] = d u pl i c a t e_v a l u e$

Find first index `i`: (predecessor on the path from star)

Find second index `j`: (predecessor inside the cycle)

Why It Works

Time and Space Complexity

Primality Testing

Problem

Application:

A Brief History of Primality Testing

Basic Primality Tests (Flawed or Inefficient)

1. Naive Trial Division:

2. Randomized Trial Division:

Euclid-Based Primality Test

Key Insight:

EUCLID-PRIMZAHLTEST Algorithm:

Analysis:

Example: RSA-type numbers

Example: $n = p^{2}$

Looking Ahead

Table of Contents

Graph View

CS Notes

Explorer

17 Floyd's Cycle-Finding (Tortoise and Hare), Primality Testing

Comparison: Hash Table vs. Bloom Filter

Structure and Information Stored

Collision Probabilities and Space

Floyd’s Cycle-Finding Algorithm (“Tortoise and Hare”)

Diagram

Assumptions

Step-by-Step Algorithm

Phase 1: Detect Cycle (Find Meeting Point)

Phase 2: Find Start of Cycle (Duplicate Value)

Phase 3: Find Indices (i,j) such that a[i]=a[j]=duplicate_value

Find first index i: (predecessor on the path from star)

Find second index j: (predecessor inside the cycle)

Why It Works

Time and Space Complexity

Primality Testing

Problem

Application:

A Brief History of Primality Testing

Basic Primality Tests (Flawed or Inefficient)

1. Naive Trial Division:

2. Randomized Trial Division:

Euclid-Based Primality Test

Key Insight:

EUCLID-PRIMZAHLTEST Algorithm:

Analysis:

Example: RSA-type numbers

Example: n=p2

Looking Ahead

Table of Contents

Graph View

Phase 3: Find Indices $(i, j)$ such that $a [i] = a [j] = d u pl i c a t e_v a l u e$

Find first index `i`: (predecessor on the path from star)

Find second index `j`: (predecessor inside the cycle)

Example: $n = p^{2}$