20 Generators in Finite Fields, Properties of Finite Fields, Error Correcting Codes, Reed-Solomon Codes

Lecture from: 25.11.2024 | Video: Videos ETHZ

Overview

Let’s recap our journey through abstract algebra so far.

We began with monoids, which are algebraic structures with an associative binary operation and an identity element. Then we explored groups, a special type of monoid where every element has an inverse. Building upon groups, we defined rings, which have two operations, addition (forming an abelian group) and multiplication (forming a monoid), connected by the distributive property.

From rings, we explored two main paths:

Gaussian Integers Modulo a Gaussian Prime: We touched upon the idea of taking Gaussian integers modulo an irreducible Gaussian prime (like $2 + i$ ), creating a finite field. This is analogous to constructing $Z_{p}$ from $Z$ by modding out a prime $p$ . However, we didn’t delve deeply into the details of this construction.
Fields and Polynomial Rings: The other path focused on fields, which are commutative rings where every non-zero element has a multiplicative inverse. We then explored polynomial rings $F [x]$ over a field $F$ . This allowed us to construct Galois fields (finite fields) by taking the quotient ring $F [x] / (m (x))$ , where $m (x)$ is an irreducible polynomial. We saw that $∣ F [x] / (m (x)) ∣ = ∣ F ∣^{deg (m (x))}$ .

Within polynomial rings, we also briefly discussed polynomial interpolation. A key fact is that a non-zero polynomial of degree at most $d$ can have at most $d$ roots in a field.

Today’s lecture will explore the intersection of these two paths, addressing the question: Given two different polynomials of degree at most $d$ , how many common roots can they have? Or can they have no common roots at all?

Example: Finding Inverses in $R [x]_{x^{2} + 1}$

Consider the field $R [x]_{x^{2} + 1}$ , which is the ring of polynomials with real coefficients modulo the irreducible polynomial $x^{2} + 1$ .

Irreducibility: $x^{2} + 1$ is irreducible in $R [x]$ because its discriminant ( $b^{2} - 4 a c = 0 - 4 (1) (1) = - 4$ ) is negative, meaning it has no real roots. The Fundamental Theorem of Algebra tells us that a polynomial over the reals is irreducible if and only if it is linear or a quadratic with a negative discriminant. Since there are no linear factors $x - a$ that divide $x^{2} + 1$ , where $a \in R$ , $x^{2} + 1$ is irreducible. Thus all non-zero elements have inverses.
Finding the Inverse of $2 x + 3$ : Let’s find the multiplicative inverse of $2 x + 3$ in $R [x]_{x^{2} + 1}$ . We are looking for a polynomial $ux + v$ such that $(2 x + 3) (ux + v) \equiv 1 (mod x^{2} + 1)$ .
Setting up the Equation: This congruence translates to the equation $(2 x + 3) (ux + v) = t (x^{2} + 1) + 1$ for some polynomial $t (x) \in R [x]$ . Since we’re working modulo $x^{2} + 1$ , the degree of the product $(2 x + 3) (ux + v)$ can be reduced to less than 2 using the relation $x^{2} \equiv - 1$ .

Expanding the left side gives $2 u x^{2} + 3 ux + 2 vx + 3 v$ . Substituting $x^{2} \equiv - 1$ , we get $(- 2 u + 3 v) + (3 u + 2 v) x$ .

Thus we have: $(- 2 u + 3 v) + (3 u + 2 v) x = t (x^{2} + 1) + 1$
Equating Coefficients: Comparing coefficients of the powers of $x$ on both sides, we get the following system of equations:
- $3 u + 2 v = 0$ (coefficient of $x$ )
- $- 2 u + 3 v = 1$ (constant term)
Solving the System: We can solve this system using various methods (substitution, elimination, matrix inversion). Using matrix notation, we have:

$[3 - 2 23] [u v] = [01]$

The inverse of the matrix $[3 - 2 23]$ is $\frac{1}{13} [32 - 2 3]$ .

Multiplying both sides by the inverse matrix gives:

$[u v] = \frac{1}{13} [32 - 2 3] [01] = [- 2/13 3/13]$

So, $u = - \frac{2}{13}$ and $v = \frac{3}{13}$ .
The Inverse: Therefore, the multiplicative inverse of $2 x + 3$ in $R [x]_{x^{2} + 1}$ is $- \frac{2}{13} x + \frac{3}{13}$ .

Example: $GF (2) [x]_{x^{3} + x + 1}$

Let’s illustrate finding inverses in the field $GF (2) [x]_{x^{3} + x + 1}$ , constructed by taking the polynomial ring $GF (2) [x]$ modulo the irreducible polynomial $x^{3} + x + 1$ .

Suppose we want to find the inverse of $x^{2}$ . We seek a polynomial $u x^{2} + vx + w$ , where $u, v, w \in GF (2) = {0, 1}$ , such that $x^{2} (u x^{2} + vx + w) \equiv 1 (mod x^{3} + x + 1)$ .

While we won’t solve this completely here, the setup would involve expanding the product, reducing modulo $x^{3} + x + 1$ (using the fact that $x^{3} \equiv x + 1$ ), and equating coefficients to form a system of equations in $u$ , $v$ , and $w$ . This system can then be solved to find the inverse. This process is analogous to the previous example with $R [x]_{x^{2} + 1}$ .

Powers and Generators in Finite Fields

Now, let’s explore powers and generators within the same field, $GF (2) [x]_{x^{3} + x + 1}$ .

Powers of $x$ : Consider the powers of $x$ : $x, x^{2}, x^{3}, x^{4}, ...$ . Since $GF (2) [x]_{x^{3} + x + 1}$ is a field of polynomials, when we have $x^{3}$ , we must reduce this using our modulo $m (x)$ .
Reduction Modulo $x^{3} + x + 1$ :
- $x$ and $x^{2}$ are already in reduced form (degree less than 3).
- $x^{3} \equiv x + 1 (mod x^{3} + x + 1)$
- $x^{4} = x \cdot x^{3} \equiv x (x + 1) = x^{2} + x (mod x^{3} + x + 1)$
- $x^{5} = x \cdot x^{4} \equiv x (x^{2} + x) = x^{3} + x^{2} \equiv x + 1 + x^{2} = x^{2} + x + 1 (mod x^{3} + x + 1)$
- $x^{6} = x \cdot x^{5} \equiv x (x^{2} + x + 1) = x^{3} + x^{2} + x \equiv x + 1 + x^{2} + x = x^{2} + 1 (mod x^{3} + x + 1)$
- $x^{7} = x \cdot x^{6} \equiv x (x^{2} + 1) = x^{3} + x \equiv x + 1 + x = 1 (mod x^{3} + x + 1)$
Multiplicative Group: The non-zero elements of $GF (2) [x]_{x^{3} + x + 1}$ form a multiplicative group. The number of these non-zero elements is $2^{3} - 1 = 7$ . Here’s the complete list in their standard reduced form: ${1, x, x + 1, x^{2}, x^{2} + 1, x^{2} + x, x^{2} + x + 1}$ .
Lagrange’s Theorem and Generators: Lagrange’s theorem states that the order of any element in a finite group must divide the order of the group. In our case, the multiplicative group has order 7, which is prime. This means every element except the identity (1) has order 7. This has a powerful consequence.

Generators of the Multiplicative Group

An element $g$ is a generator of a group if its powers generate all the elements of the group. In our case, Since every element (except 1) has order 7, which is equal to the order of the multiplicative group, every non-identity element in the multiplicative group of $GF (2) [x]_{x^{3} + x + 1}$ is a generator. In the example we saw that $x, x^{2}, x^{3}, x^{4}, x^{5}, x^{6}, x^{7}$ generate all 7 elements. Thus we have $7 - 1 = 6$ generators. $x$ happens to be one of these generators.

Not part of the actual exam stuff, interesting regardless…

Linear Feedback Shift Registers (LFSRs) and Their Properties

Linear Feedback Shift Registers (LFSRs) provide a way to generate sequences of elements in a finite field, particularly $GF (2)$ . They are simple hardware circuits with fascinating mathematical properties.

1. Construction and Operation:

An LFSR is defined by a polynomial over $GF (2)$ , such as $x^{3} + x + 1$ . The degree of the polynomial determines the length of the LFSR (in this case, length 3). The non-zero coefficients (excluding the leading coefficient) specify the “taps” for feedback.

The LFSR operates by shifting its contents (bits) to the left. The new leftmost bit is calculated as the XOR sum of the bits at the tap positions. In our $x^{3} + x + 1$ example, we XOR the values at the $x$ and $1$ taps (last two values). This feedback mechanism creates a recurring sequence of states.

In the provided example with the initial state $(0, 1, 0)$ representing the polynomial $x$ , the LFSR evolves as follows, with each state corresponding to a power of $x$ modulo $x^{3} + x + 1$ :

$x$ : $(0, 1, 0)$

$x^{2}$ : $(1, 0, 0)$

$x^{3}$ : $(0, 1, 1)$ ( $x + 1$ )

$x^{4}$ : $(1, 1, 0)$

$x^{5}$ : $(1, 1, 1)$

$x^{6}$ : $(1, 0, 1)$

$x^{7}$ : $(1, 0, 0)$ ( $1$ , which also corresponds to $x^{0}$ )

$x^{8}$ : $(0, 1, 0)$ back to initial state $x^{1}$

The LFSR cycles through $2^{3} - 1 = 7$ non-zero states before returning to the initial state.

More info here: Wikipedia: LFSR

2. Statistical Properties:

For an LFSR of length $n$ with a “primitive polynomial” (a polynomial that generates the entire multiplicative group of the field):

Period: The LFSR cycles through $2^{n} - 1$ distinct non-zero states. The all-zero state is excluded as it does not change during the shift operation.

Balance: Over one period, there will be $2^{n - 1}$ ones and $2^{n - 1} - 1$ zeros. The sequence is almost perfectly balanced between 0s and 1s.

3. Applications and Limitations:

Pseudorandom Number Generation (PRNG): In the past, LFSRs were considered for PRNGs. However, they are now deemed cryptographically insecure. Observing a sufficiently long segment of the sequence allows one to reconstruct the feedback taps and predict the entire sequence using linear algebra.

Distance Measurement: LFSR sequences have been used for distance measurement, for example, in radar systems and interplanetary ranging. A modulated signal based on an LFSR sequence is transmitted. The time delay in the received (and weakened) signal corresponds to the signal’s travel time. By correlating the received sequence with shifted versions of the original sequence, we can determine the time delay and thus calculate the distance. This technique is very related to the autocorrelation method discussed in the next point.

Spread Spectrum Communication: LFSR sequences are used in spread spectrum communication. The transmitted signal is spread across a wide frequency band, making it resistant to jamming and interference. This resistance comes from a key property of LFSR sequences related to autocorrelation.

4. Autocorrelation and Spread Spectrum:

The autocorrelation function of a sequence measures the similarity between a sequence and shifted versions of itself. For an LFSR sequence with a primitive polynomial, the autocorrelation function is almost flat, except for a large peak at zero shift (where the sequence perfectly overlaps with itself). The Fourier transform of the autocorrelation function (the power spectral density) is also relatively flat, indicating the signal’s energy is spread across a wide range of frequencies. This flat spectrum is precisely why LFSR sequences are useful in spread spectrum communication. A narrowband jammer would only affect a small portion of the spread spectrum signal, leaving most of the signal intact.

5. Example of Correlation in Distance Measurement:

Let’s consider a simple example with the sequence 001 generated by $x^{3} + x + 1$ , giving us the periodic sequence 0011101. Suppose we transmit a signal based on this sequence and receive a shifted version like 1110100. By comparing the received sequence with all possible shifts of the original sequence, we find the greatest correlation when the received sequence is shifted three positions to the right. This shift corresponds to the time delay, which is related to the distance of the round trip of the transmission.

Properties of Finite Fields

Finite fields, also known as Galois fields, possess some remarkable properties:

Existence and Uniqueness: For every prime $p$ and positive integer $d$ , there exists a unique (up to isomorphism) finite field with $p^{d}$ elements, denoted $GF (p^{d})$ . This means any two finite fields with the same number of elements are essentially the same, just with potentially different representations of their elements.

Construction of Finite Fields

Recall: Finite fields are constructed as quotient rings of polynomial rings. $GF (p^{d})$ is constructed as $GF (p) [x] / (m (x))$ , where $m (x)$ is an irreducible polynomial of degree $d$ over $GF (p)$ . $GF (p)$ itself is isomorphic to $Z_{p}$ .
No Other Finite Fields: There are no other finite fields besides those of the form $GF (p^{d})$ . Every finite field must have a prime power number of elements.
Cyclic Multiplicative Group: The multiplicative group of any finite field $GF (q)$ (where $q = p^{d}$ ) is cyclic. This means there exists an element $g$ , called a generator or primitive element, such that every non-zero element of the field can be expressed as a power of $g$ .

Error Correcting Codes

Error correcting codes are crucial for ensuring data integrity in digital communication and storage systems. They protect against data corruption caused by noise, interference, and other sources of error.

The Need for Error Correction: Storage media and communication channels are inherently imperfect. Bits can be flipped or corrupted during transmission or storage, leading to data loss or misinterpretation. Error correcting codes add redundancy to the data, enabling the detection and correction of these errors.

Repetition Codes: A Simple Approach

A basic error correcting code is the repetition code. The idea is simple: repeat each bit multiple times to create redundancy.

Encoding: A 3-bit repetition code encodes 0 as 000 and 1 as 111.
Decoding (Majority Voting): The received sequence is decoded by majority voting. For example, 101 is decoded as 1.

Effectiveness and the Binary Symmetric Channel:

The binary symmetric channel (BSC) is a simple model for a noisy communication channel.

$ϵ$ : Probability of a bit flip (error).
$1 - ϵ$ : Probability of correct transmission.

Error Probability Calculation

Without error correction: $P (error) = ϵ$ .

With 3-bit repetition code: $P (error) = (2 3) ϵ^{2} (1 - ϵ) + (3 3) ϵ^{3} \approx 3 ϵ^{2}$ for small $ϵ$ .

With 5-bit repetition code: $P (error) \approx 10 ϵ^{3}$ for small $ϵ$ .

Increasing redundancy (more repetitions) improves error correction but at the cost of increased bandwidth or storage.

Hamming Codes: A More Sophisticated Approach

Hamming codes offer a more efficient way to achieve error correction. They rely on the concept of Hamming distance.

Hamming Distance

The Hamming distance between two bitstrings of equal length is the number of positions at which the corresponding bits differ.

Encoding for Larger Hamming Distance: The core principle is to encode messages into codewords such that the minimum Hamming distance ( $d_{min}$ ) between any two codewords is maximized. A larger $d_{min}$ allows for the correction of more errors.

Error Correction Capability

A code with $d_{min} = 2 t + 1$ can correct up to $t$ errors.

Decoding and Error Correction: Decoding involves finding the closest valid codeword (in terms of Hamming distance) to the received word.

Constructing Hamming Codes: Hamming codes use parity bits (linear combinations of message bits) to achieve a larger Hamming distance.

(7, 4) Hamming Code

A (7, 4) Hamming code encodes 4 message bits into 7-bit codewords. The encoding can be represented by a matrix multiplication: Decoding involves checking parity bits. Failed parity checks indicate errors, and the pattern of failures pinpoints the error location.

Formalizing Encoding and Decoding

Encoding Function: $E : A^{k} \to A^{n}$ maps a $k$ -bit message to an $n$ -bit codeword.
Decoding Function: $D : A^{n} \to A^{k}$ maps an $n$ -bit received word to a $k$ -bit message.

The decoding function should recover the original message if the number of errors is within the code’s correction capability: If $d (r, E (m)) \leq t$ , then $D (r) = m$ , where $r$ is the received word, $m$ is the original message, and $t$ is the error correction capability.

Reed-Solomon Codes and Finite Fields

You might need to skip to the final few chapters…

Reed-Solomon codes, a powerful class of error-correcting codes, utilize finite fields and polynomial evaluation.

Reed-Solomon Encoding

Encoding in a Reed-Solomon code over a finite field $GF (q)$ involves:

Message as a Polynomial: Represent the message $(a_{0}, a_{1}, ..., a_{k - 1})$ as a polynomial $a (x) = a_{0} + a_{1} x + ... + a_{k - 1} x^{k - 1}$ with coefficients in $GF (q)$ .

Evaluating the Polynomial: Choose $n$ distinct elements $α_{0}, α_{1}, ..., α_{n - 1}$ from $GF (q)$ . The codeword is then $(a (α_{0}), a (α_{1}), ..., a (α_{n - 1}))$ . Note that $n \leq q$ . This is because there are at most $q$ distinct elements in $GF (q)$ .

Minimum Distance of Reed-Solomon Codes

A Reed-Solomon code with parameters $(n, k)$ has a minimum distance $d_{min} = n - k + 1$ .

Proof

1. Core Idea: Polynomials and Roots

A non-zero polynomial of degree $d$ can have at most $d$ roots in any field.

2. Reed-Solomon Codewords as Polynomial Evaluations

Recall how Reed-Solomon codes work:

Message Polynomial: A message $(m_{0}, m_{1}, ..., m_{k - 1})$ is represented as a polynomial $m (x) = m_{0} + m_{1} x + ... + m_{k - 1} x^{k - 1}$ of degree at most $k - 1$ .
Codeword: The codeword is obtained by evaluating $m (x)$ at $n$ distinct points $α_{0}, α_{1}, ..., α_{n - 1}$ in the finite field $GF (q)$ : $c = (m (α_{0}), m (α_{1}), ..., m (α_{n - 1}))$ .

3. Considering the Difference of Two Polynomials

Now, let’s consider two different messages, $m (x)$ and $m^{'} (x)$ , both of degree at most $k - 1$ . Their corresponding codewords are $c$ and $c^{'}$ . We want to find the minimum Hamming distance between $c$ and $c^{'}$ , which is the number of positions where they differ.

Let $d (x) = m (x) - m^{'} (x)$ . This is also a polynomial of degree at most $k - 1$ . The codewords $c$ and $c^{'}$ will agree at a position $i$ if and only if $m (α_{i}) = m^{'} (α_{i})$ , which is equivalent to $d (α_{i}) = 0$ . In other words, codewords agree exactly when $d (α_{i}) = 0$ . This means $d (x)$ has a root at $α_{i}$ .

4. Connecting to Hamming Distance

The Hamming distance between $c$ and $c^{'}$ is the number of positions where they differ, which is $n$ minus the number of positions where they agree. Since $c$ and $c^{'}$ agreeing means that the difference of the polynomials is 0, we are searching for the number of places $d (x)$ isn’t zero. From step one we know that any polynomial of degree $k - 1$ has a maximal $k - 1$ distinct roots. The number of places $d (x)$ is zero is at most $k - 1$ . Thus, the minimal number of spots they disagree must be at least $n - (k - 1) = n - k + 1$ . Since we selected distinct points $α_{i}$ , which is $n$ spots to evaluate, then we can subtract at most $k - 1$ zeros from the number of distinct positions. This gives a $d_{min} = n - k + 1$ .

Example

Let’s consider a simple example over $GF (7)$ with $n = 4$ and $k = 2$ . This means our message polynomials will have degree at most $k - 1 = 1$ .

Evaluation Points: Let’s choose $α_{0} = 1$ , $α_{1} = 2$ , $α_{2} = 3$ , and $α_{3} = 4$ .
Message 1: $m (x) = 2 x + 1$ . The codeword is $c = (m (1), m (2), m (3), m (4)) = (3, 5, 0, 2)$ .
Message 2: $m^{'} (x) = x + 3$ . The codeword is $c^{'} = (m^{'} (1), m^{'} (2), m^{'} (3), m^{'} (4)) = (4, 5, 6, 0)$ .
Difference Polynomial: $d (x) = m (x) - m^{'} (x) = x - 2$ .
Roots of d(x): $d (x)$ has one root in $GF (7)$ , namely $x = 2$ . This corresponds to the second position where the codewords agree ( $c_{1} = c_{1}^{'} = 5$ ).
Hamming Distance: The Hamming distance between $c$ and $c^{'}$ is 3 (they differ in the first, third, and fourth positions). This is consistent with $d_{min} = n - k + 1 = 4 - 2 + 1 = 3$ .

This example demonstrates how the number of agreement points between two codewords relates to the roots of the difference polynomial and ultimately determines the minimum distance of the Reed-Solomon code.

Practical Application of Reed-Solomon Codes

Reed-Solomon codes are highly effective for correcting burst errors, making them suitable for applications like CDs, DVDs, barcodes, and data transmission.

Let’s consider a practical example:

1. (256, 224) Reed-Solomon Code:

A (256, 224) Reed-Solomon code encodes 224 information symbols into 256 codeword symbols. Typically, symbols are bytes (8 bits), so this code takes 224 bytes of data and adds 32 bytes of redundancy.

Finite Field: In practice, $GF (2^{8}) = GF (256)$ is often used. This field allows for representing each symbol as a byte. We could theoretically choose an $n = 2^{8} = 256$ as maximal $n$ . However we only require $224 + 32 = 256$ thus $n = 256$ suffices and each coefficient is a byte, with $n = 256$ and $k = 224$ . This leads to a $d_{min} = 256 - 224 + 1 = 33$ , meaning we can correct up to $⌊ \frac{33 - 1}{2} ⌋ = 16$ errors.

2. Interleaving for Burst Error Correction:

On CDs, errors often occur in bursts due to scratches. A single scratch could corrupt multiple consecutive bytes, rendering straightforward Reed-Solomon decoding ineffective. To address this, interleaving is used.

Spreading the Data: The data is spread out across the CD in a specific pattern before encoding. This ensures that a localized burst error affects multiple codewords rather than a long continuous segment within a single codeword. Each codeword now contains bytes which are seperated on the CD by $n$ positions where $n$ is large enough so that no scratch can scratch $t$ of our bits. If we had a burst error within $t$ symbols of $n$ consecutive symbols then we could also decode this using the same principle.

3. Example: Double Interleaving on CDs:

CDs often employ double interleaving:

Inner Code: A (28, 24) Reed-Solomon code is used. Thus $d_{min} = 28 - 24 + 1 = 5$ . $t = ⌊(5 - 1) /2 ⌋ = 2$ . This means we can correct two errors with this interleaving strategy.
Outer Code: Another (32, 28) Reed-Solomon code is applied to the output of the inner code. Thus $d_{min} = 32 - 28 + 1 = 5$ , $t = ⌊(5 - 1) /2 ⌋ = 2$ . Again we can correct two errors with this interleaving strategy.

This double interleaving scheme enhances the ability to correct burst errors, ensuring data integrity even in the presence of scratches or other localized damage. This illustrates how theoretical concepts from coding theory, like Reed-Solomon codes and Hamming distance, are combined with practical techniques like interleaving to create robust and reliable data storage and transmission systems.

Continue here: 21 Logic, Proof Systems, Logical Consequence, Syntactic Derivation

CS Notes

Explorer

20 Generators in Finite Fields, Properties of Finite Fields, Error Correcting Codes, Reed-Solomon Codes

Overview