01 What is Theoretical Computer Science?

The Big Questions

Welcome to Theoretical Computer Science. Before we dive into the technical details, let’s take a moment to appreciate the landscape. What is this field really about? At its heart, this course revolves around two fundamental questions about computation:

What are the absolute limits of what computers can do? Are there problems that are fundamentally impossible to solve with an algorithm, no matter how clever we are or how powerful our computers become? This is the realm of Computability.
Of the problems we can solve, how efficiently can we solve them? Just because a problem is solvable in principle doesn’t mean we can solve it in practice. Some problems seem to require an astronomical amount of time or resources. Can we find clever shortcuts, or is this difficulty inherent to the problem itself? This is the realm of Complexity.

To even begin answering these questions, we need to get a firm grip on the central concept that underpins everything: the algorithm.

What is an “Algorithm”?

We all have an intuitive feel for what an algorithm is. Wikipedia calls it “an unambiguous specification of how to solve a class of problems.” It consists of a finite number of well-defined steps.

This sounds reasonable. A recipe for baking a cake, a set of directions to a friend’s house, or a Python script to test for primality all seem to fit this description. The key idea is that the steps should be so clear that they can be executed without any creativity or intelligence. A machine could do it.

But there’s a subtle problem here. Words like “unambiguous” and “well-defined” are a bit… fuzzy. They aren’t mathematically precise. If we want to prove that something is impossible for any algorithm to do, we can’t rely on a fuzzy, informal definition. We need to build our house on a rock-solid mathematical foundation.

This is one of the first great triumphs of theoretical computer science: formalizing the intuitive notion of an algorithm.

The Dawn of Computation: Alan Turing

This brings us to one of the giants of the 20th century, Alan Turing. In 1936, at the age of 24, he published a paper titled “On Computable Numbers, with an Application to the Entscheidungsproblem.”

This paper was truly visionary. In it, Turing did two remarkable things:

He proposed a formal, mathematical model of a computer - what we now call a Turing Machine - to precisely define what an “algorithm” is.
He immediately used this new definition to prove that there are problems that no algorithm can solve.

The core of his argument is surprisingly simple and elegant, and you’ve already seen the key ingredient in Discrete Mathematics. It boils down to this: you can list out all possible algorithms (they are countably infinite), but the number of possible problems is vastly larger (uncountably infinite). Therefore, there must be problems for which no algorithm exists.

We’re going to explore this from a computer scientist’s perspective, focusing on a concrete, famous example.

The Halting Problem: A Wall We Can’t Climb

One of the most famous uncomputable problems is the Halting Problem. It sounds deceptively simple:

Can we write a single program, let’s call it doesItHalt(program_code), that takes the source code of any other program as input and determines whether that program will eventually stop (halt) or run forever in an infinite loop?

At first glance, this doesn’t seem so hard.

For this simple Python snippet, it’s obvious it halts:

# Halts
print("Hello, World!")

And for this one, it’s equally obvious it runs forever:

# Doesn't halt
while True:
    pass
print("This will never be printed.")

So far, so good. But what about this one?

def is_prime(k):
    # A simple primality test
    if k <= 1: return False
    for i in range(2, int(k**0.5) + 1):
        if k % i == 0:
            return False
    return True
 
n = 4
while True:
    found_pair = False
    # Check all pairs of primes (p, q) that sum to n
    for p in range(2, n // 2 + 1):
        if is_prime(p):
            q = n - p
            if is_prime(q):
                found_pair = True
                break # Found a pair, break inner loop
    
    if not found_pair:
        # If we checked all pairs and found none, we have a counterexample!
        break # Halt the program
    
    # Move to the next even number
    n += 2
 
print(f"Found an even number that is not the sum of two primes: {n}")

Does this program halt?

Take a moment to understand what it’s doing. It starts with n=4 and checks if n can be written as the sum of two prime numbers (e.g., $4 = 2 + 2$ , $6 = 3 + 3$ , $8 = 5 + 3$ , etc.). If it finds such a pair, it moves to the next even number and repeats. The program only halts if it finds an even number greater than 2 that cannot be written as the sum of two primes.

Whether this program halts is equivalent to asking whether the Goldbach Conjecture is false. This is one of the oldest and most famous unsolved problems in all of mathematics. Nobody on Earth knows the answer.

This example gives us a powerful intuition: determining if a program halts isn’t just about finding while True loops. It can be as difficult as solving profound, open mathematical questions. Turing proved, with mathematical certainty, that no general algorithm for the Halting Problem can possibly exist.

Efficiency: The P vs. NP Problem

So, we’ve established that there are things computers fundamentally cannot do. But what about the things they can do?

Many practically important problems are, thankfully, computable. A classic example is the Traveling Salesperson Problem (TSP): given a set of cities and the distances between them, find the shortest possible tour that visits each city exactly once and returns to the origin.

How would you write an algorithm for this? The straightforward approach is brute force:

List every possible ordering of the cities (every possible tour).
Calculate the total distance for each tour.
Pick the tour with the smallest distance.

This algorithm is guaranteed to be correct. It’s also computable. But it’s horrifically inefficient. The number of possible tours grows factorially with the number of cities. For just 20 cities, the number of tours is astronomical. This “try everything” approach quickly becomes impossible in practice.

This leads us to the second great question: Must we really try all possibilities? Or is there a clever shortcut, an algorithm that can find the best tour without this exponential explosion of work?

For TSP, and thousands of other important problems in a class called NP, we don’t know the answer. This is the essence of the famous P vs. NP problem, one of the biggest open questions in computer science and mathematics.

P (Polynomial Time): Problems that can be solved efficiently (e.g., finding the shortest path between two points).
NP (Nondeterministic Polynomial Time): Problems where a proposed solution can be checked efficiently (e.g., if someone gives you a tour for TSP, you can easily calculate its length).

The question is: if you can check a solution quickly, can you also find it quickly? Is P equal to NP? We suspect not, but nobody has been able to prove it.

Course Outline & Organization

This course will guide you through these foundational ideas.

Course Structure

Formalizing Computation: We’ll start with simple models like Finite Automata to build intuition, then move to the all-powerful Turing Machine to formally define “algorithm.”
Computability Theory: We’ll follow in Turing’s footsteps to understand the limits of computation, exploring the Halting Problem and other undecidable problems.
Complexity Theory: We’ll shift our focus to efficiency, classifying problems by how many resources (time, memory) they require. This will lead us to the heart of the P vs. NP problem and the theory of NP-Completeness.
Grammars and Formal Languages: We’ll explore an alternative perspective on computation, looking at how rules can generate complex structures.

The “Didactic Experiment”

This course is structured to encourage continuous engagement. Here’s how it works:

Weekly Exercises: You’ll work on problem sets (in groups of up to 3) to deepen your understanding. Submitting these is done via Moodle.
Qualification: To qualify for the mid-terms, you must achieve at least 50% of the total points on the exercises.
Two Voluntary Mid-Term Exams: During the semester, we will offer two mid-term exams.
The Deal: If you qualify for and pass both mid-term exams, you have the option to accept the resulting grade as your final grade for the course. You would not have to take the final session exam.
The Choice: If you’re not happy with your mid-term grade, or if you don’t pass them, you can still take the final session exam. However, the grade from the final exam will be your definitive grade, for better or worse.

Why do we do this? Experience has shown that this model transforms learning. It encourages you to work consistently throughout the semester, which is the most effective way to master this material. In the past, this approach has dramatically reduced failure rates from as high as 40-60% to consistently below 10% - not because the course is easier, but because students are better prepared.

All materials, announcements, and submissions will be handled through Moodle.

The Nature of Mathematics and Computer Science

To close our introduction, let’s zoom out and think about what we’re really doing here. What is mathematics? What is computer science?

Mathematics as a Language and a Tool

Mathematics is often defined as the study of abstract structures. But it’s more useful to think of it as two things:

A language designed for absolute precision and objectivity.
A research instrument for describing and investigating the world.

The ancient Pythagoreans had the ambitious goal of describing the entire universe with mathematics. Their motto was “All is number.” They believed everything could be explained with natural numbers and their ratios (rational numbers). This worldview was shattered when they discovered that the diagonal of a unit square, $2$ , could not be expressed as a ratio of two integers. It was an “irrational” number.

But here is a beautiful connection to our field: while you can’t write $2$ as a finite fraction, you can write a finite algorithm that generates its digits to any precision you desire. The notion of computation gives us a new, powerful way to describe numbers that the ancients struggled with.

Leibniz and the Dream of Automation

The philosopher and mathematician Gottfried Wilhelm Leibniz saw mathematics as the “science of automation.” His vision was a three-step process:

Translate a real-world problem into the formal language of mathematics.
Solve the problem within mathematics using formal rules.
Translate the solution back into a real-world conclusion.

He saw step 2 as a process that could, in principle, be automated. He realized that arithmetic is just a set of rules for manipulating symbols. The correctness of a calculation can be checked mechanically, without understanding. His brilliant idea was to extend this to all of human reasoning. He dreamed of a formal logic - a set of rules for correct argumentation - so powerful that any dispute could be settled by saying “Let us calculate.”

This required formalizing the very building blocks of the language. This is the role of axioms.

Axioms are not unproven beliefs. They are the definitions of our fundamental concepts.

They are the starting points, the initial words with clear meaning, from which we build the entire linguistic structure of mathematics. Mathematics grows by adding new axiomatic concepts: from arithmetic and geometry to logic, infinity, probability, and finally, computation.

The Birth of Computer Science

The dream of a complete mathematics, where every question could be answered, was pursued by David Hilbert. But in 1931, Kurt Gödel proved his famous Incompleteness Theorems, showing that any sufficiently powerful mathematical system will contain true statements that it cannot prove.

This was a seismic shock. It led directly to the questions Turing tackled: Are there algorithmic problems that are unsolvable? As we’ve seen, the answer is yes.

This is where Computer Science was born as a formal discipline. It is the science of the automatic processing of information. Its core components are ancient:

Data/Information: The need to store information outside the brain led to counting (70,000 years ago) and writing (the solution to Mesopotamia’s “first big data crisis”).
Algorithms: The idea of a step-by-step procedure is as old as teaching someone how to do something. We find formal algorithms on 4,000-year-old Babylonian tablets.
Technology: The drive to automate computation with machines is also centuries old, from Leibniz’s mechanical calculators to modern electronics.

Our Didactic Approach

Finally, a word on how we will teach this course. We will not simply present you with a finished product - a list of definitions, theorems, and proofs to memorize. Science is not a collection of facts; it is a process of discovery.

Our goal is to take you on that journey. We will explore how these ideas came to be: the questions that motivated them, the dead ends, the brilliant insights, and the new questions that arose from each discovery. The goal is not just for you to know theoretical computer science, but to learn how to think like a theoretical computer scientist.

Continue here: 02 The Language of Computation - Alphabets, Words, and Languages

CS Notes

Explorer