Lecture from: 26.03.2024 | Video: Video ETHZ

Shared Memory Concurrency, Locks, and Data Races

Continuation…

Canonical Example: Bank Account

Let’s consider a simple BankAccount class. In a single-threaded world, this code works perfectly fine:

class BankAccount {
    private int balance = 0;
 
    int getBalance() {
        return balance;
    }
 
    void setBalance(int x) {
        balance = x;
    }
 
    void withdraw(int amount) {
        int b = getBalance(); // Read balance
        if (amount > b) {
            throw new WithdrawTooLargeException();
        }
        setBalance(b - amount); // Write new balance
    }
    // ... other operations like deposit, etc.
}

The withdraw method reads the balance, checks if the withdrawal is possible, and then updates the balance.

A Bad Interleaving

Now, imagine two threads executing withdraw(100) on the same BankAccount object concurrently, with an initial balance of 150. Due to thread interleaving, the following scenario (a “bad interleaving”) can occur:

  1. Thread 1: Reads balance (b = 150).
  2. Thread 2: Reads balance (b = 150). (Thread 1 gets preempted before writing).
  3. Thread 2: Checks amount > b (100 > 150 is false).
  4. Thread 2: Calculates b - amount (150 - 100 = 50).
  5. Thread 2: Calls setBalance(50). The balance is now 50.
  6. Thread 1: Resumes execution. It already read b as 150.
  7. Thread 1: Checks amount > b (100 > 150 is false).
  8. Thread 1: Calculates b - amount (150 - 100 = 50).
  9. Thread 1: Calls setBalance(50). The balance remains 50.

Result: The final balance is 50. We expected it to be -50 (or ideally, the second withdrawal should have failed if the first completed fully). One withdrawal was effectively “lost”!

Interleaving (Recap)

  • Interleaving: When operations from multiple threads execute in an alternating, potentially unpredictable sequence because their execution overlaps in time.
  • Preemption: Interleaving can happen even on a single processor due to time-slicing, where the operating system preempts (interrupts) one thread to let another run.
  • No Problem if Different Resources: If threads access completely different, independent resources (e.g., different BankAccount objects), interleaving is usually not an issue.
  • Problem with Aliasing: If multiple threads access the same resource (the objects are aliases), interleaving can cause problems like the “lost withdraw”.

Incorrect “Fixes”

It’s tempting to try fixing bad interleavings by simply rearranging code or repeating operations. This is almost always wrong.

Consider this attempted fix:

void withdraw(int amount) {
    if (amount > getBalance()) // Check balance again right before setting
        throw new WithdrawTooLargeException();
    // Maybe balance changed between the check and here?
    setBalance(getBalance() - amount); // Read balance *again*
}

Why this is wrong:

  1. Still Not Atomic: There’s still a small window between reading the balance in the if condition and reading it again inside setBalance. Another thread could modify the balance in that window.
  2. Compiler Reordering: The compiler might reorder operations (especially if it doesn’t know about concurrency), potentially turning this back into the original, flawed version.
  3. New Bugs: This specific “fix” can lead to a negative balance! Imagine:
    • Balance = 150.
    • Thread 1: Checks amount (100) > getBalance() (150) false.
    • Thread 2: Executes withdraw(100) completely. Balance becomes 50.
    • Thread 1: Resumes. Calls setBalance(getBalance() (50) - amount (100)). Balance becomes -50!

The fundamental problem is the lack of atomicity. The sequence of “read-check-write” must happen as a single, indivisible operation.

Mutual Exclusion: The Sane Fix

The correct solution is to enforce mutual exclusion.

  • Allow at most one thread to execute the withdraw operation (and other related operations like deposit) on a specific account A at any given time.
  • Mutual Exclusion: If one thread is currently using a shared resource (like the BankAccount), any other thread trying to use the same resource must wait.
  • Critical Sections: The sections of code that access the shared resource (e.g., the body of the withdraw method) are called critical sections.

Programmer’s Responsibility:

  • The programmer must identify critical sections.
  • The programmer must use language primitives (like locks) to ensure mutual exclusion for these critical sections. The compiler generally cannot automatically determine which interleavings are safe or unsafe based on program semantics.

Critical Sections and Mutual Exclusion Defined

  • Critical Section: A piece of code that accesses a shared resource and must not be executed by more than one thread simultaneously.
    // Critical Section for withdraw:
    int b = getBalance();
    if (amount > b) throw new WithdrawTooLargeException();
    setBalance(b - amount);
  • Mutual Exclusion: The property that ensures only one thread can execute a critical section at a time. An algorithm that enforces this typically involves:
    • acquire_mutex(): Code executed before entering the critical section. If another thread is already in, this call blocks until the critical section is free.
    • ... (critical section code)
    • release_mutex(): Code executed after leaving the critical section, allowing another waiting thread (if any) to enter.

Wrong! Implementing Our Own Mutex

Why can’t we just use a simple boolean flag to implement mutual exclusion?

class BankAccount {
    private int balance = 0;
    private boolean busy = false; // Our "lock" flag
 
    void withdraw(int amount) {
        while (busy) { /* "spin-wait" */ } // Wait if busy
        busy = true; // Mark as busy
        // --- Critical Section Start ---
        int b = getBalance();
        if (amount > b)
            throw new WithdrawTooLargeException();
        setBalance(b - amount);
        // --- Critical Section End ---
        busy = false; // Mark as not busy
    }
    // deposit would spin on the same boolean
}

This doesn’t work! We’ve just moved the race condition. Two threads could simultaneously:

  1. Thread 1: Checks while(busy) (false).
  2. Thread 2: Checks while(busy) (false). (Thread 1 preempted before setting busy = true).
  3. Thread 2: Sets busy = true.
  4. Thread 2: Enters the critical section.
  5. Thread 1: Resumes. Sets busy = true (it’s already true, but no matter).
  6. Thread 1: Also enters the critical section!

We need the check (while(busy)) and the set (busy = true) to happen atomically.

What We Need: Locks

To solve this, we need help from the programming language or runtime system in the form of locks (also known as mutexes).

A lock is a basic synchronization primitive with atomic operations:

  • new: Creates a new lock, initially in the “not held” state.
  • acquire:
    • Checks if the lock is “held”.
    • If held, the calling thread blocks (waits) until the lock is released.
    • If not held, the lock is atomically marked as “held” by the calling thread, and the thread proceeds.
  • release:
    • Marks the lock as “not held”.
    • If one or more threads were blocked waiting to acquire this lock, exactly one of them is unblocked and allowed to acquire the lock.

Why Locks Work:

  • The acquire and release operations are atomic. The lock implementation (using special hardware instructions and/or operating system support) guarantees that even with simultaneous attempts to acquire/release, the lock’s state remains consistent, and only one thread holds the lock at a time.
  • We treat the lock mechanism itself as a primitive; we don’t implement it ourselves in standard application code.

The Lock Object Interface

Conceptually, a lock can be thought of as an object satisfying this interface:

public interface Lock {
    public void lock();   // Acquire the lock (blocks if necessary) - Entering CS
    public void unlock(); // Release the lock - Leaving CS
}

The semantics are:

  • new Lock(): Creates a lock, initially “not held”.
  • acquire (lock()): Blocks if held. Atomically makes the lock “held” when acquired.
  • release (unlock()): Makes the lock “not held”. Unblocks one waiting thread, if any.

Required Properties of Mutual Exclusion

Any correct mutual exclusion implementation (like a lock) must satisfy:

  • Safety Property: At most one thread executes the critical section code at any given time. (Prevents interference).
  • Liveness Property: A thread attempting to acquire the lock (acquire_mutex) must eventually succeed if no other thread is holding the lock indefinitely. (Prevents indefinite waiting/starvation).

Using Locks (Almost Correctly)

Using our conceptual Lock interface, we can protect the BankAccount:

class BankAccount {
    private int balance = 0;
    private Lock lk = new Lock(); // One lock *per account*
 
    void withdraw(int amount) {
        lk.lock(); // Acquire lock before accessing shared state
        // --- Critical Section Start ---
        int b = getBalance();
        if (amount > b) {
            throw new WithdrawTooLargeException(); // Uh oh, what if exception happens?
        }
        setBalance(b - amount);
        // --- Critical Section End ---
        lk.unlock(); // Release lock after accessing shared state
    }
 
    // deposit would also acquire/release lk
}

Important: Each BankAccount instance should have its own lock (lk) to allow concurrent operations on different accounts.

Exception Handling: The code above has a flaw! If WithdrawTooLargeException (or any other exception) occurs within the critical section before lk.unlock(), the lock will never be released, potentially blocking other threads forever. The correct pattern uses try...finally:

lk.lock();
try {
    // Critical Section
} finally {
    lk.unlock(); // Ensures unlock happens even if exceptions occur
}

Possible Mistakes with Locks

  • Incorrect: Using Different Locks: If withdraw uses lockA and deposit uses lockB, they won’t provide mutual exclusion for the shared balance. Both methods must acquire the same lock (lk in our example) to protect the shared balance.
  • Poor Performance: Using the Same Lock for Everything: If we used one single static lock for all bank accounts, only one thread could perform any operation on any account at a time, killing concurrency.
  • Incorrect: Forgetting to Release: As noted, failing to release a lock (especially due to exceptions) is a serious bug that can cause deadlock. Always use try...finally.

Locking and Other Operations

If withdraw and deposit use the same lock for a given account, they are correctly synchronized with each other. But what about helper methods like getBalance and setBalance?

  • Assume they are public (accessible from outside).
  • If getBalance and setBalance do not acquire the lock: A race condition can occur between a thread calling withdraw (which does use the lock internally via these methods if they were synchronized) and another thread calling setBalance directly.
  • If getBalance and setBalance do acquire the same lock: A problem arises! If withdraw acquires the lock lk and then calls setBalance, which also tries to acquire lk, the thread will block forever waiting for a lock it already holds! This is a self-deadlock.

Re-acquiring Locks: Re-entrant Locks

How can a thread acquire a lock it already holds?

  • Approach 1 (Restrictive): Disallow withdraw from calling a synchronized setBalance. This often leads to code duplication or awkward structuring.
  • Approach 2 (Better): Use re-entrant locks (also called recursive locks).

Re-entrant Lock:

  • “Remembers” which thread currently holds the lock.
  • Maintains a count of how many times the holding thread has acquired the lock.
  • Acquire:
    • If the lock is not held, acquire it and set count to 1.
    • If the lock is held by the current thread, simply increment the count. The thread does not block.
    • If the lock is held by a different thread, block as usual.
  • Release:
    • Decrement the count.
    • If the count reaches 0, the lock becomes “not held”.

With re-entrant locks, the following code works correctly:

// Assumes lk is a re-entrant lock
int setBalance(int x) {
    lk.lock();
    try {
        balance = x;
    } finally {
        lk.unlock();
    }
}
 
void withdraw(int amount) {
    lk.lock();
    try {
        int b = getBalance(); // Assume getBalance also uses lk
        if (amount > b) { /*...*/ }
        setBalance(b - amount); // Okay! Thread already holds lk
    } finally {
        lk.unlock();
    }
}

Locks in Java: The synchronized Statement

Java has built-in support for re-entrant locks via the synchronized keyword.

synchronized (expression) { // expression evaluates to an object
    // statements
}
  1. Evaluates expression to an object: Every Java object (except primitives) has an implicit “intrinsic lock” (or “monitor lock”) associated with it.
  2. Acquires the lock: The thread attempts to acquire the intrinsic lock of the object produced by expression. It blocks if necessary. Since it’s re-entrant, acquiring a lock already held by the current thread succeeds immediately.
  3. Executes statements: The code inside the block runs only after the lock is acquired.
  4. Releases the lock: The lock is automatically released when control leaves the synchronized block, whether normally (reaching the }) or due to an exception or return. This avoids the need for manual try...finally.

You can also synchronize entire methods:

synchronized void myMethod() {
    // Body of method
}

This is equivalent to:

void myMethod() {
    synchronized(this) { // Synchronizes on the current object instance
        // Body of method
    }
}

External Locks in Java

Besides intrinsic locks (synchronized), Java also provides explicit lock implementations in the java.util.concurrent.locks package, such as ReentrantLock.

  • External Locks (e.g., ReentrantLock):

    • More flexible than intrinsic locks.
    • Offer additional features like timed waits, interruptible waits, and different fairness policies.
    • Support more sophisticated locking idioms (e.g., reader-writer locks, condition variables - discussed in later lectures).
    • Require manual lock() and unlock() calls, typically within a try...finally block.
  • Intrinsic Locks (synchronized):

    • Simpler syntax.
    • Automatic lock release.
  • Class java.util.concurrent.locks.ReentrantLock behaves much like our pseudocode lock.

  • Use try { ... } finally { lock.unlock(); } for safety.

synchronized vs java.util.concurrent.locks.Lock API:

FeaturesynchronizedLock API (e.g., ReentrantLock)
ReleaseAutomatic (handled by JVM)Manual (unlock())
ScopeBlock-scopedCan span methods
WaitingBlocks indefinitelytryLock(), interruptible lock
FlexibilityBasicMore features (conditions, etc.)
Ease of UseCleaner code, easier to maintainMore verbose, slightly bug-prone

Recommendation: Use synchronized whenever possible due to its simplicity and safety (automatic release). Use the Lock API when you need its advanced features.

Race Conditions Revisited

A Race Condition occurs when the correctness of a concurrent program depends on the specific, unpredictable timing or interleaving of operations executed by multiple threads. The problem usually involves threads observing inconsistent intermediate states.

Important Distinction: This lecture (and good practice) distinguishes between two types of race-condition bugs:

  1. Data Races
  2. Bad Interleavings
  • Data Race (Low-Level Race Condition):
    • Occurs when there are insufficiently synchronized accesses to the same shared memory location by multiple threads, and at least one access is a write.
    • Specifically: Simultaneous read/write or write/write to the same location without proper locking.
    • Always an error (in languages like Java/C/C++ with defined memory models) because they can lead to bizarre, non-intuitive behaviors due to compiler and hardware optimizations/reorderings. Avoid them at all costs.
  • Bad Interleaving (High-Level Race Condition):
    • Occurs when the program behaves incorrectly due to an unfavorable execution order of operations, even if the individual accesses to shared resources are properly synchronized (i.e., no data races).
    • Whether an interleaving is “bad” depends on the program’s specification or intended invariants.

Example:

// Assume getBalance and setBalance are synchronized
public /*NOT synchronized*/ void withdraw(int amount) {
    // ...
    b = getBalance(); // Reads balance (under lock inside getBalance)
    // ... calculation ...
    setBalance(b - amount); // Writes balance (under lock inside setBalance)
    // ...
}
  • No Data Race: withdraw itself doesn’t directly access balance. It calls getBalance and setBalance, which are synchronized. There are no concurrent read/write or write/write operations on the balance field itself without a lock.
  • Bad Interleaving (High-Level Race): As we saw before, two withdraw calls can still interleave between the getBalance and setBalance calls within withdraw, leading to the “lost withdraw” bug. This is a high-level race condition because the individual low-level accesses are safe, but the sequence of operations leads to an incorrect outcome based on our expectation of how withdraw should work.

Example: Bounded Stack

Consider a simple bounded stack implementation using an array.

public class Stack <E> {
    E[] array;
    int index; // Points to the next available slot
 
    public Stack(int entries) {
        // (Hack to create generic array)
        array = (E[]) new Object[entries];
        index = 0;
    }
    // ... methods ...
}

Let’s add synchronized methods for basic stack operations:

public class Stack <E> {
    // ... fields and constructor ...
 
    synchronized boolean isEmpty() {
        return index == 0;
    }
 
    synchronized void push(E val) throws StackFullException {
        if (index == array.length)
            throw new StackFullException();
        array[index++] = val;
    }
 
    synchronized E pop() throws StackEmptyException {
        if (index == 0)
            throw new StackEmptyException();
        return array[--index]; // Note: Potential memory leak if E is not primitive
    }
}

These methods (isEmpty, push, pop) are individually thread-safe because they are synchronized, preventing data races on array and index.

The Problem with peek

Now, let’s try to add a peek method (view the top element without removing it) using the existing push and pop:

// Inside Stack<E> class...
E peek() { // NOT synchronized initially
    E ans = pop();
    push(ans); // Put it back
    return ans;
} // WRONG!

Sequentially, this code is correct (though inefficient). If we only had the Stack<E> interface (isEmpty, push, pop), this pop-then-push approach is the only way to implement peek.

Concurrently, this peek is broken.

  • Overall Effect: peek is intended as a “reader” – it shouldn’t change the stack’s state permanently.
  • Intermediate State: However, the implementation temporarily modifies the stack (by popping) before restoring it (by pushing). This creates an inconsistent intermediate state where the stack is briefly missing its top element.
  • No Data Races: Calls to push and pop are synchronized, so there are no low-level data races on the underlying array or index.
  • Bad Interleavings: The intermediate state can be observed by other threads, leading to several bad interleavings.
Bad Interleaving 1: peek and isEmpty

Property we want: If the stack is not empty, isEmpty should return false.

Scenario:

  1. Stack has one element “A”.
  2. Thread 1 (peek): Calls pop(). ans = “A”. Stack is now empty.
  3. Thread 2: Calls isEmpty(). Returns true (because the stack is temporarily empty). VIOLATION!
  4. Thread 1 (peek): Calls push(ans). Stack now contains “A” again.
  5. Thread 1 (peek): Returns “A”.
Bad Interleaving 2: peek and LIFO Order

Property we want: Stack follows Last-In, First-Out (LIFO) order.

Scenario:

  1. Stack is empty.
  2. Thread 2: Calls push(x). Stack: [x]
  3. Thread 2: Calls push(y). Stack: [x, y]
  4. Thread 1 (peek): Calls pop(). ans = “y”. Stack: [x]
  5. Thread 2: Calls pop(). Returns “x”. VIOLATION! Should have returned “y”.
  6. Thread 1 (peek): Calls push(ans). Stack: [x, y]
  7. Thread 1 (peek): Returns “y”.
The Fix: Larger Critical Section

The peek operation needs to be atomic. The sequence pop-then-push must appear indivisible to other threads. We achieve this by making the entire peek method a critical section, synchronized on the same lock used by push and pop.

// Inside Stack<E> class...
synchronized E peek() { // Now synchronized!
    E ans = pop(); // pop is synchronized, okay due to re-entrant lock
    push(ans); // push is synchronized, okay due to re-entrant lock
    return ans;
}
 
// OR, if implementing peek outside the class:
class C {
    <E> E myPeek(Stack<E> s) {
        synchronized (s) { // Lock on the stack object itself
            E ans = s.pop();
            s.push(ans);
            return ans;
        }
    }
}

Because Java’s intrinsic locks are re-entrant, the synchronized peek method can safely call the synchronized pop and push methods without causing deadlock.

The Wrong “Fix”: Skipping Synchronization for Readers

It’s tempting to think: “If peek or isEmpty don’t write anything, maybe they don’t need synchronization?”

// WRONG - DON'T DO THIS
boolean isEmpty() {
    return index == 0; // Read index without lock
}

This is wrong because it introduces a data race. Even if isEmpty only reads index, another thread might be executing push or pop concurrently, writing to index without holding the lock relative to the isEmpty read. Data races lead to undefined behavior.

Recap of the Distinction:

  • Original peek (pop-then-push, unsynchronized method): Had no data races (because pop and push were synchronized internally), but had bad interleavings due to the inconsistent intermediate state.
  • isEmpty (read index directly, unsynchronized): Has a data race with push/pop (concurrent read/write).

Continue here: 13 Virtual Threads