Lecture from 30.04.2025 | Video: Video ETHZ

Last Week Recap

Let’s quickly summarize the key takeaways from our previous lectures:

  • Locks & Locks with Atomics: We explored how locks ensure mutual exclusion. While software-only locks (Peterson, Filter, Bakery) are possible using atomic/safe registers (requiring O(N) space), practical locks rely on atomic Read-Modify-Write (RMW) hardware operations (TAS, CAS, LL/SC) to achieve constant O(1) space and better efficiency (e.g., TASLock, TATAS, TATAS with backoff).
  • Intro to (Dead)locks: We defined deadlocks formally as cycles in resource allocation graphs and discussed basic avoidance strategies like ordered locking.
  • Rendezvous & Barrier: We examined these patterns for large-scale synchronization and encountered implementation pitfalls, particularly potential deadlocks when using semaphores incorrectly.

Producer / Consumer with Explicit Lock and Condition

While Java’s intrinsic monitors (synchronized, wait, notify) work, the java.util.concurrent.locks package offers more flexibility. We can use an explicit Lock (like ReentrantLock) and create separate Condition objects for different wait sets.

Setup:

import java.util.concurrent.locks.*;
 
class Queue {
    int in=0, out=0, size;
    long buf[];
    final Lock lock = new ReentrantLock(); // Use an explicit re-entrant lock
 
    // Separate condition queue for producers waiting because the buffer is full
    final Condition notFull = lock.newCondition();
 
    // Separate condition queue for consumers waiting because the buffer is empty
    final Condition notEmpty = lock.newCondition();
 
    Queue(int s) {
        size = s;
        buf = new long[size];
        // lock, notFull, notEmpty are initialized above
    }
    // ... methods ...
}
  • We instantiate a ReentrantLock.
  • Crucially, we create two Condition objects, both associated with the same lock. notFull will be used by producers to wait when the queue is full, and notEmpty will be used by consumers to wait when the queue is empty.

Implementation:

// Inside Queue class...
 
void enqueue(long x) {
    lock.lock(); // Acquire the explicit lock
    try {
        // MUST loop to re-check condition after waking
        while (isFull()) {
            try {
                // Release lock and wait specifically on the 'notFull' condition
                notFull.await();
            } catch (InterruptedException e) { /* Handle */ }
        }
        // Lock is held again here
        doEnqueue(x); // Perform the enqueue operation
        // Signal ONE thread waiting specifically on the 'notEmpty' condition
        notEmpty.signal();
    } finally {
        lock.unlock(); // Ensure lock is released
    }
}
 
long dequeue() {
    long x;
    lock.lock(); // Acquire the explicit lock
    try {
        // MUST loop to re-check condition after waking
        while (isEmpty()) {
            try {
                // Release lock and wait specifically on the 'notEmpty' condition
                notEmpty.await();
            } catch (InterruptedException e) { /* Handle */ }
        }
        // Lock is held again here
        x = doDequeue(); // Perform the dequeue operation
        // Signal ONE thread waiting specifically on the 'notFull' condition
        notFull.signal();
    } finally {
        lock.unlock(); // Ensure lock is released
    }
    return x;
}

Explanation:

  1. Explicit Locking: We use lock.lock() and lock.unlock(), making sure unlock is in a finally block.
  2. Condition Waiting: We use notFull.await() for producers and notEmpty.await() for consumers. await() atomically releases the associated lock and waits on that specific condition’s queue. Upon waking, it re-acquires the lock before returning. The while loop is still essential.
  3. Targeted Signaling: When enqueue adds an item, it calls notEmpty.signal(). This wakes up only one thread (if any) that is waiting specifically because the queue was empty (i.e., waiting on notEmpty). Similarly, dequeue calls notFull.signal() to wake one waiting producer. This is potentially more efficient than notifyAll(), which wakes all threads regardless of why they were waiting.

The Sleeping Barber Variant (E. Dijkstra)

Critique of Previous Solution: The enqueue/dequeue implementation using explicit Locks/Conditions (and also the monitor version using notifyAll) has a minor inefficiency: signal() (or notifyAll()) is called every single time an item is enqueued or dequeued, even if no other threads are currently waiting. Sending a signal when nobody is waiting incurs some unnecessary overhead.

Analogy: The “Sleeping Barber” problem provides a classic analogy.

  1. A barber shop has one barber, one barber chair, and several chairs for waiting customers.
  2. If no customers are present, the barber sleeps in the barber chair.
  3. When a customer arrives:
    • If the barber is sleeping, the customer wakes the barber and sits in the chair.
    • If the barber is busy and there are free waiting chairs, the customer sits in a waiting chair.
    • If the barber is busy and all chairs are full, the customer leaves.
  4. When the barber finishes a haircut, they check the waiting room. If customers are waiting, the barber calls one; otherwise, the barber goes to sleep.

The Problem (Lost Wakeup / Lost Signal): What happens if the timing is just wrong?

  • Barber finishes haircut, checks waiting room (empty), decides to go to sleep.
  • Just before the barber actually falls asleep, a customer arrives.
  • The customer sees the barber “about to sleep” (or technically still awake but having decided to sleep) and tries to wake the barber (e.g., calls signal).
  • The barber didn’t execute the wait/await call yet, so the signal is lost.
  • The barber proceeds to wait/await and falls asleep.
  • The customer, having already sent the signal, now waits in the chair (or leaves if full), potentially forever if no other customers arrive. The barber never wakes up.

Solving Sleeping Barber: To avoid lost signals, the barber/consumer needs to know if customers/producers are actually waiting before going to sleep, and customers/producers need to know if the barber/consumer is actually sleeping before trying to wake them. This typically requires additional shared counters to track the number of waiting producers and consumers.

  • Let m track available slots (or waiting producers if negative).
  • Let n track available items (or waiting consumers if negative).

Producer/Consumer Queue - Sleeping Barber Setup:

import java.util.concurrent.locks.*;
 
class Queue {
    int in=0, out=0, size;
    long buf[];
    final Lock lock = new ReentrantLock();
 
    // Counter for items/waiting consumers (negative -> consumers waiting)
    int n = 0;
    final Condition notEmpty = lock.newCondition();
 
    // Counter for empty slots/waiting producers (negative -> producers waiting)
    // Initialized based on size, e.g., m = size for slots count
    int m; // initialized in constructor, e.g., m = size;
    final Condition notFull = lock.newCondition();
 
    Queue(int s) {
        size = s;
        m = size; // Initially 'size' empty slots
        buf = new long[size];
        // ... other initializations ...
    }
    // ... methods ...
}

Note: The slide initializes m = size-1 which implies using the “one empty slot” trick to distinguish full/empty, adjusting the logic slightly.

Sleeping Barber Implementation Logic:

// Inside Queue class (Sleeping Barber logic)...
 
void enqueue(long x) {
    lock.lock();
    try {
        m--; // Decrement empty slot count *before* checking
             // If m becomes < 0, it means producers might need to wait
             // or were already waiting.
        if (m < 0) {
            // Since m < 0 implies the buffer *was* full just before our m--,
            // we must wait. The while loop is still needed for robustness.
            while (isFull()) { // isFull() might check based on m or in/out
                try { notFull.await(); }
                catch (InterruptedException e) {}
            }
        }
        doEnqueue(x); // Add the item
        n++; // Increment item count *after* adding
             // If n was <= 0 before this increment, consumers were waiting.
        if (n <= 0) {
             notEmpty.signal(); // Signal *only if* consumers were likely waiting
        }
    } finally {
        lock.unlock();
    }
}
 
long dequeue() {
    long x;
    lock.lock();
    try {
        n--; // Decrement item count *before* checking
             // If n becomes < 0, consumers must wait.
        if (n < 0) {
            // Since n < 0 implies the buffer *was* empty just before our n--,
            // we must wait.
             while (isEmpty()) { // isEmpty() might check based on n or in/out
                try { notEmpty.await(); }
                catch (InterruptedException e) {}
            }
        }
        x = doDequeue(); // Get the item
        m++; // Increment empty slot count *after* removing
             // If m was <= 0 before this increment, producers were waiting.
        if (m <= 0) {
            notFull.signal(); // Signal *only if* producers were likely waiting
        }
    } finally {
        lock.unlock();
    }
    return x;
}

Explanation:

  • The counters n and m are decremented before the wait condition check. A negative value indicates that adding/removing this item might satisfy a waiting thread on the other side.
  • The counters are incremented after the operation that changes the buffer state.
  • Crucially, signal is only called if the corresponding counter was less than or equal to zero before it was incremented (meaning the other type of thread was definitely blocked or about to block). This avoids signaling when no threads are waiting.
  • The while(isFull/isEmpty) loop around await is still necessary to guard against spurious wakeups and potential race conditions between signaling and waking, even with the counter checks.

Guidelines for Using Condition Waits (wait/await)

Regardless of whether you use intrinsic monitors or explicit conditions, these guidelines are essential for correctness:

  1. Always have a condition predicate: Clearly define the boolean condition your thread is waiting for (e.g., !isFull()).
  2. Always test the predicate in a while loop:
    • Check the condition before calling wait/await. If it’s already true, don’t wait.
    • Re-check the condition after returning from wait/await. Don’t assume it’s true just because you woke up.
    • Therefore: Always call wait/await inside a while loop (while (!condition) { condition.await(); }).
  3. Ensure state is protected: The variables used in the condition predicate must be protected by the same lock associated with the wait/await/signal/notify operations.

java.util.concurrent

Fortunately, Java programmers rarely need to implement complex primitives like bounded queues, semaphores, or barriers from scratch. The java.util.concurrent package provides high-quality, efficient, and well-tested implementations of:

  • Semaphore
  • CyclicBarrier
  • Various producer/consumer queues (ArrayBlockingQueue, LinkedBlockingQueue, SynchronousQueue)
  • Other utilities (CountDownLatch, Future, etc.)

Recommendation: Understand the concepts, but use the library implementations in practice whenever possible.

Reader / Writer Locks

We established that standard locks enforce mutual exclusion, meaning only one thread can hold the lock at a time. This is necessary if threads are writing to shared data.

However, if multiple threads are only reading shared data (and the data is not being written concurrently), allowing them to proceed simultaneously is safe and can improve performance. Writes still require exclusive access.

Use Case: Situations where reads are far more common than writes (e.g., configuration data, caches, data structures like Wikipedia).

A hashtable accessed mostly via lookup (read) but rarely via insert (write) would benefit from allowing concurrent lookups. Wikipedia data serves as an example of a highly read-mostly dataset.

Reader/Writer Lock: A specialized lock abstract data type designed for this scenario. It permits:

  • Multiple threads to hold the lock simultaneously for reading.
  • OR, exactly one thread to hold the lock for writing.

States:

  1. Not Held
  2. Held for Writing (by 1 thread)
  3. Held for Reading (by >= 1 threads)

Invariant: writers == 0 || readers == 0 (cannot have readers and writers concurrently) AND writers <= 1.

Conceptual Operations:

  • new(): Create lock (not held).
  • acquire_write(): Block if any readers or a writer holds the lock. Else, acquire write lock.
  • release_write(): Release write lock.
  • acquire_read(): Block if a writer holds the lock. Else, acquire read lock (incrementing reader count).
  • release_read(): Decrement reader count. If count is now 0, release the read lock fully.

Example Usage (Hashtable):

class Hashtable<K,V> {
    RWLock lk = new RWLock();
 
    V lookup(K key) {
        // ...
        lk.acquire_read(); // Acquire for shared reading
        try { /* read data */ } finally { lk.release_read(); }
        // ...
    }
 
    void insert(K key, V val) {
        // ...
        lk.acquire_write(); // Acquire for exclusive writing
        try { /* write data */ } finally { lk.release_write(); }
        // ...
    }
}

This allows multiple lookup operations to run concurrently, significantly improving throughput if lookups are frequent, while insert still guarantees exclusive access.

Simple Monitor-Based RW Lock Implementation

A basic RW lock can be implemented using Java monitors.

class RWLock {
    int writers = 0; // Count of active writers (0 or 1)
    int readers = 0; // Count of active readers
 
    synchronized void acquire_read() {
        while (writers > 0) { // Wait if writer active
            try { wait(); } catch (InterruptedException e) {}
        }
        readers++;
    }
 
    synchronized void release_read() {
        readers--;
        if (readers == 0) notifyAll(); // Wake waiting writers if last reader
    }
 
    synchronized void acquire_write() {
        while (writers > 0 || readers > 0) { // Wait if writers OR readers active
            try { wait(); } catch (InterruptedException e) {}
        }
        writers++;
    }
 
    synchronized void release_write() {
        writers--;
        notifyAll(); // Wake waiting readers/writers
    }
}

Fairness: This simple implementation has a reader preference. If the lock is held for reading, new readers can acquire it immediately, even if a writer is waiting. This can lead to writer starvation if reads are continuous.

Writer Priority RW Lock Implementation

To prevent writer starvation, we can give writers priority.

class RWLock {
    int writers = 0;
    int readers = 0;
    int writersWaiting = 0; // Count waiting writers
 
    synchronized void acquire_read() {
        // Wait if writer active OR if writer waiting
        while (writers > 0 || writersWaiting > 0) {
            try { wait(); } catch (InterruptedException e) {}
        }
        readers++;
    }
    // release_read is the same (notifyAll if readers == 0)
 
    synchronized void acquire_write() {
        writersWaiting++; // Indicate intent to write
        while (writers > 0 || readers > 0) { // Wait if active reader/writer
            try { wait(); } catch (InterruptedException e) {}
        }
        writersWaiting--; // No longer waiting
        writers++;        // Become active writer
    }
   // release_write is the same (writers--, notifyAll)
}

Now, acquire_read blocks if any writers are waiting (writersWaiting > 0), giving writers priority.

Fairness: Is this fair now? It prevents writer starvation but might now starve readers if writers arrive continuously. True fairness is complex.

These slides discuss more nuanced fairness policies, such as allowing a batch of k waiting readers to proceed after a writer finishes, before the next writer gets priority. Implementing these requires more state variables and careful signaling using explicit Condition variables. The example above gives an in order queuing guarantee (only with respect to readers vs writers, not between readers themselves…).

Reader/Writer Lock Details

  • Priority: Implementations often favor writers to prevent starvation, meaning arriving readers block if a writer is waiting.
  • Re-entrance/Upgrade: Can a thread re-acquire a read lock? Can a reader upgrade to a write lock without releasing the read lock? Can a writer downgrade? These are complex issues. Upgrade is particularly prone to deadlock if multiple readers attempt it concurrently.

In Java:

  • synchronized does not provide RW lock semantics.
  • Use java.util.concurrent.locks.ReentrantReadWriteLock.
  • It provides readLock() and writeLock() methods, returning separate Lock objects. You acquire/release these specific locks.
  • The default implementation has specific fairness policies (check docs) and does not support upgrading from a read lock to a write lock.

Lock Granularity

We’ve discussed coarse vs. fine-grained locking. This fits into a broader spectrum of approaches to concurrency control:

  1. Coarse-Grained Locking: One lock for the entire structure. Simple, but limits concurrency.
  2. Fine-Grained Locking: Separate locks for components (e.g., list nodes, hash buckets). Increases potential concurrency but adds complexity.
  3. Optimistic Synchronization: Assume conflicts are rare. Perform operations without locking, then detect if a conflict occurred (e.g., using version numbers or CAS). If conflict, retry or handle otherwise.
  4. Lazy Synchronization: Delay or simplify cleanup work. E.g., removing an item might just mark it as “deleted” logically; actual removal happens later. Can improve performance of some operations but complicates others (like getting the size).
  5. Lock-Free Synchronization: Use atomic hardware primitives (like CAS) exclusively to manage concurrent access without using any mutual exclusion locks. Very complex but can offer benefits like immunity to deadlock and better performance under certain conditions.

Fine-Grained Locking Example: List-Based Set

Let’s apply fine-grained locking to our sorted linked list set. The idea is to put a lock on each Node.

Coarse-grained locking would just synchronize add, remove, contains.

Fine-grained aims to split the object (list) into pieces (nodes) with separate locks, allowing concurrent operations on disjoint parts.

Naive Remove (Wrong): Trying to remove c by just locking b (the predecessor) fails because other threads might concurrently modify c or the node after c.

Concurrent Remove Failure: If Thread A (removing c) locks b and Thread B (removing b) locks a concurrently, Thread B might remove b from the list (a.next = c) before Thread A modifies b.next. Thread A’s subsequent update (b.next = d) modifies an orphaned node, and c is never removed.

The Problem: Modifying the list structure requires coordinated, exclusive access to multiple adjacent nodes simultaneously. Removing c requires locking both b (to change b.next) and c (to read c.next).

Hand-Over-Hand Locking (Lock Coupling): The standard solution for fine-grained list locking.

  • To traverse, always hold the lock on the current node you are examining.
  • To move from curr to next:
    1. Lock next (curr.next.lock()).
    2. Unlock curr (curr.unlock()).
  • To perform an operation (like delete) involving pred and curr:
    1. Traverse until pred and curr are the nodes you need.
    2. You will arrive holding the lock on pred.
    3. Lock curr.
    4. Now you hold locks on both pred and curr. Perform the modification (e.g., pred.next = curr.next).
    5. Unlock curr.
    6. Unlock pred.

This ensures that the necessary nodes are always locked during modification, preventing race conditions while still allowing concurrency in different parts of the list. (The code snippet shows the traversal and deletion logic within this pattern).

Continue here: 21 Optimistic Sync, Lazy Sync, Lazy Skip Lists, Lock-Free Programming