Lecture from 30.04.2025 | Video: Video ETHZ
Last Week Recap
Let’s quickly summarize the key takeaways from our previous lectures:
- Locks & Locks with Atomics: We explored how locks ensure mutual exclusion. While software-only locks (Peterson, Filter, Bakery) are possible using atomic/safe registers (requiring O(N) space), practical locks rely on atomic Read-Modify-Write (RMW) hardware operations (TAS, CAS, LL/SC) to achieve constant O(1) space and better efficiency (e.g., TASLock, TATAS, TATAS with backoff).
- Intro to (Dead)locks: We defined deadlocks formally as cycles in resource allocation graphs and discussed basic avoidance strategies like ordered locking.
- Rendezvous & Barrier: We examined these patterns for large-scale synchronization and encountered implementation pitfalls, particularly potential deadlocks when using semaphores incorrectly.
Producer / Consumer with Explicit Lock
and Condition
While Java’s intrinsic monitors (synchronized
, wait
, notify
) work, the java.util.concurrent.locks
package offers more flexibility. We can use an explicit Lock
(like ReentrantLock
) and create separate Condition
objects for different wait sets.
Setup:
import java.util.concurrent.locks.*;
class Queue {
int in=0, out=0, size;
long buf[];
final Lock lock = new ReentrantLock(); // Use an explicit re-entrant lock
// Separate condition queue for producers waiting because the buffer is full
final Condition notFull = lock.newCondition();
// Separate condition queue for consumers waiting because the buffer is empty
final Condition notEmpty = lock.newCondition();
Queue(int s) {
size = s;
buf = new long[size];
// lock, notFull, notEmpty are initialized above
}
// ... methods ...
}
- We instantiate a
ReentrantLock
. - Crucially, we create two
Condition
objects, both associated with the samelock
.notFull
will be used by producers to wait when the queue is full, andnotEmpty
will be used by consumers to wait when the queue is empty.
Implementation:
// Inside Queue class...
void enqueue(long x) {
lock.lock(); // Acquire the explicit lock
try {
// MUST loop to re-check condition after waking
while (isFull()) {
try {
// Release lock and wait specifically on the 'notFull' condition
notFull.await();
} catch (InterruptedException e) { /* Handle */ }
}
// Lock is held again here
doEnqueue(x); // Perform the enqueue operation
// Signal ONE thread waiting specifically on the 'notEmpty' condition
notEmpty.signal();
} finally {
lock.unlock(); // Ensure lock is released
}
}
long dequeue() {
long x;
lock.lock(); // Acquire the explicit lock
try {
// MUST loop to re-check condition after waking
while (isEmpty()) {
try {
// Release lock and wait specifically on the 'notEmpty' condition
notEmpty.await();
} catch (InterruptedException e) { /* Handle */ }
}
// Lock is held again here
x = doDequeue(); // Perform the dequeue operation
// Signal ONE thread waiting specifically on the 'notFull' condition
notFull.signal();
} finally {
lock.unlock(); // Ensure lock is released
}
return x;
}
Explanation:
- Explicit Locking: We use
lock.lock()
andlock.unlock()
, making sureunlock
is in afinally
block. - Condition Waiting: We use
notFull.await()
for producers andnotEmpty.await()
for consumers.await()
atomically releases the associatedlock
and waits on that specific condition’s queue. Upon waking, it re-acquires thelock
before returning. Thewhile
loop is still essential. - Targeted Signaling: When
enqueue
adds an item, it callsnotEmpty.signal()
. This wakes up only one thread (if any) that is waiting specifically because the queue was empty (i.e., waiting onnotEmpty
). Similarly,dequeue
callsnotFull.signal()
to wake one waiting producer. This is potentially more efficient thannotifyAll()
, which wakes all threads regardless of why they were waiting.
The Sleeping Barber Variant (E. Dijkstra)
Critique of Previous Solution: The enqueue
/dequeue
implementation using explicit Locks/Conditions (and also the monitor version using notifyAll
) has a minor inefficiency: signal()
(or notifyAll()
) is called every single time an item is enqueued or dequeued, even if no other threads are currently waiting. Sending a signal when nobody is waiting incurs some unnecessary overhead.
Analogy: The “Sleeping Barber” problem provides a classic analogy.
- A barber shop has one barber, one barber chair, and several chairs for waiting customers.
- If no customers are present, the barber sleeps in the barber chair.
- When a customer arrives:
- If the barber is sleeping, the customer wakes the barber and sits in the chair.
- If the barber is busy and there are free waiting chairs, the customer sits in a waiting chair.
- If the barber is busy and all chairs are full, the customer leaves.
- When the barber finishes a haircut, they check the waiting room. If customers are waiting, the barber calls one; otherwise, the barber goes to sleep.
The Problem (Lost Wakeup / Lost Signal): What happens if the timing is just wrong?
- Barber finishes haircut, checks waiting room (empty), decides to go to sleep.
- Just before the barber actually falls asleep, a customer arrives.
- The customer sees the barber “about to sleep” (or technically still awake but having decided to sleep) and tries to wake the barber (e.g., calls
signal
). - The barber didn’t execute the
wait
/await
call yet, so the signal is lost. - The barber proceeds to
wait
/await
and falls asleep. - The customer, having already sent the signal, now waits in the chair (or leaves if full), potentially forever if no other customers arrive. The barber never wakes up.
Solving Sleeping Barber: To avoid lost signals, the barber/consumer needs to know if customers/producers are actually waiting before going to sleep, and customers/producers need to know if the barber/consumer is actually sleeping before trying to wake them. This typically requires additional shared counters to track the number of waiting producers and consumers.
- Let
m
track available slots (or waiting producers if negative). - Let
n
track available items (or waiting consumers if negative).
Producer/Consumer Queue - Sleeping Barber Setup:
import java.util.concurrent.locks.*;
class Queue {
int in=0, out=0, size;
long buf[];
final Lock lock = new ReentrantLock();
// Counter for items/waiting consumers (negative -> consumers waiting)
int n = 0;
final Condition notEmpty = lock.newCondition();
// Counter for empty slots/waiting producers (negative -> producers waiting)
// Initialized based on size, e.g., m = size for slots count
int m; // initialized in constructor, e.g., m = size;
final Condition notFull = lock.newCondition();
Queue(int s) {
size = s;
m = size; // Initially 'size' empty slots
buf = new long[size];
// ... other initializations ...
}
// ... methods ...
}
Note: The slide initializes m = size-1
which implies using the “one empty slot” trick to distinguish full/empty, adjusting the logic slightly.
Sleeping Barber Implementation Logic:
// Inside Queue class (Sleeping Barber logic)...
void enqueue(long x) {
lock.lock();
try {
m--; // Decrement empty slot count *before* checking
// If m becomes < 0, it means producers might need to wait
// or were already waiting.
if (m < 0) {
// Since m < 0 implies the buffer *was* full just before our m--,
// we must wait. The while loop is still needed for robustness.
while (isFull()) { // isFull() might check based on m or in/out
try { notFull.await(); }
catch (InterruptedException e) {}
}
}
doEnqueue(x); // Add the item
n++; // Increment item count *after* adding
// If n was <= 0 before this increment, consumers were waiting.
if (n <= 0) {
notEmpty.signal(); // Signal *only if* consumers were likely waiting
}
} finally {
lock.unlock();
}
}
long dequeue() {
long x;
lock.lock();
try {
n--; // Decrement item count *before* checking
// If n becomes < 0, consumers must wait.
if (n < 0) {
// Since n < 0 implies the buffer *was* empty just before our n--,
// we must wait.
while (isEmpty()) { // isEmpty() might check based on n or in/out
try { notEmpty.await(); }
catch (InterruptedException e) {}
}
}
x = doDequeue(); // Get the item
m++; // Increment empty slot count *after* removing
// If m was <= 0 before this increment, producers were waiting.
if (m <= 0) {
notFull.signal(); // Signal *only if* producers were likely waiting
}
} finally {
lock.unlock();
}
return x;
}
Explanation:
- The counters
n
andm
are decremented before the wait condition check. A negative value indicates that adding/removing this item might satisfy a waiting thread on the other side. - The counters are incremented after the operation that changes the buffer state.
- Crucially,
signal
is only called if the corresponding counter was less than or equal to zero before it was incremented (meaning the other type of thread was definitely blocked or about to block). This avoids signaling when no threads are waiting. - The
while(isFull/isEmpty)
loop aroundawait
is still necessary to guard against spurious wakeups and potential race conditions between signaling and waking, even with the counter checks.
Guidelines for Using Condition Waits (wait
/await
)
Regardless of whether you use intrinsic monitors or explicit conditions, these guidelines are essential for correctness:
- Always have a condition predicate: Clearly define the boolean condition your thread is waiting for (e.g.,
!isFull()
). - Always test the predicate in a
while
loop:- Check the condition before calling
wait
/await
. If it’s already true, don’t wait. - Re-check the condition after returning from
wait
/await
. Don’t assume it’s true just because you woke up. - Therefore: Always call
wait
/await
inside awhile
loop (while (!condition) { condition.await(); }
).
- Check the condition before calling
- Ensure state is protected: The variables used in the condition predicate must be protected by the same lock associated with the
wait
/await
/signal
/notify
operations.
java.util.concurrent
Fortunately, Java programmers rarely need to implement complex primitives like bounded queues, semaphores, or barriers from scratch. The java.util.concurrent
package provides high-quality, efficient, and well-tested implementations of:
Semaphore
CyclicBarrier
- Various producer/consumer queues (
ArrayBlockingQueue
,LinkedBlockingQueue
,SynchronousQueue
) - Other utilities (
CountDownLatch
,Future
, etc.)
Recommendation: Understand the concepts, but use the library implementations in practice whenever possible.
Reader / Writer Locks
We established that standard locks enforce mutual exclusion, meaning only one thread can hold the lock at a time. This is necessary if threads are writing to shared data.
However, if multiple threads are only reading shared data (and the data is not being written concurrently), allowing them to proceed simultaneously is safe and can improve performance. Writes still require exclusive access.
Use Case: Situations where reads are far more common than writes (e.g., configuration data, caches, data structures like Wikipedia).
A hashtable accessed mostly via lookup
(read) but rarely via insert
(write) would benefit from allowing concurrent lookups. Wikipedia data serves as an example of a highly read-mostly dataset.
Reader/Writer Lock: A specialized lock abstract data type designed for this scenario. It permits:
- Multiple threads to hold the lock simultaneously for reading.
- OR, exactly one thread to hold the lock for writing.
States:
- Not Held
- Held for Writing (by 1 thread)
- Held for Reading (by >= 1 threads)
Invariant: writers == 0 || readers == 0
(cannot have readers and writers concurrently) AND writers <= 1
.
Conceptual Operations:
new()
: Create lock (not held).acquire_write()
: Block if any readers or a writer holds the lock. Else, acquire write lock.release_write()
: Release write lock.acquire_read()
: Block if a writer holds the lock. Else, acquire read lock (incrementing reader count).release_read()
: Decrement reader count. If count is now 0, release the read lock fully.
Example Usage (Hashtable):
class Hashtable<K,V> {
RWLock lk = new RWLock();
V lookup(K key) {
// ...
lk.acquire_read(); // Acquire for shared reading
try { /* read data */ } finally { lk.release_read(); }
// ...
}
void insert(K key, V val) {
// ...
lk.acquire_write(); // Acquire for exclusive writing
try { /* write data */ } finally { lk.release_write(); }
// ...
}
}
This allows multiple lookup
operations to run concurrently, significantly improving throughput if lookups are frequent, while insert
still guarantees exclusive access.
Simple Monitor-Based RW Lock Implementation
A basic RW lock can be implemented using Java monitors.
class RWLock {
int writers = 0; // Count of active writers (0 or 1)
int readers = 0; // Count of active readers
synchronized void acquire_read() {
while (writers > 0) { // Wait if writer active
try { wait(); } catch (InterruptedException e) {}
}
readers++;
}
synchronized void release_read() {
readers--;
if (readers == 0) notifyAll(); // Wake waiting writers if last reader
}
synchronized void acquire_write() {
while (writers > 0 || readers > 0) { // Wait if writers OR readers active
try { wait(); } catch (InterruptedException e) {}
}
writers++;
}
synchronized void release_write() {
writers--;
notifyAll(); // Wake waiting readers/writers
}
}
Fairness: This simple implementation has a reader preference. If the lock is held for reading, new readers can acquire it immediately, even if a writer is waiting. This can lead to writer starvation if reads are continuous.
Writer Priority RW Lock Implementation
To prevent writer starvation, we can give writers priority.
class RWLock {
int writers = 0;
int readers = 0;
int writersWaiting = 0; // Count waiting writers
synchronized void acquire_read() {
// Wait if writer active OR if writer waiting
while (writers > 0 || writersWaiting > 0) {
try { wait(); } catch (InterruptedException e) {}
}
readers++;
}
// release_read is the same (notifyAll if readers == 0)
synchronized void acquire_write() {
writersWaiting++; // Indicate intent to write
while (writers > 0 || readers > 0) { // Wait if active reader/writer
try { wait(); } catch (InterruptedException e) {}
}
writersWaiting--; // No longer waiting
writers++; // Become active writer
}
// release_write is the same (writers--, notifyAll)
}
Now, acquire_read
blocks if any writers are waiting (writersWaiting > 0
), giving writers priority.
Fairness: Is this fair now? It prevents writer starvation but might now starve readers if writers arrive continuously. True fairness is complex.
These slides discuss more nuanced fairness policies, such as allowing a batch of k
waiting readers to proceed after a writer finishes, before the next writer gets priority. Implementing these requires more state variables and careful signaling using explicit Condition
variables. The example above gives an in order queuing guarantee (only with respect to readers vs writers, not between readers themselves…).
Reader/Writer Lock Details
- Priority: Implementations often favor writers to prevent starvation, meaning arriving readers block if a writer is waiting.
- Re-entrance/Upgrade: Can a thread re-acquire a read lock? Can a reader upgrade to a write lock without releasing the read lock? Can a writer downgrade? These are complex issues. Upgrade is particularly prone to deadlock if multiple readers attempt it concurrently.
In Java:
synchronized
does not provide RW lock semantics.- Use
java.util.concurrent.locks.ReentrantReadWriteLock
. - It provides
readLock()
andwriteLock()
methods, returning separateLock
objects. You acquire/release these specific locks. - The default implementation has specific fairness policies (check docs) and does not support upgrading from a read lock to a write lock.
Lock Granularity
We’ve discussed coarse vs. fine-grained locking. This fits into a broader spectrum of approaches to concurrency control:
- Coarse-Grained Locking: One lock for the entire structure. Simple, but limits concurrency.
- Fine-Grained Locking: Separate locks for components (e.g., list nodes, hash buckets). Increases potential concurrency but adds complexity.
- Optimistic Synchronization: Assume conflicts are rare. Perform operations without locking, then detect if a conflict occurred (e.g., using version numbers or CAS). If conflict, retry or handle otherwise.
- Lazy Synchronization: Delay or simplify cleanup work. E.g., removing an item might just mark it as “deleted” logically; actual removal happens later. Can improve performance of some operations but complicates others (like getting the size).
- Lock-Free Synchronization: Use atomic hardware primitives (like CAS) exclusively to manage concurrent access without using any mutual exclusion locks. Very complex but can offer benefits like immunity to deadlock and better performance under certain conditions.
Fine-Grained Locking Example: List-Based Set
Let’s apply fine-grained locking to our sorted linked list set. The idea is to put a lock on each Node
.
Coarse-grained locking would just synchronize add
, remove
, contains
.
Fine-grained aims to split the object (list) into pieces (nodes) with separate locks, allowing concurrent operations on disjoint parts.
Naive Remove (Wrong): Trying to remove c
by just locking b
(the predecessor) fails because other threads might concurrently modify c
or the node after c
.
Concurrent Remove Failure: If Thread A (removing c) locks b
and Thread B (removing b) locks a
concurrently, Thread B might remove b
from the list (a.next = c
) before Thread A modifies b.next
. Thread A’s subsequent update (b.next = d
) modifies an orphaned node, and c
is never removed.
The Problem: Modifying the list structure requires coordinated, exclusive access to multiple adjacent nodes simultaneously. Removing c
requires locking both b
(to change b.next
) and c
(to read c.next
).
Hand-Over-Hand Locking (Lock Coupling): The standard solution for fine-grained list locking.
- To traverse, always hold the lock on the current node you are examining.
- To move from
curr
tonext
:- Lock
next
(curr.next.lock()
). - Unlock
curr
(curr.unlock()
).
- Lock
- To perform an operation (like delete) involving
pred
andcurr
:- Traverse until
pred
andcurr
are the nodes you need. - You will arrive holding the lock on
pred
. - Lock
curr
. - Now you hold locks on both
pred
andcurr
. Perform the modification (e.g.,pred.next = curr.next
). - Unlock
curr
. - Unlock
pred
.
- Traverse until
This ensures that the necessary nodes are always locked during modification, preventing race conditions while still allowing concurrency in different parts of the list. (The code snippet shows the traversal and deletion logic within this pattern).
Continue here: 21 Optimistic Sync, Lazy Sync, Lazy Skip Lists, Lock-Free Programming