Lecture from: 26.03.2024 | Video: Video ETHZ
Shared Memory Concurrency, Locks, and Data Races
Continuation…
Canonical Example: Bank Account
Let’s consider a simple BankAccount
class. In a single-threaded world, this code works perfectly fine:
class BankAccount {
private int balance = 0;
int getBalance() {
return balance;
}
void setBalance(int x) {
balance = x;
}
void withdraw(int amount) {
int b = getBalance(); // Read balance
if (amount > b) {
throw new WithdrawTooLargeException();
}
setBalance(b - amount); // Write new balance
}
// ... other operations like deposit, etc.
}
The withdraw
method reads the balance, checks if the withdrawal is possible, and then updates the balance.
A Bad Interleaving
Now, imagine two threads executing withdraw(100)
on the same BankAccount
object concurrently, with an initial balance of 150. Due to thread interleaving, the following scenario (a “bad interleaving”) can occur:
- Thread 1: Reads balance (b = 150).
- Thread 2: Reads balance (b = 150). (Thread 1 gets preempted before writing).
- Thread 2: Checks
amount > b
(100 > 150 is false). - Thread 2: Calculates
b - amount
(150 - 100 = 50). - Thread 2: Calls
setBalance(50)
. The balance is now 50. - Thread 1: Resumes execution. It already read
b
as 150. - Thread 1: Checks
amount > b
(100 > 150 is false). - Thread 1: Calculates
b - amount
(150 - 100 = 50). - Thread 1: Calls
setBalance(50)
. The balance remains 50.
Result: The final balance is 50. We expected it to be -50 (or ideally, the second withdrawal should have failed if the first completed fully). One withdrawal was effectively “lost”!
Interleaving (Recap)
- Interleaving: When operations from multiple threads execute in an alternating, potentially unpredictable sequence because their execution overlaps in time.
- Preemption: Interleaving can happen even on a single processor due to time-slicing, where the operating system preempts (interrupts) one thread to let another run.
- No Problem if Different Resources: If threads access completely different, independent resources (e.g., different
BankAccount
objects), interleaving is usually not an issue. - Problem with Aliasing: If multiple threads access the same resource (the objects are aliases), interleaving can cause problems like the “lost withdraw”.
Incorrect “Fixes”
It’s tempting to try fixing bad interleavings by simply rearranging code or repeating operations. This is almost always wrong.
Consider this attempted fix:
void withdraw(int amount) {
if (amount > getBalance()) // Check balance again right before setting
throw new WithdrawTooLargeException();
// Maybe balance changed between the check and here?
setBalance(getBalance() - amount); // Read balance *again*
}
Why this is wrong:
- Still Not Atomic: There’s still a small window between reading the balance in the
if
condition and reading it again insidesetBalance
. Another thread could modify the balance in that window. - Compiler Reordering: The compiler might reorder operations (especially if it doesn’t know about concurrency), potentially turning this back into the original, flawed version.
- New Bugs: This specific “fix” can lead to a negative balance! Imagine:
- Balance = 150.
- Thread 1: Checks
amount (100) > getBalance() (150)
→ false. - Thread 2: Executes
withdraw(100)
completely. Balance becomes 50. - Thread 1: Resumes. Calls
setBalance(getBalance() (50) - amount (100))
. Balance becomes -50!
The fundamental problem is the lack of atomicity. The sequence of “read-check-write” must happen as a single, indivisible operation.
Mutual Exclusion: The Sane Fix
The correct solution is to enforce mutual exclusion.
- Allow at most one thread to execute the
withdraw
operation (and other related operations likedeposit
) on a specific accountA
at any given time. - Mutual Exclusion: If one thread is currently using a shared resource (like the
BankAccount
), any other thread trying to use the same resource must wait. - Critical Sections: The sections of code that access the shared resource (e.g., the body of the
withdraw
method) are called critical sections.
Programmer’s Responsibility:
- The programmer must identify critical sections.
- The programmer must use language primitives (like locks) to ensure mutual exclusion for these critical sections. The compiler generally cannot automatically determine which interleavings are safe or unsafe based on program semantics.
Critical Sections and Mutual Exclusion Defined
- Critical Section: A piece of code that accesses a shared resource and must not be executed by more than one thread simultaneously.
// Critical Section for withdraw: int b = getBalance(); if (amount > b) throw new WithdrawTooLargeException(); setBalance(b - amount);
- Mutual Exclusion: The property that ensures only one thread can execute a critical section at a time. An algorithm that enforces this typically involves:
acquire_mutex()
: Code executed before entering the critical section. If another thread is already in, this call blocks until the critical section is free....
(critical section code)release_mutex()
: Code executed after leaving the critical section, allowing another waiting thread (if any) to enter.
Wrong! Implementing Our Own Mutex
Why can’t we just use a simple boolean flag to implement mutual exclusion?
class BankAccount {
private int balance = 0;
private boolean busy = false; // Our "lock" flag
void withdraw(int amount) {
while (busy) { /* "spin-wait" */ } // Wait if busy
busy = true; // Mark as busy
// --- Critical Section Start ---
int b = getBalance();
if (amount > b)
throw new WithdrawTooLargeException();
setBalance(b - amount);
// --- Critical Section End ---
busy = false; // Mark as not busy
}
// deposit would spin on the same boolean
}
This doesn’t work! We’ve just moved the race condition. Two threads could simultaneously:
- Thread 1: Checks
while(busy)
(false). - Thread 2: Checks
while(busy)
(false). (Thread 1 preempted before settingbusy = true
). - Thread 2: Sets
busy = true
. - Thread 2: Enters the critical section.
- Thread 1: Resumes. Sets
busy = true
(it’s already true, but no matter). - Thread 1: Also enters the critical section!
We need the check (while(busy)
) and the set (busy = true
) to happen atomically.
What We Need: Locks
To solve this, we need help from the programming language or runtime system in the form of locks (also known as mutexes).
A lock is a basic synchronization primitive with atomic operations:
new
: Creates a new lock, initially in the “not held” state.acquire
:- Checks if the lock is “held”.
- If held, the calling thread blocks (waits) until the lock is released.
- If not held, the lock is atomically marked as “held” by the calling thread, and the thread proceeds.
release
:- Marks the lock as “not held”.
- If one or more threads were blocked waiting to acquire this lock, exactly one of them is unblocked and allowed to acquire the lock.
Why Locks Work:
- The
acquire
andrelease
operations are atomic. The lock implementation (using special hardware instructions and/or operating system support) guarantees that even with simultaneous attempts to acquire/release, the lock’s state remains consistent, and only one thread holds the lock at a time. - We treat the lock mechanism itself as a primitive; we don’t implement it ourselves in standard application code.
The Lock Object Interface
Conceptually, a lock can be thought of as an object satisfying this interface:
public interface Lock {
public void lock(); // Acquire the lock (blocks if necessary) - Entering CS
public void unlock(); // Release the lock - Leaving CS
}
The semantics are:
new Lock()
: Creates a lock, initially “not held”.acquire
(lock()
): Blocks if held. Atomically makes the lock “held” when acquired.release
(unlock()
): Makes the lock “not held”. Unblocks one waiting thread, if any.
Required Properties of Mutual Exclusion
Any correct mutual exclusion implementation (like a lock) must satisfy:
- Safety Property: At most one thread executes the critical section code at any given time. (Prevents interference).
- Liveness Property: A thread attempting to acquire the lock (
acquire_mutex
) must eventually succeed if no other thread is holding the lock indefinitely. (Prevents indefinite waiting/starvation).
Using Locks (Almost Correctly)
Using our conceptual Lock
interface, we can protect the BankAccount
:
class BankAccount {
private int balance = 0;
private Lock lk = new Lock(); // One lock *per account*
void withdraw(int amount) {
lk.lock(); // Acquire lock before accessing shared state
// --- Critical Section Start ---
int b = getBalance();
if (amount > b) {
throw new WithdrawTooLargeException(); // Uh oh, what if exception happens?
}
setBalance(b - amount);
// --- Critical Section End ---
lk.unlock(); // Release lock after accessing shared state
}
// deposit would also acquire/release lk
}
Important: Each BankAccount
instance should have its own lock (lk
) to allow concurrent operations on different accounts.
Exception Handling: The code above has a flaw! If WithdrawTooLargeException
(or any other exception) occurs within the critical section before lk.unlock()
, the lock will never be released, potentially blocking other threads forever. The correct pattern uses try...finally
:
lk.lock();
try {
// Critical Section
} finally {
lk.unlock(); // Ensures unlock happens even if exceptions occur
}
Possible Mistakes with Locks
- Incorrect: Using Different Locks: If
withdraw
useslockA
anddeposit
useslockB
, they won’t provide mutual exclusion for the sharedbalance
. Both methods must acquire the same lock (lk
in our example) to protect the sharedbalance
. - Poor Performance: Using the Same Lock for Everything: If we used one single static lock for all bank accounts, only one thread could perform any operation on any account at a time, killing concurrency.
- Incorrect: Forgetting to Release: As noted, failing to release a lock (especially due to exceptions) is a serious bug that can cause deadlock. Always use
try...finally
.
Locking and Other Operations
If withdraw
and deposit
use the same lock for a given account, they are correctly synchronized with each other. But what about helper methods like getBalance
and setBalance
?
- Assume they are
public
(accessible from outside). - If
getBalance
andsetBalance
do not acquire the lock: A race condition can occur between a thread callingwithdraw
(which does use the lock internally via these methods if they were synchronized) and another thread callingsetBalance
directly. - If
getBalance
andsetBalance
do acquire the same lock: A problem arises! Ifwithdraw
acquires the locklk
and then callssetBalance
, which also tries to acquirelk
, the thread will block forever waiting for a lock it already holds! This is a self-deadlock.
Re-acquiring Locks: Re-entrant Locks
How can a thread acquire a lock it already holds?
- Approach 1 (Restrictive): Disallow
withdraw
from calling a synchronizedsetBalance
. This often leads to code duplication or awkward structuring. - Approach 2 (Better): Use re-entrant locks (also called recursive locks).
Re-entrant Lock:
- “Remembers” which thread currently holds the lock.
- Maintains a count of how many times the holding thread has acquired the lock.
- Acquire:
- If the lock is not held, acquire it and set count to 1.
- If the lock is held by the current thread, simply increment the count. The thread does not block.
- If the lock is held by a different thread, block as usual.
- Release:
- Decrement the count.
- If the count reaches 0, the lock becomes “not held”.
With re-entrant locks, the following code works correctly:
// Assumes lk is a re-entrant lock
int setBalance(int x) {
lk.lock();
try {
balance = x;
} finally {
lk.unlock();
}
}
void withdraw(int amount) {
lk.lock();
try {
int b = getBalance(); // Assume getBalance also uses lk
if (amount > b) { /*...*/ }
setBalance(b - amount); // Okay! Thread already holds lk
} finally {
lk.unlock();
}
}
Locks in Java: The synchronized
Statement
Java has built-in support for re-entrant locks via the synchronized
keyword.
synchronized (expression) { // expression evaluates to an object
// statements
}
- Evaluates
expression
to an object: Every Java object (except primitives) has an implicit “intrinsic lock” (or “monitor lock”) associated with it. - Acquires the lock: The thread attempts to acquire the intrinsic lock of the object produced by
expression
. It blocks if necessary. Since it’s re-entrant, acquiring a lock already held by the current thread succeeds immediately. - Executes
statements
: The code inside the block runs only after the lock is acquired. - Releases the lock: The lock is automatically released when control leaves the
synchronized
block, whether normally (reaching the}
) or due to an exception orreturn
. This avoids the need for manualtry...finally
.
You can also synchronize entire methods:
synchronized void myMethod() {
// Body of method
}
This is equivalent to:
void myMethod() {
synchronized(this) { // Synchronizes on the current object instance
// Body of method
}
}
External Locks in Java
Besides intrinsic locks (synchronized
), Java also provides explicit lock implementations in the java.util.concurrent.locks
package, such as ReentrantLock
.
-
External Locks (e.g.,
ReentrantLock
):- More flexible than intrinsic locks.
- Offer additional features like timed waits, interruptible waits, and different fairness policies.
- Support more sophisticated locking idioms (e.g., reader-writer locks, condition variables - discussed in later lectures).
- Require manual
lock()
andunlock()
calls, typically within atry...finally
block.
-
Intrinsic Locks (
synchronized
):- Simpler syntax.
- Automatic lock release.
-
Class
java.util.concurrent.locks.ReentrantLock
behaves much like our pseudocode lock. -
Use
try { ... } finally { lock.unlock(); }
for safety.
synchronized
vs java.util.concurrent.locks.Lock
API:
Feature | synchronized | Lock API (e.g., ReentrantLock ) |
---|---|---|
Release | Automatic (handled by JVM) | Manual (unlock() ) |
Scope | Block-scoped | Can span methods |
Waiting | Blocks indefinitely | tryLock() , interruptible lock |
Flexibility | Basic | More features (conditions, etc.) |
Ease of Use | Cleaner code, easier to maintain | More verbose, slightly bug-prone |
Recommendation: Use synchronized
whenever possible due to its simplicity and safety (automatic release). Use the Lock
API when you need its advanced features.
Race Conditions Revisited
A Race Condition occurs when the correctness of a concurrent program depends on the specific, unpredictable timing or interleaving of operations executed by multiple threads. The problem usually involves threads observing inconsistent intermediate states.
Important Distinction: This lecture (and good practice) distinguishes between two types of race-condition bugs:
- Data Races
- Bad Interleavings
- Data Race (Low-Level Race Condition):
- Occurs when there are insufficiently synchronized accesses to the same shared memory location by multiple threads, and at least one access is a write.
- Specifically: Simultaneous read/write or write/write to the same location without proper locking.
- Always an error (in languages like Java/C/C++ with defined memory models) because they can lead to bizarre, non-intuitive behaviors due to compiler and hardware optimizations/reorderings. Avoid them at all costs.
- Bad Interleaving (High-Level Race Condition):
- Occurs when the program behaves incorrectly due to an unfavorable execution order of operations, even if the individual accesses to shared resources are properly synchronized (i.e., no data races).
- Whether an interleaving is “bad” depends on the program’s specification or intended invariants.
Example:
// Assume getBalance and setBalance are synchronized
public /*NOT synchronized*/ void withdraw(int amount) {
// ...
b = getBalance(); // Reads balance (under lock inside getBalance)
// ... calculation ...
setBalance(b - amount); // Writes balance (under lock inside setBalance)
// ...
}
- No Data Race:
withdraw
itself doesn’t directly accessbalance
. It callsgetBalance
andsetBalance
, which are synchronized. There are no concurrent read/write or write/write operations on thebalance
field itself without a lock. - Bad Interleaving (High-Level Race): As we saw before, two
withdraw
calls can still interleave between thegetBalance
andsetBalance
calls withinwithdraw
, leading to the “lost withdraw” bug. This is a high-level race condition because the individual low-level accesses are safe, but the sequence of operations leads to an incorrect outcome based on our expectation of howwithdraw
should work.
Example: Bounded Stack
Consider a simple bounded stack implementation using an array.
public class Stack <E> {
E[] array;
int index; // Points to the next available slot
public Stack(int entries) {
// (Hack to create generic array)
array = (E[]) new Object[entries];
index = 0;
}
// ... methods ...
}
Let’s add synchronized methods for basic stack operations:
public class Stack <E> {
// ... fields and constructor ...
synchronized boolean isEmpty() {
return index == 0;
}
synchronized void push(E val) throws StackFullException {
if (index == array.length)
throw new StackFullException();
array[index++] = val;
}
synchronized E pop() throws StackEmptyException {
if (index == 0)
throw new StackEmptyException();
return array[--index]; // Note: Potential memory leak if E is not primitive
}
}
These methods (isEmpty
, push
, pop
) are individually thread-safe because they are synchronized
, preventing data races on array
and index
.
The Problem with peek
Now, let’s try to add a peek
method (view the top element without removing it) using the existing push
and pop
:
// Inside Stack<E> class...
E peek() { // NOT synchronized initially
E ans = pop();
push(ans); // Put it back
return ans;
} // WRONG!
Sequentially, this code is correct (though inefficient). If we only had the Stack<E>
interface (isEmpty
, push
, pop
), this pop-then-push approach is the only way to implement peek
.
Concurrently, this peek
is broken.
- Overall Effect:
peek
is intended as a “reader” – it shouldn’t change the stack’s state permanently. - Intermediate State: However, the implementation temporarily modifies the stack (by popping) before restoring it (by pushing). This creates an inconsistent intermediate state where the stack is briefly missing its top element.
- No Data Races: Calls to
push
andpop
are synchronized, so there are no low-level data races on the underlyingarray
orindex
. - Bad Interleavings: The intermediate state can be observed by other threads, leading to several bad interleavings.
Bad Interleaving 1: peek
and isEmpty
Property we want: If the stack is not empty, isEmpty
should return false
.
Scenario:
- Stack has one element “A”.
- Thread 1 (peek): Calls
pop()
.ans
= “A”. Stack is now empty. - Thread 2: Calls
isEmpty()
. Returnstrue
(because the stack is temporarily empty). VIOLATION! - Thread 1 (peek): Calls
push(ans)
. Stack now contains “A” again. - Thread 1 (peek): Returns “A”.
Bad Interleaving 2: peek
and LIFO Order
Property we want: Stack follows Last-In, First-Out (LIFO) order.
Scenario:
- Stack is empty.
- Thread 2: Calls
push(x)
. Stack: [x] - Thread 2: Calls
push(y)
. Stack: [x, y] - Thread 1 (peek): Calls
pop()
.ans
= “y”. Stack: [x] - Thread 2: Calls
pop()
. Returns “x”. VIOLATION! Should have returned “y”. - Thread 1 (peek): Calls
push(ans)
. Stack: [x, y] - Thread 1 (peek): Returns “y”.
The Fix: Larger Critical Section
The peek
operation needs to be atomic. The sequence pop
-then-push
must appear indivisible to other threads. We achieve this by making the entire peek
method a critical section, synchronized on the same lock used by push
and pop
.
// Inside Stack<E> class...
synchronized E peek() { // Now synchronized!
E ans = pop(); // pop is synchronized, okay due to re-entrant lock
push(ans); // push is synchronized, okay due to re-entrant lock
return ans;
}
// OR, if implementing peek outside the class:
class C {
<E> E myPeek(Stack<E> s) {
synchronized (s) { // Lock on the stack object itself
E ans = s.pop();
s.push(ans);
return ans;
}
}
}
Because Java’s intrinsic locks are re-entrant, the synchronized peek
method can safely call the synchronized pop
and push
methods without causing deadlock.
The Wrong “Fix”: Skipping Synchronization for Readers
It’s tempting to think: “If peek
or isEmpty
don’t write anything, maybe they don’t need synchronization?”
// WRONG - DON'T DO THIS
boolean isEmpty() {
return index == 0; // Read index without lock
}
This is wrong because it introduces a data race. Even if isEmpty
only reads index
, another thread might be executing push
or pop
concurrently, writing to index
without holding the lock relative to the isEmpty
read. Data races lead to undefined behavior.
Recap of the Distinction:
- Original
peek
(pop-then-push, unsynchronized method): Had no data races (becausepop
andpush
were synchronized internally), but had bad interleavings due to the inconsistent intermediate state. isEmpty
(readindex
directly, unsynchronized): Has a data race withpush
/pop
(concurrent read/write).
Continue here: 13 Virtual Threads