Searching algorithms are designed to find a specific element (the key) within a dataset.
Linear Search (Unsorted)
Algorithm
Linear search is a straightforward algorithm that iterates through each element of an unsorted list sequentially until the key is found or the end of the list is reached.
Python Implementation
Time Complexity
- Worst-case: - When the key is at the end of the list or not present. We have to scan all n elements.
- Best-case: - When the key is at the beginning of the list. We find it in the first comparison.
- Average-case: - On average, we expect to search half of the list.
Correctness Proof (Loop Invariant)
Invariant: At the start of each iteration of the for loop, the key is not present in the sublist data[0...i-1]
.
-
Initialization: Before the loop starts (i=0), the sublist
data[0...-1]
is empty, so the invariant holds trivially. -
Maintenance: Suppose the invariant holds at the beginning of iteration i. That means
key
is not indata[0...i-1]
. In the i-th iteration, we check ifdata[i]
is equal to thekey
. If it is, we return i and the algorithm terminates (successfully). If not, the invariant still holds for the next iteration sincekey
is notdata[i]
, so not indata[0..i]
. -
Termination: The loop terminates either when the key is found (and the function returns), or when
i
reaches the length of the array (len(data)
). If the loop terminates becausei
reachedlen(data)
, it means that for all i from 0 tolen(data)-1
,data[i]
is not equal tokey
. Therefore, the key is not in the list, and the function returns -1, maintaining the invariant.
Binary Search
Algorithm
Binary search is an efficient algorithm for finding a key within a sorted list. It works by repeatedly dividing the search interval in half. If the middle element is the key, the search is successful. If the key is smaller than the middle element, the search continues in the lower half; otherwise, it continues in the upper half.
Python Implementation
Time Complexity
- Worst-case: - When the key is not present or at the very end of search after halving multiple times.
- Best-case: - When the key is at the middle element initially.
- Average-case: - On average, the search space is halved in each step.
Correctness Proof (Loop Invariant)
Invariant: At the start of each iteration of the while loop, if the key is present in the original sorted list, then it must be within the sublist data[low...high]
.
- Initialization: Before the loop starts,
low
is 0 andhigh
islen(data) - 1
, so the sublist is the entire list. If the key is in the list, it’s within this sublist so the invariant holds. - Maintenance: Suppose the invariant holds at the beginning of some iteration. That is, if the
key
is indata
, thenkey
is indata[low...high]
. We calculatemid
.- If
data[mid] == key
, we found the key and terminate successfully. - If
data[mid] < key
, since the array is sorted, all elements indata[low...mid]
are less than the key. Hence if the key is present, it must be indata[mid+1...high]
. We setlow = mid + 1
ensuring the invariant holds for the next iteration. - If
data[mid] > key
, similarly, if the key is present it must be indata[low...mid-1]
. We sethigh = mid - 1
ensuring the invariant holds.
- If
- Termination: The loop terminates when
low > high
. This means the sublistdata[low...high]
is empty. If the loop terminates this way, it implies the key was never found in any sublist inspected and hence, is not present in the entire list because of the invariant. So returning -1 maintains correctness. If the loop terminates becausedata[mid] == key
, we return the indexmid
(correct result).
NOTE
Binary Search requires the input list to be sorted.