Searching algorithms are designed to find a specific element (the key) within a dataset.

Linear Search (Unsorted)

Algorithm

Linear search is a straightforward algorithm that iterates through each element of an unsorted list sequentially until the key is found or the end of the list is reached.

Python Implementation

def linear_search(data, key):
    for i in range(len(data)):
        if data[i] == key:
            return i  # Return index if key found
    return -1  # Return -1 if key not found

Time Complexity

  • Worst-case: - When the key is at the end of the list or not present. We have to scan all n elements.
  • Best-case: - When the key is at the beginning of the list. We find it in the first comparison.
  • Average-case: - On average, we expect to search half of the list.

Correctness Proof (Loop Invariant)

Invariant: At the start of each iteration of the for loop, the key is not present in the sublist data[0...i-1].

  • Initialization: Before the loop starts (i=0), the sublist data[0...-1] is empty, so the invariant holds trivially.

  • Maintenance: Suppose the invariant holds at the beginning of iteration i. That means key is not in data[0...i-1]. In the i-th iteration, we check if data[i] is equal to the key. If it is, we return i and the algorithm terminates (successfully). If not, the invariant still holds for the next iteration since key is not data[i], so not in data[0..i].

  • Termination: The loop terminates either when the key is found (and the function returns), or when i reaches the length of the array (len(data)). If the loop terminates because i reached len(data), it means that for all i from 0 to len(data)-1, data[i] is not equal to key. Therefore, the key is not in the list, and the function returns -1, maintaining the invariant.

Algorithm

Binary search is an efficient algorithm for finding a key within a sorted list. It works by repeatedly dividing the search interval in half. If the middle element is the key, the search is successful. If the key is smaller than the middle element, the search continues in the lower half; otherwise, it continues in the upper half.

Python Implementation

def binary_search(data, key):
    low = 0
    high = len(data) - 1
    while low <= high:
        mid = (low + high) // 2  # Floor division
        if data[mid] == key:
            return mid  # Key found
        elif data[mid] < key:
            low = mid + 1  # Search in the upper half
        else:
            high = mid - 1  # Search in the lower half
    return -1  # Key not found

Time Complexity

  • Worst-case: - When the key is not present or at the very end of search after halving multiple times.
  • Best-case: - When the key is at the middle element initially.
  • Average-case: - On average, the search space is halved in each step.

Correctness Proof (Loop Invariant)

Invariant: At the start of each iteration of the while loop, if the key is present in the original sorted list, then it must be within the sublist data[low...high].

  • Initialization: Before the loop starts, low is 0 and high is len(data) - 1, so the sublist is the entire list. If the key is in the list, it’s within this sublist so the invariant holds.
  • Maintenance: Suppose the invariant holds at the beginning of some iteration. That is, if the key is in data, then key is in data[low...high]. We calculate mid.
    • If data[mid] == key, we found the key and terminate successfully.
    • If data[mid] < key, since the array is sorted, all elements in data[low...mid] are less than the key. Hence if the key is present, it must be in data[mid+1...high]. We set low = mid + 1 ensuring the invariant holds for the next iteration.
    • If data[mid] > key, similarly, if the key is present it must be in data[low...mid-1]. We set high = mid - 1 ensuring the invariant holds.
  • Termination: The loop terminates when low > high. This means the sublist data[low...high] is empty. If the loop terminates this way, it implies the key was never found in any sublist inspected and hence, is not present in the entire list because of the invariant. So returning -1 maintains correctness. If the loop terminates because data[mid] == key, we return the index mid (correct result).

NOTE

Binary Search requires the input list to be sorted.