Merge Sort - The Computer Science Handbook

Introduction

Prerequisites: Recursion

Merge sort works by breaking down the sorting into smaller pieces. If we want to sort N elements, we can sort the first half of the elements, sort the second half and then merge the results together. To sort the first half, we can do the exact same thing of sorting the first quarter and the second quarter and merging the results.

Implementation

Merge sort work be breaking down the problem into smaller and smaller parts and then combining those parts to solve the larger problem.

We can keep splitting the list into half until each piece has zero or one elements. We can then combine the results of each piece repeatedly until the entire list is sorted.

Example:

Formalization

Let merge(arr1,arr2) that combines two sorted arrays into one sorted array.

Example:
merge([1,5,7,9], [2,4,8])
= [1, 2, 4, 5, 7, 8, 9]

Let sort(arr) sort an array
Let middle be arr.length / 2

sort([x]) = [x]
sort([x,y]) = [x,y] if x<y 
sort([x,y]) = [y,x] otherwise
sort(arr) = merge(sort(arr[0..middle]), sort(arr[middle..arr.length])

Example:
sort([3, 7, 1, 9, 8, 4, 5])
= merge(sort([3, 7, 1]), sort([9, 8, 4, 5]))
= merge(merge(sort([3, 7]), sort([1]), merge(sort([9, 8]), sort([4, 5])))
= merge(merge([3, 7], [1]), merge([8, 9], [4, 5])
= merge([1, 3, 7], [4, 5, 8, 9])
= [1, 3, 4, 5, 7, 8, 9]

Code

We will implement our solution in Java that reflects our formalization in a way that is easier to understand but more inefficient. In our code, we create new arrays to store merged results but we can actually do this in place without extra memory (left as an exercise).

// Merges two sorted arrays v1 and v2 into a new sorted array
public static Vector<Integer> merge(Vector<Integer> v1, Vector<Integer> v2) {
  Vector<Integer> merged = new Vector<Integer>();
  int i = 0, j = 0;
  // Always take the smaller element of the two vectors
  while(i < v1.size() && j < v2.size()){
    if(v1.get(i) < v2.get(j)){
      merged.add(v1.get(i));
      i++;
    } else {
      merged.add(v2.get(j));
      j++;
    }
  }
  if (i >= v1.size()){
    // Add the rest of v2
    while(j < v2.size()){
      merged.add(v2.get(j));
      j++;
    }
  } else {
    // Add the rest of v1.
    while(i < v1.size()){
      merged.add(v1.get(i));
      i++;
    }
  }
  return merged;
}

// Merge sorts an array
public static Vector<Integer> mergeSort(Vector<Integer> v) {
  // Base case if 1 or 0 elements.
  if (v.size() <= 1) {
    return v;
  }
  // Get middle of array.
  int middle = v.size()/2;
  // Split vector into two halves.
  Vector<Integer> firstHalf = new Vector<Integer>(v.subList(0, middle));
  Vector<Integer> secondHalf = new Vector<Integer>(v.subList(middle, v.size()));
  // Return merged halves.
  return merge(mergeSort(firstHalf), mergeSort(secondHalf));
}

// Example usage:
Vector<Integer> v = new Vector<Integer>();
Collections.addAll(v, 5,4,10,8,9,2);
System.out.println(mergeSort(v));

Exercises

Write the formalization for merge(arr1, arr2).
Rewrite merge sort to not use any additional memory (i.e. merging and splitting arrays is done in place and no new vectors or arrays are created). You should rewrite the function to sort in place: void mergeSort(int[] arr, int start, int end) and void merge(int arr[]. int start, int end, int start2, int end2).