# Merge sort#

**Merge sort** is a divide-and-conquer approach to sorting that generates a new list. Like quicksort, it works by splitting the list in half. Unlike quicksort, the idea is quite simple:

Split the list in two arbitrarily.

Recursively sort each of the split lists.

Merge the shorter, sorted lists into a single list.

Here’s the pseudocode:

```
-- merge - merge two sorted lists into a single list
merge(l1, l2):
if l1 is empty: return l2
if l2 is empty: return l1
if l1[0] < l2[0]:
return l1[0] followed by merge(l1[1:], l2)
else:
return l2[0] followed by merge(l1, l2[1:])
-- mergesort - sort a list l into a new list
mergesort(l):
split l arbitrarily into two lists l1 and l2 of nearly equal length
l1_sorted = mergesort(l1)
l2_sorted = mergesort(l2)
return merge(l1_sorted, l2_sorted)
```

For the split, we can do something simple: copy every other element to one of two lists. Let’s try it in Python:

## Correctness#

The correctness of merge sort is substantially easier to see than for quicksort. The `split`

procedure is more or less arbitrary, and produces lists that are approximately half the list in length. The `merge`

procedure will clearly produce a sorted list from two sorted lists: each time it picks off the lowest element on the front of the two lists. The `mergesort`

procedure itself is quite simple: it splits the list, sorts them (guaranteed to be correct by an inductive argument), and then merges them.

## Performance#

Merge sort will produce recursions with a perfect tree pattern: each recursive call splits the list in half. We don’t have to worry about pivot elements the way that quicksort does. We’ll get the `n * log2(n)`

comparisons every time.

One big difference with quicksort is that merge sort produces new lists—and by default, it produces quite a few of them! It’s possible to use less space than our implementation: in general, to merge sort a list of `n`

elements, you need to allocate space for another `n`

elements. Our code allocates as it goes, and the merge will allocate again.

Our `split`

routine splits every other item. In Python, it’d be easy enough to define a simpler split as, e.g., `(l[0:len(l)//2], l[len(l)//2:])`

. Efficient implementations may use ranges of the original list and do some work in place.