# CMPSCI 251: Mathematics of Computation

### The Problem of Sorting

We've just seen that there are n! possible permutations of an n-element set. A sorting algorithm must determine which of these possible orders the input is in. A comparison-based sorting algorithm must decide this on the basis of comparisons between pairs of elements. We can represent a comparison-based sorting algorithm by a decision tree. I'll draw on the board a depth-1 decision tree that decides the order of two elements, and a depth-3 tree that decides the order of three elements. Each internal node of the tree represents a comparison between two of the elements, and each leaf node represents and inferred order of the elements.

A tree of depth d can have at most 2d leaves, and a correct tree sorting n elements must have at least n! leaves. Comparing these two numbers gives us the Sorting Lower Bound Theorem, which says that any comparison-based sorting algorithm takes at least log2(n!) comparisons in the worst case.

Questions are in black, answers in blue.

• Writing Exercise 1: For n=4 we must use at least five comparisons because log2(24) is a real number between 4 and 5. Describe a decision tree of depth 5 that is correct for sorting four elements a, b, c, and d. (Note that there are six possible comparisons, so you will always reach your conclusion despite there being at least one pair that you have not compared directly.)

First compare a against b, then c against d, then the winners of first two comparisons. In the case where the earlier letter wins, we know that we have reduced to the three cases abcd, acbd, and acdb. In this case we can first compare b against c (our fourth comparison) and output abcd if b wins. If b loses we compare b against d to find which of the other two cases we are in.

In the seven other cases for the first three comparisons, we also have three cases left and we can find out which of the three we have by one or two more comparisons, using at most five in all. For example, if b wins over a in the first comparison, we can switch the roles of a and b thereafter.

• Writing Exercise 2: For n=5 the lower bound is seven comparisons because log2(120) is a real number a little smaller than 7. Describe a decision tree of depth seven that is correct for sorting five elements a, b, c, d, and e. (Hint: To each node you can associate the set of orders of the elements that reach that node. To get the shallowest tree you want the new comparison to divide this set as nearly in half as possible. I'll illustrate this on the sample trees for n=2 and n=3.)

Once again we begin with a versus b, c versus d, and the winners of the first two comparisons. We'll deal with the case where a beats b, c beats d, and a beats c -- the other seven cases are symmetrical. As above, the first four elements must be in order abcd, acbd, or acdb. Each of these three cases gives five orders of the five elements because e can be in five different places. So we have eabcd, aebcd, abecd, abced, abcde, eacbd, aecbd, acebd, acbed, acbde, eacdb, aecdb, acedb, acdeb, and acdbe.

We need a comparison to split these fifteen orders into a group of eight and a group of seven: e versus c will do:

• e wins: We have eabcd, aebcd, abecd, eacbd, aecbd, eacdb, or aecdb. Compare e with a. If e wins we have eabcd, eacbd, or eacdb and we can finish in two more comparisons as in the four-element case. If a wins we have aebcd, abecd, aecbd, or aecdb. Compare b with c -- if b wins compare b with e, if c wins compare b with d.
• c wins: We have abced, abcde, acebd, acbed, acbde, acedb, acdeb, or acdbe. Compare d with e. If d wins we have abcde, acbde, acdeb, or acdbe. Compare b with d -- if b wins compare b with c, if d wins compare b with e. If e wins the fourth comparison we have abced, acebd, acbed, or acedb. Compare b with e -- if b wins compare b with c, if e wins compare b with d.