CMPSCI 311 Discussion #11: Bin Packing

David Mix Barrington

4/6 December 2006

Today we consider an optimization problem called BIN-PACKING which is similar to the LOAD-BALANCING problem from Section 11.1 of KT. In BIN-PACKING our input is a set of objects, each with a given size, and a bin size that is at least as big as any object size. Our goal is to partition the objects into as few subsets as possible, such that the total size of the objects in each subset is at most the bin size. The optimization problem is to find the minimum number of bins possible, and the decision problem takes a positive integer k as additional input and asks whether it can be done with k bins.

  1. Prove that the decision version of BIN-PACKING is NP-complete in the general case. (Hint: You can insist that k = 2, as long as the bin and object sizes can be large integers.)

    In Problem 8.26 on HW #5, you proved that the NUMBER-PARTITIONING problem is NP-complete, where you are given a set of objects with positive integer sizes and ask whether they can be partitioned into two subsets of equal total size. This is simply a special case of the decision version of BIN-PACKING: Given a set of objects of total size S, we add a bin size of S/2 and a k of 2, and we have a BIN-PACKING instance where the packing is possible if and only if the partition is possible. We have thus reduced a known NP-complete problem to BIN-PACKING. The only remaining task is to show that BIN-PACKING is in the class NP. This is easy because a division of the objects into bins can be specified by a string of polynomial length, and in polynomial time we can check that each item occurs in exactly one subset and that each subset meets the condition of having its total size less than or equal to the bin size.

  2. Now consider the special case of BIN-PACKING where the bin size is 1 and the object sizes are all greater than 1/3. Describe a poly-time algorithm to find the exact optimal number of bins in this case. (Hint: Reduce to a problem from earlier in the course, which is solved by one of our standard methods.)

    First note that no bin can contain more than two objects, and that each object of size more than 1/2 must go in a separate bin (call these large objects). If we have n objects, k of them large, the only question is how many of the n-k other objects we can put into bins that already contain a large object. Any remaining small objects can be packed two to a bin. So if we match m small objects, we use k + ceiling ((n - k - m)/2) bins. Our goal then is to match as many small objects as possible to large ones.

    The reduction I had in mind was to MAXIMUM BIPARTITE MATCHING, which we know is solvable in polynomial time by reduction to NETWORK-FLOW. We make a graph with a vertex for each large object on the left, a vertex for each large object on the right, and an edge between the vertices for any large object and small object that fit together into one bin. A maximum bipartite matching in this graph gives us the largest possible m and thus the optimal number of bins.

    Most of you in discussion applied a greedy algorithm to find the maximum matching, first sorting the objects by size and then finding matches by:

            for each object X in downward order
                if there is an object that fits with X
                   put X in a bin with the smallest remaining such object
                else put X in a bin by itself
      

    The only problem with this solution is that we have to prove that it is correct, meaning that there is no way to match more pairs than this algorithm does. Suppose the optimal algorithm forms pairs (a,a'), (b,b'), ..., (z,z'), where we list the larger object first in each pair and put the pairs in order of descending size of larger object. Note that we can rearrange the smaller elements so that they are in ascending order of size, using an exchange argument: if (b,b') and (c,c') are both valid pairs and b' > c', then (b,c') and (c,b') are both valid pairs because each has smaller total size than (b,b'). Now replace a', b', ..., z' with the m smallest objects in the set -- we still have m valid pairs. We want to show that our greedy algorithm found at least as many matches as the optimal algorithm did. Each of the items a, b, ..., z will be matched by the greedy algorithm unless it finds another match first. The greedy algorithm's matches will use the m smallest items in order. It cannot fail to find these matches, because if the item d, for example, matches the fourth smallest item, then the greedy algorithm will find its fourth match at or before the time it reaches d.

    We could replace the word "smallest" in the code above by "largest", which would make more sense as a greedy algorithm because we are filling the current bin as full as possible. This also works, because whenever we make a match we leave a situation that is strictly better than the one we would have by making the smallest match - for each i the i'th smallest item remaining is no bigger than the i'th smallest item in the other list. This implies that both greedy algorithms find the same number of matches, though of course they don't find the same matches.

  3. (KT Exercise 11.1) Here is a greedy on-line algorithm to pack the objects in an approximately optimum way. Take the objects in the order they come and try to put each one in the next available bin. If it won't fit, declare that bin closed and open a new bin. Show that this algorithm uses at most twice the optimal number of bins. (Hint: Show that if this greedy algorithm uses an odd number 2k+1 of bins, then the sum of the object sizes is greater than k, and that if an even number 2k of bins are used, the sum is also at least k.)

    The key observation is that when we start a new bin, the size of the old bin and the new object must be greater than the bin size (which we'll call "1" as we can use whatever units we like). This means that the first two bins contain more than 1 unit of size between, that the third and fourth bins contain at least a unit, and so on up to the last bin. If we use an odd number of bins, then the first k pairs contain more than one unit each, so the total size is greater than k and the optimal algorithm must use at least k+1 bins, more than half the number we used. If we used an even number, the k pairs again each use more than a unit each (for example, we would not have started bin 2k unless the size of the last item plus the contents of bin 2k-1 were more than one), and again we have more than k size and we need at least k+1 bins.

  4. Give an example where the algorithm from Question 3 does as badly as you can manage relative to the optimal. (There is an easy construction where it uses 2 - ε times the optimal number, for any positive number ε.)

    For any positive integer n, let the bin size be n and consider the sequence of 2n objects with sizes n, 1, n, 1, ..., n, 1. The greedy algorithm must put each item in a separate bin and thus uses 2n bins. The optimal algorithm could put all the size-1 items in one bin and thus use only n+1 bins. For any positive ε, we can choose n large enough that 2n/(n+1) > 2 - ε.

Last modified 7 December 2006