The Complexity of Fairness through Equilibrium

Abraham Othman, Christos Papadimitriou, Aviad Rubinstein

Introduction

University classes have limited capacity, and some are more popular than others. This creates an interesting allocation problem. Imagine that each student has ordered all possible bundles of courses from most desirable to least desirable, and the capacities of classes are known. What is the best way to allocate class seats to students? There are several desiderata for a course allocation mechanism:

Are all possible seats in courses allocated?

Are students motivated to honestly report their preferences to the mechanism?

Can the allocation be computed from the data in polynomial time?

Competitive Equilibrium from Equal Incomes (CEEI) (Foley, 1967; Varian, 1974; Thomson and Varian, 1985) is a venerable mechanism with many attractive properties: In CEEI all agents are allocated the same amount of “funny money”, next they declare their preferences, and then a price equilibrium is found that clears the market. The market clearing guarantees efficiency and feasibility. The mechanism has a strong, albeit technical, ex post fairness guarantee that emerges from the notion that agents who miss out on a valuable, competitive item will have extra funny money to spend on other items at equilibrium. Truthfulness is problematic — as usual with market mechanisms — even though the problem is mitigated by the large number of agents. However, CEEI works when the resources to be allocated are divisible and the utilities relatively benign. It is easy to construct examples in which a CEEI does not exist when preferences are complex or the resources being allocated are not divisible. Indeed, both issues arise in practice in a variety of allocation problems, including shifts to workers, landing slots to airplanes, and our favorite, courses to students. (Varian, 1974; Budish, 2011).

It was recently shown in Budish (2011) that an approximation to a CEEI solution, called A-CEEI, exists even when the resources are indivisible and agent preferences are arbitrarily complex, as required by the course allocation problems one sees in practice. The approximate solution guaranteed to exist is approximately fair (in that the agents are given almost the same budget), and approximately efficient and feasible (in that all classes are filled close to capacity, with the possible exception of very unpopular classes). This result seems to be wonderful news for the class allocation problem. However, there is a catch: Budish’s proof is non-constructive as it relies on Kakutani’s fixed-point theorem.

A heuristic algorithm for solving A-CEEI was introduced in Othman et al. (2010). The algorithm is a modified search analogue to the traditional tâtonnement process, where the prices of courses that are oversubscribed are increased, and the prices of courses that are undersubscribed are decreased. This heuristic algorithm is currently used by the Wharton School (University of Pennsylvania) to assign their MBA students to courses. It has been documented that the heuristic algorithm often produces much tighter approximations than the theoretical bound; yet, on some instances it fails to find even the guaranteed approximation (Budish, 2011, Section 9).

Thus A-CEEI is a problem where practical interest motivates theoretical inquiry. We have a theorem that guarantees the existence of an approximate equilibrium — the issue is finding it. Can the heuristic of Othman et al. (2010) be replaced by a fast and rigorous algorithm for finding an approximate CEEI? Or are there complexity obstacles to approximating CEEI?

In this paper, we show that finding the guaranteed approximation to CEEI is an intractable problem:

Theorem 2, informal statement. The problem of finding an A-CEEI as guaranteed by Budish (2011) is $\PPAD$ -complete.

We also show an essentially optimal $\NP$ -hardness result for determining whether a better approximation exists.

Theorem 3, informal statement. It is $\NP$ -hard to distinguish between an instance where an exact CEEI exists, and one in which there is no A-CEEI tighter than guaranteed in Budish (2011).

The Course Allocation Problem

Even though the A-CEEI and the existence theorem in (Budish, 2011) are applicable to a broad range of allocation problems, we shall describe our results in the language of the course allocation problem.

We are given a set of $M$ courses with integer capacities (the supply) $(q_{j})_{j=1}^{M}$ , and a set of $N$ students, where each student $i$ has a set $\Psi_{i}\subseteq 2^{M}$ of permissible course bundles, with each bundle containing at most $k\leq M$ courses. The set $\Psi_{i}$ encodes both scheduling constraints (e.g., courses that meet at the same time) and any constraints specific to student $i$ (e.g. prerequisites).

Each student $i$ has a strict ordering over her permissible schedules, denoted by $\preccurlyeq_{i}$ . We allow arbitrarily complex preferences — in particular, students may regard courses as substitutes or complements. More formally:

Course Allocation Problem The input to a course allocation problem consists of:

For each student $i$ a set of course bundles $\left(\Psi_{i}\right)_{i=1}^{N}$ .

The students’ reported preferences, $\left(\preccurlyeq_{i}\right)_{i=1}^{N}$ ,

The course capacities, $\left(q_{j}\right)_{j=1}^{M}$ , and

The output to a course allocation problem consists of:

Prices for each course $(p_{j}^{*})_{j=1}^{M}$ ,

Allocations for each student $(x_{i}^{*})_{i=1}^{N}$ , and

Budgets for each student $\left(b_{i}^{*}\right)_{i=1}^{N}$ .

How is an allocation evaluated? The clearing error of a solution to the allocation problem, is the $\mathcal{L}_{2}$ norm of the length- $M$ vector of seats oversubscribed in any course, or undersubscribed seats in courses with positive price.

The clearing error $\alpha$ of an allocation is

We can now define the notion of approximate CEEI. The quality of approximation is characterized by two parameters: $\alpha$ , the clearing error (how far is our solution from a true competitive equilibrium?) and $\beta$ , the bound on the difference in budgets (how far from equal are the budgets?). Informally, $\alpha$ can be thought of as the approximation loss on efficiency, and $\beta$ can be thought of as the approximation loss on fairness.

An allocation is a $(\alpha,\beta)$ -CEEI if:

Each student is allocated their most preferred affordable bundle. Formally

Total clearing error is at most $\alpha$ .

In Budish (2011) it is proved that an $(\alpha,\beta)$ -approximate CEEI always exists, for some quite favorable (and as we shall see, essentially optimal) values of $\alpha$ and $\beta$ :

Budish (2011) For any input preferences, there exists a $(\alpha,\beta)$ -CEEI with $\alpha=\sqrt{kM/2}$ and any $\beta>0$ .

Recall that $k$ is the maximum bundle size. $\alpha=\sqrt{kM/2}$ means that, for large number of students and course capacities, the market-clearing error converges to zero quite fast as a fraction of the endowment. It is also shown in Budish (2011) that the mechanism which allocates courses according to such an A-CEEI satisfies attractive criteria of approximate fairness, approximate truthfulness, and approximate Pareto efficiency. The reader may consult Budish (2011) for the precise definitions of the economic properties of the A-CEEI mechanism.

Theorem 1 is an example of a non-constructive existential result; such theorems are common in mathematics, and are quite often related to economics (recall Nash’s theorem, Arrow-Debreu theorem, etc.). It is often important to determine whether there is a polynomial algorithm for finding the solution guaranteed by such a theorem; computational problems of this nature are called total, because they correspond to total functions from inputs to solutions.

In exploring the difficulty of total problems, applying the methodology of $\NP$ -completeness is problematic. The intuitive reason is that, for example, a reduction from 3SAT relies heavily on the fact that the starting 3SAT instance may be unsatisfiable. Therefore 3SAT cannot be reduced in any meaningful way to a total problem such as A-CEEI (see Chapter 2 of Nisan et al. (2007) for a discussion of this point). $\NP$ -completeness does not seem to be an option. But there is an alternative: Total problems can often be proved complete for certain complexity classes between $\P$ and $\NP$ . For example, during the past decade several game-theoretic problems have been proved complete for the complexity class $\PPAD$ , containing difficult problems related to fixed point theorems such as Brouwer’s, Nash’s, competitive equilibria, and so on (Papadimitriou (1994); Abbott et al. (2005); Codenotti et al. (2006); Huang and Teng (2007); Chen et al. (2009); Chen and hua Teng (2009); Daskalakis et al. (2009); Kintali et al. (2009); Palvolgyi (2009); Chen and Teng (2011); Vazirani and Yannakakis (2011); Chen et al. (2013)).

There are several interesting and subtle ways of defining $\PPAD$ , but for our purposes it is most convenient to define it as the class of all total problems that are reducible to the problem Gcircuit, the problem of finding the fixed point of a continuous function specified by a “generalized circuit”. Gcircuit is defined in the next section.

A-CEEI is \PPAD\PPAD\PPAD-Complete

Computing a $\left(\sqrt{\frac{kM}{2}},\beta\right)$ -CEEI is $\PPAD$ -complete, for some polynomially small $\beta>0.$

The rest of this section is devoted to the proof of this theorem.

We first establish that the problem belongs to the class $\PPAD$ ; this proof is much harder than usual (see Appendix A). We follow the steps of the existence proof in Budish (2011), and show that each one can be carried out either in polynomial time, or through a fixed point. One difficulty is that certain steps of Budish’s proof are randomized, and we must be derandomized in polynomial time.

The problem Gcircuit

The reduction is from the PPAD-complete problem Gcircuit, alluded to in the previous section.

Generalized circuits are similar to the standard algebraic circuits, the main difference being that generalized circuits contain cycles, which allow them to verify fixed points of continuous functions. Formally,

A generalized circuit $S$ is a pair $(V,\cal{T})$ , where $V$ is a set of nodes and $\cal{T}$ is a collection of gates. Every gate $T\in\cal{T}$ is a 5-tuple $T=(G,v_{1},v_{2},v)$ , in which $G\in\{G_{/2},G_{\frac{1}{2}},G_{+},G_{-},G_{<},G_{\wedge},G_{\vee},G_{\neg}\}$ Chen et al. (2009) define slightly different gates, whose $\epsilon$ -approximation can be simulated by $O(\log 1/\epsilon)$ of our gates: $G_{\zeta}$ can be simulated using $G_{\frac{1}{2}}$ and $G_{+}$ ; and $G_{\times\zeta}$ and $G_{=}$ can be simulated using $G_{/2}$ and $G_{+}$ . is the type of the gate; $v_{1},v_{2}\in V\cup\{nil\}$ are the first and second input nodes of the gate; $v\in V$ is the output node.

The collection $\cal{T}$ of gates must satisfy the following important property: For every two gates $T=(G,v_{1},v_{2},v)$ and $T^{\prime}=(G^{\prime},v^{\prime}_{1},v^{\prime}_{2},v^{\prime})$ in $\cal{T}$ , $v\neq v^{\prime}$ .

Given a generalized circuit, we are interested in the computational problem of finding an assignment that simultaneously satisfies all the constraints defined by the gates.

VALUE: $f_{G_{\frac{1}{2}}}\equiv\frac{1}{2}$

SUM: $f_{G_{+}}\left(x,y\right)=\min\left(x+y,1\right)$

DIFF: $f_{G_{-}}\left(x,y\right)=\max\left(x-y,0\right)$

LESS: $f_{G_{<}}\left(x,y\right)=\begin{cases}1&x>y+\beta\\ 0&y>x+\beta\end{cases}$

AND: $f_{G_{\wedge}}\left(x,y\right)=\begin{cases}1&\left(x>\frac{1}{2}+\beta\right)\wedge\left(y>\frac{1}{2}+\beta\right)\\ 0&\left(x<\frac{1}{2}-\beta\right)\vee\left(y<\frac{1}{2}-\beta\right)\end{cases}$

OR: $f_{G_{\vee}}\left(x,y\right)=\begin{cases}1&\left(x>\frac{1}{2}+\beta\right)\vee\left(y>\frac{1}{2}+\beta\right)\\ 0&\left(x<\frac{1}{2}-\beta\right)\wedge\left(y<\frac{1}{2}-\beta\right)\end{cases}$

Given a generalized circuit $\cal{S}=(V,\cal{T})$ , $\epsilon$ -Gcircuit is the problem of finding an assignment that $\epsilon$ -approximately satisfies it. It is shown in Chen et al. (2009) to be $\PPAD$ -complete for $\epsilon={1\over\poly(|V|)}$ .

Overview of the Reduction

We shall reduce the $\epsilon$ -Gcircuit problem to that of finding an $(\alpha,\beta)$ -CEEI, with approximation parameters $\alpha=\Theta(N/M)$ and $\epsilon=\beta/2$ . (Note that, by increasing $N$ , we can make $\alpha$ arbitrarily large as a function of $M$ ; in particular, $\alpha>\sqrt{kM/2}$ .)

We will construct gadgets (that is, small sets of courses, students, capacities and preferences) for the various types of gates in the generalized circuit. Each gadget that we construct has one or more dedicated “input course(s)”, a single “output course”, and possibly some “interior courses”. An output course of one gadget can (and will) be an input to another. The construction will guarantee that in any A-CEEI the price of the output course will be approximately equal to the gate applied to the prices of the input courses.

Gate gadgets

To illustrate what needs to be done, we proceed to construct a gate for the function $f_{G_{\neg}}\left(x\right)=1-x$ ; in particular, this implements a logic NOT.

Let $n_{x}>4\alpha$ and suppose that the economy contains the following courses:

$c_{{1-x}}$ with capacity $q_{{1-x}}=n_{x}/2$ (the “output course”);

$n_{x}$ students interested only in the schedule $\left\{c_{x},c_{{1-x}}\right\}$ ;

and suppose further that at most $n_{{1-x}}=n_{x}/4$ other students are interested in course $c_{{1-x}}$ .

Then in any $\left(\alpha,\beta\right)$ -CEEI

If $p_{{1-x}}^{*}>1-p_{x}^{*}+\beta$ , then none of the $n_{x}$ students will be able to afford the bundle $\left\{c_{x},c_{{1-x}}\right\}$ , and therefore there will be at most $n_{{1-x}}=n_{x}/4$ students enrolled in the $c_{{1-x}}$ - much less than the capacity $n_{x}/2$ . Therefore $z_{{1-x}}\geq n_{x}/4$ .

On the other hand, if $p_{{1-x}}^{*}<1-p_{x}^{*}$ , then all $n_{x}$ students can afford the bundle $\left\{c_{x},c_{{1-x}}\right\}$ - therefore the class will be overbooked by $n_{x}/2$ ; thus, $z_{{1-x}}\geq n_{x}/2$ .

Therefore if $p_{{1-x}}^{*}\notin\left[1-p_{x}^{*},1-p_{x}^{*}+\beta\right]$ , then $\left\|z\right\|_{2}\geq n_{x}/4>\alpha$ - a contradiction to $\left(\alpha,\beta\right)$ -CEEI. ∎

Similarly, we construct gadgets that simulate all the gates of the generalized circuit:

Let $n_{x}\geq 2^{8}\cdot\alpha$ and suppose that the economy has courses $c_{x}$ and $c_{y}$ . Then for any of the gate functions $f_{G}$ in the definition of $\epsilon$ -Gcircuit, we can add: a course $c_{z}$ , and at most $n_{x}$ students interested in each of $c_{x}$ and $c_{y}$ , such that in any $\left(\alpha,\beta\right)$ -CEEI $p_{z}^{*}\in\left[f\left(p_{x}^{*},p_{y}^{*}\right)-2\beta,f_{G}\left(p_{x}^{*},p_{y}^{*}\right)+2\beta\right]$ .

In particular, $p_{z}^{*}$ continue to satisfy the above inequalities in every $\left(\alpha,\beta\right)$ -CEEI even if up to $n_{z}\leq n_{x}/2^{8}$ additional students (beyond the ones needed in the proof) are interested in course $c_{z}$ .

We defer the proof of Lemma 2 to the appendix.

Course-size amplification

So far, we have constructed gadgets that compute all the gates necessary for the circuit in the reduction from $\epsilon$ -Gcircuit. What happens when we try to concatenate them to form a circuit? Recall the last sentence in the statement of Lemma 2: It says that the prices continue to behave like the gate that is simulated, as long as there are not too many additional students that try to take the output course. (If there are more students, they may raise the price of the course beyond what we expect.) In particular, the number of additional students that may want the output course is smaller than the number of students that want the input course.

If we concatenated the gadgets without change, we would need to have larger class sizes as we increase the depth of the simulated circuit. This increase in class size is exponential in the depth of the circuit. Things get even worse- since we reduce from generalized circuits, our gates form cycles. If the class size must increase at every gate it would have to be infinite!

To overcome this problem we construct a COPY gadget that preserves the price from the input course, but is robust to twice as many additional students:

Let $n_{x}\geq 100\alpha$ and suppose that the economy contains the following courses:

for $i=1,\dots 10$ , $c_{i}$ with capacities $q_{i}=0.5\cdot n_{x}$ (“interior courses”);

$c_{x^{\prime}}$ with capacity $q_{x^{\prime}}$ , s.t. $q_{x}\leq q_{x^{\prime}}\leq 4n_{x}$ (“output course”);

$n_{x}$ students interested in schedules $\left(\left\{c_{x},c_{i}\right\}\right)_{i=1}^{10}$ (in this order);

$n_{i}=0.49\cdot n_{x}$ students ( $\forall i$ ) interested in schedules $\left(\left\{c_{x^{\prime}},c_{i}\right\},\left\{c_{i}\right\},\left\{c_{i+1}\right\},\dots,\left\{c_{10}\right\}\right)$ (in this order);

and suppose further that at most $n_{x^{\prime}}=2n_{x}$ other students are interested in course $c_{x^{\prime}}$ .

Then in any $\left(\alpha,\beta\right)$ -CEEI

In particular, notice that the price of $c_{x^{\prime}}$ is guaranteed to approximate the price of $c_{x}$ , even in the presence of additional $n_{x^{\prime}}=2n_{x}$ students - twice as many students as we added to $c_{x}$ .

We start by proving that all the $c_{i}$ ’s simulate NOT gadgets simultaneously, i.e. for every $i$ and every $\left(\alpha,\beta\right)$ -CEEI, $p_{i}^{*}\in\left[1-p_{x}^{*},1-p_{x}^{*}+\beta\right]$ .

If $p_{i}^{*}>1-p_{x}^{*}+\beta$ , assume wlog that it is the first such $i$ , i.e. $p_{j}^{*}\leq 1-p_{x}^{*}+\beta<p_{i}^{*}$ for every $j<i$ .

None of the $n_{x}$ students can afford buying both $c_{x}$ and $c_{i}$ . Furthermore, for every $j<i$ , none of the $n_{j}$ students will prefer $c_{i}$ over $c_{j}$ . Therefore at most $n_{i}$ students will take this course: $z_{i}^{*}\geq 0.01n_{x}$ .

If, on the other hand, $p_{i}^{*}<1-p_{x}^{*}$ , then all $n_{x}$ students will buy course $c_{i}$ or some previous course $c_{j}$ (for $j\leq i$ ); additionally for every $j\leq i$ , each of the $n_{j}$ corresponding students will buy some course $c_{k}$ for $j\leq k\leq i$ . Therefore the total overbooking of classes $1,\dots,i$ will be at least $\sum_{j\leq i}z_{j}^{*}\geq n_{x}\cdot\left(1-0.01i\right)$ - a contradiction to $\left(\alpha,\beta\right)$ -CEEI.

Now that we established that $p_{i}^{*}\in\left[1-p_{x}^{*},1-p_{x}^{*}+\beta\right]$ , we shall prove the main claim, i.e. that $p_{x^{\prime}}^{*}\in\left[p_{x}^{*}-\beta,p_{x}^{*}+\beta\right]$ .

If $p_{x^{\prime}}^{*}>p_{x}^{*}+\beta$ , then none of the $n_{i}$ students, for any $n_{i}$ , can afford buying both $c_{x^{\prime}}$ and $c_{i}$ . Therefore, even in the presence of additional $n_{x^{\prime}}=2n_{x}$ students who want to take $c_{x^{\prime}}$ , the class will be undersubscribed by $z_{x^{\prime}}^{*}\geq q_{x^{\prime}}-n_{x^{\prime}}=2n_{x}$

If $p_{x^{\prime}}^{*}<x+\beta$ , then all $n_{i}$ students, for each $i$ , can afford to buy their top schedule - both $\left\{c_{i},c_{x^{\prime}}\right\}$ . Therefore $c_{x^{\prime}}$ will be oversubscribed by at least $z_{x^{\prime}}^{*}\geq 0.9\cdot n_{x}$ - a contradiction to $\left(\alpha,\beta\right)$ -CEEI.

Finally, given an instance of $\epsilon$ -Gcircuit, we can use the gadgets we constructed in Lemmata 1-3 to construct an instance of $(\alpha,\beta)$ -CEEI that simulates the generalized circuit.

\NP\NP\NP hardness

Budish (2011) shows that his existence theorem is tight, that is, there exist economies in which it is impossible to achieve less than $\Omega\left(\sqrt{kM}\right)$ market clearing error. One may hope that on instances encountered in practice, a better approximation may be possible, and finding it may not be prohibitively hard. We next show that even in economies that admit an exact CEEI, it is $\NP$ -hard to find even a constant factor improvement over the $\Omega\left(\sqrt{kM}\right)$ bound.

It is $\NP$ -hard to distinguish between an economy that has an exact CEEI, and an economy that does not have a $\left(\Omega\left(\sqrt{N+M}\right),\beta\right)$ -CEEI for any $0\leq\beta<1$ .

In particular, since our reduction uses a constant $k$ , it means that it is $\NP$ -complete to find an $\left(\Omega\left(\sqrt{kM}\right),\beta\right)$ -CEEI — an approximation factor smaller only by a multiplicative constant than the approximation guaranteed by the existence theorem of Budish (2011).

Theorem 2 is in some sense stronger than Theorem 3 in that it applies to a larger market clearing error. In turn, Theorem 3 is stronger in two ways: (1) it gives $\NP$ -hardness, as opposed to $\PPAD$ -hardness; and (2) it applies to any $0\leq\beta<1$ , as opposed to a polynomially small $\beta$ .

1 Proof

We reduce from 3SAT-5, i.e., a SAT instance in which every clause contains exactly 3 variables, and each variable appears in exactly 5 clauses. Feige (1998) proved that it is $\NP$ -hard to distinguish between a satisfiable 3SAT-5 instance, and a 3SAT-5 instance where at most $1-\epsilon$ can be satisfied, for some $\epsilon>0$ In fact, an equivalent result for 3SAT- $B$ for any constant $B$ would suffice for our techniques. Hardness of approximation with perfect completeness for 3SAT- $B$ was proven by Papadimitriou and Yannakakis (1991); Arora and Safra (1998); Arora et al. (1998)..

Given a 3SAT-5 formula, we construct a gadget for each variable and each clause. The gadgets are constructed so that for any assignment that completely satisfies the formula there exists an exact CEEI in the economy.

Furthermore, given an approximate CEEI for the economy which exactly clears the courses in a subset of the gadgets, one can recover an assignment for the 3SAT-5 formula that satisfies all the clauses corresponding to the same subset. Informally, this means that for every clause that we are unable to satisfy in the 3SAT-5 formula, there must be a deviation from exact market clearing in the gadget corresponding to either that clause, or one of its variables.

Because we use a sparse 3SAT, each deviation from market clearing can affect at most 5 clauses. Each variable gadget uses $13$ courses, and each clause gadget uses only $1$ more. For an instance with $n$ clauses and $\frac{3}{5}n$ variables, we have exactly $M=\frac{44}{5}n$ courses. Finally, if $\epsilon n$ of the clauses are unsatisfied, then the market clearing error must be at least $\sqrt{\frac{1}{5}\cdot\epsilon n}=\sqrt{\frac{\epsilon}{44}\cdot M}$ . Since $N<M$ and $\epsilon>0$ is a constant, we get $\NP$ -hardness with $\alpha=\Omega\left(\sqrt{M+N}\right)$ .

For each variable $x_{i}$ , we have a variable gadget that forces a consistent assignment to $x_{i}$ . The gadget contains $5$ pairs of “output courses” $O_{T}^{j},O_{F}^{j}$ ; each of these pairs is also part of the “input courses” of a clause gadget. Additionally, the gadget has three inner courses: $D_{L},D_{C},D_{R}$ . The gadget also has two students: $s_{T}$ has preference list: $\left\{D_{L},D_{C}\right\}$ , $\left\{D_{L},O_{T}^{1},\dots O_{T}^{5}\right\}$ , $\left\{D_{R}\right\}$ ; and $s_{F}$ has preference list: $\left\{D_{R},D_{C}\right\}$ , $\left\{D_{R},O_{F}^{1},\dots,O_{F}^{5}\right\}$ , $\left\{D_{L}\right\}$ .

Soundness: It is easy to see that, in any CEEI, $x_{i}$ cannot be assigned more than one value: otherwise neither student will be assigned $D_{C}$ ; yet if $D_{C}$ has price zero, then both students would prefer the respective bundles that contain it.

If, on the other hand, neither $O_{T}^{j}$ nor $O_{F}^{j}$ is assigned, we must again have a nonzero market clearing error for the courses in this gadget:

If all the inner courses have price zero, then $D_{C}$ will be over demanded;

If $D_{L}$ and $D_{R}$ have price zero, then under any assignment either one of the three will be over demanded, or $D_{C}$ will be under demanded;

If $p\left(D_{C}\right)=0$ , $p\left(D_{L}\right)>0$ , and, wlog, $p\left(D_{L}\right)\geq p\left(D_{R}\right)$ , then either $D_{C}$ will be over demanded, or $D_{L}$ will be under demanded;

Finally, since $\beta<1$ , if $D_{C}$ has nonzero price, then either it is under demanded, or one of the three inner courses must be over demanded.

Completeness: For an assignment with $x_{i}=\mbox{True}$ , let the prices of $O_{T}^{j},O_{F}^{j},D_{L},D_{C},D_{R}$ be $\frac{1}{6},0,\frac{1}{6},1,0$ , respectively. Under these prices, student $s_{T}$ will prefer bundle $\left\{D_{L},O_{T}^{1},\dots O_{T}^{4}\right\}$ Recall that in the completeness we show the existence of exact CEEI, so all the budgets are exactly $1$ ., while student $S_{F}$ will choose bundle $\left\{D_{R},D_{C}\right\}$ .

Clause gadget

For each clause containing variables $\left\{X,Y,Z\right\}$ , consider seven courses: six input courses $X_{T},X_{F},Y_{T},Y_{F},Z_{T},Z_{F}$ (where each pair is the output courses of a variable gadget), and a single “budget diluting” course $D$ . We also have a single gadget student, who is interested in any of the seven bundles corresponding to a satisfying assignment.

For example if the clause is $\left(X\vee\neg Y\vee Z\right)$ , the gadget student would be interested in the bundles: $\left\{X_{F},Y_{F},Z_{F},D\right\}$ , $\left\{X_{F},Y_{F},Z_{T},D\right\}$ , $\left\{X_{F},Y_{T},Z_{F},D\right\}$ , $\left\{X_{F},Y_{T},Z_{T},D\right\}$ , $\left\{X_{T},Y_{F},Z_{F},D\right\}$ , $\left\{X_{F},Y_{T},Z_{F},D\right\}$ , $\left\{X_{T},Y_{T},Z_{T},D\right\}$ . In particular, the student is not interested in the bundle $\left\{X_{T},Y_{F},Z_{T},D\right\}$ , which corresponds to assigning $\left(X=\mbox{False},\,Y=\mbox{True},\,Z=\mbox{False}\right)$

Soundness: Observe that the variable gadgets students are assigned courses $X_{a}$ , $Y_{b}$ , and $Z_{c}$ , then in any exact CEEI, the clause gadget student must be assigned the bundle $\left\{X_{\neg a},Y_{\neg b},Z_{\neg c},D\right\}$ .

Completeness: Suppose that the variable gadgets students are assigned courses $X_{a}$ , $Y_{b}$ , and $Z_{c}$ , each with price at least $\frac{1}{6}$ , while courses $X_{\neg a}$ , $Y_{\neg b}$ , and $Z_{\neg c}$ are all unassigned. Then if we set the price of $D$ to be $1$ , the only affordable bundle for the clause gadget student is indeed $\left\{X_{\neg a},Y_{\neg b},Z_{\neg c},D\right\}$ .

Discussion

In this work we classified the computational complexity of finding an approximate CEEI as a function of the precision parameter $\alpha$ of the approximation, the market clearing error. We showed that finding $(\alpha,\beta)$ -CEEI is $\PPAD$ -complete when $\alpha$ is large enough to guarantee existence, while finding a better approximation to CEEI is $\NP$ -complete.

One potential way around these intractability results could be to restrict the input language of preferences. This has been a fruitful line of research in combinatorial auctions (Nisan, 2006; Sandholm and Boutilier, 2006). However, in contrast to that space, we do not anticipate limiting language complexity in the course allocation problem to be fruitful either in theory or in practice. Recall that the student preferences used in the $\PPAD$ -hardness proof are already very simple. Furthermore, in practice there are significant inherent complexities in students’ preferences: for example, courses meeting at the same time and courses with multiple sections.

Despite the negative results shown in this paper, a heuristic search algorithm exists that finds practical solutions to A-CEEI. Interestingly, in both laboratory experiments as well as real course allocation problems, this heuristic often finds solutions that are an order of magnitude better than the theoretical $\sqrt{\frac{kM}{2}}$ guarantee on the clearing error (Othman et al., 2010) — a performance which we have shown NP-hard to guarantee. Once again we are faced with a familiar conundrum: What are the characteristics of the instances appearing in practice that enable this favorable performance? And how can one develop a rigorous fast algorithm for them?

References

Appendix A A-CEEI ∈\PPADabsent\PPAD\in\PPAD

We show that computing a $\left(\frac{\sqrt{\sigma M}}{2},\beta\right)$ -CEEI is in $\PPAD$ , for $\sigma=\min\{2k,M\}$ .

We assume that the student preferences $\left(\succsim_{i}\right)$ are given in the form of an ordered list of all the bundles in $\Psi_{i}$ (i.e., all the bundles that student $i$ prefers over the empty bundle). In particular, we assume that the total number of permissible bundles is polynomial.

In fact, we prove that the following, slightly more general problem, is in $\PPAD$ : Given any $\beta,\epsilon>0$ and initial approximate-budgets vector $\mathbf{b}\in\left[1,1+\beta\right]^{N}$ , find a $\left(\frac{\sqrt{\sigma M}}{2},\beta\right)$ -CEEI with budgets $\mathbf{b^{*}}$ such that $|b_{i}-b^{*}_{i}|<\epsilon$ for every $i$ .

Our proof will follow the steps of the existence proof by Budish . We will use the power of $\PPAD$ to solve the Kakutani problem, and derandomize the other nonconstructive ingredients.

Our algorithm receives as input an economy $\left(\left(q_{j}\right)_{j=1}^{M},\left(\Psi_{i}\right)_{i=1}^{N},\left(\succsim_{i}\right)_{i=1}^{N}\right)$ , parameters $\beta,\epsilon>0$ , and an initial approximate-budgets vector $\mathbf{b}\in\left[1,1+\beta\right]^{N}$ . We denote $\bar{\beta}=\min\{\beta,\epsilon\}/2$ .

Given the total demand of all the students, we can define the excess demand to be:

A.2 Deterministically finding a “general position” perturbation (step 1)

In this section, we show how to deterministically choose these taxes.

There exists a polynomial-time algorithm that finds a vector of taxes $\mathbf{\tau}=\left(\tau_{i,x}\right)_{i\in{\cal S},x\in\Psi_{i}}$ such that:

$-\epsilon<\tau_{i,x}<\epsilon$ (taxes are small)

$\tau_{i,x}>\tau_{i,x^{\prime}}$ if $x\succ_{i}x^{\prime}$ (taxes prefer more-preferred bundles)

$1\leq\min_{i,x}\left\{b_{i}+\tau_{i,x}\right\}\leq\max_{i,x}\left\{b_{i}+\tau_{i,x}\right\}\leq 1+\beta$ (inequality bound is preserved)

$b_{i}+\tau_{i,x}\neq b_{i^{\prime}}+\tau_{i^{\prime},x^{\prime}}$ for $\left(i,x\right)\neq\left(i^{\prime},x^{\prime}\right)$ (no two perturbed prices are equal)

Assume wlog that $\mathbf{b}$ is rounded to the nearest $\bar{\beta}M^{-M}$ : otherwise we can include this rounding in the taxes.

We proceed by induction on the pairs $\left(i,x\right)$ of students and bundles: at each step let $\tau_{i,x}$ be much smaller than all the taxes introduced so farAssume wlog that for each $i$ we consider the $\left(i,x\right)$ ’s in order reversed with respect to $\succsim_{i}$ , so that property 2 is guaranteed..

More precisely, if $\left(i,x\right)$ is the $\nu^{\mbox{th}}$ pair to be considered, then we set

where the sign is chosen such that condition 3 in the statement of the lemma is preserved.

Assume further, wlog, that this is the first such $k$ -tuple, with respect to the order of the induction. In particular, this means that $\left\{x_{1},\dots,x_{{}_{k-1}}\right\}$ are linearly independent. Now consider the system

Notice that it has rank $k-1$ . We can now take $k-1$ linearly independent rows $j_{1},\dots j_{k-1}$ such that the following system has the same unique solution $\alpha$ :

Since $X$ is a square matrix of full rank it is invertible, so we have that

where $X_{i,j}$ is the $\left(i,j\right)$ -cofactor of $X$ . Finally, since $X$ is a Boolean matrix, its determinant and all of its cofactors are integers of magnitude less than $\left(k-1\right)^{k-1}$ . The entries of $\alpha$ are therefore rational fractions with numerators and denominators of magnitude less than $\left(k-1\right)^{k-1}$ .

However, if $\left(i_{k},x_{k}\right)$ is the $\nu^{\mbox{th}}$ pair added by the induction, then the following is an integer:

but $\frac{M^{\left(2\nu-1\right)M}}{\bar{\beta}}\cdot\left(b_{i_{k}}+\tau_{i_{k,}x_{k}}\right)$ is not an integer, a contradiction to Equation (1). ∎

A.3 Finding a fixed point (steps 2-4)

This subsection describes the price adjustment correspondence of Budish , and is brought here mostly for completeness.

We first define the price adjustment function:

Instead, we define an upper hemicontinuous, set-valued “convexification” of $f$ :

The correspondence $F$ is upper hemicontinuous, non-empty, and convex; therefore, by Kakutani’s fixed point theorem it has a fixed point.

Finally, by Papadimitriou finding this fixed point of $F$ is in PPAD.

We round all price vectors to a $\left(\bar{\beta}M^{\frac{1}{2}-2\left(\nu_{\max}+1\right)M}\right)$ -grid (this precision suffices to implement the algorithm in lemma 4).

From the proof of Papadimitriou it follows that it suffices to compute just a single point in $F\left(\mathbf{p}\right)$ for every $\mathbf{p}$ (this is important because the number points in $F\left(\mathbf{p}\right)$ on the grid may be exponential). At any point on the grid, the price of any bundle is an integer multiple of $\left(\bar{\beta}M^{\frac{1}{2}-2\left(\nu_{\max}+1\right)M}\right)$ . In particular, any budget-constraint hyperplane which does not contain $\mathbf{p}$ , must be at distance at least $\left(\bar{\beta}M^{\frac{1}{2}-2\left(\nu_{\max}+1\right)M}\right)$ . Therefore, we can take any point $\mathbf{p^{\prime}}$ at distance $\frac{1}{2}\left(\bar{\beta}M^{\frac{1}{2}-2\left(\nu_{\max}+1\right)M}\right)$ from $\mathbf{p}$ , and which does not lie on any of the hyperplanes that contain $\mathbf{p}$ . Because no budget-constraint hyperplanes lie between $\mathbf{p^{\prime}}$ and $\mathbf{p}$ , it follows that $f\left(\mathbf{p^{\prime}}\right)\in F\left(\mathbf{p}\right)$ .

A.4 From a fixed point to approximate CEEI (steps 5-9)

Given a fixed point $\mathbf{p^{*}}$ of $F$ , we can find in polynomial time a vector of prices $\mathbf{p^{\phi^{\prime}}}$ such that $\left\|\mathbf{z}\left(\mathbf{p^{\phi^{\prime}}},\mathbf{b},\mathbf{\tau}\right)\right\|_{2}\leq\frac{\sqrt{\sigma M}}{2}$

We use the method of conditional expectation to derandomize Step 8 of Budish .

Recall that by remark 3, there exists a neighborhood around $\mathbf{p^{*}}$ which does not intersect any budget-constraint hyperplanes (beyond those that contain $\mathbf{p^{*}}$ ). Let $1,\dots,L^{\prime}$ be the indices of students whose budget-constraint hyperplanes intersect at $\mathbf{p^{*}}$ . For student $i\in\left[L^{\prime}\right]$ , let $w_{i}$ be the number of corresponding hyperplanes $H\left(i,x_{i}^{1},\tau_{i,x_{i}^{1}}\right),\dots H\left(i,x_{i}^{w_{i}},\tau_{i,x_{i}^{w_{i}}}\right)$ intersecting at $\mathbf{p^{*}}$ , and assume wlog that the superindices of $x_{i}^{1},\dots x_{i}^{w_{i}}$ are ordered according to $\succsim_{i}$ .

Let $d_{i}^{0}$ be agent $i$ ’s demand when prices are slightly perturbed from $\mathbf{p^{*}}$ such that all $x_{i}^{j}$ ’s are affordable. Such a perturbation exists and is easily computable because the hyperplanes are linearly independentThis appears to be a slight inaccuracy in the proof in Budish . Similarly, let $d_{i}^{1}$ denote agent $i$ ’s demand when $x_{i}^{2},\dots x_{i}^{w_{i}}$ are affordable, but $x_{i}^{1}$ is not, and so on. Finally, let $z_{S\setminus\left[L^{\prime}\right]}\left(\mathbf{p^{*}},\mathbf{b},\mathbf{\tau}\right)=d_{S\setminus\left[L^{\prime}\right]}\left(\mathbf{p^{*}},\mathbf{b},\mathbf{\tau}\right)-\mathbf{q}$ be the market clearing error when considering the rest of the students. (The demands of $S\setminus\left[L^{\prime}\right]$ is constant in the small neighborhood $\mathbf{p^{*}}$ which does not intersect any additional hyperplanes.)

By Lemma 3 of Budish , there exist distributions $a_{i}^{f}$ over $d_{i}^{f}$ :

such that the clearing error of the expected demand is :

We first find such $a_{i}^{f}$ in polynomial time using linear programming.

The existence proof then considers, for each $i$ , a random vector $\Theta_{i}=\left(\Theta_{i}^{1},\dots,\Theta_{i}^{w_{i}}\right)$ : the vectors are independent and in any realization $\theta_{i}$ satisfy $\sum_{f=0}^{w_{i}}\theta_{i}^{f}=1$ , while the variables each have support $\mbox{supp}\left(\Theta_{i}^{f}\right)=\left\{0,1\right\}$ , and expectation $\mbox{E}\left[\Theta_{i}^{f}\right]=a_{i}^{f}$ .

By Lemma 4 of Budish , the expected clearing error is bounded by:

We now proceed by induction on the students. For each $i$ , if the conditional expectation on $\left(\hat{\theta}_{j}\right)_{j<i}$ satisfies

then at least one $\hat{\theta}_{i}$ must also satisfy the above bound. We can find such $\hat{\theta}_{i}$ in polynomial time by computing the conditional expectation for every feasible $\hat{\theta}_{i}^{{}^{\prime}}$ :

The chosen $\left(\hat{\theta}_{i}\right)_{i=1}^{L^{\prime}}$ define an allocation $\mathbf{x^{*}}$ with bounded clearing error. We now follow step 9 of Budish in order to define budgets $\mathbf{b^{*}}$ such that $\mathbf{x^{*}}$ is the preferred consumption by all the students at price $\mathbf{p^{*}}$ .

We define, for every $i$ , $b_{i}^{*}=b_{i}+\tau_{i,x_{i}^{*}}$ . For $i>L^{\prime}$ we have $x_{i}^{*}=d_{i}\left(\mathbf{p^{*}},b_{i},\tau_{i}\right)$ . By requirement 2 of lemma 4, every bundle that student $i$ prefers over $x_{i}^{*}$ had a greater tax and was still unaffordable at $\mathbf{p^{*}}$ ; it now costs more than $b_{i}+\tau_{i,x_{i}^{*}}$ .

For $i\leq L^{\prime}$ notice that every bundle $x_{i}^{\perp}$ that $i$ prefers over $x_{i}^{*}$ and was exactly affordable at $\mathbf{p^{*}}$ with taxes $\mathbf{\tau}$ and budget $\mathbf{b}$ , $x^{\perp}$ must cost strictly more than $i$ ’s new budget $b_{i}^{*}$ . Therefore, $\left(\mathbf{x^{*}},\mathbf{b^{*}},\mathbf{p^{*}}\right)$ is a $\left(\frac{\sqrt{\sigma M}}{2},\beta\right)$ -CEEI

Appendix B Additional gadgets for Theorem 2

Lemma 2. Let $n_{x}\geq 2^{8}\cdot\alpha$ and suppose that the economy has courses $c_{x}$ and $c_{y}$ . Then for any of the functions $f$ listed below, we can add: a course $c_{z}$ , and at most $n_{x}$ students interested in each of $c_{x}$ and $c_{y}$ , such that in any $\left(\alpha,\beta\right)$ -CEEI $p_{z}^{*}\in\left[f\left(p_{x}^{*},p_{y}^{*}\right)-2\beta,f\left(p_{x}^{*},p_{y}^{*}\right)+2\beta\right]$

VALUE: $f_{G_{\frac{1}{2}}}\equiv\frac{1}{2}$

SUM: $f_{G_{+}}\left(x,y\right)=\min\left(x+y,1\right)$

DIFF: $f_{G_{-}}\left(x,y\right)=\max\left(x-y,0\right)$

LESS: $f_{G_{<}}\left(x,y\right)=\begin{cases}1&x>y+\beta\\ 0&y>x+\beta\end{cases}$

In particular, $p_{z}^{*}\in\left[f\left(p_{x}^{*},p_{y}^{*}\right)-2\beta,f\left(p_{x}^{*},p_{y}^{*}\right)+2\beta\right]$ in every $\left(\alpha,\beta\right)$ -CEEI even if up to $n_{z}\leq n_{x}/2^{8}$ additional students (beyond the ones specified in the proofs below) are interested in course $c_{z}$ .

Notice, that like in similar gadget reductions from $\PPAD$ -complete problems, LESS, AND, and OR are brittle comparators (see discussion in Daskalakis et al. for more details).

Let $c_{z}$ have capacity $q_{z}=n_{x}/8$ , let $n_{z}=q_{z}/2$ , and consider three auxiliary courses $c_{1}$ , $c_{2}$ , and $c_{\overline{x}}$ of capacities $q_{1}=q_{2}=q_{z}$ and $q_{\overline{x}}=n_{x}/2$ . Using lemma 1 add $n_{x}$ students that will guarantee $p_{\overline{x}}\in\left[1-p_{x}^{*},1-p_{x}^{*}+\beta\right]$ . Additionally, consider $n_{\overline{x}}=n_{x}/4$ students with preference list: $\left(\left\{c_{z},c_{1},c_{\overline{x}}\right\},\left\{c_{z},c_{2},c_{\overline{x}}\right\},\left\{c_{1},c_{2},c_{\overline{x}}\right\}\right)$ (in this order), then:

If the total price $p_{i}^{*}+p_{j}^{*}$ of any pair $i,j\in\left\{1,2,z\right\}$ is less than $p_{x}^{*}-\beta$ , then all $n_{\overline{x}}$ students will be able to afford some subset in their preference list, leaving a total overbooking of at least $z_{z}^{*}+z_{1}^{*}+z_{2}^{*}\geq 2n_{\overline{x}}-3q_{z}=n_{x}/8$ , which violates the $\left(\alpha,\beta\right)$ -CEEI conditions

If the total price of any of the pairs above (wlog, $p_{1}^{*}+p_{2}^{*}$ ) is greater than $p_{x}^{*}+\beta$ , then none of the $n_{\overline{x}}$ students will be able to afford the subset $\left\{c_{1},c_{2},c_{\overline{x}}\right\}$ . Therefore the number of students taking $c_{z}$ will be at least the sum of students taking $c_{1}$ or $c_{2}$ . Therefore, even after taking into account $n_{z}$ additional students, we have that $z_{z}^{*}+z_{1}^{*}+z_{2}^{*}\geq q_{z}-n_{z}=n_{x}/16$ .

Similarly to the HALF gadget, consider two auxiliary courses $c_{1}$ and $c_{2}$ , and let $n_{x}$ students have preferences: $\left(\left\{c_{z},c_{1}\right\},\left\{c_{z},c_{2}\right\},\left\{c_{1},c_{2}\right\}\right)$ . Then, following the argument for the HALF gadget, it is easy to see that $p_{z}^{*}\in\left[\frac{1}{2},\frac{1}{2}+\beta\right]$ in any $\left(\alpha,\beta\right)$ -CEEI, with $n_{z}=n_{x}/8$ .

Let $c_{\overline{x}}$ be a course with price $p_{\overline{x}}^{*}\in\left[1-p_{x}^{*},1-p_{x}^{*}+\beta\right]$ , $q_{\overline{x}}=n_{x}/2$ , and consider $n_{\overline{x}}=n_{x}/4$ students willing to take $\left\{c_{\overline{x}},c_{y},c_{z}\right\}$ . Then it is easy to see that

Concatenating NOT and DIFF gadgets, we have:

Let $c_{\overline{x}}$ be a course with price $p_{\overline{x}}^{*}\in\left[1-p_{x}^{*},1-p_{x}^{*}+\beta\right]$ , $q_{\overline{x}}=n_{x}/2$ ; let $q_{z}=n_{x}/8$ and $n_{z}=n_{x}/16$ . Consider $n_{x}/4$ students wishing to take $\left(\left\{c_{\overline{x}},c_{y}\right\}\left\{c_{z}\right\}\right)$ , in this order:

If $p_{y}^{*}>p_{x}^{*}+\beta$ , then $p_{\overline{x}}^{*}+p_{y}^{*}>1+\beta$ , and therefore none of the $n_{x}/4$ students will be able to afford the first pair; they will all try to sign up to $c_{z}$ which will be overbooked unless $p_{z}^{*}>1$

If $p_{x}^{*}>p_{y}^{*}+\beta$ , then all $n_{x}/4$ students will sign up for the first pair, forcing $p_{z}^{*}=0$ in any $\left(\alpha,\beta\right)$ -CEEI.

Let $c_{\frac{1}{2}}$ be a course with price $p_{\frac{1}{2}}^{*}\in\left[\frac{1}{2},\frac{1}{2}+\beta\right]$ and $n_{\frac{1}{2}}=n_{x}/8$ , as guaranteed by gadget VALUE; let $q_{z}=n_{x}/32$ and $n_{z}=n_{x}/64$ . Consider $n_{x}/16$ students wishing to take $\left(\left\{c_{x},c_{\frac{1}{2}}\right\},\left\{c_{y},c_{\frac{1}{2}}\right\},\left\{c_{z}\right\}\right)$ , in this order.

If $\left(p_{x}^{*}>\frac{1}{2}+\beta\right)\wedge\left(p_{y}^{*}>\frac{1}{2}+\beta\right)$ , then the $n_{x}/16$ students can afford neither pair. They will all try to sign up for $c_{z}$ , forcing $p_{z}^{*}>1$ , in any $\left(\alpha,\beta\right)$ -CEEI.

If $\left(x<\frac{1}{2}-\beta\right)\vee\left(y<\frac{1}{2}-\beta\right)$ , then the $n_{x}/16$ students can afford at least one of the pairs and will register for those courses. Thus $p_{z}^{*}=0$ .

Similar to the AND gadget; students will want $\left(\left\{c_{x},c_{y},c_{\frac{1}{2}}\right\},\left\{c_{z}\right\}\right)$ , in this order.