ebook img

Counting Houses of Pareto Optimal Matchings in the House Allocation Problem PDF

0.48 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Counting Houses of Pareto Optimal Matchings in the House Allocation Problem

Counting Houses of Pareto Optimal Matchings in the House Allocation Problem Bala´zs Keszegh Andrei Asinowski Tillmann Miltzow Alfr´ed R´enyi Institute of Freie Universita¨t Berlin Freie Universita¨t Berlin Mathematics 5 Berlin, Germany Budapest, Hungary Berlin, Germany 1 [email protected] [email protected] [email protected] 0 2 t c O 7 Abstract In an instance of the house allocation problem two sets A and B are ] O given. The set A is referred to as applicants and the set B is referred to as houses. We denote by m and n the size of A and B respectively. In C the house allocation problem, we assume that every applicant a∈A has . h apreferencelistoverthesetofhousesB. Wecallaninjectivemappingτ t from A to B a matching. A blocking coalition of τ is a non-empty subset a m A(cid:48) of A such that there exists a matching τ(cid:48) that differs from τ only on elements of A(cid:48), and every element of A(cid:48) improves in τ(cid:48), compared to τ, [ according to its preference list. If there exists no blocking coalition, we 3 call the matching τ a Pareto optimal matching (POM). v A house b ∈ B is reachable if there exists a Pareto optimal matching 4 using b. The set of all reachable houses is denoted by E∗. We show 5 3 |E∗|≤ (cid:88) (cid:106)m(cid:107)=Θ(mlogm). 5 i i=1,...,m . 1 This is asymptotically tight. A set E ⊆ B is reachable (respectively ex- 0 actly reachable) if there exists a Pareto optimal matching τ whose image 4 contains E as a subset (respectively equals E). We give bounds for the 1 : number of exactly reachable sets. We find that our results hold in the v more general setting of multi-matchings, when each applicant a of A is i X matchedwith(cid:96)a elementsofB insteadofjustone. Further,wegivecom- plexity results and algorithms for corresponding algorithmic questions. r a Finally, we characterize unavoidable houses, i.e., houses that are used by all POM’s. This yields efficient algorithms to determine all unavoidable elements. 1 Introduction 1.1 Definitions The house allocation problem is motivated by the following setup: a set of people is interested to be allocated to a certain set of houses. Each person has a ranking over the set of houses and wants to be assigned to the house with her highest preference. As soon as two people have the same favorite house this is not possible. Motivated by this picture we abstract the set up and start with some definitions. In an instance of the house allocation problem two sets A and B are given. The set A represents applicants and the set B represents houses. We denote by m and n the size of A and B respectively. In the house allocation problem, we assume that every a ∈ A has a preference list over the set B. A preference list can be formally defined as a total order of B. We call an injective mapping τ from A to B a matching. A blocking coalition of τ is a non-empty subset A(cid:48) of A such that there exists a matching τ(cid:48) that differs from τ only on elements of A(cid:48), and every element of A(cid:48) improves in τ(cid:48), compared to τ according to its preference list. If there exists no blocking coalition, τ is called a Pareto optimal matching (POM). We represent the preference lists by an m×n matrix. Every row represents the preference list of one of the applicants in A, i.e., in a given row r corre- sponding to some applicant a∈A, the leftmost house is the one that a prefers most, etc., that is house b is left to b in r if and only if a prefers b over b . 1 2 1 2 Note that no row contains an element from B twice. We usually denote this matrixbyM andfollowingthisinterpretationweusuallydenotethemembersof A by r ,r ,...r and the members of B by 1,2,...,n. Because of this matrix 1 2 m representation, we usually refer to members of A only as rows and to members of B as elements (of the matrix). To illustrate the notion consider the following matrix and observe that the matching indicated by circles is indeed Pareto optimal. 1 5 3 2 4 3 1 4 5 2   1 3 5 4 2     We denote tuples p = (a,k) as positions of the matrix, if a is some row and k is some natural number. A matching corresponds to a set of positions P τ if there is exactly one position for each row and no two positions contain the same house. The image set of τ corresponds to the set of houses of B in these positions. Thus,wesaythatτ selects somepositionpofM (resp.someelement b of B), if p is in P (resp. b is in the image set of τ). Similarly, given some τ matching τ we say that a row a selects a position p in row a (resp. element b) if p = (a,k) ∈ P for some k (resp. b is in the image of τ). We denote by s(τ) τ the image set of τ. InaPOMthepositionsafterthem-thcolumnwillneverbeassigned,because atleastoneofthepreviousmelementsinthatrowispreferredandnotassigned 1 to any other element on A. Therefore it is sufficient to consider only m×m square matrices. (In other words, only the first m elements of the preference lists matter.) If some POM τ selects p (resp. b), then it is a reachable position (resp. reachable element). More generally, a set E ⊆ B is (exactly) reachable if there existsaPOMτ withE ⊆s(τ)(E =s(τ)). Inthiscasewealsosaythatτ reaches E. An element b is unavoidable if it belongs to the set s(τ) for every Pareto optimalmatchingτ ofM andbiscalledavoidableifthereexistsaParetooptimal matching τ with b ∈/ s(τ). A set E is avoidable if there exists a POM τ with s(τ)∩E =∅. Note that for a set E such that |E|=m it is exactly reachable if andonlyifB\E isavoidable. Thesenotionscanbegeneralizednaturallytothe case of multi-matchings, when each element a has to be paired up with (cid:96) ≥ 1 a elements of B. For the precise definitions see Section 5. These Pareto Optimal Multi-matchings (POMMs) can also appear naturally in practical applications and we will see that our results about POMs generalize naturally to POMMs. We will also study matrices with fewer than m columns, precise definitions will be given in Subsection 1.4. In this case preference lists are incomplete, i.e., it canhappenthatsomeelementsofAarenotassigned. (Inthiscase, itmightbe interesting to compute a POM of maximum size.) Example 1. To illustrate the notions and to avoid confusion, we give here a detailed example. The reader can use it to verify that she understood all important notions. The example can be skipped, as we will not refer to it again. Consider the following matrix. 1 5 3 2 3 1 5 4 M =  1 6 5 4  3 6 2 4      The elements 1 and 3 are unavoidable, as they are both in the first column of M and thus picked by every POM. A quick check reveals that every other element of M can be avoided. Each set E with 1∈E or 3∈E is also unavoid- able. A simple argument reveals that every set E ⊆ {2,4,5,6} with |E| ≥ 3 is unavoidable, because every POM picks 4 elements of M. In order to determine all unavoidable sets, there remain only six sets E ⊂ {2,4,5,6} of size 2 to be examined. It turns out that only {5,6} and {2,5} are unavoidable. It is easy to see that all elements of M are reachable. In order to specify the reachable sets note that if some set E is reachable then also all its subsets. Thus it is sufficient to specify all reachable sets of size 4. Also note that if E ⊆ {1,2,3,4,5,6} of size four is reachable then D = {1,2,3,4,5,6}\E is avoidable. Thus, by the discussion above, the four reachable sets of size four are exactly {1,3,4,5},{1,3,2,5},{1,3,2,6},{1,3,5,6}. 2 1.2 Results 1.2.1 Enumerating reachable elements and sets In Section 2 we deal with enumerative problems related to reachable elements. Our main result here is the following. Theorem 2. Let M be an m×m matrix and E∗ be the set of all reachable elements. Then m |E∗|≤ (cid:98)m/i(cid:99)≤m(lnm+1). i=1 (cid:88) This improves the trivial upper bound of m2 which appears in [11]. In [11] the authors also showed a lower bound construction which has asymptotically asmanyreachableelementsasisimpliedbyourupperbound. ThusTheorem2 is asymptotically tight. Denote by E(M) the family of all (exactly) reachable m-element sets of M. For example, if all the elements in the first column of M are distinct, then |E(M)|=1. With Theorem 2 we can bound E(M). Corollary 3. For any matrix M, we have |E(M)|≤ m(lnm+1) . m This is the only non-trivial upper bound that we (cid:0)found, imp(cid:1)roving m2 of m [11]. As an important consequence, our upper bound also improves the upper (cid:0) (cid:1) bound on the pattern matching problem regarded in [11]. The best known lower bound is m [11]. The construction in that paper is a matrix where (cid:100)m/2(cid:101) in the first (cid:98)m/2(cid:99) columns the i-th column contains only element i and in the (cid:0) (cid:1) ((cid:98)m/2(cid:99)+1)-stcolumntherearemdifferentelementswhicharealsoalldifferent from 1,2,...(cid:98)m/2(cid:99). 1.2.2 Characterization of avoidable elements and sets Section3concentratesonthenotionofavoidableelements. Letxbeanelement suspect to be avoidable. Given some set of rows R we denote by E (R) the set x of elements left of x in the rows R (i.e., y is in E (R) if and only if there exists x a row r ∈R in which y appears to the left of x; if x does not appear in r then all elements in r are regarded to be left of x). Theorem 4. An element x of a matrix M is avoidable if and only if for every set R of rows of M, we have: |E (R)|≥|R| x Extremalresultsandalgorithmicresultsinconnectiontoavoidableelements areincludedinSection3,astheyfollowfromtheproofofTheorem4. Weprove that: Corollary 5. Deciding whether an element x is avoidable can be done in √ O(m2 m+n) and also in O(m3) time. Listing all unavoidable elements can √ be done in O(m2n m+n) and also in O(m5) time. Bothresultsfollowfromaneasyreductiontomatchingsinabipartitegraph. 3 1.2.3 Complexity of reachability Computational questions about reachable elements are considered in Section 4. Among the computational questions connected to the notions defined in this paper, we selected those that we found natural and canonical. The problems are defined as follows: Problem 1. (Deciding Reachability) Input: A matrix M and a set D ⊆B. Question: Is D reachable? Problem 2. (Deciding Exact Reachability) Input: A matrix M and a set E ⊆B Question: Is E exactly reachable? Problem 3. (Counting Reachable Sets) Input: A matrix M. Question: How many sets D ⊆B are reachable? Problem 4. (Counting Exactly Reachable Sets) Input: A matrix M. Question: How many sets E are exactly reachable? Problem 5. (Counting Exactly Reachable Supersets) Input: A matrix M and some set D ⊆B. Question: How many sets E with D ⊆E ⊆B are exactly reachable? The next table summarizes our findings about algorithmic questions. The gen- eralcaseisalwaysthesameaswith3columnmatrices. Thehardnessresultsfor Problems 1 and 5 already hold if D contains exactly 1 element. It was already observed by Henze, Jaume and Keszegh that Problem 1 is NP-complete [11]. Our contribution among others is to show NP-completeness also for matrices withonly3columns. TherunningtimeofProblem2followsfromthediscussion in Section 3 and Corollary 5. Problem 2 columns proof 3 columns proof 1) polynomial (Thm. 13) NP-complete (Thm. 10) 2) polynomial (Cor. 5) polynomial (Cor. 5) 3) explicit formula (Thm. 14) ? 4) #P-complete (Thm. 15) #P-complete (Thm. 15) 5) #P-complete (Thm. 15) #P-complete (Thm. 10) ItremainsanopenquestionwhetherProblem3ishardforgeneralmatrices. We conjecture it is already #P-complete for 3 column matrices. Problem 4 is a special case of Problem 5 for the case that D =∅. 1.3 Motivation and related work Paretooptimalmatchingsareoneofmanyconceptstodefinea’good’matching in a graph with preference lists. The graph properties and the preference lists 4 depend on the precise situation they are supposed to model. Most research in thisareaisdrivenbyeconomicapplications. Arecentbookonmatchingsunder preferences,byDavidManlove[13]containsabroadoverviewoverthefieldand usually puts a focus on algorithmic rather than economic questions. In this paper we tried, whenever applicable, to follow the notation therein. A field that evidently seems to be related to our topic is that of stable matchings. This field is very broad and belongs to economic game theory. The seminal work of Gale and Shapley is the starting point for this field [9]. Some work in this field and different variations of the problem can be found in the Ph.D. thesis of Sandy Scott [17], recent papers can be found in the proceedings of the Second International Workshop on Matching Under Preferences called MATCHUP[6]. Intheseworkstherearemanydifferentconceptsofpreferences and stability and they ask for efficiently computable solutions that maximize the outcome for the participants in one way or the other. Readers interested more broadly in the topic of algorithmic game theory are referred to the book edited by Nisan, Roughgarden, Tardos and Vazirani [14]. Incontrasttomostresearchdoneintheseareas,ourquestionismorecombi- natorial in nature. The underlying algorithmic question of computing a Pareto optimal matching can be solved by the greedy algorithm. Thus, instead of ex- istence questions, rather the enumerative questions become interesting. (Note that it is non-trivial to compute POMs with extra features.) Nevertheless, for the early definition of stability many authors have tried to establish upper and lowerboundsonthenumberofstablematchingsandsomecombinatorialstruc- tures have been unfolded. See [13, Section 2.2.2] for an overview of results in this direction. Some complexity results similar to ours have been found earlier. The first dates back to 2005 [1]. Their main result is an efficient algorithm to compute a POMwithmaximumcardinality. Herethepreferencelistsareincomplete. (See Section 1.4 for precise definitions of incomplete preference lists.) Additionally, theyshowhardnesstocomputeaminimumPOM.Thisresulthasalreadysome ideas of the proof of Theorem 4. Further the proof of Theorem 15 implies hardness of computation about minimum matchings. Although they show an easy 2-approximation, it is open whether there exists a PTAS for a minimum POM. We are aware of four recent papers that considered similar results to our complexity results [16, 3, 4, 7]. Their main motivation is to study the behavior oftherandomizedserialdictatorshipalsocalledrandomizedpriorityallocation. The randomized serial dictatorship picks a permutation at random and there- after computes the corresponding greedy matching. ThefirstpaperisbySab´anandSethuraman[16]. Theirresults,reformulated in our context, is NP-hardness of Problem 1, for arbitrary matrices. Aziz, Brandt and Brill [3] show #P hardness for a variant of Problem 5 for arbitrary matrices. We improve these results, as we can show this holds also for matrices withonly3columns. AzizandMestreshowthatconstraintversionsaresolvable in polynomial time [4]. Cechl´arov´a et al. [7] consider a generalized setting: they show NP-hardness of computing a minimum maximal matching even for 5 matrices with 2 columns by an elegant reduction from vertex-cover very similar in spirit to the proof in Theorem 15. This work is originally motivated by a work that was presented at the Eu- roCG 2012 in Braunschweig [11]. See also the follow up work [5]. The authors considered a generalisation of Voronoi diagrams under the assumption that not just one point, but many points are matched in a way that minimizes the sum of the squares of distances between matched points. From the definitions in their paper, the Pareto optimality comes as a natural property. They asked explicitly for the number of exactly reachable sets, as it gives an upper bound on the number of Voronoi cells in the above setting. Motivated by this, they gave lower and upper bounds on the number of exactly reachable stable sets. To do this, first they gave lower and upper bounds for the number of reachable elements. In this paper we improve their upper bound for the number of reach- able elements and by that we prove that their lower bound is asymptotically correct. Thisalsoyieldsasignificantimprovementonthepreviousupperbound on the number of exactly reachable stable sets, although in this case our new upperboundstilldoesnotmeetthelowerboundtheyhad. Theirworkisbased on a work by Rote presented at the EuroCG 2010 in Dortmund [15]. 1.4 Preliminaries As we also want to study matrices with fewer than m columns, we need to define what we mean by a matching under these assumptions. There are two equivalent ways. First, we could say that every row, for which all elements are already picked by other rows, just does not get assigned to anything. A second andnicerwayistoaddcolumns,withallelementsinonecolumnbeingthesame and not appearing before. If we want to know whether some set E is exactly reachable in the first way, we construct E(cid:48) from E by adding the elements from the first m−|E| additional columns (and vice versa). The following is an example of a 2 column matrix. 1 4 1 4 c c 1 2 2 1 2 1 c c  ∼ 1 2  2 5 2 5 c c 1 2  4 3   4 3 c c     1 2      We use the first approach. However, using the second approach, some hard- ness results will carry over from 2 or 3 column matrices to k column matrices (2 ≤ k ≤ m). In such a case, we will point this out again at the appropriate places. To see the correspondence between matchings in a graph-theoretical sense and in our context we define the bipartite row element graph G as follows. The vertices are defined as the set of rows and elements; an element e is adjacent to some row r if and only if e appears in r. See an example for the special case of a matrix with only 2 columns. 6 1 2 r 1 1 3 1 5  1 4  r2 2 r5 6  3 4  r3 3 r6 7  5 6  r4 4  7 6      The circled POM corresponds to the thick matching on the right side. If there is no blocking coalition of size ≤i, we call the matching an i-Pareto optimal matching (i-POM). In particular this implies that every POM is an i-POM. We call a matching 1-POM if there is no blocking coalition of size one. The next matching is 1-Pareto optimal but not Pareto optimal. 1 5 3 5 1 4   5 1 2     Observation 6. If τ is a POM and τ selects position p in row a, then τ selects every element that appears in row a left of p. This observation also holds for 1-POMs. A matching τ is greedy if there exists a permutation π of A such that the matching can be generated in the following manner: we process the rows of M in the order determined by π, and in each row we pick the leftmost element that was not picked earlier. Given some permutation π we call the corresponding greedy matching τ . π In the literature, the mechanism of determining a matching from a permu- tation is called Serial Dictatorship Mechanism. Lemma 7 brings all the introduced notions together, showing that POM, 1-POM and greedy matchings select exactly the same sets. Actually,POMandgreedymatchingsareequivalentnotions,asprovenin[1, 11, 18], and probably also by many others. For the proof of the next lemma we need an equivalent definition of POM. Assume τ (cid:54)=τ . We say τ is better than τ if every element a∈A is matched 1 2 1 2 inτ toabetterorequallygoodelementthaninτ ,accordingtothepreference 1 2 listofa. POMareexactlythemaximalelementswithrespecttothisbetterthan relation. Lemma 7. Let E ⊆ {1,...,n} with |E| = m. The following statements are equivalent. 1. E is (exactly) reachable, i.e. there exists a POM τ with s(τ)=E. 2. There exists a permutation π such that for the greedy matching τ we have π s(τ )=E. π 3. There exists a 1-Pareto optimal matching (1-POM) τ with s(τ)=E. 7 Proof. [1 ⇒ 2] Let τ be a POM matching such that s(τ) = E. We construct a permutation π inductively. If possible take as the next row, in the order of our permutation, the one that has a position of τ in its first entry. Delete the element a at this position from all other rows and continue. We show that at each stage there must be such a row. For the purpose of contradiction assume such a row does not exist. Take any row, denoted by q and let e be some 1 1 elementwhichislefttotheelementselectedbyτ inrowq . Becauseτ isPareto 1 optimal, there exists some row r selecting e . Let e be any element left to 2 1 2 e in row r . In this way we can define a sequence (e ) and (r ). As we have 1 2 i i only finitely many elements, at some point we get a first e that appears earlier j in the sequence e = e , i < j. This implies that in the rows r ,...r we can i j i j improvesimultaneously(i.e., itisablockingcoalition), whichisacontradiction to the assumption that τ is Pareto optimal. [2⇒3]Aseveryrowpicksthebestelement, notyetselected, itisclearthat no single row can improve. [3⇒1] Let τ be some 1-Pareto optimal matching and E =s(τ ). Observe 0 0 that all the elements left to the elements picked by τ are in E. The set of 0 matchings that are better or equal to τ is non-empty as it contains τ and the 0 0 setisofcoursefinite,sothereexistsabestmatchingτ amongthem,i.e. onefor 1 which there is no better matching. This must be a POM and by Observation 6 we have s(τ ) ⊆ s(τ ) = E, and the size of s(τ ) is also m. This implies 1 0 1 s(τ )=E. 1 Notethatthislemmaimpliesthatalsoforanyi,i-POMsselectthesamesets as POMs/1-POMs. Note also that the proof of Lemma 7 implies that actually every POM matching is greedy. The inverse is also true and left as an exercise, as we will not use it later. 2 Counting reachable elements and sets We start with a discussion of reachable positions. For every row r, there exists a reachable position p furthest to the right in that row, we call such a position r last reachable. However not all positions left of the last reachable position must be reachable. Consider for example the following matrix together with the matching τ indicated by circles. 5 4 3 2 5 1 6 7   1 2 8 9    2 1 5 4      The matching τ is Pareto optimal and thus the circled position in the bottom row with element 4 is the last reachable position in that row. However, it is easy to verify, that the two positions left to this circled position (with elements 1 and 5) are not reachable. 8 Theorem 2. Let M be an m×m matrix and E∗ be the set of all reachable elements. Then m |E∗|≤ (cid:98)m/i(cid:99)≤m(lnm+1). i=1 (cid:88) Proof. Let T be some set of k POMs of M. We denote by E(T) the set of elements reached by at least one POM of T. (formally: E(T) = s(τ).) τ∈T We claim: k (cid:83) |E(T)|≤ (cid:98)m/i(cid:99). i=1 (cid:88) Note k m (cid:98)m/i(cid:99)≤ (cid:98)m/i(cid:99)≤m(lnm+1). i=1 i=1 (cid:88) (cid:88) Because (cid:98)m/i(cid:99) = 0 for i > m and the known bound on the harmonic series. Thus the theorem follows from the claim. The proof goes by induction on k. The base case k =1 is true as one POM selects exactly m different elements. Consider now a set T of k ≥2 POMs and the set of positions reached by T. Among these positions we denote by p the position furthest to the right in row i i and we denote F = {p ,...,p }. We say that an element e (resp. position 1 m p) is uniquely reachable by some τ if τ is the only POM in T that reaches e (resp. selects p). Consider the set G⊆F of those rightmost reachable positions thatarereachablebyexactlyonePOMofT. Bythepigeon-holeprinciplethere exists a POM τ in T that reaches at most 1/k portion of G. Denote the set of elements in these positions by H (|H|≤(cid:98)m/k(cid:99)). By the definition of H all elements s(τ)\H are not selected uniquely by τ, i.e. some other matching of T also selects it. Let us explain this in more detail. Consider an element e ∈ s(τ)\H and denote by p the position of e. As p (cid:54)∈ G there exists another matching τ(cid:48) (cid:54)= τ that either also selects p or τ(cid:48) selects a position further to the right of p in the same row. In both cases holds: τ(cid:48) must select e by Observation 6. Thus the rest of the reached elements are also reachable by T \ τ. By induction we get k−1 k E(T)≤E(T \τ) +(cid:98)m/k(cid:99)≤ (cid:98)m/i(cid:99) + (cid:98)m/k(cid:99)= (cid:98)m/i(cid:99). (cid:32) (cid:33) i=1 i=1 (cid:88) (cid:88) A natural question is, if we could asymptotically improve upon the bound given in the previous Theorem 2. The following construction by Henze, Jaume and Keszegh [11] shows asymptotic tightness. The following example has more rows than columns, however the rows can befilledwitharbitraryelementsasnoneoftheseelementsarereachableanyway. 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.