Tighter Bounds for the Discrepancy of Boxes and 7 Polytopes 1 0 2 Aleksandar Nikolov n Department of Computer Science a J University of Toronto 9 [email protected] 1 ] O Abstract C Combinatorial discrepancy is a complexity measure of a collection of . h sets which quantifies how well the sets in the collection can be simulta- t neously balanced. More precisely, we are given an n-point set P, and a m a collection F = {F1,...,Fm} of subsets of P, and our goal is color P with two colors, red and blue, so that the largest absolute difference be- [ tween the number of red elements and the number of blue elements (i.e. 1 the discrepancy) in any Fi is minimized. Combinatorial discrepancy has v many applications in mathematics and computer science, including con- 2 structions of uniformly distributed point sets, and lower bounds for data 3 structures and private dataanalysis algorithms. 5 Weinvestigatethecombinatorial discrepancyofgeometrically defined 5 set systems, in which P is an n-point set in d-dimensional space, and F 0 isthecollection ofsubsetsofP inducedbydilationsandtranslationsofa . 1 fixed convex polytope B. Such set systems include sets induced by axis- 0 alignedboxes,whosediscrepancyisthesubjectofthewellknownTusna´dy 7 problem. Weprovenewdiscrepancyupperboundsforsuchsetsystemsby 1 extending the approach based on factorization norms previously used by v: the author and Matouˇsek. We improve the best known upper bound for i theTusna´dyproblembyalogarithmicfactor,usingaresultofBanaszczyk X on signed series of vectors. Weextendthisimprovementtoanyarbitrary r convex polytope B by using a decomposition duetoMatouˇsek. a 1 Introduction As usual, we define the combinatorial discrepancy of a system of subsets of F some finite set P as: disc := min max χ(p). F χ:P→{−1,1}F∈F p∈F X We refer the reader to the book of Matouˇsek [Ms10] for background on com- binatorial discrepancy. For a reference to the large number of applications of 1 combinatorial discrepancy to geometric discrepancy, numerical methods, and computer science, see the book of Chazelle [Cha00]. A recent application to private data analysis can be found in the thesis of the author [Nik14]. Let be the family of anchored axis-aligned boxes in Rd: := A(x) : d d x Rd A, where A(x) := y Rd : y x i [d] and [d] := 1A,...,d{. This i i ∈ } { ∈ ≤ ∀ ∈ } { } is a slight abuse of terminology: A(x) is not a box, but rather a polyhedral cone. Usually A(x) is defined to be anchored at 0, i.e. it is defined as the set y Rd : 0 y x i [d] . However, we prefer to anchor A(x) at i i { ∈ ≤ ≤ ∀ ∈ } ( , ). Thisconventiondoesnotaffectanyofthe resultsinthepaper,and −∞ −∞ allows us to avoid some minor technicalities. For an n-point set P Rd, we denote by (P) := A(x) P : x Rd d ⊂ A { ∩ ∈ } the set system induced by anchored boxes on P. (Note that this set system is finite, and, in fact, can have at most nd sets.) Tusn´ady’s problem asks for tightboundsonthelargestpossiblecombinatorialdiscrepancyof (P)overall d A n-point sets P. Here we prove the following theorem, which gives a new upper bound on the discrepancy. Theorem 1. For any d 2, there exists a constant C s.t. for any n-point set d P in Rd ≥ disc (P) C (1+logn)d−1/2. d d A ≤ It was shown in [MN15, MNT14] that for infinitely many n there exists an n-point set P such that disc (P) = Ω (lognd−1), so the upper bound above d d A is within a O (√logn) factor from the lower bound. (Here, and in the rest of d this paper, we use the notation O (),Ω (), Θ () to denote the fact that the p p p · · · implicitconstantinthe asymptotic notationdepends onthe parameterp.) The best previously known upper bound for the Tusn´ady problem was O (logdn), d and was recently provedby Bansaland Garg [BG16]. Their result improvedon the prior work of Larsen [Lar14] (see also the proof in [MNT14]), who showed an upper bound of O (logd+1/2n). d More generally, let be a collection of subsets of Rd, and, for an n-point set P Rd, let (P)Fbe the set system induced by on P, i.e. (P) := ⊂ F F F F P : F . We are interested in how the worst-case combinatorial { ∩ ∈ F} discrepancy of such set systems grows with n. This is captured by the function disc(n, ) := sup disc (P) : P Rd, P = n . In this notation, Theorem 1 F { F ⊂ | | } shows that disc(n, )=O (logd−1/2n). d d Let K Rd, aAnd let’s define to be the family of images of K under K translations⊆and homothetic transfTormations: := tK +x : t R ,x K + Rd , where R is the set of positive reals. IfTwe take{Q := [0,1]d∈, then it’∈s + } well-known that disc(n, ) 2ddisc(n, ) and, therefore, Theorem 1 implies Q d T ≤ A disc(n, ) = O (logd−1/2n). Our next result shows that this bound holds for Q d any polTytope B in Rd. Theorem 2. For any d 2, and any closed convex polytope B Rd, ≥ ⊂ disc(n, )=O (logd−1/2n). B d,B T 2 We note that with the same proof we can establish the stronger fact that disc(n,POL( ))=O (logd−1/2n), where is a family ofhyperplanesin Rd, d,H andPOL( )HisthesetofallpolytopeswhichcHanbewrittenas m γ ,witheach H i=1 i γ a halfspace whose boundary is parallel to some h . The best previously i ∈ H T known upper bound in this setting is also due to Bansal and Garg [BG16], and is equal to O (logdn). d,B Our approachbuilds onthe connectionbetweenthe γ normandhereditary 2 discrepancy shown in [NT15, MNT14]. The new idea which enables the tighter bounds is to use a result of Banaszczyk, proved in the context of improving the constants in the Steinitz lemma, in order to get one dimension “for free”. The proof of Theorem 2 combines the ideas in the proof of Theorem 1 with a decomposition due to Matouˇsek [Mat99]. 1.1 Connection to Geometric Discrepancy Geometric discrepancy measures the irregularity of a distribution of n points in [0,1]d with respect to a family of distinguishing sets. In particular, for an n-point set P [0,1]d and a family of measurable subsets of Rd, we define ⊆ F the discrepancy D(P, ):= sup P F λ (F [0,1]d) , d F | ∩ |− ∩ F∈F (cid:12) (cid:12) where λ is the Lebesgue measur(cid:12)e on Rd. The smallest(cid:12)achievable discrep- d ancy over n point sets with respect to is denoted D(n, ) := inf D(P, ) : P Rd, P = n . A famous result of SFchmidt [Sch72] shoFws that D{(n, F) = 2 ⊂ | | } A Θ(logn). The picture is much less clear in higher dimensions. In a seminal paper [Rot54], Roth showed that D(n, ) = Ω (log(d−1)/2n) for any d 2; d d A ≥ the best known lower bound in d 3 us due to Bilyk, Lacey, and Vaghar- shakyan [BLV08] and is D(n, )=≥Ω (log(d−1)/2+ηdn), where η is a positive d d d A constantdependingondandgoingto0asdgoestoinfinity. Ontheotherhand, thebestknownupperboundisD(n, )=O (logd−1n)andcanbeachievedin d d A manydifferentways,oneofthesimplestbeingtheHalton-Hammersleyconstruc- tion [Hal60, Ham60]. The book by Beck and Chen [BC08] calls the problem of closingthissignificantgap“theGreatOpenProblem”(ingeometricdiscrepancy theory). See the book of Matouˇsek[Ms10] for further backgroundon geometric discrepancy. There is a known connection between combinatorial and geometric discrep- ancy. Roughly speaking, combinatorial discrepancy is an upper bound on geo- metric discrepancy. More precisely, we have the following transference lemma, which goes back to the work of Beck on Tusn´ady’s problem [Bec81] (see [Ms10] for a proof). Lemma 3. Let be a family of measurable sets in Rd such that there is some F F whichcontains[0,1]d. Assumethat D(n,F) goes to0asngoestoinfinity, ∈F n andthatdisc(n, ) f(n)for afunctionf thatsatisfiesf(2n) (2 δ)f(n)for F ≤ ≤ − all n and some fixed δ > 0. Then there exists a constant C that only depends δ on δ, for which D(n, ) C f(n). δ F ≤ 3 Lemma 3 and Theorem 2 imply that D(n, ) = O (logd−1/2n) for any B d,B convex polytope B in Rd. This bound gets withTin an O (√logn) factor from d,B the best bound knownfor boxes in d dimensions. See the remarksafter Section 4.6. of [Ms10] for references to prior work on bounding D(n, ) for a convex B T polytope B. 2 Preliminaries In what follows we use , for the standard inner product on Rn. We use h· ·i capitalletters to denote matrices,and lowercase letters with indexes to denote matrix entries, e.g. a to denote the entry in row i column j of the matrix A. ij We also use µ for the standard Gaussian measure on Rn. n 2.1 The γ factorization norm 2 Theγ normwasintroducedinfunctionalanalysistostudyoperatorsthatfactor 2 through Hilbert space. We say that an operator u: X Y between Banach → spaces X and Y factors through a Hilbert space if there exists a Hilbert space H and bounded operators u : X H and u : H Y such that u = u u . 1 2 2 1 → → Then the γ norm of u is 2 γ (u):=inf u u , 2 1 2 k kk k where the infimum is taken over all Hilbert spaces H, and all operators u 1 and u as above. Here u and u are the operator norms of u and u , 2 1 2 1 2 k k k k respectively. The book ofTomczak-Jaegermann[TJ89]is anexcellentreference on factorization norms and their applications in Banach space theory. Inthisworkwewillusetheγ normofanm nmatrixA,whichisdefinedas 2 × the γ norm of the linear operator u: ℓn ℓm with matrix A (in the standard bases2of Rm and Rn). In the language o1f→mat∞rices, this means that γ (A):=inf U V :A=UV , 2 2→∞ 1→2 {k k k k } wheretheinfimumistakenovermatricesU andV, V equalsthelargestℓ 1→2 2 k k normofacolumnofV,and U equalsthelargestℓ normofarowofU. By 2→∞ 2 k k a standard compactness argument, the infimum is achieved; moreover, we can takeU Rm×r andV Rr×n,wherer istherankofA. Yetanotherequivalent ∈ ∈ formulation, which will be convenient for us, is that γ (A) is the smallest non- 2 negative real t for which there exist vectors u ,...,u ,v ,...,v Rr such 1 m 1 n ∈ that for any i [m],j [n], a = u ,v and u t, v 1. ij i j i 2 j 2 ∈ ∈ h i k k ≤ k k ≤ Let us further overload the meaning of γ by defining γ ( ) = γ (A) for a 2 2 2 F set system with incidence matrix A. We recall that the incidence matrix of F a system =F ,...,F of subsets of a set P =p ,...,p is defined as 1 m 1 n F 1 p F j i a := ∈ . ij (0 pj Fi 6∈ 4 In other worse, γ ( ) is the smallest non-negative real t such that there exist 2 F vectors u ,...,u ,v ,...,v satisfying 1 m 1 n 1 p F j i u ,v = ∈ , i j h i (0 pj Fi 6∈ and u t, v 1 for all i [m],j [n]. i 2 j 2 k k ≤ k k ≤ ∈ ∈ In [MN15] it was shown that γ ( ) is, up to logarithmic factors, equivalent 2 F to a hereditary version of combinatorial discrepancy. We do not directly use this connection here. In [MN15] it was also shown that γ satisfies a number 2 of nice properties which help in estimating the norm of specific matrices or set systems. Here we only need the following inequality, which holds for a set system = ... , where ,..., are set systems over the same set P: 1 k 1 k F F ∪ F F F k γ ( ) γ ( )2. (1) 2 F ≤v 2 Fi ui=1 uX t Wealsoneedthefollowingsimplelemma,whichfollows,e.g.fromtheresults in [MN15, MNT14], but also appears in a similar form in [Lar14], and follows from standard dyadic decomposition techniques. Lemma 4. For any d 1 there exists a constant C such that the following d holds. For any n-point s≥et P Rd, γ ( (P)) C (1+logn)d. 2 d d ⊂ A ≤ 2.2 Signed Series of Vectors We will use the following result of Banaszczyk: Lemma 5 ([Ban12]). Let v ,...,v Rm, i : v 1, and let K Rm be 1 n i 2 ∈ ∀ k k ≤ ⊂ a convex body symmetric around the origin. If µ (K) 1 1/(2n), then there m ≥ − exists an assignment of signs χ: [n] 1,1 so that →{− } j j 1,...,n : χ(i)v 5K. i ∀ ∈{ } ∈ i=1 X This lemma was proved in the context of the well-known Steinitz problem: givenvectorsv ,...,v , eachof Euclideannormat most1,suchthat v +...+ 1 n 1 v = 0, find a permutation π on [n] such that for all integers i, 1 i n, n ≤ ≤ v +...+v C√m, where C is an absolute constant independent of π(1) π(i) 2 k k ≤ m or n. Lemma 5 gives the best partial result in this direction: it can be used to show a bound of C(√m+√logn) in place of C√m. Lemma 5 follows relatively easily from another powerful result Banaszczyk proved in [Ban98]. Unfortunately, the proof of the latter does not suggest any efficient algorithm to find the signs χ(i), and no such algorithm is yet known. 5 3 Proof of Theorem 1 In this section we give the proof of Theorem 1. Let us fix the n-point set P Rd once and for all. Without loss of generality, assume that each p P ⊂ ∈ has a distinct last coordinate, and order the points in P in increasing order of theirlastcoordinateasp ,...,p . Writeeachp asp =(q ,r ),whereq Rd−1 1 n i i i i i and r R. With this notation, and the ordering we assumed, we hav∈e that i ∈ r <r whenever i<j. i j LetQ:= q :1 i n . Noticethatthisisann-pointsetinRd−1. Denote i { ≤ ≤ } the setsin (Q)asA ,...,A (innoparticularorder). ByLemma 4,there d−1 1 m A exist vectors u ,...,u and v ,...,v such that 1 m 1 n 1 q A j i u ,v = ∈ , (2) i j h i (0 qj Ai 6∈ and u C (1+logn)(d−1), v 1 for all i and j. Define the symmetric i 2 d j 2 k k ≤ k k ≤ polytope K := x Rm : u ,x C′(1+logn)d−1/2 i 1,...,m , { ∈ |h i i|≤ d ∀ ∈{ }} where C′ >C is a constant large enough that µ (K) 1 1/(2n). The fact d d m ≥ − that such a constant exists follows from standard concentration of measure re- sults in Gaussianspace. Indeed, using a Bernstein-type inequality for Gaussian measure, we can show that, for C′ big enough, µ (S ) 1 1/(2nd+1) for all d m i ≥ − i [m], where S := x: u ,x C′(1+logn)d−1/2) . By the union bound, si∈nce m nd, thiis im{plies|hµi (i|m≤ Sd) 1 1/2n. } ≤ m i=1 i ≥ − The body K and the vectors v ,...,v then satisfy the assumptions of 1 n T Lemma 5, and, therefore,there exists and assignmentof signs χ: [n] 1,1 →{− } such that, for any k, 1 k n, ≤ ≤ k χ(j)v 5K. j ∈ j=1 X Expanding the definition of K, we see this is equivalent to k i 1,...,m : χ(j) u ,v 5C′(1+logn)d−1/2. (3) ∀ ∈{ } (cid:12) h i ji(cid:12)≤ d (cid:12)j=1 (cid:12) (cid:12)X (cid:12) (cid:12) (cid:12) forFaonryexachRi,d,1w≤e ica≤n wmr,itl(cid:12)(cid:12)eetAu(sx)defiPneaAs′iA(cid:12)(cid:12)=′ {ppj :,.q.j.,∈pAi}fo.rWsoemcelaiimantdhakt. ∈ ∩ i ∩{ 1 k} (Herewe assumethatA(x) P isnon-empty: the othercaseis irrelevantto the proof.) To see this, let x=∩(y,x ), where y Rd−1 and x R. Let i be such d d ∈ ∈ that A(y) Q=A , and let k be the largest integer such that r x . Then: i k d ∩ ≤ A(x) P = p :q A(y),j k =A′ p ,...,p . ∩ { j j ∈ ≤ } i∩{ 1 k} 6 It follows that χ(j) = χ(j) = χ(j) u ,v 5C′(1+logn)d−1/2, (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) h i ji(cid:12)≤ d (cid:12)(cid:12)j:pjX∈A(x) (cid:12)(cid:12) (cid:12)(cid:12)j:qj∈XAi,j≤k (cid:12)(cid:12) (cid:12)(cid:12)Xj≤k (cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) w(cid:12)here the penu(cid:12)ltim(cid:12)ate equality fol(cid:12)lows(cid:12)from (2), and t(cid:12)he final inequality is (3). Since xwasarbitrary,wehaveshownthatdisc (P) 5C′(1+logn)d−1/2, as Ad ≤ d was required. 4 Proof of Theorem 2 ThemainingredientinextendingTheorem1toarbitrarypolytopesisageomet- ric decomposition due to Matouˇsek. To describe the decomposition we define the k-composition Q ( ) of sets from a (finite) set system as follows. For k F F k =0, Q ( )= ; for an integer k >0, we have k F ∅ Q ( ):= F F :F Q ( ),F Q ( ),F F = ,k +k =k k F { 1∪ 2 1 ∈ k1 F 2 ∈ k2 F 1∩ 2 ∅ 1 2 } ∪ F F :F Q ( ),F Q ( ),F F ,k +k =k . { 1\ 2 1 ∈ k1 F 2 ∈ k2 F 2 ⊆ 1 1 2 } By an easy induction on k, we see that discQ ( ) kdisc . (4) k F ≤ F We also extend the notion of anchored boxes to “corners” whose bounding hyperplanes are not necessarily orthogonal. Let W = w ,...,w be a basis 1 d of Rd. Then we define := A (x) : x Rd , whe{re A (x) =} y Rd : W W W A { ∈ } { ∈ w ,y w ,x i [d] . i i h i≤h i ∀ ∈ } The following lemma gives the decomposition result we need. Lemma 6 ([Mat99]). Let B Rd bea convex polytope. There exists a constant k depending on dand B, and⊂k bases W ,...,W of Rd suchthat every B′ 1 k B ∈T belongs to Q ( ... ). Moreover, e W W ... W , where e is the first stkanAdWar1d∪basis∪vAecWtokr of Rd. 1 ∈ 1∩ 2∩ ∩ k 1 Matouˇsek does not state the condition after “moreover”; nevertheless, it is easy to verify this condition holds for the recursive decomposition in his proof of Lemma 6. We will also need a bound on γ ( (P)) for a basis W and an n-point set 2 W A P. Lemma 7. For any d 1 there exists a constant C such that the following d holds. For any basis W≥of Rd, and any n-point set P, γ ( (P)) C (1+ 2 W d A ≤ logn)d. Proof. Let W = w ,...,w , and let u be the linear map that sends the i-th 1 d standardbasisvectore tow foreachi [d]. LetQ=u(P):= u∗(p):p P , i i ∈ { ∈ } 7 where u∗ is the adjoint of u. It is easy to verify that (P) = (Q), and, W d A A therefore, γ ( (P))=γ ( (Q)) C (1+logn)d, 2 W 2 d d A A ≤ where the final inequality follows from Lemma 4. We are now ready to finish the proof of Theorem 2. As in the proof of Theorem 1, we fix the n-point set P Rd once and for all, and we order P ⊂ as p ,...,p in increasing order of the last coordinate. We write each p as 1 n i p =(q ,r ) for q Rd and r R, and define Q:= q :1 i n . i i i i i i ∈ ∈ { ≤ ≤ } Let W ,...,W be as in Lemma 6, and let W′ = W e for each i, 1 i k.1Observekthat W′ is a basis of Rd−1. By (i1) andiL\em{m1a}7, ≤ ≤ i γ2(AW1′(Q)∪...∪AWk′(Q))≤√kCd(1+logn)d−1. By an argument using Lemma 5 analogous to the one used in the proof of Theorem 1, we can then show that there exists a constant C′ depending on B,d B and danda coloringχ: [n] 1,1 ,such thatfor any integeri, 1 i k, and any x Rd, →{− } ≤ ≤ ∈ χ(j) C′ (1+logn)d−1/2. (cid:12) (cid:12)≤ B,d (cid:12)(cid:12)j:pj∈XAWi(x) (cid:12)(cid:12) (cid:12) (cid:12) Here CB′ ,d is impli(cid:12)(cid:12)citly assumed t(cid:12)(cid:12)o depend on k as well, which depends on B and d. This establishes that disc C′ (1 + logn)d−1/2, where = F ≤ B,d F (P) ... (P). Because, by Lemma 6, (P) C ( ), (4) implies AW1 ∪ ∪AWk TB ⊂ k F disc (P) discC ( ) kdisc kC′ (1+logn)d−1/2. TB ≤ k F ≤ F ≤ B,d This finishes the proof of the theorem. The same asymptotic bound with B T replaced by POL( ) for a family of hyperplanes can be proved by replacing H H Lemma 6 with an analogous decomposition lemma for POL( ), also proved H in [Mat99]. References [Ban98] W. Banaszczyk. Balancing vectors and Gaussian measures of n- dimensional convex bodies. Random Structures and Algorithms, 12(4):351–360,1998. [Ban12] WojciechBanaszczyk.Onseriesofsignedvectorsandtheirrearrange- ments. Random Structures Algorithms, 40(3):301–316,2012. [BC08] J´ozsef Beck and William W. L. Chen. Irregularities of distribu- tion, volume 89 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 2008. Reprint of the 1987 original [MR0903025]. 8 [Bec81] J. Beck. Balanced two-colorings of finite sets in the square. I. Com- binatorica, 1:327–335,1981. [BG16] Nikhil Bansal and Shashwat Garg. Algorithmic discrepancy beyond partial coloring. CoRR, abs/1611.01805,2016. [BLV08] DmitriyBilyk,MichaelT.Lacey,andArmenVagharshakyan. Onthe small ball inequality in all dimensions. J. Funct. Anal., 254(9):2470– 2502, 2008. [Cha00] Bernard Chazelle. The discrepancy method. Cambridge University Press, Cambridge, 2000. Randomness and complexity. [Hal60] J. H. Halton. On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numer. Math., 2:84–90,1960. [Ham60] J. M. Hammersley. Monte Carlo methods for solving multivariable problems. Ann. New York Acad. Sci., 86:844–874(1960), 1960. [Lar14] K. G. Larsen. On range searching in the group model and combi- natorial discrepancy. SIAM Journal on Computing, 43(2):673–686, 2014. [Mat99] Jir´ıMatouˇsek. Onthediscrepancyforboxesandpolytopes. Monatsh. Math., 127(4):325–336,1999. [MN15] Jir´ıMatousekandAleksandarNikolov.Combinatorialdiscrepancyfor boxes via the gamma 2 norm. In Lars Arge and J´anos Pach, editors, 31st International Symposium on Computational Geometry, SoCG 2015, June 22-25, 2015, Eindhoven, The Netherlands, volume 34 of LIPIcs, pages 1–15. Schloss Dagstuhl - Leibniz-Zentrum fuer Infor- matik, 2015. [MNT14] Jiˇr´ıMatouˇsek,AleksandarNikolov,and Kunal Talwar. Factorization norms and hereditary discrepancy. CoRR, abs/1408.1376v2,2014. [Ms10] Jiˇ r´ıMatouˇ sek. Geometric discrepancy, volume 18 of Algorithms and Combinatorics. Springer-Verlag, Berlin, 2010. An illustrated guide, Revised paperback reprint of the 1999 original. [Nik14] AleksandarNikolov.NewComputationalAspectsofDiscrepancyThe- ory. PhD thesis, Rutgers, The State University of New Jersey, 2014. [NT15] Aleksandar Nikolov and Kunal Talwar. Approximating hereditary discrepancy via small width ellipsoids. In Piotr Indyk, editor, Pro- ceedings of theTwenty-Sixth AnnualACM-SIAMSymposium on Dis- crete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 324–336.SIAM, 2015. 9 [Rot54] K. F. Roth. On irregularities of distribution. Mathematika, 1:73–79, 1954. [Sch72] WolfgangM.Schmidt. Irregularitiesofdistribution.VII. ActaArith., 21:45–50,1972. [TJ89] N. Tomczak-Jaegermann. Banach-Mazur Distances and Finite- Dimensional Operator Ideals. Pitman Monographs and Surveys in Pure and Applied Mathematics 38. J. Wiley, New York, 1989. 10