ebook img

Speed and concentration of the covering time for structured coupon collectors PDF

0.34 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Speed and concentration of the covering time for structured coupon collectors

Speed and concentration of the covering time for structured coupon collectors Victor Falgas-Ravry 1, Joel Larsson 2, and Klas Markstro¨m 3 ∗ † ‡ 6 1 1Department of Mathematics, Vanderbilt University 0 2Institutionen f¨or Matematik och Matematisk Statistik, Ume˚a Universitet 2 3Institutionen f¨or Matematik och Matematisk Statistik, Ume˚a Universitet n a J 8 January 19, 2016 1 ] R P Abstract . h t a Let V be an n-set, and let X be a random variable taking values in the powerset of V. m Suppose we are given a sequence of randomcoupons X1,X2,..., where the Xi are independent [ random variables with distribution given by X. The covering time T is the smallest integer t 1 t ≥ 0 such that i=1Xi = V. The distribution of T is important in many applications in combinatorialprobability,andhasbeenextensivelystudied. Howevertheliteraturehasfocussed v S 5 almost exclusively on the case where X is assumed to be symmetric and/or uniform in some 5 way. 4 InthispaperwestudythecoveringtimeformuchmoregeneralrandomvariablesX;wegive 4 generalcriteriaforT beingsharplyconcentratedarounditsmean,precisetoolstoestimatethat 0 mean, as well as examples where T fails to be concentrated and when structural properties in . 1 thedistributionofX allowforaverydifferentbehaviourofT relativetothesymmetric/uniform 0 case. 6 1 : Keywords— Coupon collector; concentration inequalities; combinatorialprobability v i X 1 Introduction r a In this paper, we study the random covering problem in a general setting: we are interested in the distribution of the covering time T for general distributions of the random covering variable X. With the exception of a result of Aldous [4], discussed later, this is as far as we are aware the first timethecovering problemisstudiedinthisgenerality. However thequestion isanaturalone: there are many applications where the covering variable is non-uniform’ in a way which puts it outside ∗Electronic address: [email protected]. Research partially supported by a grant from the Kempefoundation. †Electronic address: [email protected] ‡Electronic address: [email protected]. Research supported by a grant from the Swedish Research Council (Vetenskapsr˚adet) 1 the current literature on covering problems. Also, a common drawback of many of the existing exact results aboutcovering processes is that the expressions obtained often involve a large number of summands and are hard to evaluate directly; this was pointed out for example by Sellke [55] and Adler and Ross [2]. Our own focus is on simple, easy-to-use concentration inequalities for the covering time which can be applied in a straightforward way. The the basic question we seek to answer: how does the distribution of X affect the covering time? Can we exploit ‘structure’ in the choice of X to ‘speed up’ or ‘slow down’ the covering? And when can we guarantee that T is sharply concentrated? Our paper is structure as follows. In Section 2, we gather together elementary bounds for the covering time, and identify the range of possible speeds of the covering process, giving examples going over the entire spectrum. We follow on in Section 3 with the main results of this paper, namely general structure theorems giving sufficient conditions for the covering time of an arbitrary random covering variable to besharply concentrated. Theseare stated in Section 3.1 and proved in Sections 3.3–3.6. In Section 4 we discuss ‘fast’ coverage by structured random variables. Finally in Section 5 we give some applications of our results to the connectivity of random graphs, continuum percolation, random graph colourings, the unsatisfiability threshold for k-SAT and the appearance of perfect matchings in random graphs. We end with some questions and remarks. 1.1 Definitions Let V be a finite set; usually we shall take V = [n] := 1,2,...,n . Let X be a random variable { } taking values in the power-set of V. A random variable X taking values in the power-set of V is referred to as a random covering variable, or random coupon. We say that X is exchangeable if the law of X is invariant under every permutation of V. We call X transitive if the law of X is invariant under the action of a transitive subgroup of Sym(V). We say X is balanced if for every v,v V we have P(v X) = P(v X). Finally, X is k-uniform if X = k with probability 1. ′ ′ ∈ ∈ ∈ | | We consider an infinite sequence X = X ,X ,... of i.i.d. random covering variables X X. 1 2 i { } ∼ We view this as a sequence of random coupons received by a coupon collector; we refer to X as i the ith coupon, and to the collector as the X-coupon collector. We set C = C (X) = X to t t i t i be the collection of elements of V covered by the union of the first t coupons X ,X ,...,≤X , and 1 2 t S define the covering time T =T(X) to be T = inf t : C = V . t { } This quantity T is sometimes referred to as the waiting time in the literature. Note that T could be infinite if, for example, X almost surely does not cover (contain) some element v V. We ∈ also define 1 T = T (X) = inf t : P(C = V) , 1 1 t 2 2 ≥ 2 (cid:26) (cid:27) to be the earliest time by which we have at least a fifty percent chance of having covered V, and for a subset A V we let τ = τ (X) be the least t such that A C if it exists, and infinite A A t ⊆ ⊆ otherwise. For v V, we let d (t), the degree of v at time t, denote the number of sets X with v i ∈ i t containing v. ≤ Our aim in this paper is to prove concentration results for the covering time T in a general setting, i.e. for arbitrary random covering variables X. We shall also consider applications where V Rd is a compact set and X takes values among the compact subsets of V, and define V , T t ⊆ 2 and T analogously to the discrete case. In this continuous setting, we shall use A to denote the 1 2 | | Lebesgue measure of a set A. For a sequence of events ( n)n N, we say that n holds with high A ∈ A probability (whp) if lim P( ) = 1. n n A →∞ Also, we say that a sequence of random variables (Yn)n N is sharply concentrated around f(n) if (Y /f(n)) converges to 1 in probability, i.e. ε > 0∈, lim P(Y /f(n) 1 > ε) = 0. We recnall herent∈hNe standard Landau notation for asym∀ptotic behanv→io∞ur. F|ornfunctio−ns|f,g : N R , 0 → ≥ we say that f = O(g) if there exists C > 0 such that f(n) Cg(n) for all but finitely many n. We ≤ writef = o(g) to denotethat lim f(n)/g(n) = 0. Finally, we usef = Ω(g) to denoteg = O(f), n →∞ we write f = θ(g) if both f = O(g) and f = Ω(g) hold, and use f = ω(g) or f g to denote ≫ g = o(f). 1.2 Some examples We give below some examples of random covering variables X, illustrating the definitions of ex- changeable, transitive and balanced above. Our first example is that of the quintessential ‘nice’ random covering variable: the k-uniform, exchangeable randomcoupon variable, which was thefocus of most of the previous work on coupon collecting. Example 1.1: Let X be a k-set of V = [n] selected uniformly at random, for some k : 1 k n; ≤ ≤ X is k-uniform and exchangeable. Next we give three examples of ‘structured’ coupon collectors, of the kind that motivate our work in this paper. Example 1.2: Let G be a graph on n vertices. Let X be the random coupon obtained by selecting a vertex x of V = V(G) uniformly at random and taking as the coupon the closed neighbourhood of x in G, Γ¯(x) := y V(G) : xy E(G) x . Here X is balanced if and only if the graph G is { ∈ ∈ }∪{ } regular, and transitive if and only if the graph G has a transitive automorphism group. Example 1.3: Let V = Q be the discrete d-dimensional hypercube 0,1 d, and let X be a k- d { } dimensional subcube of Q chosen uniformly at random, for some k : 0 k d. This random d ≤ ≤ covering variable X is transitive and 2k-uniform but not exchangeable, and, as described in Section 5, underlies the random SAT problem. Example 1.4: Let V be the square of area n, [0,√n]2 R2, and let X be the intersection of V with ⊂ the disc of radius r about a uniformly chosen random point x V. This random covering variable ∈ X is neither uniform nor balanced, due to boundary effects; it is relevant to problems of coverage in random geometric graph theory (see Section 5). 1.3 Motivation for coupon collecting The problem of determining the covering time of a set by a union of random subsets is of funda- mental importance in several areas of mathematics, most notably in probability theory, discrete mathematics and mathematical statistics. This importance is illustrated both by the age of the problem — in its simplest form, the covering problem can be traced back to de Moivre [44] in 3 1711 — and by the many appellations it has amassed through the years. It has been studied by a large number of mathematicians from a variety of backgrounds and under a variety of names: ma- trix occupancy [15], allocation of particles in complexes [59], committee problem [40], chromosome problem [56], urn-sampling [55] or urn-occupancy problem [16], the Dixie cup problem [46], and, perhaps most famously, the coupon collector problem [6]. The ubiquitous nature of the covering problem is due to its wide range of applications. It is linked to the study of random walks [3], colouring [10] and degree sequences [41] in graph the- ory. In Section 5 we also give applications of the coupon collector problem to the connectivity of random graphs. The performance analysis of many exploration or optimisation algorithms in theoretical computer science involves a solution to a covering problem [47, 58], while the unsatisfia- bility threshold for SAT correspondsto the cover time of a hypercubeby random subcubes[35, 18]. The ‘reverse’ coupon collector problem — estimating the size of V given C — is important to IP t traceback algorithms [54] and the study of biological diversity [48, 45] amongst other applicationss, while the study of the degrees d (t), v V, is central to hashing and load balancing [52]. There are v ∈ furtherapplications inpopulation genetics [37,51], evolutionary algorithms forfitness selection [49] and disordered system physics [29]. 1.4 Previous work on coupon collecting Most of the previous work on covering problems in the spirit of the present paper focussed on the case where X is an exchangeable, k-uniform random covering variable. The case k = 1, known as the coupon collector’s problem has received by far the most attention. It can be traced back to de Moivre [44], who computed the probability that P(V = V) exactly. Laplace [12] later generalised t de Moivre’s result to the k-uniform case for k 1. ≥ The second half of the twentieth century saw great activity on the problem, with many results replicated independently by researchers. Po´lya [50] gave an expression for the expected covering time T in the k-uniform exchangeable case. Feller’s textbook [21] included a computation of ET in the special case k = 1. Still in the case k = 1, Newman and Shepp [46] computed the expected time necessary for m-coverage of V (covering every point at least m times). Erd˝os and R´enyi [16] computed the asymptotic distribution of the m-coverage time for m 1; as their result is of ≥ particular relevance to this paper, we state it below: Theorem 1.5 (Erd˝os–R´enyi): Let V = [n], and let X be the random coupon obtained by selecting a singleton from V uniformly at random. Denote by Tm be the time at which every point of V has been covered by at least m of the coupons X ,...,X . Then for every x R, 1 Tm ∈ lim P(Tm < nlogn+(m 1)nloglogn+xn) = e e−x. − n − →∞ In particular Tm is sharply concentrated around nlogn+(m 1)nloglogn. − Continuing work on the 1-uniform exchangeable case, Baum and Billingsley [6] proved results on the asymptotic distribution of the size of C (the number of coupons collected by time t)as a t function of t; Holst [27] later generalised their result to unbalanced 1-uniform random covering variables X. A number of researchers worked on the distribution of the degrees (d (t)) , such v v V ∈ as Eicker, Siddiqui and Mielke [15], Mikhailov [43], Barbour and Holst [5] and Khakimullin and Enatskaya [36], all of whom dealt with the k-uniform case with k > 1 as well. A number of the 4 papers cited above also deal with the unbalanced, 1-uniform case; let us mention in addition the work of Papanicolaou, Kokolakis and Boneh [47], who gave an expression for the expected covering time when X is a randomly chosen 1-uniform random covering variable. In the exchangeable k-uniform case with k > 1, several researchers [40, 25, 57] computed, like Laplace, the expected covering time, giving closed-form formulae. Vatutin and Mikhailov [59] determined the asymptotic distribution of the number of degree zero (i.e. uncovered) vertices, which in turn gives results on the distribution of the covering time. Recently Ferrante and Frigo [22] gave an expression for the expected covering time when X is a k-uniform covering random variable with different v V receiving different weights. In a different ∈ direction, improving results of Sellke, Ivchenko [31] computed the asymptotic distribution of the covering time when n and X is a fixed (i.e. not varying on n) non-uniform exchangeable → ∞ random variable; similar results were also obtained by Johnson and Sellke [33], while a closed-form expression for the expectation of T appeared in Adler and Ross [2]. Finally, Aldous [4] proved a general abstract result about covering times, in connection with randomwalks ongraphs. Tostate hisresult, weneedonemoredefinition. Given arandomcovering variable X and an X-coupon collector, we let B = B(X) denote the set of “holdouts”, which is to say the last subset of V to be covered: B = C C (if the coupon collector does not cover V, T T 1 \ − we set B to be the collection of never-covered elements of V). Theorem1.6 (Aldous): Suppose ET = ω(1). ThenT issharply concentrated around itsexpectation if and only if E (Eτ ) B B = o(1). ET ThepowerofAldous’stheoremisitsgeneralityandthenecessityandsufficiencyofitshypothesis for the concentration of the covering time. However as Aldous observed “[w]ithout any structure being imposed [...] it is not clear how to estimate [E (Eτ )] in order to use these results”. Indeed, B B computing E (Eτ ) = P(A= B)Eτ B B A A V X⊆ requires us to estimate both the probability that a given set A is the “holdout” and to compute its expected covering time, both of which may be non-trivial tasks. Several surveys have been written on coupon collectors, random allocation, urn occupancy problems, etc. Amongst others, let us mention the book of Johnson and Kotz [34] and Kolchin, Sevast’yanov and Chistyakov [39], the surveys of Ivanov, Ivchenko and Medvedev [30] and Kobza, Jacobson and Vaughan [38], and the papers of Holst [28], Stadje [57], Flajolet, Gardy and Thi- monier [23] and McKay and Skerman [41]. 2 Preliminaries: thresholds and elementary bounds 2.1 Coarse threshold ItfollowsfromasimpleapplicationoftheBollob´as–Thomason thresholdtheorem[7]thatacovering process as we have defined it will always have a coarse threshold: 5 Proposition 2.1 (Coarse threshold): Let X be a covering random variable for a set V. Then o(1) if t T P(Ct = V) = 1 o(1) if t ≪ T12. (cid:4) ( 1 − ≫ 2 Thus the covering time T is whp of the same order as T . In the present work, however, we are 1 2 interested in a much sharper form of concentration than the one guaranteed by Proposition 2.1: we want thecovering timeT to besharply concentrated, i.e. wewant thatT/T 1in probability. As 1 2 → weshallseeinthenextsubsection, wecannotingeneral guarantee this kindof sharpconcentration. A question of crucial interest is then what conditions are necessary or sufficient to have sharp concentration for T — and how the value of ET may be computed in such cases. 2.2 Elementary bounds Let X be a covering random variable for an n-set set V. For each v V, let q = P(v X), and v ∈ ∈ set q = min q . We have the following elementary bounds on the location of T and probable ⋆ v v 1 2 location of T. Proposition 2.2: log(2) log(2n) T . 1 log(1 q⋆) ≤ 2 ≤ log(1 q⋆) − − − − What is more, for any fixed ε> 0 logn P T (1+ε) 1 n ε. − ≤ log(1 q ) ≥ − (cid:18) − − ⋆ (cid:19) Proof. For t T we have 1 ≥ 2 1 P(C = V) inf P(v C ) = 1 (1 q )t, t t ⋆ 2 ≤ ≤v V ∈ − − ∈ from which the claimed lower bound on T follows. For the upper bound, t T implies 1 1 2 ≤ 2 1 P(C = V) P(v / C )= (1 q )t n(1 q )t. t t v ⋆ 2 ≤ 6 ≤ ∈ − ≤ − v V v V X∈ X∈ Finally, for the ‘what is more’ statement, note that for t (1+ε) logn , the expected number ≥ log(1 q⋆) of vertices not yet collected is − − E V C = n(1 q )t n ε, t ⋆ − | \ | − ≤ whence by Markov’s inequality with probability at least 1 n ε we have C = V and T t. (cid:4) − t Note that if X is balanced then q = µ for all v V, wh−ere µ := E X . In particular if≤µ = o(n) v n ∈ | | then the bounds above can be rewritten as nlog2 log2 log(2n) nlogn (1+o(1)) = T = (1+o(1)) . µ log 1 µ ≤ 12 ≤ log 1 µ µ − − n − − n (cid:0) (cid:1) (cid:0) (cid:1) 6 Perhaps surprisingly, these elementary bounds are essentially sharp. As we shall show in the next section, the covering time T for the exchangeable k-uniform coupon collector (Example 1.1) is sharply concentrated around the value logn ; in particular if k = o(n), T = (1+o(1))nlogn. −log(1−nk) 12 k We think of this as ‘slow coverage’. On the other hand, there are instances of ‘fast coverage’, discussed in greater detail in Section 4. We give here a simple example. Example 2.3 (Couponcollector with lottery): Set V = [n], and p = c/n for some c =c(n) [0,n]. ∈ Let X be with probability 1 p a singleton from V chosen uniformly at random, and with probability − p the entire set V. Note that X is an exchangeable random covering variable, with expected size c E X = (1 p)+pn = 1+c . | | − − n Proposition 2.4: Let X and V be as in Example 2.3. Assume c = o(n) and c is bounded away from 0. Then: (i) T = (1+o(1))nlog2; 1 c 2 (ii) ET = (1+o(1))n; c (iii) lim P T > xn = e x for any fixed x 0. n→∞ c − ≥ Proof. By Th(cid:0)eorem 1.(cid:1)5, whp the 1-uniform exchangeable coupon collector does not cover [n] in time less than 1nlogn. Thus for time t < 1nlogn, whp T t if and only if we have ‘won the 2 2 ≤ lottery’ bytime T,thatis, if X = [n]forsomei t. Thisevent occurswithprobability 1 (1 p)t. i ≤ − − To obtain part (i) of the proposition, we observe that if t T then 1 ≥ 2 1 +o(1) 1 (1 p)t, 2 ≤ − − yielding t (1 + o(1)) log2 = (1 + o(1))nlog2, and we show similarly that if t T then ≥ log(1−p) c ≤ 21 t (1+o(1))nlog2 to conclude. ≤ c For part (ii), let T be the time at which we first ‘win the lottery’ by receiving all of V as our ′ coupon. We have 1 n ET = tp(1 p)(t 1) = = . ′ − − p c t X Since T T , we have that ET ET . Now from our estimates for the probability of winning the ′ ′ ≤ ≤ lottery by time t above, whp we have T = o(nlogn). Thus by Theorem 1.5, whp T = T , and ′ ′ ET tP(T = t T =T)P(T = T) t P(T = t) o(1) = (1+o(1))E(T ), ′ ′ ′ ′ ′ ≥ | ≥ − t t X X (cid:0) (cid:1) whence we are done. Finally for part (iii), we simply note that whp T = T , and that ′ xn P(T′ > ) = (1 p)xcn = e−x(1+O(n−1)) e−x. c − → 7 (cid:4) Proposition 2.4 shows two things. First of all, the lower bound on T in Proposition 2.2 is 1 2 essentially sharp; indeed taking c = c(n) tending to infinity slowly, we have in Example 2.3 that µ = E X = c(1 + o(1)), and T = (1 + o(1))nlog2. Further, by varying the value of c = c(n) | | 21 µ from Ω( 1 ) to o(n), we can get T to take asympotically any value between the bounds from logn 1 2 Proposition 2.2. Secondly, wecannotingeneralexpectT tobesharplyconcentrated: part(iii)of Proposition2.4 shows that we do not get sharper concentration than the Bollob´as–Thomason-type concentration guaranteed by Proposition 2.1. With this in mind, we next focus on conditions on X which guarantees sharp concentration of the covering time T and/or ‘slow coverage’. 3 General concentration results Let V = [n] and X be a random coupon variable for V. In this section we prove general results establishing (simple, easily checkable) sufficient conditions for sharp concentration of the covering time T(X). We also include some results in the special case where the random coupon variable X is balanced, transitive or exchangeable. Before stating our results, we need to introduce some notation. Our proof strategy involves approximating the discrete-time process of collecting coupons by a continuous-time process. Instead of the coupon collector drawing a random coupon X at integer timepoints,shedrawsarandomcouponX (fromthesamedistribution)attimesgivenbyaPoisson process with parameter 1. The times at which any given coupon is drawn will then be a thinned Poisson process, and the Poisson processes associated with different coupons will be independent. The times at which any particular element x V is drawn will also be a thinned Poisson process, though the Poisson ∈ processes associated with different elements x,y V will not in general be independent. Working ∈ in the continuous rather than in the discrete setting will greatly simplify calculations. For S [n], set h(S) = P(X = S). For every S with h(S) > 0, start a Poisson process with S ⊆ P intensity h(S). Each time an event occurs in , the coupon collector draws the coupon S. Let S P q := h(S) be the total intensity of all coupons covering x, and let q := h(S) be the x S x xy S x,y total inten∋sity of all coupons covering x and y simultaneously. Equivalently, q =∋P(x X) and x P P ∈ q = P(x,y X). Note that q = S h(S) = EX =: µ. xy ∈ x V x S| |· Let Z be the indicator even∈t of the element x not being covered at time t, and let Z := x,t t P P Z . An element x has not been covered by time t if its associated Poisson process with x V x,t inte∈nsity q has had no events in the time interval [0,t]. The probability of this occurring is e qxt, x − P and so the first two moments of Z are: t EZ = e qxt and EZ2 = e (qx+qy qxy)t. t − t − − x V x,y V X∈ X∈ Many of our proofs will use the second moment method to show concentration of Z , which implies t concentration of T(X). 8 3.1 Results Let V = V(n) be an n-set and X = X(n) a covering random variables for V (formally we consider sequences(V(n))n N and(X(n))n N). WeletT = T(X)denotethecoveringtimefortheX-coupon ∈ ∈ collector. Our first result gives us sufficient conditions for EZ (the number of uncollected elements) to t have a sharp transition from ω(1) to o(1). This holds trivially for balanced coupon collectors, but may fail if X is far from balanced — for instance, if some elements of V occur very rarely. For any α R, let q be the α-H¨older mean of the vector of intensities q = (q ), i.e. α x ∈ k k 1 q := 1 qα α (with the usual convention that for α = 0, q is the geometric mean k kα n x x k k0 ( q )1/n). x x (cid:0) P (cid:1) TQheorem 3.1: Set q := min q . If any of the following conditions is satisfied, then there exists ⋆ x x T (n) and T+(n) such that T = (1+o(1))T+, EZ and EZ 0. − − T− T+ → ∞ → (i) There exists t q 1 such that EZ 1 ≫ ⋆− t ≫ (ii) There exists α= o(logn) such that (α+2) q logn q α ⋆ k k− ≤ · (iii) There exists A = exp(ω(r)), which does not depend on n, such that for any r > 0 and all r sufficiently large n, the number of y V satisfying q < rq is at least A . y ⋆ r ∈ If qx 1 logn for all x,y, condition (ii) is met trivially with α = 0; in fact it can be shown qy ≤ 2 that the factor 1 can be replaced by any positive number. So in particular Theorem 3.1 applies to 2 ‘almost balanced’ random coupon variables X. Our second result gives whp bounds on T when correlations are bounded. Theorem 3.2: If there exist C = C(n), T = T (n) and T+ = T+(n) such that all of the following − − are satisfied: (i) EZ and EZ 0, T− T+ → ∞ → (ii) the coupons have C-bounded correlation, i.e. q Cq q for all x = y, xy x y ≤ 6 (iii) Cq¯= o( 1 ), where q¯= E q θ not covered at time T is the expected size of q for θ drawn logn θ − θ uniformly at random from among the uncovered vertices at time T , (cid:2) (cid:12) (cid:3) − (cid:12) then T T(X) T+ whp. − ≤ ≤ The parameter q¯should be viewed as the ‘speed’ of covering at time T , and it is usually hard − to compute exactly. However in order to apply Theorem 3.2 it is enough to give an upper bound on q¯. The simplest such bound, namely q¯ max q , can easily be improved. For instance, it is x x ≤ straight-forward to show that q¯ q for any finite α and all n sufficiently large. This makes α ≤ k k− condition (iii) easy to check in many situations. We can obtain further results when X is assumed to be balanced. The next theorem tells us that if either the coupons are ‘small’ (size o(n)) or the pairwise correlations between the elements of V are ‘not too strong’ then we have sharp concentration for T. Theorem 3.3: Let X be a balanced random coupon variable with µ := E X . | | 9 (i) If there exists t such that (eqxyt 1) = o(n2) and e qxt =ω(1), then T(X) t whp. x,y − x − ≥ (ii) If there exists 1 < β(n) <Pn tending to infinity and q =Po µ with q q for all but at nlogβ xy ≤ most 1n2 ‘bad’ pairs (x,y), then T(X) n(logβ ω(1))(cid:16)whp. (cid:17) β ≥ µ − In particular, if q = o µ and there are at most n1+o(1) such ‘bad’ pairs, then nlogn T(X) = (1 o(1))nlog(cid:16)n whp.(cid:17) ± µ (iii) If all coupons have size at most M, then T(X) n(logn logM ω(1)) whp, for ω(1) tend- ≥ µ − − ing to infinity arbitrarily slowly. In particular, if M = no(1), then T(X) = (1+o(1))nlogn whp. µ (iv) If q = q for all x = y,x = y , and all coupons have size at most M, and T is such that xy x′y′ ′ ′ − 6 6 T = n min o( n ),logn ω(1) , then T T(X) whp. − µ · M − − ≤ In particular,(cid:0)if M = o( n ), the(cid:1)n T(X) = (1+o(1))nlogn whp. logn µ There are examples where the whp lower bounds given by this theorem are sharp, while the upper boundsgiven by the firstmoment method are not; see for instance Example 4.1 in Section 4. Note also that unlike Theorem 1.6, Theorem 3.3 also locates the threshold. For balanced random covering variables we also have good control for both concentration and the covering time when X satisfies an ‘almost negative correlation’ condition. Here below we say that a function m = m(n) is sub-polynomial in n if m = no(1). Theorem 3.4: Let δ > 0 be fixed. Let X be a balanced covering random variable for an n-set V, with P(x X) = c for some c (0,1 δ). Suppose further that we have almost negative ∈ ∈ − correlations, namely that there exist η = o(1/logn) and b = b(n) subpolynomial in n such that for any x V ∈ P(x,y / X) (1 c)2(1+η). ∈ ≤ − holds for all but at most b elements y. Then whp T(X) =(1+o(1)) logn . log(1 c) − − Note that if η = 0 then the correlation condition is the same as the commonly used pairwise negative correlation condition. Recently a substantial theory for negatively correlated random variables has been developed and numerous common examples have been shown to have this and even stronger correlation properties, see [8]. Corollary 3.5: Suppose that X is balanced, has pairwise negative correlation, and P(x X) = ∈ c 1 δ for a fixed δ > 0. Then whp T(X) = (1+o(1)) logn . (cid:4) ≤ − log(1 c) − − If c = o(1) then the equality above may be rewritten as T(X) = (1+o(1))nlogn. EX We next give conditions implying sharp concentration for the covering time o|f a|n exchangeable random variable X around the same value as a uniform exchangeable random variable with the same mean coupon size. Theorem 3.6: Let X be an exchangeable random coupon variable, for V = [n], with maximum coupon size M, average coupon size µ and mean square coupon size χ. If any of the four conditions below holds, then whp T(X) = (1+o(1)) logn (which in the case µ = o(n) can be rewritten as log(1 µ) − −n T = (1+o(1))nlogn). µ 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.