ebook img

Limit theorems for Betti numbers of random simplicial complexes PDF

0.49 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Limit theorems for Betti numbers of random simplicial complexes

LIMIT THEOREMS FOR BETTI NUMBERS OF RANDOM SIMPLICIAL COMPLEXES MATTHEWKAHLEANDELIZABETHMECKES 1 1 0 2 Abstract. Therehavebeenseveralrecentarticlesstudyinghomologyofvar- n ious types of random simplicial complexes. Several theorems have concerned a thresholds for vanishing of homology, and in some cases expectations of the J Bettinumbers. Howeverlittleseemsknownsofaraboutlimitingdistributions 8 ofrandomBettinumbers. 1 In this article we establish Poisson and normal approximation theorems for Betti numbers of different kinds of random simplicial complex: Erd˝os- ] R´enyirandomcliquecomplexes,randomVietoris-Ripscomplexes,andrandom R Cˇechcomplexes. Theseresultsmaybeofpracticalinterestintopologicaldata P analysis. . h t 1. Introduction a m Several papers have recently appeared concerning the topology of random sim- [ plicial complexes [11, 2, 10, 12, 13, 16, 9]. The results so far identify thresholds for vanishing of homology, or compute the expectation of the Betti numbers E[βk] 3 (i.e. the expected rank of these groups). In this article we prove Poisson and nor- v malapproximationtheoremsforβ forthreemodelsofrandomsimplicialcomplex. 0 k 3 Thecomplexesthemselvesaredefinedpreciselyandgivenfurthermotivationinthe 1 following sections but we first outline our results. 4 The first model considered is that of the Erd˝os-R´enyi random clique complex . 9 X(n,p), a higher dimensional analogue of the Erd˝os-R´enyi random graph G(n,p). 0 Itwasshownin[11]thatforeachk andacertainrangeofp=p(n), β (cid:54)=0asymp- k 0 ˙ totically almost surely (a.a.s), and in this regime, a formula for the asymptotic 1 size of E[β ]in terms of p is given. (Outside of this regime it is conjectured that : k v β =0 a.a.s. and some evidence for the conjecture is given in [11].) Here we prove k i X a Central Limit Theorem for β . That is, we show that k r β −E[β ] a k k ⇒N(0,1), (cid:112) Var[β ] k as n→∞, where N(0,1) is the normal distribution with mean 0 and variance 1. The second model considered is the random Cˇech complex. This model is a higher-dimensional analog of the random geometric graph; the underlying graph is a random geometric graph and the presence of (k−1)-dimensional faces is deter- mined by k-fold intersections of balls centered about the vertices. Cˇech complexes Date:January19,2011. M. Kahle’s research was supported by the NSF Research Training Group grant in geometry andtopologyofStanfordUniversity. E. Meckes’s research was supported by an American Institute of Mathematics Five-year Fel- lowshipandNSFgrantDMS-0852898. 1 2 MATTHEWKAHLEANDELIZABETHMECKES Figure1. TheBettinumbersofX(n,p)plottedverticallyagainst edge probability p; in this example n = 100. Computation and graphic courtesy of Afra Zomorodian. are homotopy equivalent to Edelsbrunner and Mu¨cke’s alpha shapes, widely ap- plied in computational geometry and topology [6]. The analysis needed to obtain limit theorems for the Betti numbers of random Cˇech complexes is more subtle that what is needed for the Erd¨os-R´enyi model; to prove the normal and Poisson approximation theorems we must first establish limit theorems for certain hyper- graph counts, extending some of Mathew Penrose’s results for subgraph counts for geometric random graphs [15]. The final type of complex considered is the random Vietoris-Rips complex, de- noted VR(n,r). This is similar to the random Cˇech complex; the construction is to take the clique complex of a random geometric graph. (A useful reference for geometricrandomgraphsis[15].) Thetopologyisverydifferentthanfortheclique complex of the Erd˝os-R´enyi random graph; for the contrast between X(n,p) and VR(n,r)seeFigures1and2. Theanalysisneededtoobtainlimittheoremsforthe BettinumbersofVR(n,r)isneverthelessessentiallyidenticaltothatneededforthe random Cˇech complex. A minor example of this fact is that in both cases, since β 0 countsthenumberofconnectedcomponentsfortheCˇechandRipscomplexes,β is 0 actually the same in each of these cases and is equal to the number of components oftherandomgeometricgraph. ThishasalreadybeentreatedindetailbyPenrose [15], and so when convenient we will restrict attention to β for k ≥1. k The techniques throughout the paper are a combination of inequalities derived from combinatorial and topological considerations with Stein’s method. (For an introduction to topological combinatorics see [4]; for a survey of Stein’s method in proving Poisson approximation theorems see [5], and for an introduction to Stein’s method for normal approximation, see [17].) 1.1. Notation and conventions. Throughout this article, we use Bachmann- Landaubig-O, little-O, andrelatednotations. Inparticular, fornon-negativefunc- tions g and h, we write the following. LIMIT THEOREMS FOR BETTI NUMBERS 3 • g(n)=O(h(n)) means that there exists n and k such that for n>n , we 0 0 have that g(n) ≤ k·h(n). (i.e. g is asymptotically bounded above by h, up to a constant factor.) • g(n)=Ω(h(n)) means that there exists n and k such that for n>n , we 0 0 have that g(n) ≥ k·h(n). (i.e. g is asymptotically bounded below by h, up to a constant factor.) • g(n)=Θ(h(n)) means that g(n)=O(h(n)) and g(n)=Ω(h(n)). (i.e. g is asymptotically bounded above and below by h, up to constant factors.) • g(n) = o(h(n)) means that for every (cid:15) > 0, there exists n such that 0 for n > n , we have that g(n) ≤ (cid:15) · h(n). (i.e. g is dominated by h 0 asymptotically.) • g(n) = ω(h(n)) means that for every k > 0, there exists n such that for 0 n>n , we have that g(n)≥k·h(n). (i.e. g dominates h asymptotically.) 0 We may also write An (cid:39)Bn if limn→∞ BAnn =1, and An (cid:46)Bn if there is a constant c such that A ≤cB for all n. n n A sequence {X }∞ of random variables is said to converge weakly to a limit- n n=1 ing random variable X (written X ⇒ X) if lim E[f(X )] = E[f(X)] for all n n→∞ n bounded continuous functions f (there are several other equivalent definitions). The total variation distance between random variables X and Y is defined by (cid:12) (cid:12) dTV(X,Y):=sup(cid:12)E[f(X)]−E[f(Y)](cid:12), f with the supremum taken over all continuous functions bounded by one. Clearly, if d (X ,X) → 0 as n → ∞, then X ⇒ X; however, the topology induced by TV n n the total variation distance is stronger than the topology of weak convergence. TheL -WassersteindistanceorKantorovich-RubensteindistancebetweenX and 1 Y is defined by (cid:12) (cid:12) d1(X,Y):=sup(cid:12)E[f(X)]−E[f(Y)](cid:12), f where the supremum is over all functions f with sup |f(x)−f(y)| ≤ 1. This dis- x(cid:54)=y |x−y| tance also induces a topology stronger than the topology of weak convergence. Finally,thenormaldistributionwithmeanµandvarianceσ2isdenotedN(µ,σ2), and the distribution function of the standard normal distribution is denoted Φ(t). 2. Erdo˝s-Re´nyi random clique complexes Perhapsthefirsttypeofrandomsimplicialcomplexstudiedwasthe1-dimensional version studied by Erd˝os and R´enyi [7]. Definition 2.1. TheErd˝os-R´enyi random graphG(n,p)istheprobabilityspaceof all graphs on vertex set [n] = {1,2,...,n} with each edge included independently with probability p. The“cliquecomplex”isusedtogeneralizeG(n,p)fromgraphstohigherdimen- sional simplicial complexes. Definition 2.2. The clique complex X(H) of a graph H is a the simplicial com- plex with vertex set V(H) and a face for each set of vertices spanning a complete subgraph of H. 4 MATTHEWKAHLEANDELIZABETHMECKES Inotherwords,thecliquecomplexX(H)ofagraphH isthemaximalsimplicial complex with 1-skeleton H. This section concerns the clique complex of the Erd˝os-R´enyi random graph, i.e. X(G(n,p)). For simplicity in notation, this is denoted X(n,p). There are several motivations for using X(n,p) as a model of a random simpli- cialcomplex. OnemotivationisthatX(n,p)providesanaturalhigher-dimensional generalization of G(n,p), which has proved extremely useful in graph theory as well as in applications. (Other higher-dimensional generalizations are studied in [2, 12, 13].) Another motivation comes from the fact that every simplicial com- plex is homeomorphic to the clique complex of some graph (e.g. by barycentric subdivision) [8]. One interesting feature of X(n,p) is that it provides homological analogues of the Erd˝os-R´enyi theorem, but in a non-monotone setting: If edges are added at random to an empty graph, the Erd˝os-R´enyi theorem characterizes the number of edges needed before the graph becomes connected. Connectivity is a monotone graph property – if one adds edges to a connected graph, it is still connected. Topologically, connectivity is equivalent to a statement about zeroth homology H (G(n,p)) but if one asks about H (X(n,p)), k >0, there is a problem – adding 0 k edges generates higher k-dimensional faces and (k +1)-dimensional faces at the same time. Since generators and relations are both being added, there is no reason that things have to behave in a monotone way. In fact, it is not just that things might not be monotone; they are non-monotone in an essential way. In particular, there seem to be two thresholds for higher homology – one where H passes from k vanishing to non-vanishing, and another where it passes back to vanishing. Thefollowingtheoremwasprovedin[11]. Foranyfixedk >0,letβ denotethe k dimension of kth homology, i.e. β =dim[H (∆,Q)]. k k Theorem 2.3. If p=ω(n−1/k) and p=o(n−1/(k+1)) then E[β (X(n,p))] 1 lim k = . n→∞ nkp(k+21) (k+1)! (In [11] explicit nontrivial homology classes are exhibited, and several partial conversesofTheorem2.3areproved;inparticularitisshownthatifp=O(n−1/k−(cid:15)) or p=Ω(n−1/(2k+1)+(cid:15)) for some constant (cid:15)>0, then a.a.s. β =0.) k The remainder of this section is devoted to showing that in the same regime, β k obeys a central limit theorem. Theorem 2.4. If p=ω(n−1/k) and p=o(n−1/(k+1)) then β (X(n,p))−E[β (X(n,p))] k k ⇒N(0,1). (cid:112) Var[β ] k Proof. For a finite simplicial complex ∆, let f (∆) (or simply f if context is clear) i i denote the number of i-dimensional faces of ∆. A useful fact when proving Theo- rems 2.3 and 2.4 is that β satisfies the following “Morse” inequalities: k (1) −f +f −f ≤β ≤f , k−1 k k+1 k k for all k. These inequalities follow from the definition of simplicial homology and the rank-nullity law [8]. The next observation to make is that X(n,p) is a clique complex, so f counts k the number of (k+1)-cliques. Since there are (cid:0) n (cid:1) possible (k+1)-cliques and k+1 LIMIT THEOREMS FOR BETTI NUMBERS 5 (k+1) each appears with probability p 2 , E[f ] 1 lim k = . n→∞nk+1p(k+21) (k+1)! If p=ω(n−1/k) then E[fk−1] = nkp(k2) = 1 =o(1), E[fk] nk+1p(k+21) npk and the same argument shows that if p=o(n−1/(k+1)) then E[f ] k+1 =o(1). E[f ] k That is, in the regime of Theorems 2.3 and 2.4, E[f ] lim k =1, n→∞E[−fk−1+fk−fk+1] which, in light of (1), reproves Theorem 2.3. Let f˜ := −f +f −f . The following claim together with (1) is used to k k−1 k k+1 show that β satisfies a central limit theorem. k Claim 2.5. (i) Var(f ) lim k =1. n→∞Var(f˜) k (ii) f −E[f ] k k ⇒N(0,1) asn→∞. (cid:112) Var(f ) k (iii) f˜ −E[f˜] k k ⇒N(0,1) asn→∞. (cid:113) Var(f˜) k For t∈R, it follows from (1) that (cid:34) (cid:35) (cid:34) (cid:35) (cid:34) (cid:35) f −E[f ] β −E[f ] f˜ −E[f ] P k k ≤t ≤P k k ≤t ≤P k k ≤t . (cid:112) (cid:112) (cid:112) Var(f ) Var(f ) Var(f ) k k k The left-hand side tends to Φ(t) as n → ∞ by part (ii) of the claim. For the right-hand side, let (cid:15)>0 and observe that (2) P(cid:34)f(cid:112)˜k−E[fk] ≤t(cid:35)≤Pf(cid:113)˜k−E[f˜k] ≤t−(cid:15)+P(cid:12)(cid:12)(cid:12)(cid:12)f(cid:113)˜k−E[f˜k] − f(cid:112)˜k−E[fk](cid:12)(cid:12)(cid:12)(cid:12)>(cid:15) Var(fk) Var(f˜k) (cid:12)(cid:12) Var(f˜k) Var(fk)(cid:12)(cid:12)  (cid:12) (cid:12)  +Pf(cid:112)˜k−E[fk] ≤t,(cid:12)(cid:12)(cid:12)f(cid:113)˜k−E[f˜k] −t(cid:12)(cid:12)(cid:12)≤(cid:15). Var(fk) (cid:12)(cid:12) Var(f˜k) (cid:12)(cid:12) 6 MATTHEWKAHLEANDELIZABETHMECKES Now, it follows from part (iii) of the claim that the first term of the right-hand side of (2) tends to Φ(t−(cid:15)) and that the last is asymptotically bounded above by Φ(t+(cid:15))−Φ(t−(cid:15)). For the second term, first require n to be large enough that (cid:12) (cid:12) (cid:12)(cid:12)(cid:12)(cid:112)E[fk] − (cid:113)E[f˜k] (cid:12)(cid:12)(cid:12)< (cid:15). (cid:12)(cid:12) Var(fk) Var(f˜k)(cid:12)(cid:12) 2 This condition together with Chebychev’s inequality implies that (cid:12) (cid:12)   (cid:12) (cid:12)  P(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)f(cid:113)˜kV−arE([ff˜˜kk)] − f(cid:112)˜kV−arE([ffkk)](cid:12)(cid:12)(cid:12)(cid:12)(cid:12)>(cid:15)≤Pf˜k(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:112)Va1r(fk) − (cid:113)Va1r(f˜k)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)> 2(cid:15) (cid:113) 2 Var(f˜) k ≤4(cid:15)−2(cid:112) −1 , Var(f ) k which tends to zero for fixed (cid:15)>0 by part (i) of the claim. It thus follows that the right-hand side of (2) is asymptotically bounded above by Φ(t+(cid:15)) as n → ∞; as (cid:15) is arbitrary, this completes the proof of the central limit theorem for β , modulo k proof of the claim. To prove part (i) of the claim, first write (cid:88) f = ξ , k A A⊆{1,...,n} |A|=k+1 where ξ is the indicator that A spans a face in X(n,p); that is, that A spans A a complete graph in G(n,p). Then, enumerating pairs of subsets of size k+1 of {1,...,n} by the size r of their interesection, (cid:88) (cid:20)(cid:18) n (cid:19) (k+1)(cid:21)2 Var(fk)= E[ξAξB]− k+1 p 2 A,B =(cid:18) n (cid:19)k(cid:88)+1(cid:18)k+1(cid:19)(cid:18)n−k−1(cid:19)p2(k+21)−(r2)−(cid:20)(cid:18) n (cid:19)p(k+21)(cid:21)2. k+1 r k+1−r k+1 r=0 Now,itisnothardtoseethatintherangeofpconsideredhere,onlyther =0,1,2 terms contribute in the limit; there is cancellation of the terms of order nk+1 and nk, so that the main contribution is in fact from the r =2 term and (3) lim n−2kp(−2(k+21)+1)Var(fk)=ck, n→∞ for some constant c depending only on k. From this it follows immediately that Var(f ) Var(f ) k−1 =o(1) and k+1 =o(1), Var(f ) Var(f ) k k for p in the range specified in the statement of the theorem. Expanding the same way as above, it is clear that (cid:18) (cid:19) (cid:34)k+1(cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19)(cid:35) Cov(fk,fk+1)= k+n 1 p(k+21)+(k+22) (cid:88) k+r 1 nk−+k2−−r1 p−(r2)− k+n 2 ; r=0 LIMIT THEOREMS FOR BETTI NUMBERS 7 again there is cancellation of the terms of order nk+2 and nk+1 so that the leading contribution is from the r =2 term and lim n−2k−1p−((k+21)+(k+22)−1)Cov(fk,fk+1)=ck n→∞ for a (different) constant c depending only on k. Thus in the range of p being k considered, Cov(f ,f ) k k+1 =o(1). Var(f ) k In exactly the same way, one can show that Cov(f ,f ) Cov(f ,f ) k k−1 =o(1) and k−1 k+1 =o(1), Var(f ) Var(f ) k k completing the proof of part (i) of the claim. Theproofsofthesecondandthirdpartsbothfollowfromanabstractnormalap- proximation theorem for dissociated random variables proved (via Stein’s method) in [3]. Part (ii) is in fact proved there; the following is a a straightforward mod- ification of their proof which obtains a central limit theorem for the lower bound f˜. One can also recover the proof of part (ii) from what is given below, simply by k ignoring the extra terms present in f˜ beyond those coming from f . k k A set {X :j=(j ,...,j )∈J} for J a set of r-tuples is dissociated if two sub- j 1 r collections of the random variables {X :j∈K} and {X :j∈L} are independent j j (cid:80) whenever (∪ {j ,...,j })∩(∪ {j ,...,j }) = ∅. Let W := X , and for j∈K 1 r j∈L 1 r j∈J j each j ∈ J, let L := {k ∈ J : {k ,...,k }∩{j ,...,j } =(cid:54) ∅}. That is, L is a j 1 r 1 r j dependency neighborhood for j. If EX = 0 and EW2 = 1, then it is shown in [3] j that (cid:88) (cid:88) (cid:104) (cid:105) (4) d (W,Z)≤K E|X X X |+E|X X |E|X | , 1 j k l j k l j∈Jk,l∈Lj where Z is a standard normal random variable. To show that f˜ satisfies a central limit theorem, let the index set J be the k potential edge sets for complete graphs on k+e (e ∈ {0,1,2}) vertices in G(n,p); that is, an element of J is a (cid:0)k+e(cid:1)-tuple of edges spanning a given set of k +e 2 vertices. Each j∈J can thus be associated with its spanning set A of vertices. If j the random variables X are defined by j X :=σ−1(ξ −E[ξ ]), j Aj Aj where σ2 =Var(f ), then {X } are evidently dissociated. k j The second half of the sum from (4) is fairly straightforward to bound in this context. For each j, partition L into the sets Le of indices whose spanning sets j j have size k+e. Observe that for each j, if e =|L |−k, then j j (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) n n−k−e n−k−e |Le|= − j −(k+e ) j =O(nk+e−2). j k+e k+e j k+e−1 8 MATTHEWKAHLEANDELIZABETHMECKES Decomposing as in the variance estimate by the size r of the intersection of A and j A and using the bound above for |Lf| yields k j (cid:88) (cid:88) (cid:88) E|X X |E|X | j k l j∈Jk∈Lej l∈Lfj ≤σ−3cknk+f−2p(k+2f)(cid:18)k+ne (cid:19)k+(cid:88)(ej∧e)(cid:18)k+rej(cid:19)(cid:18)nk−+ke−−erj(cid:19)p(k+2e)+(k+2ej)−(r2) j r=2 ≤σ−3ckn3k+ej+e+f−4p(k+2ej)+(k+2e)+(k+2f)−1, since the r =2 term yields the top-order contribution in the range of p considered here. Moreover, it is easy to check that this expression is maximized for e = e = j f = 1. Combining this estimate with (3) shows that the contribution to the error from the second sum is bounded above by √ c p σ−3ckn3k−1p3(k+21)−1 ≤ kn , which tends to zero as n tends to infinity. The first half of the sum is bounded similarly, although it requires that the intersections of three spanning sets of vertices be considered. Let r denote the numberofpointscommontoA andA . Letp :=|A ∩A ∩Ac|,p :=|A ∩A ∩A | j k 1 j l k 2 j l k and p :=|Ac∩A ∩A |. Then 3 j l k E|XjXkXl|≤cσ−3p(k+2ej)+(k+2ek)+(k+2el)−(p1+2p2)−(p2+2p3)−(r2)+(p22), where the constant c simply accounts for the fact that the X have been centered. j The number of ways to choose j, k and l is (cid:18) (cid:19)(cid:18) (cid:19)(cid:18) (cid:19)(cid:18) (cid:19) n k+e n−k−e k+e −r j j j k+e r k+e −r p j k 1 (cid:18) (cid:19)(cid:18) (cid:19)(cid:18) (cid:19) r k+e −r n−2k−e −e +r × k j k . p p k+e −p −p −p 2 3 l 1 2 3 Combiningthesetwofacts,itisperhapsslightlyunpleasantbutnottoohardtosee that the main contribution to the error arises from the case that r =2, p +p =2 1 2 (in fact only when p (cid:54)=0), and e =e =e =1. It follows that 1 j k l (cid:88) (cid:88) E|XjXkXl|≤σ−3ckn3k−1p3(k+21)−2 ≤ nc√kp, j∈Jk,l∈Lj which also tends to zero as n tends to infinity. This completes the proof of part (iii) of the claim, finishing the proof of Theorem 2.4. (cid:3) 3. Random Cˇech complexes The second model of random simplicial complex considered is the random Cˇech complex. This is a higher-dimensional analog of a geometric random graph, con- structed explicitly below. In order to analyze this model, we use the same tech- niques used by Penrose [15] in his study of subgraph counts of random geometric graph. The additional spacial dependence that is inherent in the random variables we consider presents an additional technical challenge, and means that Penrose’s results cannot be applied directly to the problem. LIMIT THEOREMS FOR BETTI NUMBERS 9 Supposethat{X }∞ isani.i.d.sequenceofrandomvectorsinRd,withbounded i i=1 densityf. Let{r }∞ ⊆R ,suchthatnrd −n−→−−∞→0(theso-called“sparse”regime n n=1 + n ofgeometricrandomgraphs),andconstructarandomCˇechcomplexC(X ,...,X ) 1 n on {X }n asfollows. If|X −X |≤2r , putan edgebetween X and X ; thatis, i i=1 i j n i j the 1-skeleton of the complex is a random geometric graph. More generally, make theconvexhullof{X ...,X }afaceofthecomplexiftheballsofradiusr about i1 ik n the points {X ...,X } have non-trivial intersection. i1 ik Definition 3.1. Thepoints{x ,...,x }⊆Rd formanempty (k−1)-simplexwith 1 k (cid:92) respect to r if for each j ∈{1,...,k}, the intersection B (x ) is non-empty, o r j 1≤j≤k j(cid:54)=jo (cid:92) but the intersection B (x )=∅. r j 1≤j≤k Let h (x ,...,x ) be the indicator that {x ,...,x } form an empty (k −1)- r 1 k 1 k simplex with respect to r, and for a multiindex i=(i ,...,i ) with 1≤i <···< 1 k 1 i ≤n, let ξ =h (X ,...,X ). Let k i rn i1 ik (cid:88) S := ξ ; n,k i i=(i1,...,ik) 1≤i1<···<ik≤n that is, S is the number of empty (k−1)-simplices in C(X ,...,X ). Another n,k 1 n objectofequalimportanceinwhatfollowsisS(cid:101)n,k, thenumberofisolatedemptyk- simples. Thatis,ifζ istheindicatorthat{X ,...,X }formanempty(k− (i1,...,ik) i1 ik 1)-simplex with respect to r and that there are no edges between {X } n j j∈{i1,...,ik} and {X } , then j j∈/{i1,...,ik} (cid:88) S(cid:101)n,k = ζi. i=(i1,...,ik) 1≤i1<···<ik≤n The random variables Sn,k and S(cid:101)n,k are related to βk−1 as follows. Firstly, β is bounded below by the number of isolated empty k-simplices; that is, k−1 βk−1(C(X1,...,Xn)) ≥ S(cid:101)n,k. Furthermore, any contribution to βk−1 not coming from an isolated empty (k−1)-simplex comes from a component in C(X ,...,X ) 1 n on at least k+1 vertices. In order for such a component to contribute to β , k−1 (k−2)-dimensional faces. Such faces are necessarily triangulated (by the construc- tion of C(X ,...,X )), and so any further contribution to β contains at least 1 n k−1 onesimplexonk−1vertices,witheitheranextraedgeattachedtoeachoftwodif- ferent vertices (terminating in different places), or else an extra path of length two attached to one vertex. Let Y denote the number of simplices in C(X ,...,X ) n,k 1 n on k−1 vertices with two extra edges attached, counted once for each simplex on k−1 vertices which occurs and for each distinct pair of simplex vertices with an extra edge. Similarly, let Z denote the number of simplices in C(X ,...,X ) on n,k 1 n k−1 vertices with at least one extra path of length 2 attached, counted once for each simplex which occurs and for each vertex with a path of length two attached. The argument above shows that (5) S(cid:101)n,k ≤βk−2(C(X1,...,Xn))≤Sn,k+Yn,k+Zn,k, where the trivial bound S(cid:101)n,k ≤Sn,k has also been used. 10 MATTHEWKAHLEANDELIZABETHMECKES Thelimitingdistributionofβ willfollowasintheprevioussectionbyproving k−1 thesamelimittheoremsfortheupperandlowerboundsof (5). Thetheoremisthe following. Theorem 3.2. (i) If nkrd(k−1) →0 as n→∞, then n β (C(X ,...,X ))→0 a.a.s. as n→∞. k 1 n (ii) If nkrd(k−1) →α∈(0,∞) as n→∞, then n d (β (C(X ,...,X )),Y)≤cnrd,, TV k 1 n n whereY isaPoissonrandomvariablewithE[Y]=E[β ]andcisaconstant k depending only on d, k, and f. (iii) If nkrd(k−1) →∞ as n→∞ and nrd →0 as n→∞, then n n β(C(X ,...,X ))−E[β(C(X ,...,X ))] 1 n 1 n ⇒N(0,1). (cid:112)Var(β(C(X ,...,X ))) 1 n The first step in proving Theorem 3.2 is to determine the order in n and r of n E[S(cid:101)n,k] and E[Sn,k +Yn,k +Zn,k]. In fact, slightly more is needed. Let A be an open subset of Rd such that vol(∂A) = 0. Let X be a finite subset of Rd, and call x ∈ X the “left-most” point of X (denoted LMP(X)) if x is the first element of X when X is ordered lexicographically. Now, define S to be the number of n,k,A empty (k−1)-simplices formed from X ,...,X , such that the left-most point of 1 n the k-simplex is in A. Define S(cid:101)n,k,A in the analogous way. Lemma 3.3. For k >1, let (cid:18)(cid:90) (cid:19)(cid:90) µ := f(x)kdx h (0,y ,...,y )d(y ,...,y ). A 1 2 k 2 k A (Rd)k−1 Then µ nl→im∞n−krn−d(k−1)E[Sn,k,A]=nl→im∞n−krn−d(k−1)E[S(cid:101)n,k,A]= kA! . Observe that µ depends only on f and A and can be trivially bounded by A (cid:107)f(cid:107)k−1(2dθ )k−1, where θ is the volume of the unit ball in Rd. ∞ d d Lemma 3.4. Let (cid:18)(cid:90) (cid:19)(cid:90) µ(cid:48) := f(x)k+1dx g1,2(0,y ,...,y )dy ···dy , 1 1 k 1 k Rd (Rd)k where g1,2(x ,...,x ) is the indicator that {x ,...,x } form a simplex (where 1 0 k 0 k−2 a complex is built as described on x ,...,x with threshhold radius 1) and that 0 k {x ,x } and {x ,x } are edges. Let 0 k−1 1 k (cid:18)(cid:90) (cid:19)(cid:90) µ(cid:48)(cid:48) := f(x)k+1dx k1(0,y ,...,y )dy ···dy . 1 1 k 1 k Rd (Rd)k Let k1(x ,...,x ) be the indicator that {x ,...,x } form a simplex and that 1 0 k 0 k−2 {x ,x } and {x ,x } are edges. Then 0 k−1 k−1 k µ(cid:48) lim n−(k+1)r−dkE[Y ]= , n→∞ n n,k 2(k−3)!

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.