The area of a self-similar fragmentation Jean Bertoin 1 ∗ 1 0 2 n a Abstract J 0 2 We consider the area A = ∞( ∞ X (t))dt of a self-similar fragmentation process 0 i=1 i X = (X(t),t 0) with negative index. We characterize the law of A by an integro- R] ≥ R P differential equation. The latter may be viewed as the infinitesimal version of a recursive P . distribution equation that arises naturally in this setting. In the case of binary splitting, h t thisyieldsarecursiveformulafortheentiremomentsofAwhichgeneralizes knownresults a m for the area of the Brownian excursion. [ 1 Key words: Self-similar fragmentation, area, recursive distributional equation. v 5 6 9 1 Introduction 3 . 1 0 1 1 The distribution of the area AExc = 0 esds of a standard Brownian excursion (es,0 ≤ s ≤ 1) 1 appears in a variety of settings, including random graphs [15, 16], random trees and branching : R v processes [17, 19], order statistics [18], hashing with linear probing [8], ..., not to mention of i X course the study of Brownian motion for its own interest [12, 13] . The entire moments E(Ak ) r Exc a have a special importance, as they are related, for instance, to asymptotics as n for the → ∞ number of connected graphs with n labelled vertices and n+k 1 edges, see [15] and the survey − [11]. We refer to Perman and Wellner [14], Janson [11] and references therein for a detailed presentation and reviews of known results on this topic. The starting point of this work lies in the observation that one can express the area in the form ∞ A = θ(t) dt Exc | | Z0 where θ(t) denotes the Lebesgue measure of the random open set θ(t) = s [0,1] : e > t . s | | { ∈ } The point is that the process θ = (θ(t),t 0) is a self-similar interval-fragmentation in the ≥ ∗ Laboratoire de Probabilit´es et Mod`eles Al´eatoires,UPMC, 4 Place Jussieu, 75252 Paris Cedex 05; France. Email: [email protected] 1 sense of [3, 5]. One can derive an integro-differential equation for the distribution of A from Exc the branching and self-similarity properties of θ. In particular this yields recursive formulas for the entire moments of A that have been obtained in the literature by analytic techniques Exc based on the Feynman-Kac formula or the analysis of continued fractions and of singularities of the generating functions of discrete approximations. The same approach applies more generally to self-similar fragmentation processes, a class of Feller processes with values in the space of mass-partitions ∞ = x = (x ,x ,...) : x x ... 0 and x 1 . m 1 2 1 2 i P { ≥ ≥ ≥ ≤ } i=1 X Specifically, aself-similar fragmentationX = (X(t),t 0)fulfillsthefollowingtwo fundamental ≥ properties. For x [0,1], let us denote by P the law of the version of X started from a x ∈ single fragment of mass x, i.e. X(0) = (x,0,...). First, the self-similarity means that there exists an index α R such that for every x (0,1] the distribution of the rescaled process ∈ ∈ (xX(xαt),t 0) under P is P . Second, the process X satisfies the branching property, 1 x ≥ in the sense that for every mass-partition x = (x ,x ,...), if X(1),X(2),... are independent 1 2 fragmentations with respective laws P ,P ,..., then the process resulting from the decreasing x1 x2 rearrangement of all the fragments of X(1)(t),X(2)(t),... is a version of X(t) started from X(0) = x. We assume that the index of self-similarity α is negative, which implies that small fragments split faster than the large ones. A well-known consequence is that the process of the total mass t ∞ X (t) decreases and reaches 0 in finite time a.s.; in other words the entire 7→ i=1 i mass is eventually ground down to dust. This has been observed first by Filippov [7], see also P Proposition 2(i) in [4]. We may thus define the area ∞ ∞ A = X (t) dt i Z0 i=1 ! X which is the main object of interest in this work. The denomination area is better understood if we remember that a fragmentation admits an interval representation; cf. Section 3.2 in [3]. There exists a nested right-continuous family (θ(t),t 0) of open subsets of the unit interval ≥ such that for every t 0, the sequence of the lengths of the interval components of θ(t) listed ≥ in the decreasing order is precisely X(t) = (X (t),...). If we define a lower semi-continuous 1 path F : [0,1] R by + → F(u) = sup t 0 : u θ(t) , u [0,1], { ≥ ∈ } ∈ 2 then θ(t) = u : F(u) > t , and since ∞ X (t) = θ(t) , we can express A in the form { } i=1 i | | 1 A = F(u)du. Of course F = e is the Brownian excursion in the situation discussed at the 0 P beginning of this introduction. R The area A has another natural interpretation in terms of continuous random trees. Indeed, Haas and Miermont [9] obtained a representation of self-similar fragmentations in terms of some rooted continuous tree T which enjoys a self-similarity and branching properties. More precisely, for every t 0, X(t) = (X (t),...) can be viewed as the ranked sequence of the 1 ≥ masses of connected components of T(t), the subset of the points in T at distance at least t from the root. In this setting, ∞ A = T(t) dt, | | Z0 where T(t) denotes the mass of T(t). Hence A represents the average height in T, i.e. the | | average distance of points to the root, where averaging is taken with respect to the mass measure of T. We also refer to [19] for results on this quantity in the framework of certain discrete random trees. In the next section, we will present our main result which determines the law of A as the unique solution to an intro-differential equation expressed in terms of the characteristics of X. In the case of binary dislocations, this enables us to derive explicit recursive formulas for the moments of A. We recover in particular identities due to Tak´acs [16] for the moments of the area of the Brownian excursion. Section 3 is devoted to the proof of this integro-differential equation. Weshall startby establishing apriori boundsforthemoments ofA. Then we proceed with the simpler case when the dislocation measure is finite, and derive the equation from a recursive distribution equation which is naturally induced by the dynamics of fragmentation. The general case when the dislocation measure is infinite is then deduced by approximation. This relies on a weak limit theorem for the area of fragmentation processes. The proof of the latter is somewhat technical and will be postponed to the final subsection. 2 Main results We denote by ν the dislocation measure of X, so ν is a sigma-finite measure on with no m P atom at the trivial mass-partition (1,0,...) and fulfills the integral condition (1 x )ν(dx) < . (1) 1 − ∞ ZPm We implicitely exclude the degenerate case when ν 0 and further assume absence of erosion. ≡ Roughly speaking, this means that X is a purely discontinuous process that only evolves by 3 sudden dislocations whose rates are determined by ν and the index of self-similarity α. We refer to Chapter 3 of [5] or [3] for background. For an arbitrary mass partition x, we denote by ηx(da) = Px(A da), a 0, ∈ ≥ the law of the area under the probability measure Px for which the fragmentation X starts from X(0) = x. For the sake of simplicity, we will work from now on under the law P = P , 1 i.e. when the fragmentation starts from a single fragment of mass 1, and write then η = η . 1 This induces no loss of generality since, combining self-similarity and the branching property, we get that for every mass-partition x = (x ,x ,...), there is the identity 1 2 ∞ ηx(da) = P x1i−αAi ∈ da (2) ! i=1 X where (Ai)i∈N is a family of i.i.d. copies of A. Note that when x has only finitely many non-zero terms, say x1,...,xn, then ηx can be expressed as a convolution product ηx = ηx1 ∗ ···∗ ηxn where η stands for the image of η by the dilation a y1−αa. Finally, we let µ,f = fdµ y 7→ h i denote the integral of some function f with respect to a measure µ when the integral makes R sense. We are now able to state our main result Theorem 1 Let X be a self-similar fragmentation with index α < 0, dislocation measure ν and without erosion. Then for every 1 function f : R R such that f′(y) = O(yp) as y + C → → ∞ for some p > 0, the law η of A solves η,f′ = ν(dx)( η,f ηx,f ) , (3) h i h i−h i ZPm where the quantities above are finite. Further (3) characterizes η. If we introduce the concave increasing function Φ : R R by + + → ∞ Φ(q) = 1 x1+q ν(dx), (4) − j ZPm j=1 ! X then we immediately see from Theorem 1 that the first moment of A is given simply by E(A) = η,Id = 1/Φ( α). h i − This identity can also be established directly; see the forthcoming Lemma 1. 4 More generally, we shall now derive from Theorem 1 a recursive formula for the entire momentsofA. Forthesakeofsimplicity, weshallfocusonthespecialcaseofbinarydislocations, although more general situations could be dealt with at the price of heavier notation. This means that we assume that the dislocation measure ν has support in the subset of binary mass-partitions x = (x,1 x,0,...),x [1/2,1) . By a slight abuse, we shall then identify { − ∈ } the dislocation measure ν with its image by the map x x , i.e. we view ν as a measure 1 → on [1/2,1). Specializing Theorem 1 to f(x) = xk and applying the binomial formula, we immediately obtain: Corollary 1 Let X be a self-similar fragmentation with index α < 0, binary dislocation mea- sure ν and without erosion. For every integer k 0, let M = E(Ak) denote the k-th moment k ≥ of the area. Then there is the identity k−1 a M = kM + a M M , k 1, k k k−1 j,k j k−j ≥ j=1 X with a = (1 xk(1−α) (1 x)k(1−α))ν(dx) = Φ(k(1 α) 1), k − − − − − Z[1/2,1) and k a = xj(1−α)(1 x)(k−j)(1−α)ν(dx). j,k j!Z[1/2,1) − We stress that (1) ensures the finiteness of a and a . k j,k We now discuss some examples, starting with the case of the Brownian fragmentation A = A which has motivated this work. The Brownian fragmentation has self-similarity index Exc α = 1/2, no erosion, and its dislocation measure is binary and specified by − 2 ν(dx) = dx, 1/2 x < 1; 2πx3(1 x)3 ≤ − p see [3] on its pages 339-340. One gets by symmetry 1 1 x3k/2 (1 x)3k/2 Γ((3k 1)/2) a = − − − dx = 23/2 − k 2πx3(1 x)3 Γ(3k/2 1) Z0 − − p 5 and k 1 x3j/2(1 x)3(k−j)/2 a = − dx j,k j!Z0 2πx3(1−x)3 k!Γ((3j 1)/2)Γ((3(k j) 1)/2) p = − − − √2πj!(k j)!Γ(3k/2 1) − − Following Tak´acs [16], if we set 4√π2−k/2k! M = K , k k Γ((3k 1)/2) − then after some cancellations, Corollary 1 reduces to k−1 K = (3k/4 1)K + K K k k−1 j k−j − j=1 X with K = 1/2. This is the recursive equation found by Tak´acs, which in turn yields a Riccati 0 − type ODE by considering the exponential generating function of the K ; see Flajolet et al. [8]. k We mention the existence of other recursive formulas for the moments of A , see in particular Exc [13] and the discussion in [11]. Similar calculations apply when more generally the dislocation measure is of beta-type, i.e. is binary with ν(dx) = cxβ(1 x)βdx, 1/2 < x < 1 − for some 2 < β < 1. These beta-splitting measures have appeared in works of Aldous [1] − − on cladograms; see also Section 5.1 in [10]. One obtains 2 1 a = 1 xk(1−α) (1 x)k(1−α) xβ(1 x)βdx k c − − − − Z0 = B(β(cid:0)+1,β +1) 2B(β +1+k(cid:1)(1 α),β +1) − − 2(2β +3) 2β +2+k(1 α) = B(β +2,β +2) 2 − B(β +1+k(1 α),β +2) β +1 − β +1 − where B(a,b) = Γ(a)Γ(b)/Γ(a+b) is the beta function, and 2 k 1 a = xβ+j(1−α)(1 x)β+(k−j)(1−α)dx j,k c j!Z0 − k = B(β +j(1 α)+1,β +(k j)(1 α)+1). j! − − − 6 Finally, note that we can also deal with linear combinations of the beta dislocation measures, which covers for instance the case of Ford’s alpha model; see Section 5.2 in [10]. 3 Proof of Theorem 1 This section is devoted to the proof of Theorem 1; it relies in four main steps. In the first sub- section, we establish a priori bounds for the moments of the area, relying on known properties of the so-called tagged fragment. In the second sub-section, we prove the equation (3) in the special case when the dislocation measure is finite. In the third sub-section, we provide the proof of Theorem 1 by approximation, taking for granted a weak convergence result for the area that will be established in the final sub-section. 3.1 Bounds for the moments of the area Thepurpose ofthissubsection istoestablish some aprioribounds onthemoments M = E(Ak) k of the area. Recall the notation (4). Lemma 1 We have M = 1/Φ( α) 1 − and for k 1 ≥ k! M k . k ≤ Φ( α) Φ( kα) − ··· − As a consequence E(exp(cA)) < whenever c < Φ( ), and in particular the law η of A is ∞ ∞ determined by its entire moments. Proof: It is convenient to work in the setting of interval-fragmentation, i.e. when the frag- mentation X describes the ranked sequence of the lengths of the interval components of nested open subsets (θ(t),t 0). Recall from the introduction that θ(t) can be expressed in the form ≥ θ(t) = u [0,1] : F(u) > t and note that for every integer k 1, there is the identity { ∈ } ≥ 1 1 Ak = du ... du F(u ) F(u ). 1 k 1 k ··· Z0 Z0 In other words, we have E(Ak) = E(F(U ) F(U )) 1 k ··· where U ,...,U are i.i.d. uniform variables on [0,1]. This yields 1 k M = E(F(U)) and M kE(F(U)k) 1 k ≤ 7 where U has the uniform distribution on [0,1]. The variable F(U) should be viewed as the lifetime of the tagged-fragment, i.e. it is the first instant t when the size χ(t) of the interval component of θ(t) that contains the randomly tagged point U reaches the absorbing state 0. This variable has the distribution of an exponential functional, ∞ F(U) (l=aw) I = exp(αξ )dt t Z0 where ξ = (ξ ,t 0) is a subordinator with Laplace exponent Φ; see Corollary 2 in [3]. Since t ≥ it is well-known that k! E(Ik) = Φ( α) Φ( kα) − ··· − (cf. for instance Theorem 2 in [6]), the first two claims are proved, and the last ones follow (cid:3) immediately as the function Φ increases. 3.2 The case with finite dislocation rates In this subsection, we assume that the fragmentation process has a finite dislocation measure, i.e. ν( ) (0, ). This means that under P, the process X stays in state (1,0,...) during m P ∈ ∞ an exponential time T with parameter ν( ), and then, independently of the waiting time m P T, jumps at some random mass partition X(T) whose distribution is given by the normalized dislocation measure ν/ν( ). We stress that the jump times of X may nonetheless accumulate m P right after T; in particular then X is not a continuous-time Markov chain. Indeed, the first dislocation may produce fragments of arbitrarily small sizes, which then split again almost instantaneously by self-similarity. The proof of the following weaker version of Theorem 1 in this setting is straightforward. Proposition 1 Assume that X has no erosion and finite dislocation measure ν. Then for every 1 function f : R R such that the derivative f′ has a finite limit at , we have + C → ∞ η,f′ = ν(dx)( η,f ηx,f ) . h i h i−h i ZPm Proof: An application of the strong Markov property at the first dislocation time T and (2) yields the recursive distributional equation (see the survey [2] for much more this topic) ∞ A = T + X (T)1−αA i i i=1 X where (Ai)i∈N is a sequence of i.i.d. copies of A which is further independent of X(T). This 8 entails ∞ 1 ℓ(q) = ν(dx) ℓ(x1−αq), q 0 ν( )+q i ≥ Pm ZPm i=1 Y where ℓ(q) = E(exp( qA)) is Laplace transform of the area. By rearrangement, we arrive at − ∞ qℓ(q) = ν(dx) ℓ(q) ℓ(x1−αq) , − − i ZPm i=1 ! Y which is the equation in the statement specified for f(a) = e−qa. This establishes our claim by a standard application of the Stone-Weierstrass theorem (recall that that A has finite moments). (cid:3) Let us briefly discuss the elementary example when ν = δ . We thus start with a (1/2,1/2,0,...) single fragment of unit size which splits in two fragments each of size 1/2 after an exponential time with parameter 1, and so on. It should be plain that the area can then be expressed in the form A = e +2α−1(e +e )+22(α−1)(e +e +e +e )+ , 0,1 1,1 1,2 2,1 2,2 2,3 2,4 ··· where the e for j = 1,...,2i and i = 0,1,... are i.i.d. standard exponential variables. The i,j Laplace transform of A is thus given by ∞ 2n 1 ℓ(q) = , 1+2n(α−1)q n=0(cid:18) (cid:19) Y and the equation qℓ(q) = ℓ(q) ℓ(2α−1q)2 − − provided by Proposition 1 can be checked directly. 3.3 Proof of Theorem 1 by approximation In this subsection, we shall derive Theorem 1 by approximation from the case when the dislo- cation measure is finite, taking for granted the weak convergence of the corresponding areas. Specifically, we introduce the finite measures ν(n)(dx) = 1 ν(dx), x , {1−x1>1/n} ∈ Pm where n isa sufficiently largeinteger so that ν(n) 0. Wewrite A(n) for thearea of a self-similar 6≡ fragmentation process with index α, dislocation measure ν(n) and without erosion, and denote 9 by η(n) the distribution of A(n). Recall (2) and set for a generic mass-partition x ∞ η(n)(da) = P x1−αA(n) da x i i ∈ ! i=1 X where (A(n) : i N) is a sequence of i.i.d. copies of A(n). The following crucial lemma will be i ∈ established in the next sub-section. Lemma 2 The sequence (η(n),n N) converges weakly to η as n . ∈ → ∞ The next step to the proof of Theorem 1 is the following technical result. Lemma 3 (i) Let f : R R be continuous and bounded. Then for every x , + m → ∈ P lim ηx(n),f = ηx,f . n→∞h i h i (ii) Let f : R R be a 1 function with f′(y) = O(yp) as y for some p > 0, and + → C → ∞ set f′ = sup f′(x) /(1 + x)p : x 0 . There is a constant c depending only on p and the k k {| | ≥ } characteristics of the fragmentation such that for every mass-partition x and every n η(n),f η(n),f c f′ (1 x ). x 1 |h i−h i| ≤ k k − Proof: (i) Denote by κ the cumulant of A, i.e. E(exp( qA)) = exp( κ(q)) for q 0, and by − − ≥ κ(n) that of A(n). If (A(n),i N) is a sequence of i.i.d. copies of A(n), then we have i ∈ ∞ ∞ E exp q x1−αA(n) = exp κ(n)(x1−αq) . − i i − i !! ! i=1 i=1 X X We know from Lemma 2 that lim κ(n)(x1−αq) = κ(x1−αq) for every q 0 and i N. n→∞ i i ≥ ∈ Further, κ(n) is a concave increasing function with κ(n)(0) = 0, and its derivative at 0 is given by E(A(n)). Recall also from Lemma 1 that E(A(n)) = 1/Φ(n)( α) where − ∞ ∞ Φ(n)(q) = 1 x1+q ν(n)(dx) = 1 1 x1+q ν(dx). − i {1−x1>1/n} − i ZPm i=1 ! ZPm i=1 ! X X Plainly the sequence Φ(n)( α) increases and there are thus the bounds − κ(n)(x1−αq) x1−αq/Φ(n)( α) cx1−αq. i ≤ i − ≤ i 10