On the exponential generating function for non-backtracking walks Arrigo, Francesca and Grindrod, Peter and Higham, Desmond J. and Noferini, Vanni 2017 MIMS EPrint: 2017.34 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: http://eprints.maths.manchester.ac.uk/ And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M13 9PL, UK ISSN 1749-9097 ON THE EXPONENTIAL GENERATING FUNCTION FOR NON-BACKTRACKING WALKS FRANCESCA ARRIGO†, PETER GRINDROD‡, DESMOND J. HIGHAM§, AND VANNI NOFERINI¶ Abstract. We derive an explicit formula for the exponential generating function associated with non-backtracking walks around a graph. We study both undirected and directed graphs. Our resultsallowustoderivecomputableexpressionsfornon-backtrackingversionsofnetworkcentrality measures based on the matrix exponential. We find that eliminating backtracking walks in this contextdoesnotsignificantlyincreasethecomputationalexpense. Weshowhowthenewmeasures maybeinterpretedintermsofstandardexponentialcentralitycomputationonacertainmultilayer network. Insights from this block matrix interpretation also allow us to characterize centrality measures arising from general matrix functions. Rigorous analysis on the star graph illustrates the effect of non-backtracking and shows that unwantedlocalization effects can be eliminated when we restricttonon-backtrackingwalks. Wealsoinvestigatethelocalizationissueonsyntheticnetworks. Key words. generating function, inverse participation ratio, Katz, localization, loops, multi- layer,graphspectrum,backtrackingwalks. AMS subject classifications. 05C90,05C50,15A15,65F50 1. Introduction. 1.1. Motivation. Theconceptofawalkonagraphisverynatural—onarriving at a node, the walker may continue by traversing any edge pointing out of that node. If, at any stage, the edge along which the walker continues is the reverse of the edge on which they arrived, then the walk is said to be backtracking. Non-backtracking walks have been analyzed in a number of fields. They play a key role in the study of zeta functions on graphs [24], with applications in spectral graph theory [4, 23], numbertheory[44],discretemathematics[12,41],quantumchaos[39],randommatrix theory [40], stochastic analysis [3], applied linear algebra [42] and computer science [38, 45]. Within network science, constraining walks to be non-backtracking has been shown to offer benefits in community detection [26, 28], centrality measurement [6, 20, 29, 36], network comparison [30] and in the study of related issues concerning optimal percolation [31, 32]. A natural task across all these fields is to count the number of distinct non- backtracking walks of a given length between pairs of nodes, and to form a compact expression for the associated generating function. For network centrality, in the case of standard walk counts it has been argued that an exponential-style power series gives a useful alternative to the standard resolvent-style version [15]. For example, basedonananalogywithaphysicaloscillator,itmaybearguedthatmovingfromthe resolvent to the exponential takes us from classical to quantum physics [16]. Further, thereareeffectiveandreliabletoolsforcomputingtheactionofthematrixexponential [2, 21, 22]. This provides the initial motivation for our article, where we study the exponential generating function associated with non-backtracking walks. We also note that more general matrix functions have been proposed in this walk-counting †Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK ([email protected]) ‡MathematicalInstitute,UniversityofOxford,Oxford,UK([email protected]) §Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK ([email protected]) ¶DepartmentofMathematicalSciences,UniversityofEssex,Colchester,UK([email protected]) 1 context [17]. Hence, we then extend the analysis to cover generating functions based on arbitrary power series. 1.2. Background. WeletA=(a ) Rn n denotetheadjacencymatrixforan ij × ∈ unweightedgraphwithnnodes;thatis,a =1ifthereisanedgefromnodeitonode ij j, and a =0 otherwise. We will assume the graph to be loopless, so that a =0 for ij ii all i, and to have no multiple edges. If we further assume the graph represented by A to be undirected, i.e., if A=AT, then the graph will be said to be simple. We use D to denote the diagonal matrix whose diagonal entries are D =(A2) . So D counts ii ii ii the number of reciprocal neighbours of node i, that is, the number of nodes j such that a =a =1. If A=AT, so that all edges are reciprocated, then D reduces to ij ji ii the vector of degrees and any node i for which D =1 will be called a leaf. ii If A = AT the network is said to be directed and we will denote by S = (s ) ij Rn n the6 matrix associated with the graph obtained by removing all edges that ar∈e × not reciprocated, so that s =a a for all i,j =1,2,...,n. ij ij ji A walk of length r from node i to node j is characterized by a sequence of edges i = i i ,i i ,...,i i = j. Note that edges and nodes may be revisited 1 2 2 3 r r+1 7→ 7→ 7→ inthecourseofawalk. ItfollowsfromthedefinitionofmatrixmultiplicationthatAr has (i,j)th element that counts the number of distinct walks of length r from i to j; see, for example, [13, Theorem 2.2.1]. Given a parameter t, the ordinary generating function [46] associated with this sequence of walk counts may be represented by a resolvent function: ∞ trAr =(I tA)−1, − Xr=0 where I is the identity matrix. If t is interpreted as a formal algebraic variable, this expression holds within the algebra of formal power series. Analytically, if t is interpreted as a real (or complex) valued parameter, it is valid for t < 1/ρ(A), | | whereρ()denotesthespectralradius. Similarly,theassociatedexponentialgenerating · function [46] takes the form of a matrix exponential: ∞ trAr =exp(tA), r! Xr=0 analytically valid for all t R (or in C). ∈ Power series expansions that combine walk counts of different lengths have be- come widely used in the field of network science [17]. Here, the idea is to assign a weighttoeachnodethatsummarizesits“importance”inthenetwork. Ifweinterpret importance as the ability of a node to initiate messages along the edges in the graph, then we see that (I tA) 11, for 0 < t < 1/ρ(A), and exp(tA)1, for t > 0, are − candidates for measu−ring and comparing nodes, where 1 Rn is the vector of ones. ∈ Here, the ith component arises by summing the number of distinct walks from node i toallnodesinthegraph,withwalksoflengthr discountedbytr intheresolventcase and by tr/r! in the exponential case. The resolvent version dates back to the work of Katz [25] and the alternative based on the exponential function, called the total (node) communicability, was introduced and studied in [9]. We also note that both versionsarerelatedtoeigenvectorcentrality[10,11],whichusesthePerron-Frobenius eigenvector of A when the underlying graph is strongly connected [8]. In recent work [29], Martin et al. argued that traditional centrality measures may suffer from unwanted localization effects when the network under study has a 2 scale-free degree distribution (see, e.g., [7, 14]). In this type of network, there are a few nodes with a high number of neighbours, that are thus assigned most of the centrality,andmanynodeswithaverysmallnumberofconnections,thathenceshare theremainderofthemassofthecentralityvector. Thisunbalancedpartitioningmakes it difficult to distinguish between the importance of the nodes with few connections. Theauthorsthereforeproposedavariationofeigenvectorcentralitymotivatedbythe concept of non-backtracking walks. This idea was pursued in [6, 20] in the context of resolvent-based centrality. Our aim here is to develop and study the analogous exponential version of non-backtracking walk centrality, and also to consider general matrix functions. Formally, wesaythatawalkisbacktracking ifitcontainsatleastonepairofsuc- cessive edges of the form i j, j i; that is, having left some node i it immediately 7→ 7→ returns to that node. We say the walk is non-backtracking otherwise. We use the ab- breviation NBTW to mean non-backtracking walk. We will denote by p (A) Rn n r × ∈ the matrix whose (i,j)th entry counts the number of NBTWs of length r from node i to node j in the graph represented by the matrix A. Our first task is therefore to obtain useful expressions for the corresponding exponential generating function ∞ trpr(A) (1.1) F(t):= r! Xr=0 and centrality measure (1.2) b:=F(t)1. We do this in Sections 2 and 3 for the case of undirected and directed graphs, re- spectively. Although, of course, an undirected graph is a special case of a directed graph, we treat the two cases separately for two reasons. First, undirected graphs are common in network science, and hence it is useful to have results tailored to this case. Second, in our context it seems more straightforward to study undirected graphs directly than to simplify the more general expression for directed graphs. In Section 4, we briefly describe how to compute the proposed centrality measures and discuss computational costs. In Section 5, we interpret the new expressions from the viewpoint of traversals around multilayer networks with possibly negative interlayer weights. This block matrix framework also allows us to derive expressions for the case where the exponential is replaced by a general matrix function. In Section 6 we give a concrete example of the benefit of non-backtracking, and further illustrations on realistic model networks appear in Section 7. We end with a short summary in Section 8. The key analytical steps used here are to derive and solve an ODE—this is a widely used approach in the study of exponential generating functions [46]. To this end, we note from (1.1) that ∞ trpr+1(A) ∞ trpr+2(A) (1.3) F (t)= and F (t)= . ′ ′′ r! r! Xr=0 Xr=0 We also require a lemma on matrix-valued ODEs. The following result arises from columnwise application of a standard result on vector-valued linear systems; see, for example, [21, Section 2.1]. The statement of the result makes use of the shifted exponentialfunctionψ (z)=(ez 1)/z,whichmaybedefinedformallyviathepower 1 − 3 series expansion ∞ zr (1.4) ψ (z)= . 1 (r+1)! Xr=0 Lemma 1.1. Let C Rm m and P,X Rm ℓ. Then × 0 × ∈ ∈ (1.5) X(t)=tψ (tC)(P +CX )+X 1 0 0 is the unique solution to the linear, constant coefficient, inhomogeneous, autonomous, initial value ODE system X (t)=CX(t)+P, X(0)=X . ′ 0 Moreover, in the homogeneous case P =0, the expression (1.5) simplifies to (1.6) X(t)=exp(tC)X . 0 Remark 1.2. We note that Lemma 1.1 does not require the Jacobian matrix C to be nonsingular. 2. Exponential generating function for undirected graphs. We assume throughoutthissectionthatthegraphisundirected. Wealsoassumeforconvenience that the graph is connected—in the disconnected case, each component may be stud- ied separately. We note that each node is therefore assumed to have at least one neighbour. As we mentioned in Section 1.2, the matrix Ar counts walks of length r. Hence thetrivialrelationshipAr+1 =AAr givesarecurrencebetweenwalkcountsforlength r and r +1. In the case of NBTWs on undirected graphs, the following two term recurrence relates walk counts of successive lengths. We have p (A) = A, p (A) = 1 2 A2 D, and − (2.1) p (A)=Ap (A) (D I)p (A), for r 1. r+2 r+1 r − − ≥ Thisrecursionwasprovedin[12]forgeneraldirectedgraphs. Fortheundirectedcase considered here, the recurrence was re-derived independently in [41]. We now give the main statement of this section: an explicit formula for the generating function (1.1) and the corresponding centrality measure (1.2). Theorem 2.1. For a simple graph with associated adjacency matrix A, the expo- nential generating function F(t) in (1.1) and total communicability vector b in (1.2) satisfy A F(t)= I 0 tψ (tY) +I, 1 (cid:20)A2 D(cid:21) (cid:2) (cid:3) − A1 b= I 0 tψ (tY) +1, 1 (cid:20)(A2 D)1(cid:21) (cid:2) (cid:3) − where 0 I (2.2) Y := R2n 2n. (cid:20) I D A (cid:21)∈ × − 4 In the case where the graph has no leaves, these expressions may be written as I +(D I) 1 F(t)= I 0 exp(tY)(cid:20) A− − (cid:21)−(D−I)−1, (cid:2) (cid:3) 1+(D I) 11 b= I 0 exp(tY)(cid:20) A−1 − (cid:21)−(D−I)−11. (cid:2) (cid:3) Proof. To obtain the exponential generating function F(t) in (1.1), we multiply by tr/r! in (2.1) and then sum from r =1 to , to obtain ∞ ∞ trpr+2(A) ∞ trpr+1(A) ∞ trpr(A) =A (D I) . r! r! − − r! Xr=1 Xr=1 Xr=1 From (1.3), this shows F (t) p (A)=A(F (t) p (A)) (D I)(F(t) p (A)), ′′ 2 ′ 1 0 − − − − − which simplifies to (2.3) F (t) AF (t)+(D I)F(t)= I. ′′ ′ − − − To convert to a first order system, we introduce G (t) F(t) G(t)= 1 := R2n n, (cid:20) G2(t) (cid:21) (cid:20) F′(t) (cid:21)∈ × so that (2.3) becomes 0 I G(t)=YG(t)+ , where G(0)= , ′ (cid:20) I (cid:21) (cid:20) A (cid:21) − where Y is defined in (2.2). It then follows from Lemma 1.1 that 0 I I (2.4) G(t)=tψ (tY) +Y + . 1 (cid:18)(cid:20) I (cid:21) (cid:20) A (cid:21)(cid:19) (cid:20) A (cid:21) − Finally, we note that the Jacobian matrix Y in (2.2) is invertible if and only if (D I) 1 exists, that is, if the graph does not contain leaves, i.e., nodes of degree − − one. In this case (D I) 1A (D I) 1 Y−1 =(cid:20) −I − − −0 − (cid:21) and (2.4) simplifies to I+(D I) 1 (D I) 1 (2.5) G(t)=exp(tY) − − − − . (cid:20) A (cid:21)−(cid:20) 0 (cid:21) NotingthatF(t)= I 0 G(t),weseethat(2.4)and(2.5)yieldthestatement. (cid:2) (cid:3) We discuss these results further in Sections 4 and 5. 5 3. Exponential generating function for directed graphs. For directed graphs the matrices p (A) that count NBTWs satisfy a three term recurrence. This r was shown in [12], with a more linear algebra oriented derivation given in [42]. With p (A)=I, p (A)=A, p (A)=A2 D, the recurrence takes the form 0 1 2 − (3.1) p (A)=Ap (A) (D I)p (A) (A S)p (A), for r 0. r+3 r+2 r+1 r − − − − ≥ Recall that S Rn n is the adjacency matrix of the graph obtained by only consid- × ∈ ering reciprocated links in the original network; so s = a a . We emphasize that ij ij ji unlike the undirected case (2.1), this recurrence is valid from r =0. Here is the analogue for directed graphs of Theorem 2.1. Theorem 3.1. Let A be the adjacency matrix of an unweighted directed graph with no self-loops nor multiple edges. The exponential generating function F(t) in (1.1) and total communicability vector b in (1.2) satisfy I F(t)= I 0 0 exp(tZ) A , (cid:2) (cid:3) A2 D − 1 b= I 0 0 exp(tZ) A1 , (cid:2) (cid:3) (A2 D)1 − where 0 I 0 (3.2) Z := 0 0 I R3n×3n. ∈ (A S) (D I) A − − − − Proof. Multiplying (3.1) by tr/r! and summing from r =0 to we arrive at ∞ (3.3) F (t)=AF (t) (D I)F (t) (A S)F(t). ′′′ ′′ ′ − − − − We note that, in contrast to (2.3), this ODE does not have an inhomogeneous term. To convert to a first order system, introduce H (t) F(t) 1 H(t)= H2(t) = F′(t) , H (t) F (t) 3 ′′ so that I H′(t)=ZH(t), where H(0)= A , A2 D − where Z is defined in (3.2). Using (1.6) in Lemma 1.1 and noting that F(t)= I 0 0 H(t), we arrive at the statement. (cid:2) (cid:3) Itisofinteresttonotethattheremovalofbacktrackingwalksmaybeinterpreted as the introduction of damping terms (D I)F (t) and (A S)F(t) in (3.3). ′ − − − − Further, Theorem 3.1 shows that, in general, as t increases the total communicability vector b aligns with the first n components of the dominant eigenvector of Z. 6 4. Computing the centrality vectors. In this section we consider computa- tional issues. First, we note that the standard expression for the total (node) com- municability, etA1, requires the action of the exponential of a sparse n by n matrix on a vector in Rn. It is reasonable to assume that a suitable approximation may be obtained iteratively, using a small number of sparse matrix-vector multiplications [2]. AssumingthatAhasO(n)nonzeros(edges),theneachsparsematrix-vectormultipli- cation will cost O(n). Hence, using k to denote the number of iterations, the overall cost will be O(nk). In practice, it is reasonable to regard k as finite and independent from n, so that this expression becomes O(n). In Theorem 2.1, we see that for an undirected graph with no leaves, the cost of computing the non-backtracking total communicability vector b is dominated by the action of etY on a vector in R2n, where the 2n by 2n sparse matrix Y is defined in (2.2). We note that Y has only 2n more nonzeros than the underlying adjacency matrix A, and hence, under the assumptions above, if the same number of iterations are required, then the same O(n) cost applies. In the directed case, the expression for b in Theorem 3.1 requires the action of etZ on a vector in R3n, where the 3n by 3n sparse matrix Z is defined in (3.2). Since Z also has O(n) nonzeros, we again recover the O(n) cost under our assumptions. Itremainstoconsiderthecaseofanundirectedgraphwithleaves. InTheorem2.1, theexpressionforbinvolvestheshiftedexponential, ψ , from(1.4). Wemayproceed 1 by reducing the key computation to the approximation of the action of a matrix exponential in a slightly higher dimension. Let tY v Y := R(2n+1) (2n+1), (cid:20)0T 0(cid:21)∈ × b where v R2n and 0 R2n is a vector of all zeros. It was shown in [37, Section 2.3] ∈ ∈ that etY ψ (tY)v exp(Y)= 1 . (cid:20)0T 1 (cid:21) b So, if we let A1 v= (cid:20)(A2 D)1(cid:21) − thenbinTheorem2.1maybecomputedbyfirstapproximatingthevectorexp(Y)e , 2n+1 where e is the (2n+1)th vector of the standard basis of R2n+1. We then extract 2n+1 b the top n entries of the resulting vector, scale by t and shift by 1. This again will have an O(n) cost under our assumptions. Remark 4.1. It is worth emphasizing that the final steps of scaling and shifting can be avoided if we are only interested, as is customary in network science, in the rankings of the nodes rather than their actual centrality scores. 5. Block matrix interpretations. Theorems 2.1 and 3.1 show that, loosely, for exponential-based centrality the operation of eliminating backtracking walks is equivalent to replacing the original adjacency matrix A with a suitable block matrix. Intheundirectedcase,the2by2versionY in(2.2)isused,whereasthedirectedcase has the 3 by 3 version Z in (3.2). These block matrices were constructed indirectly, as a consequence of the generating function approach, [46]. In this section, we add insight by showing how these block matrices may be interpreted directly. We also 7 show that they remain relevant when the exponential is replaced by a general matrix function. We begin with an algebraic interpretation. The matrices Z and Y are the first companion matrix linearization [18] of the matrix polynomials (A S)+(D I)λ − − − Aλ2+Iλ3 and, respectively, (D I) Aλ+Iλ2. These are in turn the reversals of, − − respectively, the directed and undirected version of the deformed graph Laplacian, a matrix polynomial that plays a crucial role in the theory of NBTW combinatorics, see [6, 20]. Algebraically, the companion matrix is the multiplication-by-λI operator in the module obtained by forming quotients in the ring of matrix polynomials using the left ideal generated by the linearized matrix polynomial [34]. Linear-algebraic functionsofthecompanionmatrixalsohaveananalogousinterpretation;forexample, exp(tZ)representsmultiplicationbyexp(tλ)I modulo(A S)+(D I)λ Aλ2+Iλ3. − − − Formoredetailsonthisalgebraicviewofcompanionmatricessee,e.g.,[33,Sec. 2],[34, Sec. 5], [35, Sec. 9], and the references therein. There is also a graph-theoretical interpretation that we now describe in detail. We focus first on the directed case, involving Z in (3.2). A useful connection is that Z represents the adjacency matrix for a three layer weighted network. This network is node-aligned [27], so that the general ith nodes from each layer may be regarded as copies of the same node. Looking at the diagonal blocks of Z, we see that the within-layer connections correspond to the empty graph for layers one and two, and the original graph for layer three. Next, we consider the off-diagonal blocks. The identity matrix in the (1,2) block and zero matrix in the (1,3) block show that the only edge out of a node in layer one points to the corresponding node in layer two. Similarly, we may only leave layer two by moving to the corresponding node in layer three. Within layer three, in addition to moving within the layer using edges from A, traversalsarepossibletolayerstwoandone. The(3,2)blockshowsthatforD >1a ii traversalmaymovefromnodeiinlayerthreetonodeiinlayertwo,withthenegative weight 1 D assigned to that edge. The (3,1) block shows that a move from node ii − i in layer three to node j in layer one is allowed if and only if the original graph has a nonreciprocal edge from node i to node j, in which case the inter-layer edge carries weight 1. − The expression for b in Theorem 3.1 shows that exponential NBTW centrality arises from a post-processed version of the “standard” exponential centrality on this multilayer network, where negative inter-layer weights have the effect of negating the contributions of backtracking walks in the underlying graph. The two lemmas and the theorem below show that Z is also relevant for more general matrix functions. Lemma 5.1. For all r 0 we have ≥ 0 p (A) r ZrZ0=pr+1(A). I p (A) e r+2 Proof. For r < 2, the identity may be verified directly. For r 2 the result ≥ follows by straightforward induction using the three-term recurrence (3.1). Theorem 5.2 (Directed graph). Let α be a sequence such that f (Z) = { r}∞r=0 0 ∞r=0αrZr converges. Then ∞r=0αrpr(A) converges, and P P I ∞ αrpr(A)=[I 00]f0(Z) A . Xr=0 A2 D − 8 Moreover, ∞r=0αrpr(A) is equal to the (3,3) block in f0(Z)−f2(Z), where P ∞ f (Z)= α Zr. k r+k Xr=0 Proof. By Lemma 5.1 it follows that 0 ∞ ∞ αrpr(A)=[I 00] αrZrZ0 Xr=0 Xr=0 I e and, since 0 I Z0= A I A2 D e − the first part of the statement follows. For the second part, note first that ∞r=0αr+2Zr also converges under the as- sumptions in the statement. Then P ∞ ∞ α p (A)=α I+α A+ α p (A) r r 0 1 r r Xr=0 Xr=2 0 ∞ =α0I+α1A+[00I] αrZr−2Z0 Xr=2 I e 0 0 ∞ ∞ =α0I+α1A+[00I] αrZr0 [00I] αrZr−20 − Xr=2 I Xr=2 I 0 ∞ =[00I] (αr αr+2)Zr0, − Xr=0 I as required. Theorem 5.2 shows that we can accumulate combinations of NBTW counts in the original network by looking at the (3,3) block of an appropriate function of the matrix Z. From a multilayer perspective, the (3,3) block of Z is associated with the layer containing the adjacency matrix A of the original network. Theorem 5.2 therefore concerns walks of any length that start and end at that layer. This includes not only walks that take place within layer three, but also those that make one or more intermediate visits to the other layers. Thecontributioncomingfromf (Z)countswalksinthemultilayernetworkusing 0 the original weights. The correction term involving f (Z) removes walks from the 2 count by weighting walks of length r 2 as if they were of length r. − In more detail, the term f (Z) is used to correct the appearance of an identity 2 matrix in Z2 that is then propagated at higher powers of Z. This identity matrix at level r = 2 originates from the one appearing in the (3,2) block. This latter is used in the recursion (3.1) defining p (A) when r >2 to correct the penalization of walks r of the type i k j k j of length r → ··· → → → → 9
Description: