The logarithmic least squares optimality of the geometric mean of weight vectors calculated from all spanning trees for (in)complete pairwise comparison matrices Sa´ndor Bozo´ki1,2, Vitaliy Tsyganok3,4 Abstract Pairwise comparison matrices, a method for preference modelling and quantification in 7 multi-attributedecisionmakingandrankingproblems,arenaturallyextendedtotheincom- 1 pletecase,offeringawiderrangeofapplicability. Theweightingproblemistofindaweight 0 vectorthatreflectsthedecisionmaker’spreferencesaswellaspossible. Thelogarithmicleast 2 squaresproblemhasauniqueandsimplycomputablesolution. Thespanningtreeapproach r doesnotassumeanymetricinadvance,insteaditgoesthroughallminimalsufficientsubsets a (spanningtrees)ofthesetofpairwisecomparisons,thentheweightvectorsareaggregated. M It is shown that the geometric mean of weight vectors, calculated from all spanning trees, is the optimal solution of the well known logarithmic least squares problem, not only for 9 complete,asitwasrecentlyprovedbyLundy,SirajandGreco, butforincompletepairwise comparison matrices as well. ] C O 1 Incomplete pairwise comparison matrices . h t Cardinalpreferencesofdecisionmakersareoftenmodelledandcalculatedbypairwisecomparison a m matrices [16]. Questions ’How many times a criterion is more important than another one?’ or ’Howmanytimesagivenalternativeisbetterthananotheronewithrespecttoafixedcriterion?’ [ are typical in multi-attribute decision problems. The numerical answers are collected into a 2 pairwise comparison matrix A = [a ] having reciprocity, i.e., a = 1/a . A pairwise ij i,j=1...n ij ji v comparison matrix can be complete, as in a classical AHP model [16], or incomplete [11, 13]. In 5 the paper incomplete means ’not necessarily complete’, in other words, the number of missing 6 2 elements is allowed to be zero. 4 Incompletepairwisecomparisonmatricesareappliednotonlyinthesamedecisionsituations 0 in which the complete matrices arise, but also larger decision and ranking problems. Boz´oki, . 1 Csat´oandTemesi[1]proposearankingmethodfortoptennisplayersbasedontheirpairwisere- 0 sults,whereincompletenessoccursinanaturalway. Csat´o[6]constructeda149×149incomplete 7 pairwise comparison matrix to rank the teams of the 39th Chess Olympiad 2010. 1 : v 1LaboratoryonEngineeringandManagementIntelligence,ResearchGroupofOperationsResearchandDeci- i X sionSystems,InstituteforComputerScienceandControl,HungarianAcademyofSciences;Mail: 1518Budapest, P.O.Box63,Hungary. E-mail: [email protected] r 2DepartmentofOperationsResearchandActuarialSciences,CorvinusUniversityofBudapest,Hungary a 3Laboratory for Decision Support Systems, Institute for Information Recording of National Academy of Sci- encesofUkraine;Mail: 2,Shpakstr.,Kyiv,03113,Ukraine. E-mail: [email protected] 4DepartmentofSystemAnalysis,StateUniversityofTelecommunications,Ukraine 1 Example 1.1. Let A be a 6×6 incomplete pairwise comparison matrix as follows: 1 a a a a 12 14 15 16 a21 1 a23 A= a32 1 a34 a41 a43 1 a45 a51 a54 1 a 1 61 2 The logarithmic least squares (LLS) problem The basic problem of finding the best weight vectors usually includes an additional information on how closeness is defined or specified. The classical approaches apply metrics based on least squares[4],weightedleastsquares[4],logarithmicleastsquares[5,12],justtonameafew. Alot of further weighting methods are discussed by Golany and Kress [10] and by Choo and Wedley [3]. Even the well known eigenvector method [16] is proved to be a distance minimizing method [7, 8], although its metric seems to be rather artificial. The Logarithmic Least Squares (LLS) problem [13] is defined as follows: min (cid:88) (cid:20)loga −log(cid:18)wi(cid:19)(cid:21)2 (1) ij w j i,j: aij isknown w >0, i=1,2,...,n. (2) i Originally, the LLS problem was defined for complete pairwise comparison matrices, i.e., the sum in the objective function is taken for all i,j. In this special case, the LLS optimal solution isuniqueanditcanbeexplicitlycomputedbytakingthegeometricmeanofrows’elements[5,12]. n n (cid:80) (cid:81) Themostcommonnormalizationsare w =1and w =1.Normalizationw =1,called i i 1 i=1 i=1 ideal-mode in Lundy, Siraj and Greco [14]), can also be interpreted: the first object (criterion, alternative) is considered a reference point and all the others are expressed according to it. Given an (in)complete pairwise comparison matrix A of size n × n, an undirected graph G(V,E) is defined as follows: G has n nodes and the edge between nodes i and j is drawn if and only if the matrix element a is known. ij The incomplete LLS problem can be solved by the following theorem: Theorem2.1. (Boz´oki,Fu¨l¨op,R´onyai[2])LetAbeanincompleteorcompletepairwisecompar- ison matrix such that its associated graph G is connected. Then the optimal solution w=expy of the logarithmic least squares problem is the unique solution of the following system of linear equations: (cid:88) (Ly) = loga for all i=1,2,...,n−1,n (3) i ik k:e(i,k)∈E(G) y =0 (4) 1 where L denotes the Laplacian matrix of G ((cid:96) is the degree of node i and (cid:96) = −1 if nodes i ii ij and j are adjacent). 2 L has rank n−1. Normalization (4), being equivalent to w =1, plays a technical role only. 1 Itcanbereplacedby,e.g.,thecommonlyused(cid:81)n w =1 (⇔(cid:80)n y =0).Thecomputational i=1 i i=1 i complexity of solving the system of n linear equations (3)-(4) is at most O(n3) [22, Chapter 1]. Example 2.1. Let incomplete pairwise comparison matrix A be the same as in Example 1.1. Equations (3) for i=1,2,...,6 form the following system of linear equations: 4 −1 0 −1 −1 −1 y loga +loga +loga +loga 1 12 14 15 16 −1 2 −1 0 0 0 y2 loga21+loga23 0 −1 2 −1 0 0 y3= loga32+loga34 −1 0 −1 3 −1 0 y4 loga41+loga43+loga45 −1 0 0 −1 2 0 y5 loga51+loga54 −1 0 0 0 0 1 y loga 6 61 3 Aggregations of weight vectors calculated from all span- ning trees The spanning tree approach by Tsyganok [19, 20] does not assume any distance function or measure of closeness. The basic idea is that the set of pairwise comparisons is considered as the unionofminimal,connectedsubsets,or,ingraphtheoreticalterms,spanningtrees. LetS denote the number of all spanning trees of graph G. Every spanning tree determines a unique weight vector, fitting on the corresponding subset of matrix elements perfectly. Given a spanning tree, the calculation of its associated weight vector requires O(n) steps. The number of spanning trees can be very large. In the special case of complete pairwise comparisonmatrices,thenumberofallspanningtreesisS =nn−2byCayley’stheorem. Another extremal case is when the graph of the incomplete pairwise comparison matrix is itself a tree (S =1). TheenumerationofallspanningtreesrequiresO(n+m+nS)steps,withthealgorithm of Gabow and Myers [9], where m denotes the number of edges in G. The computational complexity of calculating all weight vectors, associated to the spanning trees, is max{O(nS),O(n+m+nS)} steps, where S, the number of spanning trees, is between 1 and nn−2. Themostnaturalcandidatesfortheaggregationofweightvectorscalculatedfromallspanning trees are the arithmetic [17, 18, 19, 20] and the geometric means [14, 21]. The following theorem connects two weighting methods. Theorem 3.1. (Lundy, Siraj and Greco [14]) The geometric mean of weight vectors calculated from all spanning trees is logarithmic least squares optimal in case of complete pairwise compar- ison matrices. The rest of the paper provides the generalization of this result, it is shown that it holds for incomplete matrices as well. 4 Main result: the geometric mean of weight vectors calcu- lated from all spanning trees is logarithmic least squares optimal Theorem 4.1. Let A be an incomplete or complete pairwise comparison matrix such that its associated graph is connected. Then the optimal solution of the logarithmic least squares problem 3 is equal, up to a scalar multiplier, to the geometric mean of weight vectors calculated from all spanning trees. Proof. LetGbetheconnectedgraphassociatedtothe(in)completepairwisecomparisonmatrix AandletE(G)denotethesetofedges. Theedgebetweennodesiandjisdenotedbye(i,j). The Laplacian matrix of graph G is denoted by L. Let T1,T2,...,Ts,...,TS denote the spanning trees of G, where S denotes the number of spanning trees. E(Ts) denotes the set of edges in Ts. Hereafter,upperindexsisalsousedforindexingaweightvectororapairwisecomparisonmatrix, associated to spanning tree Ts. Let ws,s = 1,2,...,S, denote the weight vector calculated from spanning tree Ts. Weight vector ws is unique up to a scalar multiplication. For sake of simplicity we can assume that ws = 1, but other ways of normalization, e.g., (cid:81)w = 1 can 1 i also be chosen. Let ys := logws, s = 1,2,...,S, where the logarithm is taken element-wise. Let wLLS denote the optimal solution to the incomplete Logarithmic Least Squares problem (normalized by wLLS =1) and yLLS :=logwLLS, then by Theorem 2.1, 1 (cid:0)LyLLS(cid:1) = (cid:88) b for all i=1,2,...,n, i ik k:e(i,k)∈E(G) where b =loga for all (i,k)∈E(G). ik ik In order to prove the theorem, it is sufficient to show that (cid:32) S (cid:33) 1 (cid:88) (cid:88) L ys = b for all i=1,2,...,n. (5) S ik s=1 i k:e(i,k)∈E(G) Consideranarbitraryspanning treeTs. Then wwiss =aij forall e(i,j)∈E(Ts). Introducethe j incomplete pairwise comparison matrix As by asij := aij for all e(i,j) ∈ E(Ts) and asij := wwiss j for all e(i,j)∈E(G)\E(Ts). Again, bs :=logas (=ys−ys). Note that the Laplacian matrices ij ij i j of A and As are the same (L). Since weight vector ws is generated by the matrix elements belonging to spanning tree Ts, it is the optimal solution of the LLS problem regarding As, too. Equivalently, the following system of linear equations holds. (cid:88) (cid:88) (Lys) = b + bs for all i=1,2,...,n. (6) i ik ik k:e(i,k)∈E(Ts) k:e(i,k)∈E(G)\E(Ts) Lemma 4.1. S (cid:88) (cid:88) (cid:88) (cid:88) bik+ bsik=S bik. (7) s=1 k:e(i,k)∈E(Ts) k:e(i,k)∈E(G)\E(Ts) k:e(i,k)∈E(G) Proof. Let i be fixed arbitrarily and consider node i in all spanning trees. There is nothing to do with edges e(i,k)∈E(Ts). Since Ts is a spanning tree, for every edge e(i,k)∈E(G)\E(Ts) there exists a unique path P ={e(i,k ),e(k ,k ),...,e(k ,k)}⊆E(Ts). P ∪e(i,k) is a cycle and 1 1 2 (cid:96) bs =b +b +...+b . (8) ik ik1 k1k2 k(cid:96)k Consider the following spanning tree: Ts(cid:48)i,k,k1 :=(Ts\e(i,k1))∪e(i,k) as in Figure 1. 4 Ts Ts(cid:48)i,k,k1 Figure 1. The replacement of edge e(i,k ) in spanning tree Ts by edge e(i,k) 1 results in spanning tree Ts(cid:48)i,k,k1. Spanning trees Ts and Ts(cid:48)i,k,k1 differ in one edge only and we can write again that s(cid:48) b i,k,k1 =b +b +...+b . (9) ik1 ik kk(cid:96) k2k1 The sum of equations (8) and (9) results in bs +bs(cid:48)i,k,k1 =b +b (10) ik ik1 ik ik1 because all intermediate terms vanish due to the reciprocal property of pairwise comparison matrices. Now let us continue this process and go through all edges e(i,k) ∈ E(G)\E(Ts) for all k and s. The remarkable symmetry of the set of all spanning trees implies that every edge occurs in exactly one pair. Summing all these equations like (10), the statement of Lemma 4.1 follows. An illustrative example is given below in Example 4.1. Finally, to complete the proof of Theorem 4.1, take the sum of equations (6) for all s = S 1,2,...,S and apply Lemma 4.1 to conclude that yLLS = 1 (cid:80) ys. S s=1 Remark. Complete pairwise comparison matrices (S = nn−2) are included in Theorem 4.1 asaspecialcase. TheproofofTheorem4.1canalsobeconsideredasasecond,andshorterproof of Theorem 3.1. 5 Example 4.1. (An illustration of Lemma 4.1) Let incomplete pairwise comparison matrix A be the same as in Example 1.1. The associated graph G and its spanning trees T1,T2,...,T11 are drawn in Figure 2. Let us focus on node i=1. Edges departing from node 1 are missing 12 times (and they are not missing 32 times) in the whole set of spanning trees, hence can identify 6 pairs. They induce 6 pairs of equations, that are labelled in Figure 2. In tree T1, b1 =b +b +b +b (11) 12 15 54 43 32 Note that equation (11), as well as the forthcoming ones, are labelled on the corresponding edges in Figure 2. Now s = 1,k = 2,k = 5 and s(cid:48) = 4, because the replacement of edge e(1,5) in 1 1,2,5 tree T1 by edge e(1,2) results in tree T4. Here b4 =b +b +b +b (12) 15 12 23 34 45 The sum of equations (11) and (12) confirms (10). Let us continue by edge e(1,4) in tree T1. b1 =b +b (13) 14 15 54 b2 =b +b (14) 15 14 45 The remaining four pairs of edges and their equations are listed below. b2 =b +b +b (15) 12 14 43 32 b4 =b +b +b (16) 14 12 23 34 b3 =b +b +b (17) 12 14 43 32 b7 =b +b +b (18) 14 12 23 34 b5 =b +b (19) 14 15 54 b8 =b +b (20) 15 14 45 b6 =b +b (21) 14 15 54 b9 =b +b (22) 15 14 45 Lemma 4.1 is now confirmed for i=1: 11 (cid:88) (cid:88) (cid:88) (cid:88) b1k+ bs1k=11 b1k =11(b12+b14+b15+b16). s=1 k:e(1,k)∈E(Ts) k:e(1,k)∈E(G)\E(Ts) k:e(1,k)∈E(G) 6 Figure 2. Graph G of Example 4.1 and its spanning trees T1,T2,...,T11 7 Let us move to node 2, three pairs of equations are written: b1 =b +b +b +b (23) 21 23 34 45 51 b5 =b +b +b +b (24) 23 21 15 54 43 b2 =b +b +b (25) 21 23 34 41 b8 =b +b +b (26) 23 21 14 43 b3 =b +b +b (27) 21 23 34 41 b10 =b +b +b (28) 23 21 14 43 Lemma 4.1 is now confirmed for i=2: 11 (cid:88) (cid:88) (cid:88) (cid:88) b2k+ bs2k=11 b2k =11(b21+b23). s=1 k:e(2,k)∈E(Ts) k:e(2,k)∈E(G)\E(Ts) k:e(2,k)∈E(G) The remaining nodes are left to the reader. 5 Conclusions It has been shown in the paper that two weighting methods, based on rather different principles and approaches, are in fact equivalent not only for complete pairwise comparison matrices, as Lundy, Siraj and Greco [14] recently proved, but also for incomplete ones. The geometric mean of weight vectors calculated from all spanning trees has been proved to be logarithmic least squares optimal. The advantages rooted in the definition of the two methods, namely the clear interpretation of taking all spanning trees into account, and the optimality by a well understood objective function (LLS), have now been united. There is a significant difference in computational complexity. The logarithmic least squares problemcanbesolvedfromasinglesystemoflinearequations(thecoefficientmatrixistheLapla- cian), requiring at most O(n3) steps. The spanning tree approach requires max{O(nS),O(n+ m+nS)} steps, where S, the number of spanning trees is between 1 and nn−2. As soon as S exceeds O(n2), the logarithmic least squares problem is faster to solve. An important consequence of the paper is that future analyses of weighting methods should not distinguish between the incomplete LLS and the geometric mean of weight vectors from all spanning trees. Certainapplicationsapplythespanningtrees’enumeration,butnotnecessarilytogetherwith the aggregation by the geometric mean. The approach of spanning trees enumeration is used in determining the consistency to build the distribution of expert estimates based on the matrix [15]. Such problems offer further research possibilities. The possible equivalence of the arithmetic (or other but not geometric) mean of weight vectors,calculatedfromallspanningtrees,andotherweightingmethods,isstillanopenproblem. Acknowledgements S.Boz´okiacknowledgesthesupportoftheJ´anosBolyaiResearchFellowshipno.BO/00154/16/3 and the Hungarian Scientific Research Fund (OTKA), grant no. K111797. 8 References [1] Boz´oki, S., Csat´o, L., Temesi, J. (2016): An application of incomplete pairwise comparison matrices for ranking top tennis players, European Journal of Operational Research, 248(1) 211–218 [2] Boz´oki, S., Fu¨l¨op, J., R´onyai, L. (2010): On optimal completion of incomplete pairwise comparison matrices, Mathematical and Computer Modelling, 52(1-2):318–333 [3] Choo,E.U.,Wedley,W.C.(2004): Acommonframeworkforderivingpreferencevaluesfrom pairwise comparison matrices, Computers & Operations Research, 31(6) 893–908 [4] Chu, A.T.W., Kalaba, R.E., Spingarn, K. (1979): A comparison of two methods for deter- mining the weights of belonging to fuzzy sets, Journal of Optimization Theory and Appli- cations, 27(4) 531–538 [5] Crawford, G., Williams, C. (1985): A note on the analysis of subjective judgment matrices, Journal of Mathematical Psychology 29(4) 387–405 [6] Csat´o, L. (2013): Ranking by pairwise comparisons for Swiss-system tournaments, Central European Journal of Operations Research 21(4) 783–803 [7] Fichtner, J. (1984). Some thoughts about the Mathematics of the Analytic Hierarchy Pro- cess. Report 8403, Universit¨at der Bundeswehr Mu¨nchen, Fakult¨at fu¨r Informatik, Institut fu¨r Angewandte Systemforschung und Operations Research, Werner-Heisenberg-Weg 39, D-8014 Neubiberg, F.R.G. 1984. [8] Fichtner, J. (1986). On deriving priority vectors from matrices of pairwise comparisons. Socio-Economic Planning Sciences, 20(6) 341–345 [9] Gabow, H.N., Myers, E.W. (1978): Finding all spanning trees of directed and undirected graphs, SIAM Journal on Computing, 7(3) 280–287 [10] Golany, B., Kress, M. (1993): A multicriteria evaluation of methods for obtaining weights from ratio-scale matrices, European Journal of Operational Research, 69(2) 210–220 [11] Harker, P.T. (1987): Incomplete pairwise comparisons in the analytic hierarchy process, Mathematical Modelling 9(11) 837–848 [12] de Jong, P. (1984): A statistical approach to Saaty’s scaling methods for priorities, Journal of Mathematical Psychology 28(4) 467–478 [13] Kwiesielewicz, M. (1996): The logarithmic least squares and the generalised pseudoinverse in estimating ratios, European Journal of Operational Research 93(3) 611–619 [14] Lundy, M., Siraj, S., Greco, S. (2017): The mathematical equivalence of the “spanning tree”androwgeometricmeanpreferencevectorsanditsimplicationsforpreferenceanalysis, European Journal of Operational Research 257(1) 197–208 [15] Olenko, A., Tsyganok, V. (2016): Double Entropy Inter-Rater Agreement Indices, Applied Psychological Measurement 40(1) 37–55 [16] Saaty, T.L. (1977): A scaling method for priorities in hierarchical structures, Journal of Mathematical Psychology 15(3) 234–281 9 [17] Siraj, S., Mikhailov, L., Keane, J.A. (2012): Enumerating all spanning trees for pairwise comparisons, Computers & Operations Research, 39(2) 191–199 [18] Siraj, S., Mikhailov, L., Keane, J.A. (2012): Corrigendum to “Enumerating all spanning trees for pairwise comparisons [Comput. Oper. Res. 39(2012) 191-199]”, Computers & Op- erations Research, 39(9) page 2265 [19] Tsyganok, V. (2000): Combinatorial method of pairwise comparisons with feedback, Data Recording, Storage & Processing 2:92–102 (in Ukrainian). [20] Tsyganok, V. (2010): Investigation of the aggregation effectiveness of expert estimates obtainedbythepairwisecomparisonmethod,MathematicalandComputerModelling,52(3- 4) 538–54 [21] Tsyganok, V.V., Kadenko, S.V., Andriichuk, O.V. (2015): Using different pair-wise com- parisonscalesfordevelopingindustrialstrategies,InternationalJournalofManagementand Decision Making, 14(3) 224–250 [22] Watkins, D.S. (2002): Fundamentals of Matrix Computations, Second Edition. John Wiley & Sons, Inc., New York 10