A Coloring Algorithm for Triangle-Free Graphs Mohammad Shoaib Jamall ⋆ Department of Mathematics, UC San Diego [email protected] Abstract. Wegivearandomizedalgorithmthatproperlycolorstheverticesofatriangle-freegraphG 1 onnverticesusingO(∆(G)/log∆(G))colors,where∆(G)isthemaximumdegreeofG.Thealgorithm 1 takes O(n∆2(G)log∆(G)) time and succeeds with high probability, provided ∆(G) is greater than 0 log1+ǫn for a positive constant ǫ. The number of colors is best possible up to a constant factor for 2 triangle-freegraphs.Asaresultthisgivesanalgorithmicproofforasharpupperboundofthechromatic n numberofatriangle-freegraph,theexistenceofwhichwaspreviouslyestablishedbyKimandJohansson a respectively. J 9 2 1 Introduction ] O Aproper vertex coloring ofa graphis anassignmentofcolorstoallverticessuchthatadjacentverticeshave C distinct colors. The chromatic number χ(G) of a graph G is the minimum number of colors required for . a proper vertex coloring. Finding the chromatic number of a graph is NP-Hard [10]. Approximating it to h within a polynomial ratio is also hard [15]. For general graphs, ∆(G)+1 is a trivial upper bound. Brooks’ t a Theorem[7]showsthatχ(G)canbe∆(G)+1onlyifGhasacomponentwhichiseitheracompletesubgraph m or an odd cycle. [ A natural question is: can this bound be improved for graphs without large complete subgraphs? In 1968,Vizing [22] had asked what the best possible upper bound for the chromatic number of a triangle-free 1 v graph was. Borodin and Kostochka [6], Catalin [8], and Lawrence [19] independently made progress in this 1 direction; they showedthat for a K -free graph, χ(G)≤3(∆(G)+2)/4. On the other hand, Kostochka and 4 2 Masurova [18], and Bollob´as [5] separately showed that there are graphs of arbitrarily large girth(length of 7 a shortest cycle) with χ(G) of order ∆(G)/log∆(G). 5 Further progresswasmade using the semi-random method to showthat the chromaticnumber ofgraphs . 1 with large girth is O(∆(G)/log∆(G)). This technique, also known as the pseudo-random method, or the 0 Ro¨dl nibble, appeared first in Ajtai, Koml´os and Szemer´edi [2] and was applied to problems in hypergraph 1 packings, Ramsey theory, colorings, and list colorings [9,13,14,17,21]. In general, given a set S , the goal 1 1 : is to show that there is an object in S1 with a desired property P. This is done by locating a sequence of v non-empty subsets S ⊇ ··· ⊇ S with S having property P. A randomized algorithm is applied to S , i 1 τ τ t X which guaranteesthat S will be obtained with some non-zero(oftensmall) probability.For upper bounds t+1 r on chromatic number, the semi-random method is used to prove the existence of a proper coloring with a a limited number of colors. It does not give an efficient probabilistic algorithm. In 1995, Kim [16] proved that ∆(G) χ(G)≤(1+o(1)) log∆(G) when G has girth greater than 4. Later on, Johansson [12] showed that ∆(G) χ(G)≤O( ) log∆(G) when G is a triangle-free graph(girth greater than 3). Both Kim and Johansson used the semi-random method. Alon,KrivelevichandSudakov[3], andVu[23] extendedthe methodofJohanssonto provebounds ⋆ Research supported in part by a ReubenH. Fleet Foundation Fellowship, and an ARCSFoundation Fellowship. on the chromatic number for graphs in which no subgraph on the set of all neighbors of a vertex has too many edges. Grable and Panconesi [11] gave an algorithm for properly coloring a ∆-regular graph with girth greater than4and∆≥log1+ǫnforapositiveconstantǫ.TheprocedureusesO(∆/log∆)colorsandhaspolynomial runningtime.ThisisaconstructiveversionoftheexistentialproofofKimwithextraconditionsonthedegree of the input graph. Section 5 of [11] asserts that the algorithm can be extended to triangle-free graphs. In Section 2.1 we will see a counter-example that shows that their analysis does not work on all triangle-free graphs. In particular, as will be seen, the analysis fails on a complete bipartite graph. In this paper we give a method for coloring the larger class of triangle-free graphs. Our randomized algorithmproperly colorsa triangle-free graphG onn vertices with ∆(G)≥log1+ǫn for a positive constant ǫ, using O(∆(G)/log∆(G)) colors in O(n∆2(G)log∆(G)) time. The probability of failure goes to 0 as n becomes large. To analyze our iterative algorithm we identify a collection of random variables. The expected changes to these random variables after a round of the algorithm are written in terms of the values of the random variablesbefore the round.We thusobtainasetofrecurrencerelationsandprovethatourrandomvariables are concentrated around the solutions to the recurrence relations. Our method of analysis resembles that of Kim, and Grable et al. However our algorithm and collection of randomvariables are different. Also related to our analysis technique is Achlioptas and Molloy’s study of the performance of a coloring algorithm on random graphs [1]. They setup recurrence relations as we do, and then use the so-called differential equation method for random graph processes [24]. We describe our algorithm in Section 2. Section 2.1 contains motivation, which is followed by a formal descriptionofthealgorithminSection2.2.WegiveanoutlineoftheanalysisinSection3.Section4contains some useful lemmas which we are used in Section 5 to give details of the analysis. 2 An Iterative Algorithm for Coloring a Graph Our algorithmtakes as input a triangle-free graphG on n vertices,its maximum degree ∆, and the number ofcolorstouse∆/k wherek isapositivenumber.Itgoesthroughroundsandassignscolorstomorevertices eachround.Initiallyallverticesareuncolored(nocolorassigned),attheendwehaveapropervertexcoloring of G with some probability. Definition 1. Let t be a natural number. We define the following: G The graph induced on G by the vertices that are uncolored at the beginning of round t. t N (u) The set of vertices adjacent to vertex u in G . That is, the set of uncolored neighbors of u t t at the beginning of round t. S (u) The list of colors that may be assigned to vertex u in round t, also called the palette of u. t For all u in V(G), S (u)={1,...,∆/k}. 0 D (u,c) The set of vertices adjacent to u that may be assigned color c in round t. That is, t D (u,c):={v ∈N (u)|c∈S (v)}. t t t 2 It will be useful to define variables for the sizes of the sets N (u), S (u), and D (u,c). t t t Definition 2. η (u)=|N (u)| t t s (u)=|S (u)| t t d (u,c)=|D (u,c)| t t Observe that for every round t, vertex u in V(G), and color c in {1,...,∆/k}, d (u,c),η (u)≤∆, s (u)=∆/k, s (u)≤∆/k. t t 0 t 2.1 A Sketch of the Algorithm and the Ideas behind its Analysis We startbyconsideringanalgorithmthatworksfor∆-regulargraphswithgirthgreaterthan4.The rounds of this algorithm will go through two stages. The number of colors in the palette of each uncolored vertex will be greater than the number of uncolored adjacent vertices at the end of the second stage with some probability. Then the partial coloring of the graph can be completed easily to give a proper coloring. Let (η ), (d ), and (s ) be sequences defined recursively as t t t η :=∆ η :=η (1−c st) 0 t+1 t 1dt d :=∆ d :=d (1−c st)c 0 t+1 t 1dt 2 s :=∆/k s :=s c 0 t+1 t 2 where c and c are constants between 0 and 1, which are determined by the analysis of the algorithm. 1 2 First Stage Repeat at every round t, until d /s <1 t t Phase I - Coloring Attempt For each vertex u in G : t Awake vertex u with probabilitity s /d . t t If awake, assign to u a color chosen from S (u) uniformly at random. t Phase II - Conflict Resolution For each vertex u in G : t If u is awake, uncolor u if an adjacent vertex is assigned the same color in the coloring attempt phase. Remove from S (u), all colors assigned to adjacent vertices. t S (u)=S (u). t+1 t end repeat Observe that d /s =k and 0 0 d d t+1 t = −c . 1 s s t+1 t In O(k) rounds d /s will be less that 1, and this marks the end of the first stage. t t Second Stage We change the recursive equations for η , d , and s . t t t ηt+1 :=ηt(1−c1ec2dt/st) dt+1 :=dt(1−c1ec2dt/st)e−c2dt/st st+1 :=ste−c2dt/st 3 All uncolored vertices are woken up at every round. In this stage, η decreases much faster than s and t t the repeat-until loop is repeated until η <s . t t Otherwise the second stage is the same as the first. ThetwostagealgorithmisderivedfromGrableetal.TheiranalysistellsusthatifgraphGhasgirthgreater than 4 and ∆ ≥ log1+ǫn for some positive constant ǫ, then there are constants c and c less than 1 such 1 2 that at each round t, ∀u∈V(G ),∀c∈S (u) t t η (u)=η (1+o(1)), s (u)=s (1+o(1)), d (u,c)=d (1+o(1)) (1) t t t t t t with probability 1−o(1). The equations above imply that at the end of the second stage s (u) > η (u) for t t all uncolored vertices u with probability approaching 1 as n approaching ∞. The change we have made to Grable et al. so far, is that we remove all colors temporarily assigned to neighborsofavertexfromits palette,insteadofremovingonlythose assignedpermanently.This simple but powerful idea, adapted from Kim [16], will waste a few colors from the palettes, and simplify our algorithm and its analysis significantly. A counter-example to demonstrate that Grable et al. does not work on graphs with girth greater than 3. The analysis for the above algorithm is probabilistic and proves the property in equation (1) by induction, showing concentration of the variables around their expectations. It fails for graphs with 4-cycles. An example illustrates why: Consider a vertex u whose 2-neighborhood, the graph induced by vertices within distance 2 of u, is the complete bipartite graph K with partitions X and Y. Suppose ∆,∆ that u and another vertex v are in X. If v is colored with c in round 0 while u remains uncolored, then the set D (u,c) = ∅; this violates equation (1) since d ≥ 1 if for example k ≥ 2 and ∆ ≥ 2/c . So, when the 1 1 2 graph has 4-cycles, d (u,c) is not necessarily concentrated around its expectation given the state of the t+1 algorithmatthe beginning of roundt. Maintaining equation(1) is crucialfor the proofin Grable et al.,and this violation is the error we mentioned in the introduction; their algorithm and analysis do not work for triangle-free graphs in general! We must modify the algorithm in two more ways. First Modification: A technique for coloring graphs with 4-cycles. Whiled (u,c)isnotconcentrated t enough when the graph has 4-cycles, our analysis will show that the averageof d (u,c) over all colors t+1 inthepaletteofavertexuisconcentratedenough.Howdoesthisbenefitus?Markov’sfamousinequality may be interpreted as: a list of s positive number which averaged has at most s/q numbers larger than qdforanypositivenumberq.Wemodify thealgorithmsothatattheendofeachroundt,everyvertexu removesfromitspaletteeverycolorcwithd (u,c)largerthanqd forsomeconstantq largerthan1. t+1 t+1 Look at what happens in round t = 1. By a straightforward application of Markov’s inequality, instead of equation (1) we will have the less stringent property: ∀u∈V(G ),∀c∈S (u) t t q−1 η (u)≤η (1+o(1)), s (u)≥ s (1−o(1)), d (u,c)≤qd (1+o(1)). (2) t t t t t t q with probability 1−o(1). In fact, using a generalization of Markov’s inequality, the analysis will show that with a few more modifications our algorithm maintains a slightly stronger property(still weaker than equation (1)). Equation (1) implies that the η (u), s (u), and d (u) at all uncoloredvertices u are about the same. t t t It is a strong statement and helps in the proofs, but is too much to maintain on graphs with 4-cycles. Equation(2)isweakerandisguaranteedbyouralgorithmandwhatis more,itissufficienttoguarantee thatalluncoloredverticesattheendofthesecondstagehavepaletteswithmorecolorsthanthenumber of uncolored adjacent vertices. This is a key idea in our algorithm. 4 Second Modification: Using independent random variables for easier analysis. Insteadofwaking up a vertex with some probability, and then choosing a color from its palette uniformly at random; for each uncolored vertex u and color c in its palette, we will assign c to u independently with some prob- ability. In case multiple colors remain assigned to the vertex after the conflict resolution phase, we will arbitrarily choose one of them to permanently color the vertex. This modification, adapted from Johansson [12], will make concentration of our random variables simpler. Next we provide a formal description of the algorithm we have just motivated. 2.2 A Formal Description of the Algorithm Ineachroundofthealgorithm,someverticesarecolored.Thedetailsofthecoloringprocedurevarydepending on which of three stages the algorithm is in. First Stage Letq be a constantgreaterthan1,andlet(η ),(d ), and(s ) be sequencesdefined recursively t t t as q−1 s η :=∆ η :=η (1− e−1/q t) 0 t+1 t 2q3 d t q−1 s d :=∆ d :=d (1− e−1/q t)e−1/q 0 t+1 t 2q3 d t s :=∆/k s :=s e−1/q. 0 t+1 t (3) For round t, vertex u, and color c, F (u,c):={c is not assigned to any vertex adjacent to u in round t} (4) t is an event in the probability space generated by the random choices of the algorithm in round t, given the state of all data structures at the beginning of the round. Let Desired F :=e−1/q. t 5 Repeat at every round t, until d /s <1/q2 t t Phase I - Coloring Attempt For each vertex u in G , and color c in S (u): t t Assign c to u with probability 1 1. q2dt Phase II - Conflict Resolution For each vertex u in G : t Phase II.1 Remove from S (u), all colors assigned to adjacent vertices. t Phase II.2 For each color c in S (u), remove c from S (u) with probability t t Desired F (u,c) t 1−min(1, ). Pr(F (u,c) t If S (u) has at least one color which is assigned to u, t then arbirarily pick an assigned color from S (u) to permanently color u. t Phase III - Cleanup(discard all colors c with d (u,c)&qd from palette) t+1 t+1 For each vertex u in G : t S (u)=S (u). t+1 t Let α=1−|S (u)|/s . t+1 t+1 If α<0, then α=0, otherwise if α>1/q, then α=1/q. Let γ be the smallest number in [1,∞) so that 1−qα Average d (u,c)≤ γd . c∈St+1(u) t+1 1−α t+1 Remove all colors c with d (u,c)≥qγd from S (u). t+1 t+1 t+1 end repeat Second Stage We change the recursive equations defining the constants η , d , and s . t t t ηt+1 :=ηt(1− q2−q31e−1qdstt) dt+1 :=dt(1− q2−q31e−q1dstt)e−q1dstt st+1 :=ste−q1dstt (5) Also Desired Ft :=e−q1dstt. All other details of the repeat-until loop of the first stage are the same for the secondstage, except that an uncolored vertex u is assigned a color c from its palette with probability 1 1 , q2s t and we repeat until q−1 η < s . t t q Third Stage(Greedy Coloring) Color each uncolored vertex u, with an arbitrary color from its palette which has not been used to color an adjacent vertex. 6 3 The Main Theorem We say that a sequence x(n) is O(f(n)) if there is a positive number M such that |x(n)| ≤ M|f(n)|. All sequencesinthebig-ohareindexedbyn,thenumberofverticesingraphG.Rememberthateachoccurrence of the big-oh comes with a distinct constant M which may depend on the constant ǫ. Theorem 1 (Main Theorem). Given a triangle-free graph G on n vertices with maximum degree ∆ ≥ log1+ǫn for a positive constant ǫ, and q =7, thereis a positive constant c such that startingwith c ∆/log∆ ǫ ǫ colors, our algorithm finds a proper coloring of the graph in O(n∆2log∆) time with probability 1−O(1/n). We need some lemmas to prove the Main Theorem and before that we need the following definition. Definition 3. d (v)=Average d (v,c) t c∈St(v) t Lemma 1 (MainLemma).Givenatriangle-freegraphGonnverticeswithmaximumdegree∆≥log1+ǫn for a positive constant ǫ,q ≥2, and ψ >1, thereare positive constants α and α such that for thesequences 1 2 (e ) defined by t ψ ψ e =0, e =α (e + + ) for t>0, (6) 0 t+1 1 t d s r t r t and (f ) defined by t f =1−α tn2e−ψ, (7) t 2 if s ≫ψ and e ≪1 at round t, then t t ∀u∈V(G ),∃α∈[0,1/q],∀c∈S (u), t t s (u)≥(1−α)s (1−e ) t t t 1−qα d (u)≤ d (1+e ) t t t 1−α d (u,c)≤qd (1+e ) t t t η (u)≤η (1+e ) t t t with probability greater than f . t We will prove the Main Lemma in Section 5 and assume it in this section. Using it we can immediately conclude the following. Corollary 1. Given the setup of the Main Lemma(Lemma 1), if s ≫ ψ and e ≪ 1 at round t, then t t ∀u∈V(G ), t q−1 s (u)≥ s (1−e ) t t t q d (u)≤d (1+e ) t t t with probability greater than f . t Lemma 2. The first stage finishes in O(k) rounds. Proof. By the definition of sequences (d ) and (s ) in equation (3), we have t t d d q−1s t+1 = t(1− te−1/q) s s 2q3 d t+1 t t d q−1 = t − e−1/q. s 2q3 t Since d0 =k we get dt1 ≤ 1 for some round t =O(k). ⊓⊔ s0 st1 q2 1 7 Let t =O(k) 1 be the last round of the first stage. Then the following lemma is a straightforward application of equation (3). Lemma 3. ∆ s = e−O(k) and ∃c>0 such that if k ≤clog∆, then s ≫1. t1 k t1 3.1 The Second Stage: Controlling the ratio of available colors to uncolored neighbors. We must show that η decreases significantly faster than s in the second stage. Let t t q−2 ρ=1− . 2q3 Lemma 4. η <η ρt t1+t t1 Proof. By the definition of sequence (η ) in equation (5), we have t ηt+1 =ηt(1− q2−q31e−q1dstt) q−1 1d <η (1− (1− t)) hex ≥1+xi t 2q3 q s t q−1 1 d t <η (1− (1− )) h <1i t 2q3 q s t q−1 1 <η (1− + ) t 2q3 2q3 q−2 =η (1− ) t 2q3 =η ρ. t Thus η <η ρt. ⊓⊔ t1+t t1 We now study the ratio d /s , which appears in the recursive equation (5) for the sequences (η ), (d ), and t t t t (s ). t Lemma 5. d 1 t1+t < ρt s q2 t1+t Proof. By the definition of sequences (d ) and (s ) in equation (5), we have t t dt+1 = dt(1− q−1e−1qdstt). s s 2q3 t+1 t Since the ratioof d /s to d /s is the same as the ratioof η to η , we use Lemma 4 to conclude that t+1 t+1 t t t+1 t d d 1 t1+t < t1ρt < ρt. s s q2 t1+t t1 ⊓⊔ 8 Lemma 6. If q is greater than 2, then 4 s ≥s (1− ). t1+t t1 q−2 Proof. By the definition of sequence (s ) in equation (5), we have t st+1 =ste−1qdstt ≥ste−1qdstt 1d t ≥s (1− ). t qs t Thus t 1d t s ≥s (1− ) t1+t t1 qs t i=0 Y t 1 ≥s (1− ρi) hLemma 5i t1 q3 i=0 Y t 2 ≥s exp(− ρi) hsince q >2i t1 q3 i=0 X 2 1 ≥s exp(− ) hsum of geometric seriesi t1 q31−ρ 4q3 =s exp(− ) t1 q3(q−2) 4 =s exp(− ) t1 q−2 4 ≥s (1− ). t1 q−2 ⊓⊔ Lemma 7. For any δ > 0, there is a c > 0 such that if q > 6, k ≤ c log∆, and ∆ ≫ 1, then the second δ δ stage takes at most δlog∆ rounds. Proof. By Lemma 3, we have ∆ s = e−O(k). t1 k Since q >6, the above equation and Lemma 6 imply that ∆ s =Ω( e−O(k)) t1+t k for all t. Also, by Lemma 4, we have η <∆ρt. t1+t Since ∆≫1,givena δ it is straightforwardto find a c >0 sothat ifk ≤c log∆, then afterδlog∆ rounds δ δ η < q−1s . ⊓⊔ t q t 9 3.2 Bounding the Error Estimate in all Concentration Inequalities Now we look at d , which is used to bound the error term e . Let t t 1 µ=1− . 2q2 Lemma 8. 4 d ≥d (1− )µt t1+t t1 q−2 Proof. By the definition of sequence (d ) in equation (5), we have t dt+1 =dt(1− q2−q31e−1qdstt)e−1qdstt ≥dt(1− 21q2)e−1qdstt =dtµe−1qdstt. Combining the inequality above with the definition of sequence (s ) in equation (5), we get t d s t+1 t+1 ≥µ . d s t t We now use Lemma 6 to conclude that d s 4 t1+t ≥µt t1+t ≥µt(1− ). d s q−2 t1 t1 ⊓⊔ Let t be the number of rounds spent in the second stage. 2 Lemma 9. There is a positive constant α such that in the first two stages of our algorithm, keO(k)ψ e ≤αtO( ). t s ∆µt2 Proof. By Lemma 3, we have ∆ s = e−O(k). t1 k Note that in equation (6), the recurrence for e , the largest term is O( ψ/s ) in the first stage, while t t O( ψ/d ) is larger in the second. At round t , the algorithm moves to the second stage and d = Θ(s ). t 1 p t1 t1 The greedy stage starts at round t +t . Since both sequences (d ) and (s ) are decreasing, we use Lemma p 1 2 t t 8 to conclude that ψ ψ k eO(k)ψ O( )=O( )=O( ) sdt1+t2 sst1µt2 s ∆µt2 is the maximum this error term can be. Thus we can simplify the recurrence for e to t k eO(k)ψ e =O(e + ). t+1 t s ∆µt2 Since e =0, a simple upper bound for e is given by 0 t k eO(k)ψ e ≤αtO( ) t s ∆µt2 where α is some positive constant. ⊓⊔ 10