ebook img

Counting Lyndon factors PDF

0.12 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Counting Lyndon factors

COUNTING LYNDON FACTORS AMYGLEN,JAMIESIMPSON,ANDW.F.SMYTH ABSTRACT. In this paper, we determine the maximum number of distinct Lyndon factors that a word of lengthncancontain.WealsoderiveformulasfortheexpectedtotalnumberofLyndonfactorsinawordof lengthnonanalphabetofsizeσ,aswellastheexpectednumberofdistinctLyndonfactorsinsuchaword. 7 TheminimumnumberofdistinctLyndonfactorsinawordoflengthnis1andtheminimumtotalnumberis 1 n,withbothboundsbeingachievedbyxnwherexisaletter.Amoreinterestingquestiontoaskiswhatisthe 0 minimumnumberofdistinctLyndonfactorsinaLyndonwordoflengthn?Inthisdirection,itisknown[13]thatan 2 optimallowerboundforthenumberofdistinctLyndonfactorsinaLyndonwordoflengthnis log (n)+1 , n ⌈ φ ⌉ whereφdenotesthegoldenratio(1+√5)/2.Moreover,thislowerboundisattainedbytheso-calledfinite a J Fibonacci Lyndon words, which are preciselythe Lyndon factors of the well-known infiniteFibonacci word f —aspecialexampleofainfiniteSturmianword.Saari[13]conjecturedthatifwisLyndonwordoflengthn, 4 n=6,containingtheleastnumberofdistinctLyndonfactorsoverallLyndonwordsofthesamelength,then ] w6isaChristoffelword(i.e.,aLyndonfactorofaninfiniteSturmianword).Wegiveacounterexampletothis O conjecture.Furthermore,wegeneraliseSaari’sresultonthenumberofdistinctLyndonfactorsofaFibonacci C Lyndon word by determining the number of distinct Lyndonfactors of a givenChristoffel word.We end withtwoopenproblems. . h t a m [ 1. INTRODUCTION 1 v 8 ThispaperisconcernedwithcountingLyndonwordsoccurringinagivenwordoflengthn. 2 9 First, let us recall some terminology and notation from combinatorics on words (see, e.g., [10, 11]). A 0 wordisa(possiblyempty)finiteorinfinitesequenceofsymbols,calledletters, drawnfromagivenfinite 0 set Σ, called an alphabet, ofsize σ = Σ . A finite word w := x x x with each x Σ is said tohave 1. length n, written w = n. The empty| w|ord is the unique word1 o2f·l·e·ngnth 0, denotedi ∈by ε. The set of all 0 | | finite wordsover Σ (including theemptyword)is denotedby Σ ,and for each integer n 2, thesetof 7 ∗ ≥ 1 allwordsoflengthnoverΣisdenotedbyΣn. : v A finitewordz is saidtobeafactor ofagiven finiteword w ifthereexistwordsu, v suchthat w = uzv. i X If u = ε, then z is said to be a prefix of w, and if v = ε, then z is said to be a suffix of w. If both u and v r arenon-empty,wesaythatzisaproper factorofw.Aprefix(respectively,suffix)ofwthatisnotequalto a w itselfissaidtobeaproper prefix (respectively,proper suffix)ofw.Afactorofan infinitewordisafinite wordthatoccurswithinit. A non-empty word x that is both a proper prefix and a proper suffix of a finite word w is said to be a border of w. We say that a word which has only an empty border is borderless. If, for some word x, w = xx x (k times for some integer k 1), we write w = xk, and w is called the k-th power of x. A ··· ≥ non-empty finite word is said to be primitive if it is not a power of a shorter word. Two finite words u, v are said to be conjugate if there exist words x,y such that u = xy and v = yx. Accordingly, conjugate words are cyclic shifts of one another, and thus conjugacy is an equivalence relation. A primitive word of length n has exactly n distinct conjugates.For example, theprimitive word abacaba of length7 has 7 distinctconjugates;namely,itselfandthesixwordsbacabaa, acabaab, cabaaba, abaabac, baabaca, aabacab. Thesetofallconjugatesofafinitewordwiscalledtheconjugacyclassofw. Date:January4,2017. 2000MathematicsSubjectClassification. 68R15. Keywordsandphrases. Lyndonword;Sturmianword;Fibonacciword;Christoffelword. 1 2 AMYGLEN,JAMIESIMPSON,ANDW.F.SMYTH InthispaperweconsideronlywordsonanorderedalphabetΣ = a ,a ,...,a wherea < a < < 1 2 σ 1 2 { } ··· a .ThistotalorderonΣnaturallyinducesalexicographical order(i.e.,analphabeticalorder)onthesetof σ allfinitewordsoverΣ.ALyndonwordoverΣisanon-emptyprimitivewordthatisthelexicographically least word in its conjugacy class, i.e., w Σ or w < vu for all non-empty words u,v such that w = uv (e.g.,see[10]). Equivalently, anon-empty∈finiteword w over Σ is Lyndonifand onlyif w Σ or w < v ∈ forallpropersuffixesvofw[7].Note,inparticular,thatthereisauniqueLyndonwordintheconjugacy classofanygivenprimitiveword.Forexample,aabacab istheuniqueLyndonconjugateoftheprimitive wordabacaba. LyndonwordsarenamedafterR.C.Lyndon[12],whointroducedthemin1954underthe nameof“standardlexicographicsequences”.Suchwordsarewellknowntobeborderless[7]. WebegininSection2bycomputingD(σ,n),themaximumnumberofdistinctLyndonfactorsinaword of length n on an alphabet Σ of size σ. In Section 3 we compute ET(σ,n), the expected total number of Lyndon factors (that is, counted according to their multiplicity) in a word of length n over Σ, while Section4computesED(σ,n),theexpectednumberofdistinctLyndonfactorsinwordoflengthnoverΣ. Section 5 considersdistinctLyndonfactors in a Lyndonword oflength n; in particular, we generalisea resultofSaari[13]onthenumberofdistinctLyndonfactorsofaFibonacciLyndonwordbydeterminingthe numberofdistinctLyndonfactorsofagivenChristoffelword(i.e.,aLyndonfactorofaninfiniteSturmian word—tobedefinedlater).Lastly,inSection6,westatesomeopenproblems. 2. THE MAXIMUM NUMBER OF DISTINCT LYNDON FACTORS IN A WORD Let D(σ,n) be the maximum number of distinct Lyndon factors in a word of length n on the alphabet Σ = a ,a ,...,a . We want to find a word that achieves D(σ,n), given σ and n. It is clear that a 1 2 σ { } necessaryconditionforattainingthemaximumisthatwtakestheformak1ak2 ...akσ.Thiswordcontains 1 2 σ (n+1) factors of lengths 1,2,...,n, of which each is a Lyndon word except those of the form ak, k > 1. 2 i Thenumberofpowersofeach a is(ki+1),including a itself.ThetotalnumberofLyndonfactorsin w is i 2 i therefore n+1 σ k +1 (1) ∑ i +σ, 2 − 2 (cid:18) (cid:19) i=1(cid:18) (cid:19) where the final σ counts the single letters a . We claim that the summation is minimised when the k i i differbyatmostone.Supposetothecontrarythatk = k +sforsomei, jands 2.Itiseasilychecked j i ≥ that k +1+s k +1 k +s k +2 i + i > i + i 2 2 2 2 (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) for s 2. Thus the summation term will be minimised when each k equals either n/σ or n/σ . If i n = m≥σ+ p, where0 < p < σ, then n/σ = m and n/σ = m+1.If p = 0thenea⌊ch k ⌋equa⌈ls m.⌉We i ⌊ ⌋ ⌈ ⌉ thereforehavethefollowingresult. Theorem1. Ifn = mσ+p,where0 p < σ,then ≤ n+1 m+1 m+2 (2) D(σ,n) = (σ p) p +σ 2 − − 2 − 2 (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) andthemaximumisattainedusing w = am...am am+1 ...am+1. 1 n p n p+1 n − − Corollary2. Ifn = mσthen σ D(σ,n) = m2+σ. 2 (cid:18) (cid:19) COUNTINGLYNDONFACTORS 3 Proof. Ifn = mσ,Theorem1gives n+1 m+1 D(σ,n) = σ +σ 2 − 2 (cid:18) (cid:19) (cid:18) (cid:19) σ = (m(mσ+1) (m+1)m)+σ 2 − σ = m2+σ asrequired. 2 (cid:18) (cid:19) (cid:3) Thefollowingtableshowsvaluesof D(σ,n)forlowvaluesofn. n D(2,n) D(5,n) D(10,n) 1 2 5 10 2 3 6 11 3 4 8 13 4 6 11 16 5 8 15 20 6 11 19 25 7 14 24 31 8 18 30 38 9 22 37 46 10 27 45 55 15 58 95 110 20 102 165 190 25 158 255 290 30 227 365 415 TABLE 1. ThemaximumnumberofdistinctLyndonfactorsthatcanappearinwordsoflengthn. 3. THE EXPECTED TOTAL NUMBER OF LYNDON FACTORS IN A WORD We now wish to calculate the total number M(σ,n) of Lyndon factors (that is, counted according to multiplicity) appearing in all words in Σn. Consider a Lyndon word L of length m n and a position ≤ i, 1 i n m+1, in wordsof length n. Wordscontaining L starting at position i have the form xLy ≤ ≤ − where xy is any word on Σ with length n m. Thustherewill be σn m wordsin Σn which contain L in − − thisposition.Thiswillbethesameforanyofthen m+1possiblevaluesofisointheσn wordsin Σn − therewillbe(n m+1)σn m appearancesofL.ThisisthesameforallLyndonwordsofthislength.The − − numberofsuchLyndonwordsis1/mofthenumberofprimitivewordsofthislength,sinceexactlyone conjugateofeach primitive wordisLyndon.Thenumberofprimitive wordsoflengthn ([10],equation (1.3.7))is m ∑µ σd d d|m (cid:16) (cid:17) whereµistheMöbiusfunction.TogetthetotalnumberofLyndonfactorsappearinginΣn,wesumover possiblevaluesofm: n n m+1 m (3) M(σ,n) = ∑ − σn m ∑µ σd. − m d m=1 d|m (cid:16) (cid:17) Dividing by σn givestheexpectedtotalnumber ET(σ,n) := M(σ,n)/σn ofLyndonfactorsin awordof lengthnonthealphabetΣ.Table2belowshowsvaluesforσ = 2,5andlowvaluesofn. 4 AMYGLEN,JAMIESIMPSON,ANDW.F.SMYTH n M(2,n) ET(2,n) M(5,n) ET(5,n) 1 2 1.00 5 1.00 2 9 2.25 60 2.40 3 30 3.75 515 4.12 4 87 5.43 3800 6.08 5 234 7.31 25749 8.24 6 597 9.32 165070 10.56 7 1470 11.48 1018135 13.03 8 3522 13.76 6103350 15.62 9 8264 16.14 35797125 18.33 10 19067 18.62 206363748 21.13 TABLE 2. Valuesofthetotalnumber M(σ,n)ofLyndonfactorsappearinginallwordsoflength nandtheexpectedtotalnumberET(σ,n)ofLyndonfactorsinawordoflengthnonanalphabet ofsizeσforσ =2,5andn =1,2,...,10. 4. THE EXPECTED NUMBER OF DISTINCT LYNDON FACTORS IN A WORD Weusethenotationfromabove,with[n]beingtheset 1,2,...,n .Mostofthefollowinganalysiscounts { } thenumberofwordsinΣn thatcontainatleastonefactorequaltoaspecificLyndonword L.Attheend we sum over all possible L. Let S be a non-empty set of positions in a word w and let P(L,S,w) = 1 if w contains factors equal to L at each positionin w beginningat apositionin theset S,and 0 otherwise. Notethatwmaycontainotherfactorsequalto L.Weclaimthat n 1ifwcontainsatleastonefactorequalto L, (4) ∑( 1)s+1 ∑ P(L,S,w) = s=1 − S [n],S=s (0otherwise. ⊆ | | If w contains no factor equal to L then P(L,S,w) equals 0 for all S so the “otherwise” part of the claim holds. Suppose w contains copies of L beginning at positions in T = i ,i ,...,i and nowhere else. 1 2 t { } Then P(L,S,w) equals 1 if and only if S is any non-empty subset of T, so the left hand side of (4) be- comes t ∑( 1)s+1 S T : S = s − |{ ⊆ | | }| s=1 t t = ∑( 1)s+1 − s s=1 (cid:18) (cid:19) t t = ∑( 1)s+1 +1. − s s=0 (cid:18) (cid:19) Thisequals1sincethefinalsumisthebinomialexpansionof(1 1)t.ThenumberofwordsinΣn which − containatleastonefactorequalto Listherefore n ∑ ∑( 1)s+1 ∑ P(L,S,w) − w Σns=1 S [n],S=s ∈ ⊆ | | n (5) = ∑( 1)s+1 ∑ ∑ P(L,S,w). − s=1 S [n],S=sw An ⊆ | | ∈ We now evaluate ∑ P(L,S,w).This is counting the words in Σn which have factors L beginning at w Σn positions i S. It cle∈arly equals 0 if s L > n since then there is no room in w for s factors L (recalling ∈ | | COUNTINGLYNDONFACTORS 5 that L is Lyndon,thereforeborderless,and therefore cannot intersecta copy of itself). We also needthe membersofStobeseparatedbyatleast L .ThenumberofsuchsetsSis | | n s L +s − | | . s (cid:18) (cid:19) OnceSischosenthereareσn s L waysofchoosingthelettersinwwhicharenotinthespecifiedfactors − | | L.Thus n s L +s ∑ P(L,S,w) = − | | σn s L . − | | s w Σn (cid:18) (cid:19) ∈ Substituting in (5) we see that the number of words in Σn which contain at least one occurrence of L is ⌊n∑/|L|⌋( 1)s+1 n−s|L|+s σn−s|L|. − s s=1 (cid:18) (cid:19) To get the expected number ED(σ,n) of distinct Lyndon factors in a word of length n, we sum this over all L withlengthat most n,usingthesame techniqueas in theprevioussection,and divideby σn. Replacing L withmwegetthefollowing: | | (6) ED(σ,n) = ∑n 1 ∑µ m σd⌊n∑/m⌋( 1)s+1 n−sm+s σ sm. − m d − s m=1 d|m (cid:16) (cid:17) s=1 (cid:18) (cid:19) ThefollowingtableshowsvaluesofED(σ,n)forlowvaluesofnandseveralvaluesofσ. n σ =2 5 10 20 1 1.00 1.00 1.00 1.00 2 1.75 2.20 2.35 2.42 3 2.50 3.56 3.94 4.14 4 3.25 5.02 5.69 6.05 5 4.06 6.55 7.57 8.12 6 4.91 8.16 9.54 10.31 7 5.81 9.82 11.59 12.61 8 6.77 11.54 13.70 14.99 9 7.77 13.31 15.88 17.45 10 8.83 15.13 18.11 19.97 15 14.77 24.93 29.90 33.36 20 21.67 35.76 42.58 47.70 25 29.35 47.43 56.02 62.73 30 37.70 59.82 70.11 78.33 TABLE 3. The expected number ED(σ,n) of distinct Lyndon factors in a word of length n for alphabetsofsizeσ =2,5,10,20. 5. DISTINCT LYNDON FACTORS IN A LYNDON WORD Minimising thenumber ofLyndonfactors overwordsof length n is notvery interesting:theminimum number of distinct Lyndon factors is 1 and the minimum total number is n. Both bounds are achieved by xn wherexisaletter.AmoreinterestingquestionhasbeenstudiedbySaari[13]:whatistheminimum numberofdistinctLyndonfactorsinaLyndonwordoflength n?Heprovedthatanoptimallowerboundfor thenumberofdistinctLyndonfactorsinaLyndonwordoflengthnis log (n)+1 ⌈ φ ⌉ 6 AMYGLEN,JAMIESIMPSON,ANDW.F.SMYTH where φ denotes the golden ratio (1+√5)/2. Moreover, this lower bound is attained by the so-called finiteFibonacciLyndonwords, whicharepreciselytheLyndonfactorsofthewell-knowninfiniteFibonacci word f —aspecialexampleofacharacteristic Sturmianword. Following the notation and terminology in [11, Ch. 2], an infinite word s over a,b is Sturmian if and { } only if there exists an irrational α (0,1), and a real number ρ, such that s is one of the following two ∈ infinitewords: s , s : N a,b α,ρ ′α,ρ −→ { } definedby aif (n+1)α+ρ nα+ρ = 0, s [n] = ⌊ ⌋−⌊ ⌋ α,ρ (botherwise; (n 0) ≥ aif (n+1)α+ρ nα+ρ = 0, s [n] = ⌈ ⌉−⌈ ⌉ ′α,ρ (botherwise. Theirrationalαiscalledtheslopeofsandρistheintercept.Ifρ = 0,wehave s = ac and s = bc α,0 α ′α,0 α where c is called the characteristic Sturmian word of slope α. Sturmian words of the same slope have α the same set of factors [11, Prop. 2.1.18], so when studying the factors of Sturmian words, it suffices to consideronlythecharacteristicones. The infiniteFibonacci word f is the characteristic Sturmian word ofslope α = (3 √5)/2. Itcan be con- − structedasthelimitofaninfinitesequenceofso-calledfiniteFibonacciwords f ,definedby: n n 1 { } ≥ f = b, f = a, f = f f for n 1. 1 0 n n 1 n 2 − − − ≥ That is, f = ab, f = aba, f = abaab, f = abaababa, f = abaababaabaab, etc. (where f is a prefix of 1 2 3 4 5 n f foreachn 1),andwehave n+1 ≥ f = lim f = abaababaabaab n n ∞ ··· → Note. The length of the n-th finite Fibonacci word f is the n-th Fibonacci number F , defined by: F = n n 1 1,F = 1,F = F +F forn 1. − 0 n n 1 n 2 − − ≥ Moregenerally,anycharacteristicSturmianwordcanbeconstructedasthelimitofaninfinitesequence of finite words. To this end, we recall that every irrational α (0,1) has a unique simple continued ∈ fractionexpansion: 1 α = [0;a ,a ,a ,...] = 1 2 3 1 a + 1 1 a + 2 a + 3 ··· whereeach a isapositiveinteger.Then-thconvergentofαisdefinedby i p n = [0;a ,a ,...,a ] foralln 1, 1 2 n q ≥ n wherethesequences p and q aregivenby n n 0 n n 0 { } ≥ { } ≥ p = 0, p = 1, p = a p +p , n 2 0 1 n n n 1 n 2 q = 1, q = a , q = a q − +q − , n ≥ 2 0 1 1 n n n 1 n 2 − − ≥ Supposeα = [0;1+d ,d ,d ,...],withd 0andallotherd > 0.Tothedirectivesequence(d ,d ,d ,...), 1 2 3 1 n 1 2 3 ≥ weassociateasequence s ofwordsdefinedby n n 1 { } ≥− s = b, s = a, s = sdn s for n 1. −1 0 n n−1 n−2 ≥ COUNTINGLYNDONFACTORS 7 Suchasequenceofwordsiscalledastandardsequence,andwehave s = q foralln 0. n n | | ≥ Notethatabisasuffixofs andbaisasuffixofs foralln 1. 2n 1 2n − ≥ Standardsequencesare relatedtocharacteristicSturmianwordsinthefollowingway.Observethat,for any n 0, sn is a prefix of sn+1, which gives obvious meaning to limn ∞sn as an infinite word.In fact, ≥ → onecanprove[8,3]thateachs isaprefixofc ,andwehave n α c = lims . α n n ∞ → The following lemma collects togethersomepropertiesofthestandard words s .Notethatfrom nowon n < whenreferringtoLyndonwordsoverthealphabet a,b weassumethenaturalordera b. { } Lemma3. Letc bethecharacteristic Sturmianwordofslopeα = [0;1+d ,d ,d ,...]withc = lim s where α 1 2 3 α n n ∞ thewordss aredefinedasabove. → n ◮ Foralln 1,s isaprimitive word[6]. n ≥ ◮ Foralln 1,thereexistuniquelydeterminedpalindromes u ,v , p suchthat n n n ≥ p ab ifnisodd, n s = u v = n n n (pnba ifniseven, where u = q 2and v = q q +2.[6] n n 1 n n n 1 | | − − | | − − ◮ Forall n 1, the reversal of s is the (q 2)-ndconjugate of s , andhence the conjugacy class of s is n n n n ≥ − closedunderreversal. [9,Prop.2.9(4)] ◮ TheLyndonfactorsofc oflengthatleast2arepreciselytheLyndonconjugatesofthe(primitive)standard α wordss foralln 1.[2,5] n ≥ Thefollowinglemmaisageneralisationof[13,Lemma8]. Lemma 4. Let cα be the characteristic Sturmian word of slope α = [0;1+d1,d2,d3,...] with cα = limn ∞sn → wheres = p xywithxy ab,ba . TheLyndonconjugateofs isthewordap bforalln 1.Moreover,every n n n n ∈ { } ≥ Lyndonfactorofc thatisshorterthan ap biseitheraprefixorasuffixofap b. α n n Proof. First we show that, for all n 1, the Lyndon conjugate of s is the word ap b. If s = p ba, n n n n ≥ then ap b is clearly a conjugate of s and it is Lyndon[2, 5]. On the other hand, if s = p ab, then bp a n n n n n is a clearly a conjugate of s , and since the conjugacy class of s is closed under reversal and p is a n n n palindrome(byLemma3),itfollowsthatap bisaconjugateofs anditisLyndon[2,5]. n n < Toprovethesecondclaim, itsufficestoshowthatifk n,thentheLyndonconjugateofs isaprefixor k suffix of ap b (since, by Lemma 3, theLyndonfactors of c of lengthat least 2 are precisely theLyndon n α conjugatesofthe(primitive)standardwordsinc ).Theclaimistruefork = 1andk = 0sinces = b α 1 and s0 = a. It is also true for k = 1 because s1 = ad1b is the Lyndon conjug−ate of itself, and is a−prefix of ap b if d 1 and a suffix of ap b if d = 0. Now suppose that k 2. Then k < n implies that s n 1 n 1 k ≥ ≥ is a prefix of p . Furthermore, since p is a palindrome, the reversal of s is a suffix of p . Therefore if n n k n s = p ba, then its Lyndon conjugate ap b is a prefix of ap b; otherwise, if s = p ab, then its Lyndon k k k n k k (cid:3) conjugateap bisasuffixofap b. k n TheLyndonfactorsof(characteristic)Sturmianwordsoflengthatleast2(i.e.,theLyndonconjugatesof standard words)over a,b are precisely the so-called Christoffel words beginning with the letter a (see, { } e.g., the nice survey [1]). Christoffel words take the form aPal(v)b and bPal(v)a where v a,b and ∗ ∈ { } Pal isiteratedpalindromic closure, definedby: Pal(ε) = ε and Pal(wx) = (Pal(w)x)+ foranyfinitewordwandletterx, 8 AMYGLEN,JAMIESIMPSON,ANDW.F.SMYTH where u+ denotestheshortestpalindromebeginningwith u (called thepalindromic closure ofu). Forex- ample, Pal(aba) = abaabawheretheunderlinedlettersindicatethepointsatwhichpalindromicclosure isapplied. < < Let p,qbeco-primeintegerswith0 p q.Therational p/qhastwodistinctsimplecontinuedfraction expansions: p/q = [0;1+d ,d ,...,d ,1] = [0;1+d ,d ,...,d +1] 1 2 n 1 2 n whered 0andallotherd 1.Theso-calledChristoffelwordofslope p/qbeginningwiththeletterais 1 i ≥ ≥ theuniqueSturmianLyndonwordover a,b oflengthqcontaining p occurrencesoftheletterb,given { } by: a ifnisodd, aPal(v)b with v = ad1bd2ad3 xdn where x = ··· (b ifniseven. For example, the Christoffel word of slope 2/5 = [0;2,1,1] = [0;2,2] beginning with the letter a is aPal(ab)b = aabab. Remark 5. Note,inparticular, thattheChristoffel wordofslope p/q = [0;1+d ,d ,...,d ,1] beginningwith 1 2 n the letter a is precisely the Lyndon conjugate of the standard word s = s s where s = b, s = a, and n+1 n n 1 1 0 s = sdi s for1 i n(see,e.g.,[1]). − − i i−1 i−2 ≤ ≤ Example 6. For all n 1, the Fibonacci Lyndon word of length F (i.e., the Lyndon conjugate of the finite n ≥ Fibonacciword f )istheChristoffelwordofslope F /F = [0;2,1,1,...,1]beginningwith a. n n 2 n − (n 1)1s − | {z } Saari [13, Thm. 1] proved that if w is a Lyndonword with w F for some n 1, then w contains at n | | ≥ ≥ leastn+2distinctLyndonfactors,withequalityifandonlyifwistheFibonacciLyndonwordoflength F . For example, aPal(ab)a = aabab is the Fibonacci Lyndon word of length F = 5 and contains the n 3 minimumnumber(3+2 = 5)ofdistinctLyndonfactorsoverallLyndonwordsofthesamelength.Saari alsomadethefollowingconjecture. Conjecture 7. [13] If w is a Lyndon word of length n, n = 6, containing the least number of distinct Lyndon 6 factorsoverallLyndonwordsofthesamelength,thenwisaChristoffel word. The number 6 is excluded because the following words all contain 7 distinct Lyndon factors, which is theminimumforlength6words,andonlythefirstandlastareChristoffel: aaaaab, aaabab, aabbab, ababbb, ababac, abacac, acbacc, abbbbb. However the conjecture is not true. The following Lyndon word has length 28 and contains 10 distinct Lyndon factors — the minimum number of distinct Lyndon factors in a Lyndonword of this length — butitisnotChristoffel(comparetheprefixoflength5tothesuffixoflength5): aabaababaabaabababaabaababab Theminimum of10 distinctLyndonfactorsfor aLyndonwordoflength28is also attainedby theLyn- don word aabaababaababaababaababaabab which is the Christoffel word aPal(abab4)b of slope 11/28 = [0;2,1,1,4,1]. We now generalise Saari’s result on the number ofdistinct Lyndonfactors in a Fibonacci Lyndonword by determiningthenumberofdistinctLyndonfactorsin agiven Christoffelword.Let (w) denotethe L numberofdistinctLyndonfactorsinawordw. Theorem 8. If w is a Sturmian Lyndon word on a,b with a < b, i.e., a Christoffel word (beginning with the letter a) of slope p/q = [0;1+d ,d ,...,d ,{1] fo}r some co-prime integers p, q with 0 < p < q, then 1 2 n (w) = d +d + +d +3. 1 2 n L ··· COUNTINGLYNDONFACTORS 9 Proof. The word w is the Lyndon conjugate of the standard word s = s s with s = b, s = a, n+1 n n 1 1 0 and s = sdi s for 1 i n (see Remark 5). By Lemma4, the Lyndonconju−gatesof s− for 1 i n areeitiherpi−re1fii−xe2sorsu≤ffixe≤sofw.Moreover,foreachi with1 i n,thestandardwordi s con≤tain≤s d i i ≤ ≤ distinct Lyndonfactors oflengths sm s for m = 1,2,...,d .ByLemma3, thesearetheonlyLyndon factors oftheLyndonword w besid| ei−s1itis−e2lf|and thetwolettersi a and b. Hence (w) = (d +d + + 1 2 d )+3. L ···(cid:3) n The above result is a generalisation of [13, Lemma 9], which reworded (with the indexing of Fibonacci words and numbers shifted back by 2) states that if w is the Fibonacci Lyndon word of length F for n 2 − somen 3,i.e.,theChristoffelwordofslope ≥ F /F = [0;2, 1,1,...,1] n 4 n 2 − − (n 3)1s − beginningwiththelettera,then (w) = (n 3)+3 = n|. {z } L − Examples: ◮ The Christoffel word of slope 2/5 = [0;2,1,1] beginning with the letter a, namely aPal(ab)b = aabab,istheFibonacciLyndonwordoflengthF = 5thatcontainstheminimumnumber(5 = 2+3) 3 ofdistinctLyndonfactorsforitslength. ◮ The Christoffel word of slope 1/6 = [0;5,1] beginning with the letter a, namely aPal(a4)b = aaaaab, is a Sturmian Lyndon word of length 6 containing the minimum number (7 = 4+3) of distinctLyndonfactorsforitslength. ◮ TheChristoffelwordofslope3/8 = [0;2,1,1,1]beginningwiththelettera,namelyaPal(aba)b = aabaabab, is theFibonacci Lyndonwordof length F = 8 containing the minimum number (6 = 4 3+3)ofdistinctLyndonfactorsforitslength. 6. OPEN PROBLEMS OpenProblem1: OnemightsuspectthatanygivenChristoffelwordcontainstheminimumnum- berofdistinctLyndonfactorsoverallLyndonwordsofthesamelength.However,thisisnottrue. For instance, the Christoffel word of slope 5/11 = [0;2,4,1] beginning with the letter a, namely aPal(ab4)b = aababababab,contains8(= 5+3)distinctLyndonfactors,buttheminimumnumber ofdistinctLyndonfactorsofaLyndonwordoflength11isactually7.Thisminimumisattained bytheChristoffelwordaPal(a2ba)b = aaabaaabaab ofslope3/11 = [0;3,1,1,1]. IsittruethattheminimumnumberofdistinctLyndonfactorsoverallLyndonwordsofthesame lengthisattainedbyatleastoneChristoffelwordofthatlength? OpenProblem2: Tables 2 and 3, showing values for ET(σ,n) and ED(σ,n), raise the question of whethertheremay exist asymptoticformulas for thesequantities,simpler than the exact values displayedinequations(2)and(6),respectively. REFERENCES [1] J.Berstel,Sturmianandepisturmianwords(asurveyofsomerecentresults),CAI2007,LectureNotesinComputerScience, vol.4728,2007,pp.23–47. [2] J.Berstel,A.deLuca,Sturmianwords,Lyndonwordsandtrees,Theoret.Comput.Sci.178(1997)171–203. [3] T.C.Brown,Descriptionsofthecharacteristicsequenceofanirrational,Canad.Math.Bull.36(1993)15–21. [4] K.T.Chen,R.H.Fox,R.C.Lyndon,Freedifferentialcalculus,IV.Thequotientgroupsofthelowercentralseries,Ann.Math. 68(1958)81–95. [5] A.deLuca,Sturmianwords:structure,combinatoricsandtheirarithmetics,Theoret.Comput.Sci.183(1997)45–82. [6] A.deLuca,F.Mignosi,SomecombinatorialpropertiesofSturmianwords,Theoret.Comput.Sci.136(1994)361–385. [7] J.P.Duval,Factorizingwordsoveranorderedalphabet,J.Algs.4(1983)363–381. 10 AMYGLEN,JAMIESIMPSON,ANDW.F.SMYTH [8] A.S. Fraenkel, M. Mushkin, U. Tassa, Determination of [nθ] by its sequence of differences,Canad. Math. Bull.21 (1978) 441–446. [9] A.Glen,OnSturmianandepisturmianwords,andrelatedtopics,PhDthesis,TheUniversityofAdelaide,Australia,April 2006,http://hdl.handle.net/2440/37765. [10] M.Lothaire,CombinatoricsonWords,EncyclopediaofMathematicsanditsApplications,vol.17,CambridgeUniversityPress,UK, 1997. [11] M.Lothaire,AlgebraicCombinatoricsonWords,EncyclopediaofMathematicsanditsApplications,vol.90,CambridgeUniversity Press,UK,2002. [12] R.C.Lyndon,OnBurnside’sproblem, Trans.Amer.Math.Soc.77(1954)202–215. [13] K.Saari,LyndonwordsandFibonaccinumbers,J.Combin.TheorySer.A121(2014)34–44. AMYGLEN SCHOOLOFENGINEERING&INFORMATIONTECHNOLOGY MURDOCHUNIVERSITY 90SOUTHSTREET MURDOCH,WA6150AUSTRALIA E-mailaddress:[email protected] JAMIESIMPSON DEPARTMENTOFMATHEMATICSANDSTATISTICS CURTINUNIVERSITY BENTLEY,WA6102AUSTRALIA E-mailaddress:[email protected] W.F.SMYTH DEPARTMENTOFCOMPUTINGANDSOFTWARE MCMASTERUNIVERSITY HAMILTON,ONTARIOL8S4K1CANADA E-mailaddress:[email protected]

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.