Mathematical Modeling of Evolution of Horizontally Transferred Genes Artem S. Novozhilov, Georgy P. Karev, and Eugene V. Koonin National Center for BiotechnologyInformation, National Libraryof Medicine, National Institutes of Health Wedescribeastochasticbirth-and-deathmodelofevolutionofhorizontallytransferredgenesinmicrobialpopulations. ThemodelisageneralizationofthestochasticmodeldescribedbyBergandKurlandandincludesfiveparameters:therate of mutational inactivation, selection coefficient, invasion rate (i.e., rate of arrival of a novel sequence from outside of therecipientpopulation),within-populationhorizontaltransmission(‘‘infection’’)rate,andpopulationsize.Themodelof Berg and Kurland included four parameters, namely, mutational inactivation, selection coefficient, population size, and‘‘infection.’’However,theeffectof‘‘infection’’wasdisregardedintheinterpretationoftheresults,andtheoverall conclusion was that horizontally acquired sequences can be fixed in a population only when they confer a substantial selectiveadvantageontotherecipientandthereforearesubjecttostrongpositiveselection.Analysisofthepresentmodel indifferentdomainsofparametervaluesshowsthat,aslongastherateofwithin-populationhorizontaltransmissionis D comparabletothemutationalinactivationrateandthereisevenalowrateofinvasion,horizontallyacquiredsequences o w canbefixedinthepopulationoratleastpersistforalongtimeinasubstantialfractionofindividualsinthepopulationeven n whentheyareneutralorslightlydeleterious.Theavailablebiologicaldatastronglysuggestthatintensewithin-populationand lo a evenbetween-populationsgeneflowsarerealisticforatleastsomeprokaryoticspeciesandenvironments.Therefore,our de d modelingresultsarecompatiblewiththenotionofapivotalroleofhorizontalgenetransferintheevolutionofprokaryotes. fro m h Introduction ttp s Sequencingofmultiple,completegenomesofdiverse 2001; Ragan 2001; Lawrence and Hendrickson 2003). ://a c lifeformsandtheensuingadventofcomparativegenomics In many studies, explicit phylogenetic analysis is com- a d e havedramaticallychangedtheprevailingpictureofevolu- plemented or replaced by simpler analyses of sequence m tion,atleastfortheprokaryoticworld.Itbecameclearthat similarity rankings, usually, based on Blast search results ic.o the evolutionary process is much more flexible and dy- (Kooninetal.1997;Aravindetal.1998).Noneoftheseap- up namicthanpreviouslyimagined.Inadditiontothevertical proaches is free of substantial caveats (Kurland, Canback, .co m inheritance of genes along a tree-like evolutionary trajec- andBerg2003).Anyphyleticpattern,howeverunusual,in /m tory,lineage-specificgenelossandhorizontal(lateral)gene principle, can be explained solely through multiple, line- b e transfer(HGT)haveemergedasmajorevolutionaryforces, age-specific gene losses (Koonin 2003). Similarly, phylo- /a leadingtotheideasof‘‘uprootingthetreeoflife’’andthe genetic tree topologies and sequence similarity rankings rtic le conceptof‘‘horizontalgenomics’’(Pennisi1998;Doolittle arestronglyaffectedbothbygenelossandbyunequalrates -a b 1999a, 1999b, 2000; Pennisi 1999; Koonin, Aravind, and of evolution in different lineages (Koski and Golding s Kondrashov2000;Koonin,Makarova,andAravind2001). 2001). As a result, it has been posited that the notion that tra c Under a somewhat extreme view of the prevalence of HGT dominates prokaryotic evolution is a misconception t/2 2 HGT in the evolution of prokaryotes, even coherent tree stemming from noncritical data analysis which suggests /8 /1 topologies observed in multigene analyses might be due vastly exaggerated rates of HGT (Kurland 2000; Kurland, 7 2 togradientsofHGTpropensitypermeatingtheprokaryotic Canback, and Berg 2003). 1 /1 world:stableclustersinsuchtreesarethoughttocomprise As noticed in a recent review by Lawrence and 0 4 groups of microbes which exchange genes frequently Hendrickson, the study of HGT remains a research field 28 4 (Gogarten, Doolittle, and Lawrence 2002). However, the initsadolescence(LawrenceandHendrickson2003).This 0 b extent of HGT in prokaryotic evolution remains a matter does not seem surprising given that the appreciation of y g ofcontention(Kurland2000;Kurland,Canback,andBerg thepotentialmajorsignificanceofHGTasanevolutionary u e 2003).Likeothereventsthathappenedintheevolutionary factordatesonlyfromthelast3–4yearsofthe20thcentury st o past,eachindividualcaseofHGTishardtoprovebeyond when systematic comparison of multiple sequenced n 2 reasonabledoubt.RelativelyrecentcasesofprobableHGT genomes became possible (Koonin et al. 1997; Doolittle 7 M are usually demonstrated through anomalous nucleotide 1999b;Koonin,Aravind,andKondrashov2000).Thelack a composition or codon usage of the genes thought to have of certainty regarding the true extent of HGT is one of rch been transferred (Tsirigos and Rigoutsos 2005). These thecrucialaspectsofthisimmaturityof‘‘horizontalgeno- 20 1 methods are not applicable to putative ancient transfers mics.’’ The other distinct but not unrelated aspect is the 9 which are detected on the basis of unexpected phyletic need to integrate HGT into the framework of the existing patterns of genes (i.e., patterns of presence-absence in evolutionary theory which, in its current form, is based genomes from different taxa) and/or discrepancies in the solelyonthenotionofverticalinheritanceofgeneticchar- topologies of phylogenetic trees (Ochman, Lawrence, acters(Kimura1983). Specifically,it isnecessaryto iden- and Groisman 2000; Koonin, Makarova, and Aravind tify the neutral and/or selective evolutionary factors that affect the fate of a horizontally transferred gene leading toitsfixationinoreliminationfromarecipientpopulation. Keywords:horizontalgenetransfer,mathematicalmodeling,evolu- It seems likely that a robust evolutionary theory of HGT tionofprokaryotes. willprovidefeedbackforassessingevidenceofindividual E-mail:[email protected]. HGT cases and ultimately for more reliable estimates of Mol.Biol.Evol.22(8):1721–1732.2005 the role of this phenomenon in evolution. The first evolu- doi:10.1093/molbev/msi167 AdvanceAccesspublicationMay18,2005 tionary-theoretical analysis of HGT in microbes has been PublishedbyOxfordUniversityPress2005. 1722 Novozhilovetal. reportedbyBergandKurland(2002).Theyconcludedthat, invasionofthefirst-typeindividualswiththeratecNwhere atleastinpopulationsoflargeeffectivesizethataretypical N isthetotal sizeofthepopulation,N 5n 1n .Specif- 1 2 ofmostprokaryotes,horizontallyacquiredgenescanpersist ically,withregardtoHGT,invasionmeansthatacquisition and get fixed only when they provide a strong selective ofaparticularaliengeneisnotauniqueevent,withaneg- advantage to the recipient. It seems that cases when hori- ligiblysmallprobabilityofrepeatedoccurrence,butrather zontallyacquiredgenesarestronglybeneficialtotherecip- that there is continuous influx of the alien gene, even if ientshouldbeexceedinglyrare.Indeed,whenHGToccurs, occurring at a low rate. a gene moves from one cellular background to another, Takingintoaccountonlyprocessesofmutation,selec- especially whenthedonorandacceptoraretaxonomically tion,andinvasion,thetextbooksystem(Nagylaki1992)is and biologically distant. Because cellular components are obtained for the number of individuals of different types, fine-tuned by natural selection for coordinate functioning, chancesthatanalienproteinconfersaselectiveadvantage n_ 5m n (cid:1)un 1cN; 1 1 1 1 ontotherecipientwillbelow.Thiswouldapplytodifferent D o functional systems of the cell to a different degree, with n_ 5m n 1un ; w 2 2 2 1 n thosethatarebasedonlarge,multiproteincomplexesbeing lo or for frequency p, a affected most strongly as captured in the complexity d e d h1y9p9o9t)h.eNsiesvoefrtLhealkeessa,ntdhecobwioolrokgeircsal(lJyairne,aRsoivnearbal,eaenxdpLecatkae- p_5ðm1(cid:1)m(cid:1)Þp(cid:1)up1cq; from tion seems to be that cases of unequivocally beneficial h HGT should be (extremely) rare. Exceptions are known, where q 5 1 (cid:1) p and m(cid:1) is the mean continuous fitness, ttps e.g., situations when a horizontally acquired gene makes m(cid:1)5m1p1m2q: This is the simplest mathematical model ://a therecipientresistanttoanantibiotic(s),allowsittooccupy describing changes in the genetic content of a population. ca d a new nutritional niche or makes it a pathogen (Brown, Notethat,inthismodel,invasionisaspecialcaseofrecur- em ZRhoawneg-,MaangdnusH,oDdgasvoiens,1a9n9d8;MMazaezlel20a0n2d).DNaevvieesrth1e9le9s9s;, rent mInutteartimonsowfifthreqthueenractieesco.fdifferenttypesofindividu- ic.ou p combinationofthisbiologicalreasoningandthetheoretical als in the population, the assumption that invasion occurs .co withtheratecNisequivalenttoreplacingthesecond-type m conclusions of Berg and Kurland fuels the contention that individuals with first-type individuals (those that carry the /m theextentofHGTinprokaryotesmighthavebeenseriously b overestimatedbytheproponentsof‘‘horizontalgenomics’’ novelsequence)withtheratec.Tobemoreprecise,ifwe e/a and that HGT is, perhaps, an important but not a decisive consider the following system of differential equations rtic le factor of prokaryotic evolution (Kurland, Canback, and -a BergH2e0r0e3)w. e develop more general theoretical models n_15m1n1(cid:1)un11cn2; bstra c of HGT between microbial populations and identify the n_25m2n21un1(cid:1)cn2; t/2 2 conditionsunderwhichfixationofneutralorevenslightly /8 /1 deleterious horizontally transferred genes is possible. the equation for the frequency p of first-type individuals 7 2 remains unchanged. 1 /1 There are three principal mechanisms for transfer of 0 4 genes between microbes: (1) transduction, bacteriophage- 28 Results and Discussion 4 mediated gene transfer, (2) transformation, transfer of 0 Deterministic Model b DNA(e.g.,releasedfromdeadmicrobes)fromtheenviron- y g We consider a haploid asexual population with over- menttoarecipientcell,and(3)conjugation,directtransfer u e lapping generations (continuous time). In this section, we of DNA from one cell to another mediated by a plasmid st o assume that the effect of stochastic evolutionary factors, (Bushman 2001). Each of these processes can mediate n 2 such as genetic drift, is negligible. It is assumed that the HGT bothwithin a microbialpopulation (we will call this 7 M population under consideration consists of two types of ‘‘infection’’forshort)andbetweenpopulations(invasion). a individuals: those that carry a particular novel sequence Generally, it is expected that infection rates are substan- rch andthosethatdonot.Thismeansthatthestateofapopu- tially higher than invasion rates. 20 1 lation with regard to the novel sequence in question at Wecandefinetheinfectionrateastheprobabilityper 9 agivenmomenttiscompletelydeterminedbytwonumbers unit oftimefor anindividual that doesnothave thenovel n (t) and n (t) (the numbers, respectively, of the first- and sequencetoacquireit(tobecome‘‘infected’’).Asananalog 1 2 second-typeindividualsatthemomentt)orjustonenumber ofthelawofmassactioninchemicalkinetics,we assume p(t)5n (t)/(n (t)1n (t)),wherep(t)isthefractionofthe thattheinfectionrateisproportionalton ,withthepropor- 1 1 2 1 first-type individuals in the population. tionality constant h. The intensity of contact between in- Theassumptionsofthesimplestmodelareasfollows. fected and uninfected organisms is proportional to n n / 1 2 Anovelsequencecanbetransmittedverticallythroughcell N 5 n n /(n 1 n ). This directly applies to conjugation, 1 2 1 2 division or acquired via HGT. The novel sequence is also which involves physical contact between two cells, but subject to mutational inactivation (point mutations, inser- generally holds true for transduction and transformation tions,anddeletions)withaconstantrateu.Weassumethat aswellinasmuchasthereleaseofthenovelsequencewithin Malthusian parameters (intrinsic growth rates) are m and a transducing agent or transforming DNA is proportional 1 m (Nagylaki 1992). In addition, we assume that there is to the number of cells which harbor that sequence. 2 ModelingHorizontalGeneTransfer 1723 Combiningalltheaboveassumptions,weobtainthefollow- infectedindividuals)isdenotedk ,andtherateoftransition n ingdeterministicHGT-selection-mutation-invasionmodel from the state n to the state n (cid:1) 1 is denoted by l (the n ‘‘death rate’’ ofinfectedindividuals, i.e., therate atwhich n n thenovelsequenceislost,whichincludesactualcelldeath n_15m1n1(cid:1)un11cN1h n 11 2n ; among other processes). n n 1 2 TheKolmogorovforwardequationsforthestateprob- n_25m2n21un1(cid:1)hn 11 2n ; ð1Þ abilities pnðtÞ5PrfXðtÞ5ng can be written as 1 2 p_ 5l p (cid:1)ðk 1l Þp or n n11 n11 n n n 1k p ; n50;1;.;N: n(cid:1)1 n(cid:1)1 p_5ðm (cid:1)m(cid:1)Þp(cid:1)up1cq1hpq ð2Þ 1 D In order to make sense of this equation, we put for- ow for the frequency of the infected individuals. mallyk(cid:1)15lN115p(cid:1)1ðtÞ5pN11ðtÞ50:Thestateprobabil- nlo Assuming m1 (cid:1) m2 5 s, where s is the selection ities depend on the initial distribution pn(0). ade coefficient of the infected individuals, we obtain a logis- This scheme corresponds to a Moran model (Moran d tic-type equation with immigration 1958)which,inthecaseofahaploidpopulationwithover- fro m lappinggenerations,ismorerealisticformicrobialpopula- h tions than the more widely used Wright-Fisher model; in ttp p_5ðs1hÞpð1(cid:1)pÞ(cid:1)up1cð1(cid:1)pÞ ð3Þ s particular,aMoranmodelwasadoptedintheworkofBerg ://a and Kurland (2002). ca that can be easily analyzed (e.g., Matis and Kiffe 1999). de m Notethattheratecistherateatwhichuninfectedindivid- ic ualsbecomeinfectedwhich,asoutlinedabovemayinvolve Statement of the Model .o u differentbiologicalmechanisms.Ifalltheparametershave Inordertoanalyzethemodel,weneedtoidentifyall p.c o nonzero values and 0 (cid:2) p (cid:2) 1, equation (3) has only one theratesofthebirthanddeathprocess.Lettheindividuals m equilibrium solution satisfying the condition 0 , p* , 1 without the novel sequence divide with rate m; we will /m b and this equilibrium is asymptotically stable. Thus, under define the time unit in the model such that m 5 1. Then, e/a thismodel,ageneacquiredviaHGTwillpersistinthepop- the individuals with the novel sequence divide with the rtic uvlaastiioonn,ai.se.l,oinngfluasxtohfetrheeisnocovneltisneuqouuesn(ceev.eHnoiwfleovwer-,raetvee)nini-f roafteth(e1i1ndisv)i,dwuhalesredisviisdeths,ewseelercetmioonvceoaefrfiancideonmt.Windhievnidounael le-ab s c 5 0, i.e., there is no invasion and s , 0, i.e., the novel fromthepopulationtokeepitssizeconstant.Asinthede- tra sequence isdeleterious,but h .(cid:1)s, there is a stable inte- terministicmodel(1–3),wetakeintoaccountprocessesof ct/2 rior equilibrium that corresponds to the HGT-selection- inactivating mutation with the constant rate u, invasion 2/8 mutationbalance.Thisdescribesanotherscenario(distinct withtheconstantratecN,andwithin-populationtransmis- /1 7 from the scenario with a selectively advantageous gene sionofthenovelsequence(infection)withtherateh. 21 acquired) for the persistence of an acquired gene even Thesystemcanchangeitsstatefromnton11ifan /10 without continuous influx: the novel gene will persist if infected cell divides and there is no inactivating mutation 42 8 there is a sufficiently effective within-population mode of or if a new infected cell immigrates. In this case, the total 40 transmission(infection). numberofcellsisN11,andweneedtoremoveoneofthe by Equations(1–3)areclassicalmodelsofpopulationge- N(cid:1)ncells(thecellsthatdonotcarrythenovelsequence); gu neticsofasexualorganisms.Theinclusionoftheprocesses thus, the probability of this event is (N (cid:1) n)/(N 1 1). In es ofinfectionandinvasionandapplicationofthelawofmass addition, any cell that does not carry the novel sequence t on action allows us to incorporate HGT within thebounds of can become infected with the rate h. 27 known mathematical models. Thejumpfromstatentostaten(cid:1)1occurswhenacell M a withoutthenovelsequencedividesoracellwiththenovel rc h Stochastic Model sequence divides and an inactivating mutation afflicts one 2 0 Thesimple,deterministicmodeldescribedinthepre- of the daughter cells. In this case, we need to remove an 19 vioussectionincludesonlysystematicfactorsofevolution infected cell from the population, an event with the prob- whoseratesareassumedtobeconstant(Wright1949).To abilityn/(N11).Accordingly,forthebirthanddeathrates model genetic drift in the population, we formulate a sto- of infected cells, we obtain chasticcounterpartofthemodel(1–3).Tomakethemodel mathematically manageable, we assume that the total size N(cid:1)n nðN(cid:1)nÞ ofthepopulationisconstantandequalsN.WeuseaMarkov kn5½ð11sÞð1(cid:1)uÞn1cN(cid:4)N111h N ; birth and death process fXðtÞ;t(cid:3)0g with the finite state n spacef0;1;.;Ng.Thenumber,n,ofindividualsthatcarry ln5½N(cid:1)n1uð11sÞn(cid:4)N11: ð4Þ a particular novel sequence (a gene acquired via HGT) determines the state of the population. Transitions in the process are only allowed to neighbor states. The rate of Accordingtotherateequations(4),wedealwithabirth transition from state n to state n 1 1 (the ‘‘birth rate’’ of and death process with a finite state space and reflecting 1724 Novozhilovetal. boundaries (i.e., k0 6¼0; lN 6¼0). As in the deterministic where x 2½0;1(cid:4) and C is a constant chosen such that model (1–3), we can consider the rate c as the rate with R1fðxÞdx51: which uninfected individuals become infected because 0 The proof of this theorem is given in the Appendix. we can rewrite the birth rate in the following form: Thestationarydistribution(6)isacompleteanalogof thestationarydistributionofthediffusionapproximationof N(cid:1)n nðN(cid:1)nÞ the Wright-Fisher selection-mutation model (e.g., Goel k 5½ð11sÞð1(cid:1)uÞn1cN(cid:4) 1h n N11 N andRichter-Dyn1974).Theonlydifferenceisthepresence N(cid:1)n nðN(cid:1)nÞ of one additional parameter—the infection rate h—which, ’½ð11sÞð1(cid:1)uÞn(cid:4) 1h 1cðN(cid:1)nÞ: as in the deterministic case, is added to the selection N11 N coefficient s. All the qualitatively different forms of the stationary Berg and Kurland (2002) considered a Markov pro- distribution can be classified using the approximation cesswithrates(4)andc50inwhichthereisnoinvasion (6). This classification only depends on the products Do and,accordingly,thestate0isabsorbing.Underthismodel, uN;cN;andðs1hÞN: wn regardlessoftheinitialdistributionofthenovelsequence,it If r, q , 1 (i.e., cN, uN , 1), which corresponds to loa d willbeultimatelylostinthepopulation.Inpractice,thede- asituationwithlowratesofinactivatingmutationsandin- ed cision on whether or not to include invasion in the model vasion,themostprobablestatesareneartheboundaries.If fro rests on the comparison between the mean time to extinc- h . 0 (i.e., s 1 h . 0), more probability is concentrated m tionfromaparticularinitialdistributionandthemeantime nearx51(fig.1a),otherwise(ifh,0),moreprobability http bneofvoerlesetqhueeanpcpe.eAarsandcisecuosfseadninewtheinladsitvsidecutailonc,aurrnydienrgcothne- is concentrated near x 5 0. The distribution has a single s://a minimum. In substantive terms, if the combined values c a ditions favoring HGT, e.g., when the donor and recipient oftheselectiveadvantageandinfectionratefavorthesur- de organisms coexist in close proximity, the rate of invasion m vival of the novel sequence, it tends to sweep the popula- ic islikelytobenonnegligiblecomparedtotherateofextinc- tion; otherwise, it is likely to go extinct. .o u tionofanovelsequence.Therefore,modelswithinvasion If r , 1, q . 1 (low invasion rate, high rate of in- p.c ðc6¼0Þare,generally,morerealisticthanthosewithoutit. o activating mutations), then the most probable state is near m Fromaformalviewpoint,inclusionofthepossibilityofin- x 5 0, i.e., the novel sequence tends to perish (fig. 1b). /m vasion changes the qualitative behavior of the stochastic However, for a particular range of values s 1 h . 0, the be mchoadsteilcimnoadwealywtihthatnsoinmzpelriofieinsvtahseioannahlayssiassbteacbaleussetatthioensatroy- distriIbnuttihoencacsaenrh.av1e,aql,oc1al(hmigahxiimnvuamsio(fingr.at1ec,)l.owrateof /article distribution(adirectanalogofthestableequilibriuminthe -a inactivating mutations), the reverse is observed, with the b deterministic model). most probable state being fixation of the novel sequence; stra c however, the distribution can have a local maximum if t/2 Stationary Distribution s 1 h , 0 (fig. 1d and e). 2/8 Finally,ifr.1,q.1(highratesofinvasionandin- /1 The birth and death process with a finite state space activatingmutations),thedistributionisunimodalandthis 721 and reflecting boundaries has the unique, stable stationary is the only case when the most probable state is close /10 distribution p* that can be easily obtained noting that the 4 to the deterministic steady-state value for equation (3) 2 following relation must be satisfied at equilibrium: 8 (fig.1f). 40 To summarize, the stationary distribution under the by lnp*n 5kn(cid:1)1p*n(cid:1)1: stochasticmodelcanhaveatmosttwoextremes.Themost gue probablestatesaredeterminedbythejointeffectofallfive s Rearranging and iterating gives pmairmamicesteerxsacotfdtihsteribmuotidoenl.sTprhoeduacpepdrobxyimnautmioenric(6a)l scilmosuellay- t on 27 tion,atleastfortheconsideredrangesofparametervalues M a p*n 5p*k Yn klj(cid:1)15p*kkk Yn(cid:1)1 klj l1: ð5Þ (fig. 1A)p.proximation(6)canhelpdecidewhichparameters rch 20 j5k11 i j5k11 j n make substantial contributions to the evolution of the 19 population in each particular situation. Let us consider Theformula(5)iseasytoevaluatenumerically.Ifwe the quantity E(cid:1)X˜(cid:2)5gðh;r;qÞ—the average population assume that there are limits ðs1hÞN/h; cN/r; and penetration of the novel sequence (i.e., the average frac- uN/qwhen N/N; then the following theorem holds. tion of type 1 individuals). This is, simply, the expected Theorem 1. Suppose the parameters of the birth value of a random variable with the density function (6). anddeathprocesswithrates(4)aresuchthatthefollowing In figure 2a, the level lines of the function E(cid:1)X˜(cid:2) are limits exist: ðs1hÞN/h; cN/r; uN/qwhen N/N: shown for a fixed q 5 uN. In figure 2b, the implicit func- Then, if N/N and n=N/x; the stationary distribu- tion E(cid:1)X˜(cid:2)50:5 is plotted for different values of q. These tion (5) asymptotically tends to the distribution with the graphsshowthatiftherateofinvasioncNissubstantially density lower than the rate of inactivating mutation uN, signifi- cant penetration (on average) can be reached only with fðxÞ5CexpðhxÞxr(cid:1)1ð1(cid:1)xÞq(cid:1)1; ð6Þ high positive values of (s 1 h)N. ModelingHorizontalGeneTransfer 1725 D o w n lo a d e d fro m h ttp s ://a c a d e m FIG.1.—PossiblequalitativelydifferentstationarydistributionsoftheHGT-mutation-selection-invasionmodelwithrates(4).Ineachpanel,the ic approximation(6)andnumericallyevaluatedexactstationarydistributionaredepicted.Thestationarydistributionisnormalizedsuchthattheareasunder .o u thecurvesequalone.Theparametersare(a)h50.06,r50.98,q50.97;(b)h50.6,r50.8,q54;(c)h56,r50.8,q54;(d)h50.06,r52, p q50.8;(e)h5(cid:1)3.8,r52,q50.8;(f)h50.06,r56,q51.8. .co m /m b e /a Probability of Fixation rtic Using approximations given in the proof of Theorem 1 le Wecanformallyletk050; lN50tomakestates0and (seeAppendix),itcanbeshownthatthefollowingintegral -abs Nabsorbing.Inthissituation,thefateofauniqueindividual approximation can be used to evaluate Pfix: tra carryingthenovelsequencecanbeexamined.Throughge- 1 ct/2 nreeatcicheddr)ifto,rthciasnnpoevneeltsraetqeueanllcethceanpobpeullaotsiotn(th(tehestastteat0e Nis Pfix’11N1(cid:1)cNR11=(cid:1)N1=NexxpcðN(cid:1)ðð1s(cid:1)1xhÞuÞNNxÞdx: ð7Þ 2/8/17 is reached). In the latter case, we speak of fixation of the 2 1 novelsequenceinthepopulationalthoughithastobekept /1 First,letc50.ThiscasewasexaminedbyBergand 0 inmindthat,inthecaseofreflectingboundaries,thesystem Kurland (2002) who showed that, if s 1 h , u, then the 428 doesnotstayinstateN.Lettingk 50; l 50issomewhat 4 0 N probability of fixation is virtually zero for large N (more 0 artificialbutthereasoningbehindexaminingthissituation b precisely, this, of course, implies the effective population y is as follows. Firstly, we can assume that, once the novel g size N ; hereinafter, we use N for simplicity). If u siteqbueecnocmeeissaecsqseunirteiadlbaynadllctahneniontdibveiduloaslst i(nheanpcoeplula5tio0n)., u,s1eh,uð1(cid:1)lnuÞ;there isa plateau where theprob- est o N ability of fixation does not depend on N, and for large N, n Whentherateofinvasionissmall,thetypicalfateofanovel 2 thereisasharpdropinthisprobability.Finally,ifs1h. 7 sequence appearing in the population is extinction. M uð1(cid:1)lnuÞ, the probability of fixation has a limit that a Under typical conditions (low invasion and inactivation does not depend on N. These three distinct behaviors are rch mutation rates), most of the time, the population waits illustrated in figure 3a. 20 for a ‘‘lucky’’ sequence to be fixed; accordingly, letting 1 Allthecurvesinfigure3awereobtainedusingthein- 9 k 50isnotunrealistic.The time of fixation conditioned 0 tegralapproximation(7).Thisintegralapproximationholds suchthatthefixationdoesoccurisusuallymuchlowerthan well in most ranges of parameters except in the region the mean time to reach state N with a reflecting boundary uN (cid:5)1; all analyses described in this work were well at n 5 0. within the applicability range of the approximation. If The probability that the system ends up in the state c6¼0; qualitative changes in the behavior of P are ob- N (the probability of fixation) if initially there is only fix served. Obviously, P is expected to increase. However, one individual carrying this novel sequence is (e.g., Goel fix ifc,u,i.e.,theinvasionrateislowerthantheinactivation and Richter-Dyn 1974) rate,ands1h,uð1(cid:1)lnuÞ;thelimitingbehaviorofP is fix the same as in the case without invasion, namely, P /0 fix whenN/N(fig.3bandc).Thereis,however,arangeofN 1 P 5 : values in which P is greater than the same probability fix 11 PN(cid:1)1Qi ln fix i51 n51kn with c 5 0. In contrast, in the case of s1h.uð1(cid:1)lnuÞ; 1726 Novozhilovetal. Quasistationary Distributions Ifthereisnoinvasioninthemodel,thenovelsequence isdoomedtoextinct.Moreprecisely,letc50.Itisreadily shown that the process fX(t)g has a degenerate stationary distribution p*5ð1;0;.;0Þ: The distribution of X(t) approachesthestationarydistributionastimetapproaches infinity. Thus, ultimate absorption (extinction of the novel sequence) is certain. To evaluate the mean time to extinction if initially only one individual carries the novel sequence,wecanusethefollowingformula(e.g.,Goeland Richter-Dyn 1974): D o EfT g5 XN 1 Yn(cid:1)1kj: wnlo lost l l a n51 nj51 j de d fro m UsingapproximationsgivenintheproofofTheorem1 h (seeAppendix),anintegralapproximationforthisquantity ttp s can be obtained (see also Berg and Kurland 2002): ://a c a d e Z 1 expðxðs1hÞNÞ m EfT g’expð(cid:1)ðs1hÞÞ dx: ic lost 1=N xð1(cid:1)xÞ1(cid:1)uN .oup .c o m If s 1 h , u, then EfT g is virtually the same as /m lost b for the neutral expectation (when s 1 h 5 0). In contrast, e /a ifs1h.u,i.e.,selectioncombinedwithhorizontalgene rtic transmission dominates over inactivation, then the time le -a E(cid:1)X˜(cid:2)F5IG.Ro12x.—fðx(aÞd)xCwonitthoutrheplfioxtedofvtahlueeauvNera5ge0p.5o.pu(bla)tiLoenveplenlienterastifoonr ttohenabsionrcpretiaosnes(esxhtianrcptliyonw) igthoesincthreroausignhgaNm(infiigm.u5m; athnids bstra 5o0f%uNp.opulation penetration of the novel sequence for different values figure mimics fig. 2B of Berg and Kurland [2002]). It is ct/2 therefore of interest to analyze the distribution of X(t) 2 /8 prior to absorption. This is done using the concept of /1 7 quasistationarity. 2 1 Therearemanybiologicalandecologicalsystemsthat /1 inclusion of even low-rate invasion results in a notable 0 changeofthelimitingbehaviorofPfix:theplotofPfixversus esovnenabtulealltyimgeosceaxltei.ncTthyeentoatpiopneaorftoqubaessistatabtlieonoavreyrdainsytrirbeua-- 4284 thepopulationsizeNhasaminimumatthepointN5N , 0 min tion has proved to be a powerful tool for modeling the b and for N . N , P increases with the increase of y min fix behavior of such systems (Pollet 1996). In particular, it g N(fig.3d). u Theprobabilityoffixationintheneutralcase(s50) aitlslowwasyotnoeextotinpcrteidoinc.t the possible distribution of X(t) on est o withoutimmigrationisprimarilydeterminedbytheratioof n The state space of the birth and death process under 2 the rate of inactivating mutations, which oppose fixation, 7 considerationcanbepartitionedintotwosubsets,onecon- M and the rate of horizontal transmission of the novel se- a q(iunefenccteiowni)thraintetihsehriegchipeineonutgpho,pnueluattrioalnfi.Ixfatthioentroafntshmeinsosivoenl ttarainnisniegntthsetaatbessofr1b;in2g;.sta;tNeg0:aBnedfothreeaobthsoerrpctoiomnp,rthiseedproofcethses rch 20 assumesvalues intheset oftransientstates.Iftheprocess 1 sequence is an event with a nonzero probability. The in- 9 is conditioned on the event that absorption has not taken equality h.uð1(cid:1)lnuÞ is not very restrictive because it place at time t, then the conditional state probabilities demands that the transmission rate is approximately 10– q (t) can be determined from the state probabilities p (t) 20timesgreaterthantheinactivationrate.Moreover,even n n through the following relation: ifthenovelsequenceisslightlydeleterious,itcanbefixed notonlyduetorandomdriftinafinitepopulationbutalso due to the possibility of infection (fig. 4). The fixation of p ðtÞ slightly deleterious alleles in a finite population leading qnðtÞ5PrfXðtÞ5njXðtÞ.0g51(cid:1)np ðtÞ: toadeclineinthemeanfitnessofthepopulationisknown 0 as Muller’s ratchet (Muller 1964; Felsenstein 1974). The relatively high rate of the within-population transmission BydifferentiatingthisrelationandusingtheKolmogorov (infection) offers an alternative scenario for reducing the forwardequationsforp (t),wecanobtainasystemofdiffer- n mean population fitness during evolution. ential equation for q (t). The quasistationary distribution n ModelingHorizontalGeneTransfer 1727 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /m FIG.3.—ProbabilityoffixationofthenovelsequenceasafunctionofthepopulationsizeN.(a)Thecaseofnoinvasion,c50.Thedashedlineis b wuthie5thpr1do0ibf(cid:1)fae7br.eiTlnithtyepoavfraafilmuxeaesttieoornsf.so(bf1)aPnhaerfuaotmrratelhtseeerqssoualerinedcuceu5irnv1teh0se(cid:1)fp7ro,resms1ebnohctet5oomf1n0to(cid:1)oe8t.voToplhuaetrievoan0la,ure1ys0f(cid:1)oof8r,cce5fso3erxtch1ee0p(cid:1)ct7uf,rov8reg3sefnr1eo0tmi(cid:1)c7bd,or5itftto3,mP1fitx0o(cid:1)5to6,1p1/aN0r.(cid:1)eT50.h,(e1b0i–n(cid:1)da8)c,tT2ivh3aetic1na0gs(cid:1)em8o,uf5tai3tniov1na0sri(cid:1)aot8ne. e/article (c)Parametersareu510(cid:1)7,s1h58310(cid:1)7.Thevaluesofcforthecurvesfrombottomtotopare0,10(cid:1)9,10(cid:1)8,2310(cid:1)8.(d)Parametersareu510(cid:1)7, -a s1h55310(cid:1)6.Thevaluesofcforthecurvesfrombottomtotopare0,10(cid:1)10,10(cid:1)9,5310(cid:1)9. bs tra c t/2 2 q*isthestationarysolutionofthissystemofequations.The In the general case, these equations cannot be solved /8 probabilitiescanbeshowntosatisfythefollowingsystemof /1 explicitly. They can, however, be used to derive the 7 2 differenceequations: relations to which iteration methods for determining the 1/1 quasistationary distribution can be applied (Na˚sell 2001) 04 ln11q*n11(cid:1)ðkn1lnÞqn*1kn(cid:1)1q*n(cid:1)15l1q1*qn*: (the algorithm for evaluating the quasistationary distribu- 284 tion is presented in the Appendix). The simplest way to 0 b y g u e s t o n 2 7 M a rc h 2 0 1 9 FIG.4.—Probabilityoffixationofthenovelsequenceasafunctionof theselectioncoefficients.ParametersareN5108,u55310(cid:1)9,h553 10(cid:1)7(theuppercurve),h55310(cid:1)8(thelowercurve).Thedashedhor- FIG.5.—Meantimetoextinctionasafunctionofthepopulationsize izontallineistheprobabilityoffixationofaneutralsequenceP 51/N N.Theinactivationrateu510(cid:1)7.Thevaluesofs1hforthecurvesfrom fix whentherearenoevolutionaryforcesexceptfortherandomgeneticdrift. bottomtotopare(cid:1)10(cid:1)8,10(cid:1)8,10(cid:1)7,5310(cid:1)7,10(cid:1)6. 1728 Novozhilovetal. obtainanapproximationofthequasistationarydistribution istorestrictconsiderationtotransientstates(thestatespace ismadestrictlypositive).Byexcludingzerofromthestate space, one can establish a related process without an ab- sorbing state. This method has been applied in several mathematical models (Kendall 1949; Pielou 1969) and is valid when the time to extinction is reasonably large (Na˚sell 2001). WeconsideranauxiliaryprocessfX (t)gwhichcanbe 0 described as the original process with the origin removed. Formally,weputl 50,whileallothertransitionratesare 1 equal to the corresponding rates for the original process. The stationary distribution for the process fX (t)g is easy D 0 o todetermine.Agoodapproximationforthisstationarydis- w n tribution is given by (6) withr 50 and the normalization lo constant determined by R11=NfðxÞdx51: This means that aded wqueascisatnatiuosnearyequdaistitorinbu(ti6o)n.asThaen mapapinroxqiumeasttiioonn ifsorhothwe from close this approximation is to the actual quasistationary h distribution. ttps Figure6showsthatthesimpleapproximation(6)with ://a c c50isgood(exceptfortheareanear0)ifs1h(cid:5)u,i.e., a d thisapproximationholdswhenthemeantimetoextinction em issufficientlylong(comparetofig.5).However,evenwith ic.o lower values of s 1 h, i.e., s 1 h ; u(1 (cid:1) ln u), the up approximation (6) is close to the observed quasistationary .co m distribution almost in the whole range of x except for the /m area near 0 (data not shown). b e Thus,evenifinvasionisnotincludedinthemodel,the /a mostprobablestatesoftheprocesscanbenear1depending rtic le onthevaluesoftheotherparameters,i.e.,foracertainpart -a b of the parameter space, the novel sequence, on its way to s extinction, penetrates almost the entire population with tra c a high probability. t/2 2 /8 /1 7 General Discussion and Conclusions 2 1 Thepresentmodelisageneralizationofthestochastic FIG.6.—QuasistationarydistributionsoftheHGT-selection-mutation /10 model described by Berg and Kurland (2002). The model mapopdroelx.imThaetethdiinstrsiobluidtioannsdodfotthteedacuuxrivlieasryarper,orceesspsewctiitvhelly1,t5he0e,xaancdtatnhde 4284 includesfiveparameters: inactivatingmutation rate, selec- thicksolidcurvesarequasistationarydistributionsobtainedwithiteration 0 b tioncoefficient,invasionrate,within-populationhorizontal methods.Theparametersare(s1h)N58,uN59.5(a),uN51.17(b), y transmission(infection)rate,andpopulationsize.Bergand anduN50.37(c). gu e Kurlandcometotheconclusionthathorizontallyacquired st o sequences can be fixed in a population only when these n 2 sequences confer a substantial selective advantage onto Inthepresentmodel,twoadditionalprocessesarein- 7 M the recipient and therefore are subject to strong positive cludedand,inacertaindomainofparametervalues,signif- a selection. However, although they formally consider the icantlycontributetotheoutcome.Quasiformally,thelogic rch processofinfectionwhenformulatingthemodel,itseffect can be as follows. 20 1 is excluded from their interpretation. Therefore, their con- Iftherateofwithin-populationhorizontaltransmission 9 clusions are based on the model with only two processes: ofthenovelsequence(infection)iscomparabletotheinac- genetic drift and mutational inactivation. In this setting, it tivatingmutationrate,thenthemeantimetoextinctionofthis becomes self-evident that, in typical, large microbial pop- novelsequenceisquitelongand,importantly,increaseswith ulations,horizontallytransferredsequencescansurvivefor the increase of the population size N (fig. 5). In this case, any appreciable duration of time only when they are the approximation (6) for quasistationary distributions is stronglybeneficial.Becausesuchsituationsarereasonably applicable. The analysis under this approximation shows expected to be rare (the exceptions, such as acquisition of that the novel sequence can penetrate a significant part of antibioticresistanceorabilitytoutilizenewnutrients,not- therecipientpopulation(themostprobablestatesarenear1 withstanding),theresultsofBergandKurland’smodeling infig.6). imply that HGT played less of a role in the evolution of Ifthemeantimetoextinctionislong,thentheappear- prokaryotesthangivenin‘‘horizontalgenomics’’concepts anceofthisparticularnovelsequenceinthepopulationmay (Kurland, Canback, and Berg 2003). notbeauniqueevent.Accordingly,invasionhastobetaken ModelingHorizontalGeneTransfer 1729 possessthemolecularmachineryfornonspecificDNAup- takeandarehighlycompetentfortransformationwhichis, indeed,consideredtobeamajormechanismofgenetransfer betweenmicrobes(LorenzandWackernagel1994;Dubnau 1999;Redfield2001;ClaverysandMartin2003;Chenand Dubnau2004).Alltheseprocessesaremostintensewithin amicrobialpopulation,contributingtoapotentiallyhighrate of‘‘infection’’inourmodel.However,theyarebynomeans limitedtothesameprokaryoticspeciesandhavebeenshown tooccurevenbetweenphylogeneticallydistantprokaryotes (Davison1999;Paul1999).Suchinterspeciesgenetransfer which,again,maybeintenseunderconditionsofphysical proximity between diverse prokaryotes, e.g., in microbial D o mats,canbeamajorcontributiontotheprocessofinvasion w n in our model. That the amount of apparent HGT critically lo a d depends on ecological and physical closeness of the pur- e d porteddonorsandrecipientshasbeenbornoutbycompar- fro FIG.7.—Themeantimesrequiredtoreachdifferentlevelsofpene- ative genomics. In particular, hyperthermophilic bacteria m mtraetaionntoimfaentoovtehlesegqiuveenncleeivnealpoofppuelnaetitorant.iTonheEyfaTxigssthootwhestmheearantiotimofethtoe carry a disproportionate number of genes thought to have http k been horizontally acquired from archaea (Aravind et al. s cfix5ati0o.n00E1,fNTN5g.3P5a0r0a.meter values: s 1 h 5 0.02, u 5 0.0015, 1998; Nelson et al. 1999; Worning et al. 2000; Nesbo ://ac a etal.2001).Conversely,mesophilicarchaea,suchasHalo- d e bacteriumand,especially,Methanosarcina,containnumer- m ic ous genes of apparent bacterial origin, many more than .o intoaccount,andtheresultsobtainedforthetruestationary u hyperthermophilic archaeal species (Koonin, Makarova, p distribution are valid. and Aravind 2001; Pennisi 2001; Deppenmeier et al. .co m If invasion is included in the model, then, within a 2002;Kooninetal.2002;Koonin2003).Numerousstudies /m reasonable time span, the novel sequence can penetrate inmicrobialecologyrevealaremarkablediversityofmicro- be athesihgonrifiizcoannttalplyarttraonfsfpeorrpeudlagtieonne m(fiagy. b2e) fianxded,.eIvteinstuinatlelyr-, bmiaulnciotimesmsuhnoiwtiebso(KthadssyennaamnidcRbaeihnaevyi2o0r,0w4)h.iMchicirsobliinaklecdomto- /article esting to note that the mean time required for significant nichespecialization(Kerretal.2002),andconsiderabletem- -a b penetration is dramatically less than the mean time of fix- s ation(fig.7)whichcouldhavesubstantialconsequencesfor poralstability(Fernandezetal.1999,2000;Hashshametal. tra the fate of the population. 2000).Generally,thesecommunitiesprovidefertileground ct/2 forHGT,makingthemodelswithnonzeroinvasionrelevant. 2 When a sequence persists in a population for a long /8 Themodelingresultspresented here strongly suggest /1 time and, especially, when it gets fixed, there is a chance 7 thatthemainpreceptof‘‘horizontalgenomics,’’thecrucial 2 that the acquired gene becomes beneficial or even indis- 1 role ofHGTinprokaryoticevolution,doesnotdepend on /1 pensable (essential). 0 the unrealistic assumption that all horizontally transferred 4 Thepresentanalysisshowsthattakingintoaccountthe genesthatarefixedinmicrobialpopulationsconferastrong 284 processesofwithin-populationtransmission(infection)and 0 selectiveadvantageontotherecipient.Webelievethatthis b invasionleadstoconclusionsthataredramaticallydifferent y theoreticalsupportforamajorevolutionaryimpactofHGT g from those of Berg and Kurland: if the rates of these pro- u is particularly important given how hard it is to obtain a e cessesarenonnegligible,horizontallytransferredsequences rigorous proof for most HGT events. To strengthen the st o do get fixed or at least persist in a significant part of the n argument even further, it will be necessary to develop 2 recipientpopulationforalongtime,eveniftheyareneutral 7 quantitative estimates for gene fluxes within and between M othressleiTgahhdtedlyiptridoeesnslaeilntepgrriqooucueesss.stieosno,cthceunr,aitss:ijgunsitfihcoawntlirkaetelys?isUinttfhoar-t p‘‘rionkvaarsyioonti’c’ ipnoopuurlamtioondselw. hich figure as ‘‘infection’’ and arch 20 1 tunately,quantitativeestimatesarelackingwhichprecludes 9 us from supplementing the mathematical analysis of the Acknowledgments model with empirical estimates as Berg and Kurland have WethankYuriWolffornumeroushelpfuldiscussions donewithregardtotheinactivationrate(BergandKurland and useful suggestions and Alex Kondrashov for critical 2002).Qualitatively,however,biologicaldatasuggestthat reading of the manuscript. bothprocessescanbemediatedbyseveralmechanismssuch thattheirratesvarywithinextremelybroadrangesandthe gene flow could be intense under favorable conditions. Appendix Bacteriaareknowntoexchangegenesviaconjugativeplas- Proof of Theorem 1 midsandintegrativeandconjugativeelements(Osbornand Boltner2002;Grohmann,Muth,andEspinosa2003;Bennett First note that, if n=N/x when N/N; then 2004; Burrus and Waldor 2004). Furthermore, it has been extensively documented that many bacteria and archaea ln/Nxð1(cid:1)xÞ: ðaÞ 1730 Novozhilovetal. To handle the product in (5), we note that The state space of the birth and death process under considerationcanbepartitionedintotheunionoftwosub- (cid:3) (cid:4) sets, one containing the absorbing state 0, and the other k ðN(cid:1)jÞ ð11sÞð1(cid:1)uÞj1cN1hj N11 j5 N equaltothesetoftransientstatesf1;2;.;Ng:Beforeab- lj j½N(cid:1)j1uð11sÞj(cid:4) sorption,theprocessassumesvaluesinthesetoftransient ðN(cid:1)jÞðð11s(cid:1)u1hÞj1cNÞ states.Iftheprocessisconditionedontheeventthatabsorp- ’ tionhasnottakenplaceattimet,thentheconditionalstate jðN(cid:1)j1ujÞ probabilities q (t)canbedeterminedfromthestateproba- 511s(cid:1)u1h1cjN’11s1h1cN(cid:1) uN ; bilities pn(t) thnrough the following relation: 1(cid:1)u1 uN j N(cid:1)j N(cid:1)j p ðtÞ q ðtÞ5PrfXðtÞ5njXðtÞ.0g5 n : where we used the fact that ð11xÞ=ð1(cid:1)yÞ’11x1y for n 1(cid:1)p ðtÞ 0 small x and y. Taking the logarithm of (5), we obtain D o w ln pp*n*lkn!’ Xn(cid:1)1 ln(cid:5)11s1h1cjN(cid:1)Nu(cid:1)Nj(cid:6) dgiofrfoevrBefnyotridawilfafereqdrueeanqttiuioaanttiinofognrstqhfois(rt)pr.enlT(ath)ti,eownqeuacanasdinstuoasbtiitnoagninatrahyesydKsistoetrlmimbouo--f nloaded k k j5k11 tionq*isthestationarysolnutionofthissystemofequations. fro ’ Xn(cid:1)1 (cid:7)s1h1cN(cid:1) uN (cid:8) The probabilities can be shown to satisfy the following m h j N(cid:1)j system of difference equations: ttp j5k11 s ’ðs1hÞðn(cid:1)k(cid:1)1Þ Xn(cid:1)1 (cid:7) r (cid:1) q (cid:8)1 l q* (cid:1)ðk 1l Þq*1k q* 5l q*q*: ://aca j=N 1(cid:1)j=N N n11 n11 n n n n(cid:1)1 n(cid:1)1 1 1 n de j5k11 m Z n=Nh r q i It can be shown that ic.o ’ðs1hÞðn(cid:1)k(cid:1)1Þ (cid:1) dx u p k=N x 1(cid:1)x .co ’ðs1hÞðn(cid:1)k(cid:1)1Þlnðxrð1(cid:1)xÞqÞjxx55nk==NN: q 5p Xn 1(cid:1)Pjk5(cid:1)11qjq , n51,2,.,N, XN q 51: m/mb Here we used approximation lnð11xÞ’x for small x. n nk51 qk 1 n51 n e/artic le Taking k 5 N/2 as a reference point and noting that Thisisnotanexplicitsolutionbecauseq1canonlybe -ab k /N=4 as N/N we obtain determined when all q areknown. However, this relation s N=2 can be used in an iterantive algorithm in order to obtain a trac Np* quasistationary distribution. This algorithm starts with an t/2 p*’ N=22r1q12ehðn=N(cid:1)3=2Þðn=NÞrð1(cid:1)n=NÞq: arbitrary initial quasistationary distribution, employs this 2/8 n l /1 n distributionasaninputinthenumeratorsofthetermsthat 7 2 Using (a) we obtain the desired result. This completes the aqrueirseummemntetdhaotvPerNk, aqnd5s1o:lvTehsethietereaqtuioantiocnanunbdeefrotrhmearlley- 1/10 proof. n51 n 42 described as follows: 8 4 0 Algorithm for Calculating a Quasistationary b y Distribution qði11Þ5p Xn 1(cid:1)Pjk5(cid:1)11qijqði11Þ; gue Here we follow Na˚sell (2001). We examine a birth- n nk51 qk 1 st o death process fXðtÞ;t(cid:3)0g with the finite state space n 2 f0;1;.;Ng where the origin is an absorbing state. The 7 M birth rate is denoted k and the death rate is denoted l . where the superscript i denotes the iteration number. The a TheKolmogorovforwanrdequationsforthestateprobabinl- processisrepeateduntiltheresultsofsuccessiveiterations rch are sufficiently close. 2 ities p (t) 5 PrfX(t) 5 ng can be written as 0 n 1 9 p_ 5l p (cid:1)ðk 1l Þp 1k p , n50,1,.,N: n n11 n11 n n n n(cid:1)1 n(cid:1)1 Literature Cited To interpret this equation, we formally put Aravind,L.,R.L.Tatusov,Y.I.Wolf,D.R.Walker,andE.V. k 5l 5p ðtÞ5p ðtÞ50: The state probabilities (cid:1)1 N11 (cid:1)1 N11 Koonin.1998.Evidenceformassivegeneexchangebetween depend on the initial distribution p (0). n archaeal and bacterial hyperthermophiles. Trends Genet. Letusintroducetwosequencesq andp asfollows: n n 14:442–444. Bennett, P. M.2004. Genome plasticity: insertionsequence ele- k k .k ments, transposons and integrons, and DNA rearrangement. q151,qn5l1l2.ln(cid:1)1, n52,3,.,N, Methods Mol.Biol. 266:71–113. 1 2 n(cid:1)1 Berg, O. G., and C. G. Kurland. 2002. Evolution of microbial l p 5 1q , n51,2.,N: genomes: sequence acquisition and loss. Mol. Biol. Evol. n l n n 19:2265–2276.
Description: