Molecular Phylogenies in Angiosperm Evolution ’ William Martin, * Derek Lydiate,?,’ Henner Brinkmann, * Gert Forkmann,? Heinz Saedler,? and Riidiger Cerff * *Institut ftir Genetik, Technische Universitit Braunschweig; and TMax-Planck-Institut ftir Ziichtungsforschung We have cloned and sequenced cDNAs for the glyceraldehyde-3-phosphate dehy- D drogenase of glycolysis, gupC, from a bryophyte, a gymnosperm, and three angio- o w sperms. Phylogenetic analyses are presented for these data in the context of other nlo gupC sequences and in parallel with published nucleotide sequences for the chlo- ad e roplast encoded gene for the large subunit of ribulose- 1, Sbisphosphate carboxylase/ d oxygenase (rbcL). Relative-rate tests were performed for these genes in order to fro m assess variation in substitution rate for coding regions, along individual plant lineages h studied. The results of both gene analyses suggest that the deepest dichotomy within ttp s the angiosperms separates not magnoliids from remaining angiosperms, but mono- ://a cotyledons from dicotyledons, in sharp contrast to prediction from the Euanthial ca d theory for angiosperm evolution. Furthermore, these chloroplast and nuclear se- e m quence data taken together suggest that the separation of monocotyledonous and ic dicotyledonous lineages took place in late Carboniferous times [ - 300 Myr before .ou p the present (Mybp)]. This date would exceed but be compatible with the late- .c o Triassic ( -220 Mybp) occurrence of fossil reproductive structures of the primitive m angiosperm Sanmiguelia lewisii. /m b e /a Introduction rtic le Angiosperms dominate extant floras, yet the origins and early evolution of th-ae b s group incompletely understood. Efforts to understand the evolution of floweringtra plants rely to a large extent on the morphology of reproductive structures, yet thesct/1e are rare in fossilized form (Stewart 1983, pp. 365-379; Thomas and Spicer 1987, p0/1p. 2 15-23 1) . The intermediate forms necessary to reconstruct a robust picture of mor/1- 4 0 phological changes surrounding angiospermy have yet to be brought forth from th/1e 0 fossil record, leaving the relationships ( 1) between angiosperms and other seed plant3s 0 (Crane 1985; Meyen 1986; Cronquist 1988, pp. 129-157) and (2) within the flowerin03g 3 plants themselves (Burger 1977; Dahlgren and Bremer 1985; Krassilov 199 1) a poin bt y of controversy. Cronquist ( 1988, p. 129) has noted that the “abominable mystery g” u of angiosperm origins “remains scarcely less so to modern students of evolution. It esis clear that they [the angiosperms] are vascular plants, related to other vascular plantt ons, 1 5 1. Key words: angiosperms, gymnosperms, molecular phylogeny, relative-rate test, glyceraldehyde- N3- o phosphate dehydrogenase, ribulose-I, S-bisphosphate carboxylase/oxygenase. Abbreviations: gapC = gvly- e colytic glyceraldehyde-3-phosphate dehydrogenase, E.C. 1.2.1.12; rbcL = large subunit of ribulose- 1,5-bmis- b phosphate carboxylase/oxygenase, E.C.4.1.1.39; gapA = Calvin cycle glyceraldehyde-3-phosphate dehydreo- genase, E.C. 1.2.1.13; Mybp = million years before present. r 2 0 2. Present address: John Innes Centre for Plant Science Research, Colney Lane, Norwich, Norfo18lk, NR4 7UJ, England. Address for correspondence and reprints: William Martin, Institut Wr Genetik, Technische Universitst Braunschweig, Konstantin-Uhde-Str. 5, D-3300 Braunschweig, Germany. Mol. Biol. Ed. 10(1):140-162. 1993. 0 1993 by The University of Chicago. All rights reserved. 0737-4038/93/1001-0009$02.00 140 Angiosperm Phylogeny 141 . . . and that their immediate ancestors must have been, by definition, gymnosperms. Beyond that, much is debatable.” The accepted time scale of angiosperm evolution is integrally related to questions concerning identification of taxa ancestral to the group (Axelrod 1970; Krassilov 1977; Meeuse 1987, pp. 183-l 87; Cronquist 1988, pp. 129-157). Generally recognized an- giospermous forms appear in the fossil record in lower Cretaceous times, - 120 Myr before the present (Mybp) (Crane et al. 1986), and became the dominant group in fossil floras by -90 Mybp ( Lidgard and Crane 1988). Description of angiosperm evolution in pre-Cretaceous times is at best difficult in the absence of generally rec- ognized angiospermous fossils of pre-Cretaceous age and is further impaired by diDf- o ficulties in unambiguous definition of “an angiosperm,” particularly as applied two n fossil forms (Stewart 1983, pp 365-389; Thomas and Spicer 1987, pp. 213-232).lo a d Accordingly, the lower Cretaceous is widely viewed as the starting point of angiosperme d evolution ( Beck 1976, pp. 2-3; Cronquist 1988, pp. 136- 138 ) , though proponents froof m a long pre-Cretaceous history for angiosperms have argued their case ( Axelrod 195 h9; Krassilov 1977; Meeuse 1990, pp. 1 l-2 1)) albeit often in lack of independent empirttp- s ical data. ://a Here we report the cloning and phylogenetic analysis of full-size cDNAs focar d gapC, the nuclear-encoded gene for glycolytic glyceraldehyde-3-phosphate dehydroe- m genase (Certf and Chambers 1979) from the bryophyte Physcomitrella patens, thic.oe gymnosperm Pinus sylvestris (Scats pine), and three angiosperms: Dianthus curyup- ophyllus (carnation), Callistephus chinensis (aster), and Pisum sativum (pea). Phy.co- m logenetic analysis of r&L sequences from related taxa are presented and compared/m with the nuclear gene phylogeny. Because the fossil record for the emergence of thbee first land plants (Gensel and Andrews 1987; Thomas and Spicer 1987, pp. 73-168/artic) and first conifers (Scott and Chaloner 1983; Meyen 1984; Chaloner 1989; Galtier anled Rowe 1989) is relatively well documented, gene sequences from a bryophyte and -aba s gymnosperm provide points of calibration with which angiosperm age may be roughltray c estimated. We also describe the comparative analysis of gupC and r&L sequencet/1s 0 from a bryophyte, a gymnosperm, and eight angiosperms (including Magnolia). Thes/1e nuclear and chloroplast DNA genes provide congruent pictures of angiosperm evo/14- 0 lution. The results support the view that angiosperms arose and diversified long befor/1e 0 their extensive appearance ‘in lower-Cretaceous strata. The lower estimates for th30e 0 monocotdicot divergence reported by Wolfe et al. ( 1989) may be because their analys3is 3 rested solely on chloroplast DNA sequences from a single dicot species (tobacco), th be y chloroplast DNA of which has been shown (Wolfe et al. 1987) to evolve at a rat gue e significantly lower than that of other chloroplast sequences. Our estimates of angioss- t o perm age are congruent with the late-Triassic ( -220 Mybp) occurrence of the primitivne 1 angiosperm Sanmiguelia lewisii (Cornet 1989b). The phylogenies derived from thes5 Ne nuclear and plastid nucleotide sequence data do not support the widely accepteod v e Euanthial theory for the evolution of angiosperms from strictly magnolean antecedentsm. b Our results are discussed in light of alternative theories concerning angiosperm origines and early evolution and are contrasted to other recent molecular analyses on floweringr 20- 1 plant phylogeny. 8 Material and Methods cDNA Cloning For the bryophyte Physcomitrella patens, cultures were grown (Cove and Ashton 1984) and polyA+ mRNA was isolated, according to a method described by Cerff 142 Martin et al. and Kloppstech ( 1982) from 2-wk-old caulonema plate cultures. cDNA of P. patens was constructed by the method of Lapayre and Amahic ( 1985), size fractionated on 1% agarose, and cloned into hnm 1149 (Murray 1983 ) , according to a method described by Schwarz-Sommer et al. ( 1987). Fifty thousand recombinants were screened by plaque hybridization at 60°C in 3 X SSPE (Schwarz-Sommer et al. 1985) using a radiolabeled cDNA insert for glycolytic GAPDH from Magnolia &flora (Martin et al. 1989). Washings were performed at 60°C in 2 X SSPE. Positive clones were sub- cloned into pUC 19 ( Vieira and Messing 1982 ), and terminal regions were sequenced by the dideoxy method ( Sanger et al. 1977 ). Two full-size clones (pPPl8 and pPP 14) differing in length of their noncoding regions yet otherwise identical were identifieDd, o w and both strands were sequenced by the chemical degradation method (Maxam annd lo Gilbert 1980) and dideoxy method, for pPPC 18 and pPPC 14, respectively. a d e For Pinus sylvestris, 20 pg polyA+ mRNA was isolated from 700 pg total RNdA provided by Stefan Jansson (Umea). cDNA was constructed with the Pharmacia frokit m modified through size fractionation of linker-ligated cDNA on 1% agarose accordi hng to a method described by Martin et al. ( 1990) and cloned into hnml149 (Murrattpy s 1983). Four hundred thousand recombinants were obtained from 20 ng of cDN://aA c packaged. Screening was performed as for the P. patens library. Five hybridizing cDNadAs e were subcloned into SK+ plasmids (Stratagene), and terminal sequences were detmer- ic mined by the dideoxy method. Two of these, pPSC 15 and pPSC9, were sequenced .oon u both strands by the chemical degradation method (Maxam and Gilbert 1980). Tpwo .c of the remaining three clones were identical, in terminal regions, to pPSC 15, and othe m other was identical to pPSC9. /m b mRNA and cDNA for Dianthus caryophyllus and Callistephus chinensis we/aere prepared from immature flowers by the method described for P. patens. One full-srticize gapC clone for each species ( pDCC 1 for Dianthus and pCCC 1 for Callistephus) wle-aith insert > 1.2 kb in length was sequenced by the chemical degradation method. Tbshe Pisum cDNA pPEA 1 was isolated from a library described elsewhere (Brinkmann tracet al. 1989), and both strands were sequenced by the chain-termination method. t/1 0 /1 Computer Analysis /14 0 Sequence handling and alignment were performed with the WISGEN packa/10ge 3 (Devereux et al. 1984). Numbers of nonsynonymous substitutions per nonsynonymou0s 0 3 (K,) and nondegenerate ( KO) site between sequences were measured with the weight3ed b pathway method of Li et al. ( 1985 ). Trees from distance matrices were inferred wyith g the neighbor-joining (NJ) method (Saitou and Nei 1987). Parsimony bootstrap analyusis e s was performed with DNABOOT of the PHYLIP package (Felsenstein 1985 ). Relativt oe- n rate tests were performed for divergence at nondegenerate sites between gapC 1se- 5 quences, according to the method described by Li and Tanimura ( 1987). &CL Nse- o quences were either retrieved from the data base or typed in by hand and were analyzved e in the same manner as were gapC sequences. mb e r 2 Results 0 1 cDNA Sequences for gapC 8 Nucleotide sequences for full-size gapC cDNAs from Physcomitrella patens (a bryophyte), Pinus sylvestris (a gymnosperm), Callistephus chinensis (aster, Asteri- idae), Dianthus caryophyllus (carnation, Caryophyllidae), and Pisum sativum (pea, Rosidae) are shown in figure 1. The cDNAs contain the entire coding regions which are completely colinear, with the exception of a deletion following amino acid 145 in 100 200 300 D400 o w n lo a d e500 d fro m h ttps600 ://a c a d e m ic700 .o u p .c o m /m800 b e /a rtic le900 -a b s tra c ,t/10000 /1 /1 4 0 /1 0 13100 0 0 3 3 b y g 1u200 e s t o n 1 5 1 N300 o v e m b e r 2 0 1 8 FIG. 1. -Nucleotide sequences for six glycolytic GAPDH cDNAs. Coding regions are shown in uppercase letters, and gaps are shown as dashes. Sequences are of cDNA clones from Physcomitrella patens (pPPC18), Pinus sylvestris ( pPSC I5 and pPSC9), Dianthus caryophyllus ( pDCC 1) , Callistephus chinensis (pCCC1). and Pisum sativum ( pPEA 1) (see Material and Methods). 144 Martin et al. the moss sequence; to date, this is the only indel found in coding regions of land plant gupC sequences. Within the coding region, the two cDNA clones from Pinus show only three synonymous substitutions in the coding regions and only few indels in the noncoding regions, suggesting that these are either allelic variants or very recently duplicated genes. For phylogenetic analyses presented here, only the sequence of pPSC9 was used. As judged by their frequency in the cDNA bank, these two clones would appear to represent the major gapC protein in pine seedlings. The systematic positions of those plants from which gapC sequences were analyzed in this study are given in table 1. Nucleotide Composition Equilibrium Some plant genes, such as that for the GAPDH enzyme of the Calvin cycle in chloroplasts-gupA (Brinkmann et al. 1987; Quigley et al. 1988)-or chalcone syDn- o thase (NiesbachXl8sgen et al. 1987), show an extreme bias for G and C within thwe n coding and flanking regions in graminaceous monocots. To show that coding regionlos a d of glycolytic glyceraldehyde-3-phosphate dehydrogenases, gupC, in plants are not sueb- d ject to this type of bias, which could potentially influence the evolution of gapC gen froes at nonsynonymous sites, we separately plotted the base composition of the plant gupCm h sequences for first plus second codon positions and for third codon positions (fig. 2ttp). s Figure 2 shows that first and second positions of plant gapC coding regions are ://aat compositional equilibrium (Prager and Wilson 1988) and that a mild preference fcaor d e m ic Table 1 .o u Systematic Position of Species from Which gapC Sequences Were Analyzed p.c o m Division or/m Species (gene’) Abbreviationb Subclass Subdivision b e /a Antirrhinum majus . ant Asteridae Angiospermaertic Petunia hybrida pet Asteridae Angiospetmae le-a Nicotiana tabacum tob Asteridae Angiospermaebs Callistephus chinensis ast Aster&e Angiospermaetra c Sinapis alba sin Dilleniidae Angiospexmaet/1 Arabidopsis thaliana . ath Dilleniidae Angiospermae0/1 Dianthus caryophyllus nel Caryophyllidae Angiospexmae/1 Mesembryanthemum crystallinum mes Caryophyllidae Angiospermae40 Petroselinum hortense . par Rosidae Angiospermae/10 Pisum sativum pea Rosidae Angiospermae30 0 Magnolia 1iliifIora ma% Magnoliidae Angiospermae3 3 Ranunculus acer ran Magnoliidae Angiospermae b y Hordeum vulgare (gapcl) bar Liliidae Angiospermae g H. vulgare (gapC3) bar Liliidae Angiospermaeue s Zea mays (gapC1) zea Liliidae Angiospermaet o Z. mays (gapC2) ZC2 Liliidae Angiospermaen 1 Z. mays (gapC3) zea Liliidae Angiospermae5 N Pinus sylvestris pin Pinidae Gymnospermaoe v Physcomitrella patens mos Bryophyta e m b gapC *1 U snelqeusesn oceths.e rSwysisteem spaeticci fpicoaslitliyod ne nios tbeads,e adl lo rne fethreen sccehse mtoe goufp CEh sreenqduoernfecre s (f1ro9m9 I) Z. ea and Hordeum in the text indicaer 20te b References and accession numbers for sequences are as follows: ant, X595 17; pet, X60346; tob, M14419; ast, prese18nt paper; sin, X04302; ath, M64116; nel, present paper; mes, M29956; par, X60344; pea, present paper; mag, X60347; ran, X60345; bar gnpC1, X60343; bar gapC3, Chojecki (1986); zea gapC1, X15596; ma gapC2 and gupC3, Russell and Sachs (1989); pin, present paper; and mos, present paper. ,:, Angiosperm Phylogeny 145 a. 1.0 0.8 0.8 0.4 0.2 0.0 D o w n lo a d e d fro m b. 1.0 h ttp s 0.8 ://a c a d 0.8 e m ic 0.4 .ou p .c 0.2 om /m 0.0 be /a rtic le FIG. 2.-Base composition of plant gapC sequences demonstrating base-compositional equilibri-aum b (Prager and Wilson 1988) at first and second codon positions (a) and at third codon positions (b). Bstraars indicate the fraction of bases in each sequence. Species-name abbreviations are as in table 1. c t/1 0 /1 /1 G and C within the coding region, as reflected at third positions of the barley a40nd /1 maize sequences, is not sufficient to have introduced bias at first and second cod0on 3 positions, which contain most of the nondegenerate and nonsynonymous sites. T00he 3 equilibrium found for the GAPDH of glycolysis (gapC) does not, however, hold 3for b Calvin cycle GAPDH (gapA) genes, where third codon positions contain 97% G+y gC in maize (Brinkmann et al. 1987). ue s t o n Relative-Rate Tests with gupC Sequences 1 5 N The mRNA for gupC, a slowly evolving glycolytic protein (Fothergill-Gilmoroe v 1986), satifies several criteria required of a molecular marker able to address plaemnt b phylogeny. gapC contains - 1 kb of coding region; this is considerably longer thean markers such as 55 RNA (Hot-i et al. 1985) or cytochrome c (Boulter et al. 1972)r 2 or 0 1 shorter peptides (Boulter et al. 1979). As is true of the larger rRNA species, gap8C evolves at a conservative pace, yet, in contrast to rRNA, gupC genes do not belong to the middle repetitive fraction of the genome and thus are not subject to the influences of concerted evolution for such DNA (Dover 1987, 1989). Nucleotide sequences for gupC are known for fungi and several metazoa whose fossil histories are well docu- mented, allowing determination and comparison of substitution rates for gupC from 146 Martin et al. the three eukaryotic kingdoms, for the purpose of linking the geologic time scale of animal evolution with that of its elusive angiosperm counterpart. To examine constancy of substitution rate for plant gapC cDNAs, we translated and aligned nucleotide sequences, leaving an average of 664 nondegenerate and 760 nonsynonymous nucleotide sites for comparison. To determine whether significant variation in substitution rate among higher-plant gapC sequences could be detected, we measured numjers of substitutions per site at fourfold degenerate sites in the coding region, &, and subjected them to the relative-rate test (Li and Tanimura 1987), using both yeast and bryophyte sequences as the outgroup. The results of these tests are summarized in tables 2 and 3. With yeast as the outgroup, no significant (Dat o the 5% level) differences between plant and animal gapC sequences could be detectewd n in substitution rate at nondegenerate sites (table 2)) suggesting that the substitutiolon a rate of gapC has remained relatively constant since the divergence of these lineagedes. d The larger number of negative values for K13-KZ3 (69 of 96 values; table 2) indicate fros that the plant gupC sequences may be evolving more slowly than their animal coumn- h terparts. Using the bryophyte gapC sequence as the outgroup, we tested the constancttpy s of rate within seed plant gapC sequences (table 3). Table 3 reveals that only the gap://aC sequence from Petroselinum (parsley) is evolving at a rate significantly higher thacan d that of other spermatophyte gapC sequences. gupC of Hordeum (barley), a gramei- m naceous monocot, appears to be evolving at a slower rate than its counterparts froicm .o the dicotyledonous species surveyed, except pea. Thus, relative-rate tests reveal (u 1) p that gupC sequences in plants do not evolve more rapidly than their animal counte.cor- m parts and (2) that, with the exception of Petroselinum, gupC in the different pla/mnt lineages surveyed is not evolving at significantly different rates. This conservativbe e mal.o d1e9 9o 1f ) eavnodlu trieonnd erfso r thgaespeC viasl uaalbsole obmsaerrkveerds iinn bmacotleerciualla rg upphCy losgeeqnueetnicc ess tu(Ndieelss.o n /articleet -a b r&L Sequence Analyses s tra c Parallel to the gapC analyses, we studied a gene of the chloroplast DNA, i.e., tt/1he gene for the large subunit of ribulose- 1,5-bisphosphate carboxylase/oxygenase (r&L0/1) from the same species or closely related taxa for which rbcL sequences are currentl/1y 4 0 available: Pseudotsuga menziesii (Douglas fir), a second Pinaceae instead of Pinu/1s, 0 Marchantia polymorpha (liverwort), a second bryophyte instead of Physcomitrell30a, 0 Spinacia oleracea (spinach) instead of D. caryophyllus as a second Caryophyllidae3, 3 and Oryza sativa (rice) as a second graminaceous monocot instead of Hordeum. gap bC y is nuclear encoded in plants such that, if gupC gene duplication events occurred pri guor e to speciation of the taxa under study, a distorted picture of plant evolution potentiallsy t o could be inferred from the cDNA sequences. Because there is only one copy of r&nL 1 in the chloroplast DNA, r&L gene duplication prior to separation of the higher-lev5el N taxa considered (Nei 1987, pp. 287-290) can be effectively ruled out. As for gupCo, v we first performed a relative-rate test for values of Z&,d erived from the aligned r&emL b nucleotide sequences, using the Chlamydomonas outgroup to examine constancy eof substitution rate for this gene. The rbcL gene from maize has accumulated a highr 2er 0 1 number of substitutions at nondegenerate sites than have the other angiosperm 8se- quences under study, whereas the gene from rice has not, suggesting that the increase of rate observed in the maize rbcL gene may have occurred subsequent to the separation of these graminaceous monocots (table 4). The Marchantia r&L gene is evolving at a slower rate than the other rbcL sequences considered. The significant slowdown observed for the lineage of the Nicotiana (tobacco) chloroplast genome on the whole Table 2 Relative-Rate Test between 16 Plant and Six Animal gapC Sequences at Nondegenerate Sites, Using Yeast as the Outgroup D o SPECIES 1 w n lo a YEAST Ant Pet Tob Ran Par Nel Ast Pea Sin Mag Mes Ath Bar Zea Pin Mdeos d Yeast 2,686 2.666 2,743 2,679 2,837 2.780 2,577 2.628 2,513 2,762 2,730 2,598 2,590 2,132 2,679 2 fro.612 m Species 2: h Nem 2,694 ~8 -28 49 -15 143 86 -117 -66 -181 68 36 -96 -104 38 -15 ttp-82 s Dro2 2,887 -201 -221 -144 -208 -50 -107 -310 -259 -374 ~125 -157 -289 -297 -155 -208 -2://a75 Dro 2,899 -213 -23J -156 -22J ~62 -119 -322 -271 -3&6 -137 -169 -301 -309 ~167 -220 -2c8J a Chk 2,682 4 -16 61 -3 155 98 -105 -54 -169 80 48 -84 -92 50 ~3 d-70 e Hum 2,733 -47 -67 10 -54 104 47 -156 -105 ~220 29 ~3 -135 -143 -1 -54 -1m21 ic Rat 2,670 16 -4 73 9 167 110 -93 ~42 -157 92 60 -72 -80 62 9 .o-58 u p NOTE.-Data in the top row are K ,), i.e.. the dwxgence between the respective plant (species 1) and yeast (species 3) at nondegenerate sites (i.e.. &) X 104. Data in the leftmost column are.c Kz3, o I.e., the divergence between the respective animal (species 2) and yeast at nondegenerate sites X 104.A ll other data tn the matrix are the difference K,,-K,, at nondegenerate sites X IO“; negmative values reflect a higher rate for the animal gupC sequence tested in the comparison. Absolute values of K,,-X;, greater than 0.0200 are underlined and were tested for sigmticance. No differences /mwere b sigmficant at the 5% level. Abbreviations are as in table I. except for dro and dro? (two @qK genes of Dmsophtlu;T so et al. 19856). chk (chicken: Dugaiczyk et al. 1983). hum (human; Tso eet al. lY85a), nem (nematode; Yarbrough et al. lY87), and rat (Tso et al. 19850). /artic le -a b s tra c t/1 0 /1 /1 4 0 /1 0 3 0 0 3 3 b y g u e s t o n 1 5 N o v e m b e r 2 0 1 8 148 Martin et al. Table 3 Relative-Rate Test among Seed Plant g@ Sequences, Using Bryophyte as the Outgroup SPECIES 1 Ant Pet Tob Ran Par Nel Ast Pea Sin Mag Bar Zea Pin Mm .......... 1,461 1,607 1,520 1,430 1,727 1,564 1,535 1,381 1,397 1,406 1,384 1,482 1,347 Species 2: Ant ........ Pet ......... -146 Tob ........ -59 87 Ran ........ 31 177 90 Par ......... _266 -120 -207 -297* Nel ........ -103 43 -44 -134 163 Ast ......... -74 72 -15 -105 192 29 D Pea ......... 80 226 139 49 346; 183 154 o w Sin ......... 64 z 123 33 m* 167 138 -16 n Mag ........ 55 201 114 24 321’ 158 129 -25 -9 loa Bar ......... 77 223 136 46 343* 180 151 -3 13 22 de ZPiena .................. -12114 i2%60 13783 -5823 &&5J* * 21812 15838 -10314 -8550 -7569 -9387 135 d fro m h NOTE.-Data in the top row are K,s, i.e., the divergence between the respective species 1a nd the bryophyte (Physcottpm- itrekz) outgroup (species 3) at nondegenerate sites (i.e., &) X 104. Data in the matrix indicate the difference I&-I& sat nondegenerate sites X 104; negative values reflect a higher rate for species 2 in the comparison. Absolute values of I&-I://a& c greater than 0.0200 are underlined and were tested for significance. Species name abbreviations are as in table 1. a d * P < .05. e m ** P< .Ol. ic .o u p (Wolfe et al. 1987) is reflected neither in the &CL gene of Nicotiana nor in that .coof m the confamilial species Petunia (table 4). /m b e Trees /a rtic For the construction of phylogenetic trees we chose the NJ (Saitou and Nei 198le7) -a method, since it has been shown in computer simulation to be more efficient in rbe- s covering the correct topology than are many are other molecular sequence and distantrace c matrix methods, under a variety of sequence parameters (including unequal rates t/1of 0 evolution in different lineages) (Saitou and Nei 1987; Saitou and Imanishi 1989; J/1in and Nei 1990). Different evolutionary models deal differently with the contribution/14 0 of substitutions at twofold-degenerate sites to the total proportion of nonsynonymous/1 0 substitutions (Li et al. 1985; Nei and Gojobori 1986). Because the effect of an accura30te 0 distance measure has been shown in computer simulation to be critical for recover3y 3 of the correct topology (Jin and Nei 1990; Nei 199 1)) we thus chose divergence byat g nondegenerate sites as the distance measure for trees presented here. From the matricues e of K, and K. values (Li et al. 1985) for gapC and &CL for the nine higher-plant tast oxa under study, NJ trees were constructed in order to contrast the pictures of plant evno- 1 lution reflected in the sequences of both a nuclear gene (gapC) and chloroplast ge5 Nne (&CL). The NJ trees for values of K. are shown in figure 3. The NJ tree for gupovC e values of K, yielded a branching order identical to that shown for Ko, whereas in tmhe b r&L tree of K, values, monocotyledons branched below the gymnosperm, perhapes r 2 because of the higher substitution rate in the maize rbcL gene (table 4)) which is mo0re 1 pronounced in values of K, than for divergence at nondegenerate sites (data not shown8). With the exception of the position of Magnolia within the dicotyledons surveyed, NJ trees for K. values of both gapC and &CL yield the same topology, suggesting that no Angiosperm Phylogeny 149 Table 4 Relative-Rate Test among Plant r&L Sequences, Using Chlumydomonus as the Outgroup SPECIES 1 Mpo Fir Spi Pea Mag Tob Pet Ric Zea C. reinhardii 669 793 823 121 721 813 784 781 894 Species 2: Mpo _. Fir -124 D Spi _. _154* -30 o w Pea -58 66 96 n lo Mag -58 66 96 0 a d Tob _144 -20 10 -86 -86 e d Pet _. _115 9 39 -57 -51 29 fro Ric _112 12 42 -54 -54 32 3 m Zea . . _225** _101 -11 _167* _167* -81 _110 _113 http s NOTE.-Data in the top row are IV,,, i.e., the divergence between the respective species I and the C. reinhardii outgro://aup (species 3) at nondegenerate sites (i.e., &) X 104. Data in the matrix indicate the difference IV~~-I&a t nondenegerate scites a X IO’;n egative values reflect a higher rate for species 2 in the comparison. Absolute values of K&& greater than 0.01d00 e are underlined and were tested for significance. Species-name abbreviations are as in table I, except for fir (Pseudotsumga menziesii: Douglas fir), mpo (Marchantia polymorpha), mag (Magnolia macrophylla), spi (Spinacia oleracea; spinach), aic.ond ric (Oryza saliva; rice). u p * P < .05. .c o ** P < .Ol. m /m b gapC sequences included in figure 3 arose through duplication prior to the separatioe/an of those monocotyledons and dicotyledons surveyed. A composite matrix of K0 valurtices from concatenated gapC and rbcL sequences was also used to construct an NJ trle-aee for the taxa shown (fig. 4), in order to increase the total number of sites for comparisonb. s The branching order of the composite tree is identical to that obtained with the &CtraL c data alone (fig. 3). Magnolia is not borne on the deepest branch within the angiospermt/1s 0 surveyed, in either &CL or gapC trees (see Discussion). /1 /1 There is a single gene for gapC in Arabidopsis (Shih et al. 199 1 ), yet in maiz40e there are three separate genes for gapC, termed “gapC1,” “gapC2,” and “gapC3”/10 3 (Russell and Sachs 1989 ). The relatively good correlation between the gapC (nuclear0) 0 and rbcL (chloroplast ) trees (fig. 3) suggested that gapC comparisons in most cas33es b were not paralogous. To examine the possibility that gapC comparisons betweeyn g monocotyledons and dicotyledons may involve genes that arose through duplicatioun e s prior to the separation of these taxa, we constructed a tree from the available sequenct oe data for gapC2 and gapC3 from maize, in the context of all higher-plant gapC cDNn 1A 5 sequences currently available (fig. 5 ). Indeed, a duplication that gave rise to gapC N3 o of maize occurred before the divergence of barley and maize. Yet, if the gapC famivly e of maize did arise prior to the monocot-dicot separation, we must expect one of tmhe b e members of the maize gapC genes to share a common branch with a dicotyledonousr 2 sequence, a result that is not found, indicating that the gapC sequences compare01d 8 between monocotyledons and dicotyledons here (60 pairwise comparisons) did not arise through duplication prior to the separation of these taxa. Similar results were previously obtained in the analysis of gapC and chalcone synthase genes (Martin et al. 1989). These data and those derived from rRNA genes (Troitsky et al. 199 1) lend strength to the molecular evidence for divergence of monocotyledons and dicotyledons prior to the divergence of magnoliids from other angiosperms.
Description: