AAppplicattiioonnss Applications in Plant Sciences 2014 2 ( 1 ): 1300063 iinn PPllantt SScciieenncceess PROTOCOL NOTE A PCR– DNA LONG BASED APPROACH FOR ENRICHMENT PRIOR - 1 TO NEXT GENERATION SEQUENCING FOR SYSTEMATIC STUDIES SIMON URIBE-CONVERS 2,3,5 , JUSTIN R. DUKE 3 , MICHAEL J. MOORE 4 , AND DAVID C. TANK 2,3 2 Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, Idaho 83844-3051 USA; 3 College of Natural Resources, University of Idaho, 875 Perimeter Drive MS 1133, Moscow, Idaho 83844-1133 USA; and 4 Department of Biology, Oberlin College, Science Center K111, 119 Woodland St., Oberlin, Ohio 44074-1097 USA • Premise of the study: We present an alternative approach for molecular systematic studies that combines long PCR and next- generation sequencing. Our approach can be used to generate templates from any DNA source for next-generation sequencing. Here we test our approach by amplifying complete chloroplast genomes, and we present a set of 58 potentially universal prim- ers for angiosperms to do so. Additionally, this approach is likely to be particularly useful for nuclear and mitochondrial regions. • Methods and Results: Chloroplast genomes of 30 species across angiosperms were amplifi ed to test our approach. Amplifi ca- tion success varied depending on whether PCR conditions were optimized for a given taxon. To further test our approach, some amplicons were sequenced on an Illumina HiSeq 2000. • Conclusions: Although here we tested this approach by sequencing plastomes, long PCR amplicons could be generated using DNA from any genome, expanding the possibilities of this approach for molecular systematic studies. Key words: angiosperms; chloroplast enrichment; long PCR; next-generation sequencing; plastome; universal chloroplast PCR primers. Advancements in next-generation sequencing (NGS) tech- capture probes makes this method most effi cient for projects nologies have permitted the assembly of large, genome-scale dealing with hundreds of species. Another commonly employed data sets that have shed light on the evolutionary history of method for plant phylogenomic studies is genome skimming many taxa (e.g., Parks et al., 2009 ; Moore et al., 2010 ; Xi et al., ( Straub et al., 2012 ), which takes advantage of the fact that or- 2012 ; Eaton and Ree, 2013 ; Tennessen et al., 2013 ). For plant ganellar DNA and nuclear ribosomal DNA are present at high phylogenetics, there has been a major focus on methods for copy numbers in genomic DNA. However, a signifi cant limita- chloroplast phylogenomics (e.g., Parks et al., 2009 ; Moore et al., tion of this method for systematic studies is that only high-copy 2010 ), although methods for collecting phylogenomic data sets number regions are recovered consistently across all samples, from the nuclear and mitochondrial genomes have also been whereas regions with lower representation are only recovered developed (e.g., Straub et al., 2012 ; Eaton and Ree, 2013 ). Stull in some samples and missed completely in others ( Straub et al., et al. (2013) developed a custom RNA probe set designed to 2011 ). This can be problematic for molecular systematic stud- capture angiosperm plastomes via solution-based hybridization. ies where missing data may result in misleading phylogenetic While their capture system was broadly successful, Stull et al. results ( Lemmon et al., 2009 ). Moreover, being limited to high- (2013) found that the most variable spacer regions were often copy regions in the genome becomes restrictive for experimen- captured at much-reduced coverage compared to more con- tal design as it excludes putatively highly informative regions served regions, and were sometimes missed entirely if the target in the genome such as single-copy nuclear genes (e.g., the taxon was phylogenetically divergent from one of the 22 plas- single-copy orthologous genes [COSII] and the pentatricopep- tomes used in the bait design. Moreover, the current cost of the tide repeat [PPR] gene family; Wu et al., 2006 , and Yuan et al., 2009 , respectively). 1 Manuscript received 26 July 2013; revision accepted 3 December 2013. As an alternative, we present an NGS approach that com- The authors thank R. Cronn, T. C. Peterson, D. F. Morales-Briones, S. J. bines long PCR and Illumina sequencing to strategically com- Jacobs, H. E. Marx, and four anonymous reviewers for helpful comments pile phylogenomic data sets for molecular systematic studies. on the manuscript, and the University of Idaho Institute for Bioinformatics Long PCR, or long-range PCR, uses a combination of two poly- and Evolutionary Studies (NIH/NCRR P20RR16448 and P20RR016454) merases—a nonproofreading polymerase at high concentration and the iPlant Collaborative (NSF DBI-0735191) for bioinformatic resources. and a proofreading polymerase at a lower concentration—to Funding for this work was provided by the National Science Foundation amplify DNA fragments that range between 3 and 15 kilobases (DEB-1210895 to D.C.T. for S.U.C., and DEB-1253463 to D.C.T.) and by (kb), although cases of extremely large fragments (22–42 kb) the University of Idaho Student Grant Program and Seed Grant Program to have been reported (e.g., Cheng et al., 1994 ). Long PCR has S.U.C. and D.C.T., respectively. been used extensively in human genome projects (e.g., Craig 5 Author for correspondence: [email protected] et al., 2008 ) and to sequence complete mitochondrial genomes doi:10.3732/apps.1300063 (e.g., Knaus et al., 2011 ; Alexander et al., 2013 ), using both Applications in Plant Sciences 2014 2 ( 1 ): 1300063; http://www.bioone.org/loi/apps © 2014 Uribe-Convers et al. Published by the Botanical Society of America. This work is licensed under a Creative Commons Attribution License (CC-BY-NC-SA). 1 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 % of mbiguous bases 0.02714 0.00524 0.00245 0.00164 0.00297 0.00159 0.00334 0.00167 0.00616 0.00093 0.01555 0.00180 0.03310 0.03238 0.01227 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a a No. of mbiguous bases 34 7 3 2 4 2 4 2 7 1 19 2 36 27 9 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a a % called ebases 99.9729 99.9948 99.9976 99.9984 99.9970 99.9984 99.9967 99.9983 99.9938 99.9991 99.9844 99.9982 99.9669 99.9676 99.9877 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a N50 294 107 049 529 123 629 195 856 541 615 656 398 012 986 803 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a 9, 7, 3, 8, 5, 3, 7, 8, 4, 7, 1, 8, 1, 9 9 1 3 3 2 7 7 6 1 1 9 1 1 1 p % of masked b 1.7 0.08 0.36 0.21 0.03 0.02 0.87 0.33 1.34 0.18 0.66 0.49 2.07 1.85 1.62 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a No. of masked dbp 2126 101 440 260 35 29 1045 394 1525 198 810 547 2255 1544 1187 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a y Ave. assembldepth 656 641 664 642 844 764 707 642 698 925 540 688 652 717 701 n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a an available. CAL bp (min–max) 011 (204–28,257)3,360 (1222–48,767)1,147 (464–34,602)5,256 (819–50,680)3,676 (6157–75,123)1,372 (3039–73,629)559 (425–67,195)546 (204–28,559)412 (178–39,914) 6,024 (269–97,615)312 (179–36,972) 1,137 (616–44,011) 626 (214–36,850)169 (222–36,830)892 (186–36,621) n/a n/a n/a n/a n/a n/a n/a n/a n/an/a n/a n/a e 5 1 1 1 3 3 8 8 5 3 5 1 3 4 4 NGS assembly statistics wh Region no. not Base pairs No. of bcamplifi ed sequenced contigs n/a125,28325 n/a133,59510 n/a122,61411 n/a122,0468 n/a134,7044 n/a125,4904 13119,82814 10119,64714 7113,65021 13108,0713 9, 10122,18223 10, 14111,37110 9, 10108,76730 6, 9, 10, 13, 83,3842014, 1673,378154, 6, 7, 8, 9, 10, 13, 14, 15, 16n/an/an/a 5n/an/a 14n/an/a 5n/an/a 9n/an/a 9, 14, 17n/an/a 5, 6, 9, 15n/an/a 6, 8, 9, 10n/an/a 5, 6, 9, 11, 17n/an/a5, 6, 8, 9, n/an/a10, 145, 6, 9, 12, n/an/a13, 143, 5, 6, 8, 9, 11n/an/a es, and No. of amplifi ed regions 16 16 16 16 16 16 15 15 15 15 14 14 14 10 6 16 15 15 15 15 13 12 12 1110 10 10 ormation, tissue sourc Type of Collection tissuedate Silica 5 July 2010gel–driedSilica 13 July 2009gel–driedSilica 4 July 2001gel–driedSilica 21 July 2001gel–driedSilica 13 July 2009gel–driedSilica 6 June 2009gel–driedSilica 7 July 2010gel–driedSilica 24 June 2001gel–driedSilica 5 Mar. 2009gel–dried Silica 22 Feb. 2002gel–driedSilica 27 Apr. 2004gel–dried Silica 21 July 2005gel–dried Herbarium16 Jan. 1988 Silica 19 June 2001gel–driedSilica 16 Apr. 2005gel–dried Herbarium14 June 2001 Silica 3 July 2013gel–driedHerbarium27 May 2004 Silica 8 July 2013gel–driedHerbarium11 June 2008 Herbarium10 Apr. 1996 Herbarium23 June 2005 Herbarium28 Mar. 2002 Herbarium1 May 2004Herbarium7Aug. 1989 Silica 3 July 2013gel–driedSilica 3 July 2013gel–dried ucher inf Herbarium ID ID WTU WTU ID ID ID WTU WTU WTU WTU WTU F WTU WTU ID ID ID ID ID ID ID ID IDID ID ID study, with vo Collection no. Uribe-Convers 2010-22Tank 1046 Olmstead 2001-78Tank 2001-49 Tank 1048-b Tank 2009-8 Uribe-Convers 2010-24Tank 2001-35 Olmstead 2009-22 Egger 1213 Tank 2002-04 Fairbarns s.n. Zak & Jaramillo, 3387Tank 2001-27 Tank 2005-27 Brunsfeld 4159 Willard 2013-42Poor 21 Morales-Briones 412Brunsfeld 7213 Hetrick 1005 Smith 8040 Halse 6901 Clippinger 2Gray 52 Willard 2013-26Willard 2013-21 pecies included in this Order/Family Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Orobanchaceae Lamiales/Plantaginaceae Asterales/Asteraceae Apiales/Apiaceae Nymphaeales/Nymphaeaceae Malpighiales/Salicaeae Rosales/Rosaceae Caryophyllales/Polygonaceae Laurales/Lauraceae Poales/PoaceaeFagales/Betulaceae Poales/Poaceae Asterales/Asteraceae 1. List of s T ABLE Species Bartsia inaequalis Benth.Castilleja covilleana L. F. Hend.Castilleja elmeri FernaldCastilleja linariifolia Benth.Castilleja miniata Douglas ex Hook.Castilleja pallescens (A. Gray) Greenm.Bartsia stricta (Kunth) Benth.Castilleja applegatei FernaldCastilleja virgata (Domb. ex Wedd.) EdwinCastilleja ortegae Standl.Castilleja lineariloba (Benth.) T. I. Chuang & HeckardCastilleja victoriae Fairbarns & J. M. EggerLamourouxia virgata KunthCastilleja oresbia Greenm.Castilleja arvensis Cham. & Schltdl. Penstemon montanus Greene var. idahoensis (D. D. Keck) Cronq.Balsamorhiza sagittata (Pursh) Nutt.Lomatium dissectum (Nutt.) Mathias & ConstanceNuphar polysepala Engelm.Salix scouleriana Barratt ex Hook.Crataegus columbiana HowellPolygonum douglasii GreeneUmbellularia californica (Hook. & Arn.) Nutt.Bromus tectorum L.Alnus rhombifolia Nutt. Poabulbosa L. Senecio integerrimus exaltatusNutt. var. (Nutt.) Cronq. http://www.bioone.org/loi/apps 2 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 us Sanger sequencing and NGS technologies. Here, we use long % of mbiguobases n/a n/a n/a n/a n/a n/a 0.01 PstCudRi etso u sgiennge rNatGe Sc. hWlohroilpel awset fDoNcuAs otenm wphlaotlees chfolor rosypslatesmt aamtic- a No. of mbiguous bases n/a n/a n/a n/a n/a n/a 10.6 pstetluriefid scitea s(tei wo.gnh.,,e trthehe ios in nalvype pprtraeordati crcheup laeisra tr deoigrr ietohcntels ys o mftr atalhnle ss lipanltgaalsbetlo-ecm otepo y a rtraeer gogifeo tinen)d-. a % called epN50bases n/an/a n/an/a n/an/a n/an/a n/an/a n/an/a 35,052.8799.99 Washington Herbarium. Imaannrseee st Oa neilcdmuta diorrbn gifftlf eoimeoo c ornbiumrt,iso ou lacoointnhninkvfo gontenh ronPdemwerC csianaRsht,l il a coaactarns oeol dulwspy/lsolde,ea rl snsl aun ttla icusgashoce lf llnabeyoosae ram r rrlv ereleee pg trgiaeiysoixt o idnoutnrisnsvis evot ewhfem uranhelit ce bgfa royisreor ec in i attndsshlt .eiepr fso fih e nacynn ulsrodliitgzc iehtetoss-- No. of % of y masked d masked bbp n/an/a n/an/a n/an/a n/an/a n/an/a n/an/a 833.070.79 TU = University of r G2woe0frol 0aatrhth9kiea ;vh m Meofi r eesoaaelonds rdoe.e f MOo emftlo maaorlmles.e,ot pce2vula0iedlfi1ar ,c,0r 2 a )tp,th0 ilw0eoa 0nnha ; tiv ( c Mseahy.ig olsha.to,aeb rv Dmieel oia etmwtyti c anaosdli .fee,s iaa2tnh n0cleda0e r c7 Pgth ;he al Peol mn arboureekpmrgsl, aib1neset9nt r 9iat nhol2.gef ;, Ave. assembldepth n/a n/a n/a n/a n/a n/a 698.73 boretum; W apanmogtpeionlistfii paeeldlry mt hu epn licavhsetlroosmraole p PlsaCesqRt u gepenrnicomemse rehssa. doT fof a 3ct0ei lsistt patethecidise sath p(ep1 r7do eagsciehgn,ne wroaef) CAL bp min–max) n/a n/a n/a n/a n/a n/a 66.60 n Park Ar athcarot sws earneg dioesspigenrmeds tuos ipnogt ean tsieatl loyf b5e8 ucnhilvoerrospalla sint PaCngRi opsrpimermerss ( 13,1 ngto and that may work in some gymnosperm lineages. No. of Region no. Base pairs No. of amplifi ed not bc sequenced contigsregionsamplifi ed 94, 6, 7, 9, n/an/a10, 11, 1284, 6, 8, 9, 10, n/an/a13, 14, 1781, 6, 8, 9, 10, n/an/a12, 13, 1474, 5, 6, 7, 8, 9, n/an/a10, 11, 1274, 5, 6, 7, 8, 9, n/an/a10, 11, 13n/an/a61, 3, 4, 6, 7, 8, 9, 10, 11, 1211,49314.13 o Stillinger Herbarium; WA Park Arb. = Washi 21atgtOaronn ×9iaro gd8 cots Ri et7bcpC oese )eaedst,aptn pr ysrymfctelethrihtirosesirlmasel imlecmsd n ejeaa itacennapal atigestpMhgi.u: rv yn 30 ouL elmA0.aats0ai–cm Pes2moh7n G me Mf 0ogt xs ( u 1oI T o nEorI7LnfagIof i .Tb /dsuu ( Mμ fil AixmH.lef L iifo (nca e 11Ooag obr2 )Kr iefrg .eoD nosue DsSetpmlnpS t–peNgte i dcehaedrAAiclr mnee.i(i e eNsao( prd)2(lPnae. CD 0 he ofrH1(T o yr 3ss0i clARpahg0o ;um eeh BgEsAscr-e p)pbqiSpwn elpaupeymsUar.ae.)) is lG, uenLU i stmtd BgprhyTsoiiaao ixvutgnSnrdi epeStgsn nssn, i1i( untoa2 D)hteg,m o0e oL w u0 1iyt.8cs9e2 hli 3 )(e nDrd t o ewwgpeNrea vldoe anaAeegr dsmelsre t osw pinocd pDeoeahd ecrosfgoidia fiyeesea 5lseenxnie)n8dne--,, 1. Continued.TABLE Type of Collection SpeciesOrder/FamilyCollection no.Herbariumtissuedate Abies amabilisPinales/Pinaceae1419-46WA Park Silica 24 May 2009 Douglas Arb.gel–driedex J. ForbesCapsella bursa-pastorisBrassicales/BrassicaceaeBrunsfeld IDHerbarium1 June 2005 6313(L.) Medik.Lupinus leucophyllusFabales/FabaceaeWillard IDSilica 3 July 2013 2013-03gel–driedDouglas ex Lindl.Abies fraseriPinales/Pinaceae1005-47WA Park Silica 24 May 2009 (Pursh) Arb.gel–driedPoir.Balsamorhiza Asterales/AsteraceaeSmith 9421IDHerbarium4 June 2007 hookeri Nutt.Abies grandisPinales/Pinaceae1084-49WA Park Silica 24 May 2009 (Douglas Arb.gel–driedex D. Don) Lindl. Average Note : CAL = contig average length; F = Field Museum of Natural History Herbarium; ID = University of Idaha All data from the 16 chosen primer combinations.b The number of the regions is the same as the order in Fig. 1 .c Base pairs (bp) sequenced is the sum of all contigs when including only one copy of the inverted repeat.×d Number of bases masked because the minimum sequencing depth of 5 was not achieved.e Percentage of unambiguously called bases. f(epb(pCwamHUQcd5FeimiFfi(erPmDpfnm od1Qeexlonarero0rrrlooeviINSvegierleo ieecoaa mtFu °AInmllmmrn firirpep s PAhdzymAom CA(elbrsoaaiis rbeGeal uedmemQCweui sslwusee) aGti nnrri edacreeer—eenPEr,nttnRc,eIeies/d-ee ntlih sE vs zAekhdiIod12ds diarsNr si—ptdneeteowdattN obl ntel 0e0 e yGosworo oidg ey swb w l dgit t0nii eH Qlen uorm o ohamhtymEae D y9 ,ao0o2rhe,ws tnnfaA oeo lr Iei ,iNM ;e.rbp5 e13lNet c Aiao0n1 tgiaetof rUpdzr Cl tSa65 taedt on. a aisaw)AkhGepmhe5 Cµ grt,otpSjr, sl,si hcd1ea ogc aee loL(oa nnIeetE5oAM9oyprare %µcnev reDPnn ras w g.e reNtw ctea(tLdfe0a)Ho µnldo a )p)aTe2 oln lc fraaim w,.f L s letleasonroixµl. rhT e ctt yotL:o5FaieOsagT mef 5o0re Ltpo mam ot 4fp 1 s o iparear ×mo.ooilmnlau d qpetf ° 7p.erniarienote (t f l tng Cdeo fire1 5eihtd.aQd rd aasfrD n5in,hliso dhs 0 a cseuximd .d it5iuepo geI2 neAtseseNsoµ 0anriHHTµAyq ×s a /e e0meenaedLtr sµpngiMpA uihbsi a dr2 t 2 1aGiinDrQo Lt ra Oie cs Or(lgepep t)2h zgaiudi2o )tllnuEoPn n)Nmp Imhy )a,i a),t tcoef.f.Ar l nc .i,oNwer5a mtoez ao soaoA fif0e9eoE igd la denG irmtnm rrfi.yr3fccoppg ui nart2H el d oguoPoam3° hEv rlQsnefneccdpo5t il i Cnnnoimr0o oe1slh ofiliNda lx teIata otti w0ets rleo., µmtAiecs vraySbyg0s pl ,Petwnh.sal/LpQaa eite0e,rimt msplG μbgCn a wsnpiµattry,ir( 4 9ar bie hiin (e Lis Cra( eLsERe oosnrt itn3Ib o a 1sgoetµiprtt)nn(nuHpra Nid° i iaehlena .m 5olLodiozhtitt,2 Csrrl ut(pns ueti is fineee wnuea Q51yFoep2ut a-sg H cdodannxdiD abf b n5 e ntielorIeofgwoo,atlued µt oeyraeAse iiNlneneafecr roaQtnefioLmty t-tna nts a,sSnreGwtwlsvAitst te1s /ouIiitsti2wMdaattotreµoe ztduAsd5Eeseiaesr edsly. tlroeLf)t: reDo5.psre rafegsN elGi ,ba sen -s a 2) Mlo lPtnoimsaN,µtduHl y 2hh o,tpE .ci o(ftvCacn ra5Liat o0iean5pT g pAiiimVNnenhogtgena R QFpnn0 sl iC mµddgfiooylhi tannk6a h ed8iT uo e )LI (H sfe-ldtnagleb8rbot a ;QA FPmde2 s qeee,woi t eaab u° )ro spo.z HicnCnufIGl ldl ,aae Cowpg9fteraeifh L0Ai caSfnnRs atantr h.13rttrE n.ilioye1 ho eoiredgmdtG2afigt1a.° oraoaNfi t0 ahno0 p m, Ct s ,,fy gnDnrE lasceut × 1 agr o2rm o Th tC0 r dm µtseotnHN2T hrN5T (peifg5 ePP. a neLwt Toaio4 3ea arpai oi– t(tbACCntkµlnoeFHh rs8 p6atdqaslqic1~led MLb/sisbfbeRR,– ir eN ro µ3fo2 ro,d3 µyi.dPD lp,l 6ym lmw L t e 2CgeTrfie0L t bpSooT8 mmw naNhamc )lC a ntePureo ll2 °it nbnbihhaaayoyoktiitaraAra Cfrssl )ydynnnndpoeeess-r-f---- 2l .,., http://www.bioone.org/loi/apps 3 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 TABLE 2. Universal angiosperm primers used for chloroplast genome amplifi cations. The 16 primer combinations chosen for this study are in bold with approximate amplicon sizes in kilobases (kb) indicated.a Region no. Approx. size (kb) Primer (F/R) Primer sequence (5′ –3 ′ ) Overlap between regions in bpb 1 8 trnH.GUG.6R CCTTRATCCACTTGGCTACAT Regions 1 & 2 = 542 1 psbK.195R ACTTACAGCAGCTTGCCAAAC Regions 1 & 2a = 542 2/2a 10.3/6.3 trnQ.UUG.50R GGACGGAAGGATTCGAACC Regions 2a & 2b = 627 2a atpH.17F CTGCYGCTTCYGTTATTGCT Regions 2b & 3 = 2059 2b 4 atpF.65R CGGTATTAAACCCGAAACTCC Regions 2 & 3 = 2059 2/2b rpoC2.4805F GYCGTATYGATTGGTTRAAAGG Regions 3 & 4 = 1274 3 7 atpI.705R CRGCTAAAGTTGCAAAAATAAGAGCT Regions 4 & 5 = 860 3 rpoC1.1670F GRGATCAAATGGCTGTTCAT Regions 5 & 6 = 618 4 9 rpoC2.520R GTTCGTACAGCAGTATCYACAAC Regions 6 & 7 = 764 4 petN.3R GCCCAAGCRAGACTTACTATATCC Regions 7 & 8 = 153 5 10.5 trnC.GCA.47F CCCAGTTCAAATCCGGGT Regions 8 & 9 = 1216 5 psaB.2170F GCRGCTTTCTTGATTGCYTC Regions 9 & 10 = 135 6 10 trnfM.CAU.21R GGTTATGAGCCTTGCGAGCTA Regions 10 & 11 = 771 6 trnT.UGU.17F GGTTAGAGCATCGCATTTGTAATG Regions 11 & 12 = 2781 7 10.3 rps4.380R GGTTTGCARCGATAACTTGGKATATC Regions 12 & 13 = 142 7 rbcL.178R GTCCATGTACCAGTAGARGATTC Regions 13 & 14 = 392 8 9.2 rbcL.2F TGTCACCACAAACAGARACTAAAG Regions 14 & 15 = 1911 8 psbJ.3F GGCYGATACTACTGGAAGRAT Regions 16 & 1 = 840 9 9.8 petA.920F CTTCAAGAYCCATTACGTGTHCAAG 9 psbB.160R TRCCYTGTCTCCACATTGGAT 10 10.9 psbB.3F GGGTTTRCCTTGGTATCGTGT 10 rps3.17F.new ATCCACTTGGTTTYMGACTTGG 11 8.7 rpl16.3R AACCAACGAGTCACACACTAAGC 11/16 ycf2.5100R CAGATCATGAATGTTTGGAATCCAT 12 10 ycf2.2300F TCGGGATCCTRATGCATATAGATAC 12 rps12.190F GTTGCCAGAGTACGMTTAACCT 13 11 rps12.360R CCCTTGTTGACGATCCTTTACTC 13 ycf1.59R CCGACCACAACGACCGAAT 14/15 11.2 trnN.GUU.7R CCGCTCTACCACTGAGCTAC 14 ndhA.535F GCTGCTCAATCDATTAGTTATGAA 15 10.5 ndhI.194R CGAACRCATACTTCACAAGCAA 16 8.2 psbA.640F GCTATGCATGGTTCYTTGGTAAC rps16.50R CGAACATCAATTGCAACGATTCGATA rps16.50F TATCGAATCGTTGCAATTGATGTTCG psbK.200F GGCAAGCTGCTGTAAGTTTTCGA atpF.70F GGGTTTAATACCGATATTTTAGCAAC trnR.UCU.45F GGTATAGGTTCAAATCCTATTGGAC trnQ.UUG.47F CGGAGGTTCGAATCCTTCC trnK.UUU.3R GAGATGGCAACTCAATCGTTG trnK.UUU.3F CAACGATTGAGTTGCCATCTC atpA.430F CGTTCYGTATATGARCCTCTTCAAAC atpA.820F ATCGMCAAATGTCTCTTCTATTAMG ccsA.890R TCCAAGTAATAAANGCCCAAGTTTC trnR.ACG.15F GAGGATTAGAGCACGTGG ycf1.70F GTGGTCGGACTCTATTATGGAT trnL.UAG.18F GGTAGACACGCTGCTCTTAGG trnL.UAG.19F GTAGACACGCTGCTCTTAGGAAG rps12.320R GGGTTCCTCGAACAATGTGATATC rpl2.550F GTGCTGTAGCGAAACTGATTG rpl2.640F TCAGCAACAGTCGGACARGT psbT.3F TGGAAGCATTGGTTTATACATTYCT atpB.1290R ARGGTTGTGATAAGAAACGYTCAA trnT.UGU.42F GATGGTCATCGGTTCGATTC psbC.3R AGTTCCATTAAAGAGCGTTTCC psbD.860F CYGGTTTATGGATGAGYGCT rpoB.900R CGTCGACCAATCYTTCCTAATTC rpoB.470R CCRGGRCTTTGCAATATTTGATTG rpoC2.430R ATRGGTAAATCAATCATTTGYCCTTG a All primers are shown in the 5′ to 3 ′ direction; the name of each primer consists of three parts: the gene in which the primer is anchored, the approximate position of the primer within that gene, and either an “F” or an “R.” It is important to note that the F and R designations do not indicate that the primer should be used as a forward or reverse primer; rather, they indicate the 5′ to 3 ′ orientation of the primer with respect to the gene—i.e., a primer that is designated as an “F” primer has its 5′ to 3 ′ orientation in the same orientation as the gene (i.e., on the forward strand), whereas an “R” primer is oriented in the direction opposite to the 5′ to 3 ′ orientation of the gene (i.e., on the reverse strand). b Overlap between regions is given in number of base pairs (bp), without taking the length of the primers into consideration. For the three genera of Orobanchaceae in which PCR optimization was amplify were regions 2 (t rnQ (UUG) - rpoC2 ), 9 (p etA-psbB ), 10 (p sbB - rps3 ), performed, amplifi cation of the fragments was straightforward and had an and 14 ( trnN (GUU) - ndhA ), which are among the largest fragments (10.3 kb, average success rate of 89.7% (range = 73–100%). The most diffi cult regions to 9.8 kb, 10.9 kb, and 11.2 kb, respectively; Table 2 ). It was possible to split http://www.bioone.org/loi/apps 4 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 Fig. 1. The fi nal annotated chloroplast genome assembly of B artsia inaequalis with the 16 overlapping primer combinations indicated. Note that the primer combinations for regions 11, 12, 13, and 16 amplify both inverted repeat A and B in a single reaction. Photos by Simon Uribe-Convers. region 2 into two smaller fragments, 2a (t rnQ( UUG) - atpH : 6.3 kb) and 2b tissue for DNA extractions would improve success rates. Furthermore, if ( atpF - rpoC2 : 4 kb), which facilitated its amplifi cation in several taxa. This genomic rearrangements and/or primer mismatches are present in certain was not the case for regions 9, 10, and 14, for which multiple long PCR groups, primer combinations other than the 16 that were used here could be experiments using varying amounts of DNA template were necessary to ob- tested ( Table 2 ). Nevertheless, we successfully amplifi ed all 16 regions in tain successful amplifi cations. Amplifi cation outside of Orobanchaceae was seven species, whereas in the remaining 23 species it was only possible to highly variable, with an average success rate of 70.8% (range = 22–100%) amplify between six (1 sp.) and 15 (8 spp.) regions ( Table 1 ). These results with regions 5, 6, 9, 10, and 11 showing the lowest success. Importantly, translate to 21 species having at least 12 regions amplifi ed (114.7 kb based the results for these taxa were obtained after just two rounds of PCR where the on potential amplicon size), representing ca. 74% of the chloroplast genome annealing temperatures were changed to either 48° C or 55 ° C. Although we when considering only one copy of the inverted repeat. Even the species did not optimize the long PCRs for each group, we are confi dent that opti- with the smallest number of amplifi ed fragments (C astilleja arvensis Cham. mization on a per group basis (e.g., increasing template volume, altering & Schltdl.) was represented by ~73 kb of data, exemplifying the effective- annealing temperatures, and/or long PCR profi les) and/or the use of fresh ness of this approach. http://www.bioone.org/loi/apps 5 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 It is notable that many of the DNAs that were tested were extracted from CONCLUSIONS herbarium tissues that ranged from fi ve to 25 yr old when isolated. In addition, we tested these primers in several species of A bies Mill. (Pinaceae; Table 1 ) We present an alternative approach for systematic studies with surprising success, amplifying between six and nine regions without any that combines long PCR and NGS to strategically compile phy- PCR optimization. We caution that our long PCR protocol works best using logenomic data sets for molecular systematic studies. This ap- recent DNA extractions that have not been through multiple freeze-thaw cycles. Ideally, long PCR should be conducted using new DNA extractions that are proach is on par with genome skimming in terms of costs, but it stored at 4 ° C while performing experiments. Additionally, discrete PCR bands has the advantage of being a targeted approach and has the po- were only obtained using high-quality T aq polymerases. When conventional tential to produce data more uniformly across samples, i.e., polymerases were used (e.g., GoT aq [Promega Corporation, Madison, Wisconsin, minimizing missing data across taxa. Although this approach USA] or Top Taq [QIAGEN]), the resulting PCR products were smears rather was only tested with chloroplast data, we emphasize that the than discrete bands and were not used for sequencing. long PCR amplicons can be generated using DNA from any To confi rm that our long PCR approach was compatible with NGS and that our primers would yield complete chloroplast genomes, the amplicons from genome, expanding the possibilities of long PCR and NGS for each of the 15 Orobanchaceae taxa were purifi ed by precipitation in a 20% molecular systematic studies. This last point is important for polyethylene glycol 8000 (PEG)/2.5 M NaCl solution and washed in 70% etha- studies targeting the mitochondrion or low-copy regions of the nol. The amplicons were sheared by nebulization at 30 psi for 70 s, yielding an genome that otherwise might be missed or not shared across all average shear size of 500 bp as measured by a Bioanalyzer High-Sensitivity samples using genome skimming approaches. For example, this Chip (Agilent Technologies, Santa Clara, California, USA). DNA normaliza- approach may be particularly useful for the enrichment of nuclear tion is a critical step when pooling samples for multiplexing in NGS; however, due to the large number of plastomes per cell and the very few samples that regions, where intron sizes are large or unknown. were being sequenced in such a high-throughput sequencing platform, no DNA quantifi cation was made and the sheared amplicons were pooled by species at equal volume ratios. Sequencing libraries were constructed using the Illumina TruSeq library preparation kit and protocol (Illumina, San Diego, California, LITERATURE CITED USA) and were standardized at 2 nM prior to sequencing. Library concentra- tions were determined using the KAPA qPCR kit (KK4835; Kapa Biosystems, ALEXANDER , A . , D. S TEEL , B . S LIKAS , K . HOEKZEMA , C . C ARRAHER , M. PARKS , Woburn, Massachusetts, USA) on an ABI StepOnePlus Real-Time PCR System R. CRONN , AND C. S. BAKER . 2013 . Low diversity in the mitogenome (Life Technologies, Grand Island, New York, USA). The resulting libraries of sperm whales revealed by next-generation sequencing. G enome were multiplexed in one Illumina HiSeq 2000 lane (~187.5 million reads per Biology and Evolution 5 : 113 – 129 . lane [ Glenn, 2011] ) at the Vincent J. Coates Genomics Sequencing Laboratory ANGIOSPERM PHYLOGENY GROUP . 2009 . An update of the Angiosperm at the University of California, Berkeley, yielding ~12.5 million 100-bp single- Phylogeny Group classifi cation for the orders and families of fl ower- end reads for each taxon (GenBank Sequence Read Archive accessions: ing plants: APG III. B otanical Journal of the Linnean Society 161 : SRR1023085, SRR1023089, SRR1023095, SRR1023112, SRR1023113, 105 – 121 . SRR1023126, SRR1023128–SRR1023136). Average depth of coverage of our CHENG , S. , C. FOCKLER , W. M. BARNES , AND R. HIGUCHI . 1994 . Effective sequencing experiment was ~8333× (taking 150 kb as the average plastome amplifi cation of long targets from cloned inserts and human genomic size). The results obtained here clearly do not maximize the potential of the Il- DNA. Proceedings of the National Academy of Sciences, USA 91 : lumina HiSeq 2000 for plastome sequencing. To take full advantage of the large 5695 – 5699 . amount of data produced by a HiSeq 2000 for plastome sequencing, it would be CONANT , G . C. , AND K . H. WOLFE . 2008 . GenomeVx: Simple web-based theoretically possible to sequence ~4170 samples per lane and still reach the creation of editable circular chromosome maps. Bioinformatics 30 × minimum threshold generally regarded as ideal for plastome sequencing (Oxford, England) 24 : 861 – 862 . ( Straub et al., 2012 ). However, high-level multiplexing in NGS with this or any other high-throughput method requires careful normalization of DNA con- CRAIG , D . W. , J. V. PEARSON , S . SZELINGER , A . SEKAR , M. REDMAN , J. J. centrations across samples and suffi cient adapter barcodes; commonly used CORNEVEAUX , T. L. PAWLOWSKI , ET AL . 2008 . Identifi cation of genetic variants using bar-coded multiplexed sequencing. N ature Methods 5 : commercial kits currently offer either 96 (NEXTfl ex DNA Barcode kit; Bioo 887 – 893 . Scientifi c, Austin, Texas, USA) or 386 (Fluidigm, San Francisco, California, USA). Alternatively, one could choose to perform this type of experiment on CRONN , R. , A. LISTON , M. PARKS , D. S. GERNANDT , R. SHEN , AND T. MOCKLER . an NGS platform that yielded a lesser amount of data, e.g., 1 million 250-bp 2008 . Multiplex sequencing of plant chloroplast genomes using paired-end reads on an Illumina MiSeq Reagent Nano Kit version 2, which Solexa sequencing-by-synthesis technology. N ucleic Acids Research would allow a 30 × sequencing depth for 96 samples (or 50 × sequencing depth 36 : e122 . doi:10.1093/nar/gkn502 for 64 samples). CRONN , R. , B . J. KNAUS , A. LISTON , P . J. MAUGHAN , M . PARKS , J. V. SYRING , Because of the high depth of coverage of our sequencing experiment, reads AND J. UDALL . 2012 . Targeted enrichment strategies for next-generation were cleaned at high stringency (minimum quality = 30/40, maximum number plant biology. American Journal of Botany 99 : 291 – 311 . of low-quality bases per read = 5, maximum number of duplicate reads = 10, DOWNIE , S . R. , AND J. D. P ALMER . 1992 . Use of chloroplast DNA rear- minimum number of duplicate reads = 2) and assembled against a reference rangements in reconstructing plant phylogeny. I n P. S. Soltis, D. E. genome ( Sesamum indicum L., GenBank accession no. JN637766) using the Align- Soltis, and J. J. Doyle [eds.], Molecular systematics of plants, 14–35. reads pipeline version 2.25 ( Straub et al., 2011 ) with the following options: per- Chapman and Hall, New York, New York, USA. cent identity = medium, minimum coverage depth = 5, and single nucleotide DOYLE , J. J. , AND J. L. DOYLE . 1987 . A rapid DNA isolation procedure polymorphism (SNP) minimum coverage depth = 25 with 80% of those reads for small quantities of fresh leaf tissue. Phytochemical Bulletin 19 : supporting the SNP. The resulting assemblies had an average depth of ~700× , an 11 – 15 . average of 0.79% bases that were masked for not reaching the minimum sequenc- ing depth of 5× , and an average N50 of 35,053 bp ( Table 1 ; contigs and ACE fi les EATON , D. A. R. , AND R. H. R EE . 2013 . Inferring phylogeny and intro- gression using RADseq data: An example from fl owering plants deposited in the Dryad Digital Repository: http://doi.org/10.5061/dryad.kc75n ; (P edicularis : Orobanchaceae). S ystematic Biology 62 : 689 – 706 . Uribe-Convers et al., 2014 ). We noticed a small decrease in sequencing depth in regions immediately adjacent to some primer sites, which is a phenomenon that GLENN , T. C. 2011 . Field guide to next-generation DNA sequencers. Molecular Ecology Resources 11 : 759 – 769 . has been reported in the past ( Whittall et al., 2010 ; Knaus et al., 2011 ; reviewed in Cronn et al., 2012 ). Given that our shortest overlap between amplicons is GRAHAM , S. W. , AND R. G. OLMSTEAD . 2000 . Utility of 17 chloroplast 135 bp (between regions 9 and 10; Table 2 ), with the rest spanning hundreds of genes for inferring the phylogeny of the basal angiosperms. A merican base pairs ( Table 2 ), and that our experiment yielded a high sequencing depth, we Journal of Botany 87 : 1712 – 1730 . had no problems calling bases unambiguously (99.99% on average, Table 1 ). HARISMENDY , O. , AND K . FRAZER . 2009 . Method for improving sequence The B artsia inaequalis Benth. assembly ( Fig. 1 ; GenBank accession no. coverage uniformity of targeted genomic intervals amplifi ed by KF922718) was annotated using DOGMA (W yman et al., 2004) and visualized LR-PCR using Illumina GA sequencing-by-synthesis technology. in GenomeVx ( Conant and Wolfe, 2008) . BioTechniques 46 : 229 – 231 . http://www.bioone.org/loi/apps 6 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 KNAUS , B. J. , R. C RONN , A . LISTON , K. PILGRIM , AND M. K. SCHWARTZ . for massively parallel sequencing of angiosperm plastid genomes. 2011 . Mitochondrial genome sequences illuminate maternal lin- Applications in Plant Sciences 1 : 1200497. doi:10.3732/apps.1200497 eages of conservation concern in a rare carnivore. BMC Ecology TENNESSEN , J. A. , R . G OVINDARAJULU , A. L ISTON , AND T .-L. ASHMAN . 2013 . 11 : 10 . doi:10.1186/1472-6785-11-10 Targeted sequence capture provides insight into genome structure and LEMMON , A. R. , J. M. BROWN , K. STANGER-HALL , AND E. M. LEMMON . 2009 . genetics of male sterility in a gynodioecious diploid strawberry, F ragaria The effect of ambiguous data on phylogenetic estimates obtained by vesca ssp. b racteata (Rosaceae). G 3; Genes|Genomes|Genetics 3 : maximum likelihood and Bayesian inference. S ystematic Biology 58 : 1341 – 1351 . 130 – 145 . URIBE-CONVERS , S . , J. R. DUKE , M . J. MOORE , AND D . C. TANK . 2014 . MOORE , M. J. , C. D. BELL , P. S. SOLTIS , AND D. E. SOLTIS . 2007 . Using Data from: A long PCR–based approach for DNA enrichment prior plastid genome-scale data to resolve enigmatic relationships among to next-generation sequencing for systematic studies. Dryad Digital basal angiosperms. P roceedings of the National Academy of Sciences, Repository. http://doi.org/10.5061/dryad.kc75n . USA 104 : 19363 – 19368 . WHITTALL , J. B. , J. SYRING , M. PARKS , J. BUENROSTRO , C. DICK , A. LISTON , AND MOORE , M. J. , P. S. SOLTIS , C. D. BELL , J. G. BURLEIGH , AND D. E. SOLTIS . R. CRONN . 2010 . Finding a (pine) needle in a haystack: Chloroplast 2010 . Phylogenetic analysis of 83 plastid genes further resolves the genome sequence divergence in rare and widespread pines. M olecular early diversifi cation of eudicots. P roceedings of the National Academy Ecology 19 ( Suppl 1 ): 100 – 114 . of Sciences, USA 107 : 4623 – 4628 . WU , F. , L. A. MUELLER , D. CROUZILLAT , V. PETIARD , AND S. D. TANKSLEY . PARKS , M. , R. CRONN , AND A. LISTON . 2009 . Increasing phylogenetic reso- 2006 . Combining bioinformatics and phylogenetics to identify large lution at low taxonomic levels using massively parallel sequencing sets of single-copy orthologous genes (COSII) for comparative, evo- of chloroplast genomes. BMC Biology 7 : 84 . doi:10.1186/1741-7007- lutionary and systematic studies: A test case in the euasterid plant 7-84 clade. Genetics 174 : 1407 – 1420 . STRAUB , S. C. K. , M. FISHBEIN , T. LIVSHULTZ , Z. FOSTER , M . PARKS , K. WYMAN , S . K. , R. K. JANSEN , AND J. L. BOORE . 2004 . Automatic annota- WEITEMIER , R. C. CRONN , AND A. LISTON . 2011 . Building a model: tion of organellar genomes with DOGMA. B ioinformatics (Oxford, Developing genomic resources for common milkweed ( Asclepias England) 20 : 3252 – 3255 . syriaca ) with low coverage genome sequencing. B MC Genomics 12 : XI , Z. , B. R. RUHFEL , H. SCHAEFER , A . M. AMORIM , M. SUGUMARAN , K. J. 211 . doi:10.1186/1471-2164-12-211 WURDACK , P. K. ENDRESS , ET AL . 2012 . Phylogenomics and a pos- STRAUB , S. C. K. , M. PARKS , K. WEITEMIER , M. FISHBEIN , R. C. CRONN , AND teriori data partitioning resolve the Cretaceous angiosperm radiation A. LISTON . 2012 . Navigating the tip of the genomic iceberg: Next- Malpighiales. Proceedings of the National Academy of Sciences, USA generation sequencing for plant systematics. A merican Journal of 109 : 17519 – 17524 . Botany 99 : 349 – 364 . YUAN , Y. , C . L IU , H. M ARX , AND R. O LMSTEAD . 2009 . The pentatricopep- STULL , G. W. , M. J. MOORE , V. S. MANDALA , N. A. DOUGLAS , H.-R. KATES , tide repeat (PPR) gene family, a tremendous resource for plant phylo- X. QI , S. F. BROCKINGTON , ET AL . 2013 . A targeted enrichment strategy genetic studies. N ew Phytologist 182 : 272 – 283 . http://www.bioone.org/loi/apps 7 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 APPENDIX 1. Protocol for long PCR for amplifi cation of 4–20-kb targets. Developed by the Tank Laboratory, University of Idaho; published January 2014. Product Contents Catalog no. QIAGEN T aq DNA Polymerase1 250 units Taq DNA Polymerase, 10× PCR Buffer,† 5 × Q-Solution, 201205 25 mM MgCl 2 QIAGEN HotStar HiFidelity DNA Polymerase2 100 units HotStar HiFidelity DNA Polymerase,2 10 × HotStar PCR 202602 Buffer, 5× Q-solution, 25 mM MgSO 4 1 Almost any high-quality T aq polymerase should work; however, cheap T aq polymerases (e.g., QIAGEN TopT aq or Promega GoT aq ) do not work and result in large smears, rather than discrete bands. 2 QIAGEN HotStar HiFidelity DNA Polymerase was the only high-fi delity polymerase used in this study. † Q-solution does seem to be an important additive, thus the use of QIAGEN T aq . However, this does work using Q-solution with other high-quality Taq polymerases such as Promega’s or New England Biolab’s standard T aq (i.e., if you have a stock of Q-solution, but no QIAGEN T aq ). Genomic DNA must be high quality. Run a 0.8% or 1% gel to check. Standard CTAB extractions from silica gel–dried or herbarium material work well if they (1) are recent (extraction and tissue), and (2) contain high-molecular-weight DNA. Most important, we have found that recent DNA extractions that have not been through numerous freeze-thaw cycles work best. F or best results, long PCR should be done using new DNA extractions stored at 4° C while performing long PCR experiments . All preparations should be done on ice. 1. Number tubes or prepare plate. Make sure to include appropriate negative controls. 2. Prepare QIAGEN HotStar HiFidelity DNA polymerase dilution: Volumes for 100 reactions Reagents to prepare the HotStar T aq dilution Volumes for 25 reactions (total 12.5 µL) Volumes for 50 reactions (total 25 µL) (total 50 µL) 5 × HotStar HiFidelity PCR buffer 2.5 μ L 5.0 μ L 10 μ L H O 9.0 μ L 18 μ L 36 μ L 2 QIAGEN HotStar T aq 1.0 μ L 2.0 μ L 4.0 μ L 3. Prepare cocktail: Cocktail × 1 (25 μ L reaction) 10 × PCR buffer (QIAGEN CoralLoad PCR Buffer or colorless, 15 mM MgCl ) 2.5 μ L 2 MgCl (25 mM) 1.0 μ L (3 mM fi nal conc.; adjustable) 2 dNTP (10 mM each) 0.75 μ L (3 μ L of 2.5 mM each) Q solution (5× ) 5.0 μ L 5 ′ primer (5 μ M) 2.5 μ L (0.5 μ M fi nal conc.) 3 ′ primer (5 μ M) 2.5 μ L (0.5 μ M fi nal conc.) Taq DNA polymerase (QIAGEN) 0.25 μ L (1.25 units)1 QIAGEN HotStar DNA polymerase (diluted) 0.50 μ L H O to 25 μ L (9 μ L if using 1.0 μ L DNA) 2 1 The success rate was lower when a smaller quantity was used, but the best DNAs work with ≥ 0.125 μ L. 4. Add 1–2 μ L of template to each of the tubes. 5. While the tubes/plate with template are on ice, add 24 μ L of cocktail to each tube, being careful not to cross contaminate. Spin down to bring all liquid to the bottom of the tube. 6. Run appropriate long PCR profi le. Generic temperatures and times are: i. 93 ° C infi nity (important to go directly from ice to hot block) ii. 93 ° C for 3 min (initial denaturation) iii. 93 ° C for 15 s iv. 48–68 ° C for 30 s ( T should be ~5 ° C below T of primers) a m v. 68 ° C for 5–20 min (1 min/kb of target) vi. go to step 3, 34× vii. 4 ° C infi nity 7. Check reactions by running 2 μ L on 1% agarose gel with appropriate size standards. http://www.bioone.org/loi/apps 8 of 9 Applications in Plant Sciences 2014 2 ( 1 ): 1300063 Uribe-Convers et al.—Long PCR–based DNA enrichment doi:10.3732/apps.1300063 Primer combinations for long PCR amplifi cation of the chloroplast genome.1 ,2 Region no. Approx. size (kb) Primers (F/R)3 Primer sequence (5′ –3 ′ ) 1 8 trnH.GUG.6R C CTTRATCCACTTGGCTACAT psbK.195R ACTTACAGCAGCTTGCCAAAC 2 10.3 trnQ.UUG.50R G GACGGAAGGATTCGAACC rpoC2.4805F GYCGTATYGATTGGTTRAAAGG 2a4 6.3 trnQ.UUG.50R GGACGGAAGGATTCGAACC atpH.17F CTGCYGCTTCYGTTATTGCT 2b 4 4 atpF.65R CGGTATTAAACCCGAAACTCC rpoC2.4805F GYCGTATYGATTGGTTRAAAGG 3 7 atpI.705R CRGCTAAAGTTGCAAAAATAAGAGCT rpoC1.1670F GRGATCAAATGGCTGTTCAT 4 9 rpoC2.520R GTTCGTACAGCAGTATCYACAAC petN.3R GCCCAAGCRAGACTTACTATATCC 5 10.5 trnC.GCA.47F CCCAGTTCAAATCCGGGT psaB.2170F GCRGCTTTCTTGATTGCYTC 6 10 trnfM.CAU.21R G GTTATGAGCCTTGCGAGCTA trnT.UGU.17F GGTTAGAGCATCGCATTTGTAATG 7 10.3 rps4.380R G GTTTGCARCGATAACTTGGKATATC rbcL.178R GTCCATGTACCAGTAGARGATTC 8 9.2 rbcL.2F TGTCACCACAAACAGARACTAAAG psbJ.3F GGCYGATACTACTGGAAGRAT 9 9.8 petA.920F CTTCAAGAYCCATTACGTGTHCAAG psbB.160R TRCCYTGTCTCCACATTGGAT 10 10.9 psbB.3F GGGTTTRCCTTGGTATCGTGT rps3.17F.new ATCCACTTGGTTTYMGACTTGG 11 8.7 rpl16.3R AACCAACGAGTCACACACTAAGC ycf2.5100R CAGATCATGAATGTTTGGAATCCAT 12 10 ycf2.2300F TCGGGATCCTRATGCATATAGATAC rps12.190F GTTGCCAGAGTACGMTTAACCT 135 11 rps12.360R CCCTTGTTGACGATCCTTTACTC ycf1.59R CCGACCACAACGACCGAAT 14 11.2 trnN.GUU.7R C CGCTCTACCACTGAGCTAC ndhA.535F GCTGCTCAATCDATTAGTTATGAA 14 ′6 7 trnR.ACG.15F GAGGATTAGAGCACGTGG ccsA.890R TCCAAGTAATAAANGCCCAAGTTTC 15 10.5 ndhI.194R CGAACRCATACTTCACAAGCAA trnN.GUU.7R CCGCTCTACCACTGAGCTAC 16 8.2 psbA.640F GCTATGCATGGTTCYTTGGTAAC ycf2.5100R CAGATCATGAATGTTTGGAATCCAT 1 Universal primers designed by M.J.M.; compiled and tested by D.C.T. and S.U.C. 2 T should be ~5° C below T of primers; however, temperatures of 55° C have worked for all primer combinations. a m 3 The name of each primer consists of three parts: (1) the gene in which the primer is anchored in, (2) the approximate position of the primer within that gene (based on all-angiosperm alignment per Moore et al., 2007 ), and (3) either an “F” or an “R.” The F and R designations do not indicate that the primer should be used as a forward or reverse primer; rather, they indicate the 5′ to 3 ′ orientation of the primer with respect to the gene. In other words, a primer that is designated as an F primer has its 5′ to 3 ′ orientation in the same orientation as the gene (i.e., on the forward strand, or from start to stop), whereas an R primer is oriented in the direction opposite to the 5′ to 3 ′ orientation of the gene (i.e., on the reverse strand). 4 Regions 2a and 2b can be used to amplify region 2 in two pieces. 5 Regions 11, 12, and 13 represent a large portion of the inverted repeat (IR), thus, one amplifi cation for both IRa and IRb. 6 Region 14′ amplifi es ca. 2/3 of region 14. http://www.bioone.org/loi/apps 9 of 9