Compendium of Plant Genomes Maizura Ithnin Ahmad Kushairi Editors The Oil Palm Genome Compendium of Plant Genomes Series Editor Chittaranjan Kole, Raja Ramanna Fellow, Government of India, ICAR-National Research Center on Plant Biotechnology, Pusa, New Delhi, India Whole-genome sequencing is at the cutting edge of life sciences in the new millennium.SincethefirstgenomesequencingofthemodelplantArabidopsis thaliana in 2000, whole genomes of about 100 plant species have been sequenced and genome sequences ofseveral other plants arein the pipeline. Research publications onthesegenomeinitiatives arescattered ondedicated web sites and in journals with all too brief descriptions. The individual volumes elucidate the background history of the national and international genome initiatives; public and private partners involved; strategies and genomicresourcesandtoolsutilized;enumerationonthesequencesandtheir assembly; repetitive sequences; gene annotation and genome duplication. In addition,syntenywithothersequences,comparisonofgenefamiliesandmost importantly potential of the genome sequence information for gene pool characterization andgenetic improvementof cropplantsare described. Interestedineditingavolumeonacropormodelplant?Pleasecontact Prof. C. Kole, Series Editor, [email protected] More information about this series at http://www.springer.com/series/11805 Maizura Ithnin Ahmad Kushairi (cid:129) Editors The Oil Palm Genome 123 Editors Maizura Ithnin Ahmad Kushairi AdvancedBiotechnology Malaysian Palm Oil Board(MPOB) andBreeding Centre Kajang,Malaysia Malaysian Palm Oil Board(MPOB) Kajang,Malaysia ISSN 2199-4781 ISSN 2199-479X (electronic) Compendium of Plant Genomes ISBN978-3-030-22548-3 ISBN978-3-030-22549-0 (eBook) https://doi.org/10.1007/978-3-030-22549-0 ©SpringerNatureSwitzerlandAG2020 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeor part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway, andtransmissionorinformationstorageandretrieval,electronicadaptation,computersoftware, orbysimilarordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthis publication does not imply, even in the absence of a specific statement, that such names are exemptfromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationin thisbookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernor the authors or the editors give a warranty, expressed or implied, with respect to the material containedhereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremains neutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland This book series is dedicated to my wife Phullara and our children Sourav and Devleena Chittaranjan Kole Preface to the Series Genome sequencing has emerged as the leading discipline in the plant sciences coinciding with the start of the new century. For much of the twentieth century, plant geneticists were only successful in delineating putative chromosomal location, function, and changes in genes indirectly through the use of a number of “markers” physically linked to them. These included visible or morphological, cytological, protein, and molecular or DNAmarkers.Amongthem,thefirstDNAmarker,theRFLPs,introduceda revolutionarychangeinplantgeneticsandbreedinginthemid-1980s,mainly because of their infinite number and thus potential to cover maximum chro- mosomalregions,phenotypicneutrality,absenceofepistasis,andcodominant nature. An array of other hybridization-based markers, PCR-based markers, and markers based on both, facilitated construction of genetic linkage maps, mapping of genes controlling simply inherited traits, and even gene clusters (QTLs) controlling polygenic traits in a large number of model and crop plants.Duringthisperiod,anumberofnewmappingpopulationsbeyondF2 wereutilizedandanumberofcomputerprogrammesweredevelopedformap construction,mappingofgenes,andmappingofpolygenicclustersorQTLs. Molecularmarkerswerealsousedinthestudiesofevolutionandphylogenetic relationship, genetic diversity, DNA fingerprinting, and map-based cloning. Markerstightlylinkedtothegeneswereusedincropimprovementemploying theso-calledmarker-assistedselection.Thesestrategiesofmoleculargenetic mapping and molecular breeding made a spectacular impact during the last one and a half decades of the twentieth century. But still they remained “indirect” approaches for elucidation and utilization of plant genomes since much of the chromosomes remained unknown and the complete chemical depictionof them was yetto be unravelled. Physical mapping of genomes was the obvious consequence that facili- tated the development of the “genomic resources” including BAC and YAC libraries to develop physical maps in some plant genomes. Subsequently, integrated genetic–physical maps were also developed in many plants. This led to the concept of structural genomics. Later on, emphasis was laid on EST and transcriptome analysis to decipher the function of the active gene sequences leading to another concept defined as functional genomics. The advent of techniques of bacteriophage gene and DNA sequencing in the 1970s was extended to facilitate sequencing of these genomic resources in the last decade of the twentieth century. vii viii PrefacetotheSeries As expected, sequencing of chromosomal regions would have led to too muchdatatostore,characterize,andutilizewiththe-thenavailablecomputer softwarecouldhandle.Butthedevelopmentofinformationtechnologymade the life of biologists easier by leading to a swift and sweet marriage of biology and informatics, and a new subject was born—bioinformatics. Thus, the evolution of the concepts, strategies, and tools of sequencing and bioinformatics reinforced the subject of genomics—structural and functional. Today, genome sequencing has travelled much beyond biology and involves biophysics, biochemistry, and bioinformatics! Thanks to the efforts of both public and private agencies, genome sequencingstrategiesareevolvingveryfast,leadingtocheaper,quicker,and automated techniquesright from clone-by-clone andwhole-genomeshotgun approaches to a succession of second-generation sequencing methods. The development of software of different generations facilitated this genome sequencing. At the same time, newer concepts and strategies were emerging to handle sequencing of the complex genomes, particularly the polyploids. Itbecamearealitytochemically—andsodirectly—defineplantgenomes, popularly called whole-genome sequencing or simply genome sequencing. The history of plant genome sequencing will always cite the sequencing of the genome of the model plant Arabidopsis thaliana in 2000 that was followedbysequencingthegenomeofthecropandmodelplantricein2002. Since then, the number of sequenced genomes of higher plants has been increasing exponentially, mainly due to the development of cheaper and quicker genomic techniques and, most importantly, the development of collaborativeplatformssuchasnationalandinternationalconsortiainvolving partners from public and/or private agencies. AsIwritethisPrefaceforthefirstvolumeofthenewseriesCompendium of Plant Genomes, a net search tells me that complete or nearly complete whole-genome sequencing of 45 crop plants, eight crop and model plants, eightmodelplants,15cropprogenitorsandrelatives,andthreebasalplantsis accomplished, the majority of which are in the public domain. This means that we nowadays know many of our model and crop plants chemically, i.e. directly,andwemaydepictthemandutilizethempreciselybetterthanever. Genome sequencing has covered all groups of crop plants. Hence, infor- mation on the precise depiction of plant genomes and the scope of their utilization are growing rapidly every day. However, the information is scattered in research articles and review papers in journals and dedicated Web pages of the consortia and databases. There is no compilation of plant genomes and the opportunity of using the information in sequence-assisted breeding or further genomic studies. This is the underlying rationale for starting this book series, with each volume dedicated to a particular plant. Plant genome science has emerged as an important subject in academia, and the present compendium of plant genomes will be highly useful to both students and teaching faculties. Most importantly, research scientists involvedingenomicsresearchwillhaveaccesstosystematicdeliberationson theplantgenomesoftheirinterest.Elucidationofplantgenomesisofinterest notonlyforthegeneticistsandbreeders,butalsoforpractitionersofanarray of plant science disciplines, such as taxonomy, evolution, cytology, PrefacetotheSeries ix physiology, pathology, entomology, nematology, crop production, bio- chemistry, and obviously bioinformatics. It must be mentioned that infor- mation regarding each plant genome is ever-growing. The contents of the volumes of this compendium are, therefore, focusing on the basic aspects of the genomes and their utility. They include information on the academic and/oreconomicimportanceoftheplants,descriptionoftheirgenomesfrom amoleculargeneticandcytogeneticpointofview,andthegenomicresources developed. Detailed deliberations focus on the background history of the national and international genome initiatives, public and private partners involved,strategiesandgenomicresourcesandtoolsutilized,enumerationon thesequencesandtheirassembly,repetitivesequences,geneannotation,and genome duplication. In addition, synteny with other sequences, comparison ofgenefamilies,and,mostimportantly,thepotentialofthegenomesequence informationforgenepoolcharacterizationthroughgenotypingbysequencing (GBS) and genetic improvement of crop plants have been described. As expected, there is a lot of variation of these topics in the volumes based on the information available on the crop, model, or reference plants. Imustconfessthatastheserieseditor,ithasbeenadauntingtaskforme to work on such a huge and broad knowledge base that spans so many diverseplantspecies.However,pioneeringscientistswithlifetimeexperience andexpertiseontheparticularcropsdidexcellentjobseditingtherespective volumes.Imyselfhavebeenasmallscienceworkeronplantgenomessince the mid-1980s, and that provided me the opportunity to personally know several stalwarts of plant genomics from all over the globe. Most, if not all, of the volume editors are my long-time friends and colleagues. It has been highly comfortable and enriching for me to work with them on this book series.Tobehonest,whileworkingonthisseriesIhavebeenandwillremain a student first, a science worker second, and a series editor last. And I must express my gratitude to the volume editors and the chapter authors for pro- viding me the opportunity to work with them on this compendium. Ialsowishtomentionhere mythanks andgratitude totheSpringerstaff, particularly Dr. Christina Eckey and Dr. Jutta Lindenborn for the earlier set of volumes and presently Ing. Zuzana Bernhart for all their timely help and support. I always had to set aside additional hours to edit books beside my pro- fessional and personal commitments—hours I could and should have given to my wife, Phullara, and kids, Sourav and Devleena. I must mention that they not only allowed me the freedom to take away those hours from them but also offered their support in the editing job itself. I am really not sure whethermydedicationofthiscompendiumtothemwillsufficetodojustice to their sacrifices for the interest of science and the science community. New Delhi, India Chittaranjan Kole Preface Genomics is a combined discipline of classical genetics and computational science. Genomics aims at understanding the structure, function, evolution, genetic mapping, and genome editing of an organism using DNA sequence data.Thisbookshowcasessignificantbreakthroughsandupdatesinoilpalm genomicresearchandrelatedfields.Meantforawidespectrumofreaders,a chapteronthehistoryandeconomicimportanceofthecropisincluded.With oilpalmgenomesequence,genesormarkersassociatedwithagronomictraits of interest were identified while new tools were developed. Certain tools wereappliedtoestimategeneticdiversityoftheoilpalm.Suchinformationis crucial to breeders in designing appropriate breeding schemes to enrich the narrowgeneticbaseofcurrentplantingmaterialsforsustainabledevelopment of the industry. The genetic control of economically important phenotypes, such as fruit forms, fruit colours, and tissue culture-related abnormalities, is among the principle outcomes from genomic research. Subsequent chapters highlight the introgression of wild germplasm with current genetic materials and applying modified reciprocal recurrent selection scheme in breeding programmes and production of improved planting materials. Genetic improvements were further enhanced by means of molecular cytogenetics tools and marker-assisted selection, developed from genome-wide associa- tion studies (GWAS) and genomic selection (GS). This method is to select palms carrying specific chromosomes and favourable QTLs more precisely for breeding programmes, leading to the development of elite planting materials. Such high-yielding materials with niche characteristics are tissue cultured to obtain a large number of uniformed planting materials. An array of innovations were developed including fine-tuning the propagation proto- cols to more user-friendly and efficient with fast result. Genomic-derived qualitycontroltoolwasdevelopedandappliedacrosstissueculturematerials to minimize yield-affecting somaclonal variations. Complementing the breeding techniques, genetic engineering is used to diversify palm oil applicationsvis-a-vishighervalue-addedproducts.Researcherscontinuously working towards optimizing genetic transformation systems of the oil palm and challenges faced during the process deliberated. The final chapter pre- sents the state-of-the-art post-genomics tools such as transcriptomics, pro- teomics, and metabolomics which are embraced as phenotyping tools to elucidate the mechanisms in fruit ripening and fatty acid synthesis, among others. On the account of the indispensable need to unravel diseases in oil palm, post-genomic tools are exploited to advance knowledge in xi