ebook img

Algebraic and Discrete Mathematical Methods for Modern Biology PDF

365 Pages·2015·20.706 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Algebraic and Discrete Mathematical Methods for Modern Biology

Algebraic and Discrete Mathematical Methods for Modern Biology Algebraic and Discrete Mathematical Methods for Modern Biology Edited by Raina S. Robeva Department of Mathematical Sciences, Sweet Briar College, Sweet Briar, VA, USA AMSTERDAM (cid:129) BOSTON (cid:129) HEIDELBERG (cid:129) LONDON NEW YORK (cid:129) OXFORD (cid:129) PARIS (cid:129) SAN DIEGO SAN FRANCISCO (cid:129) SINGAPORE (cid:129) SYDNEY (cid:129) TOKYO Academic Press is an imprint of Elsevier AcademicPressisanimprintofElsevier 32JamestownRoad,LondonNW17BY,UK 525BStreet,Suite1800,SanDiego,CA92101-4495,USA 225WymanStreet,Waltham,MA02451,USA TheBoulevard,LangfordLane,Kidlington,OxfordOX51GB,UK Firstedition2015 Copyright©2015ElsevierInc.Allrightsreserved. Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans,electronicormechanical,including photocopying,recording,oranyinformationstorageandretrievalsystem,withoutpermissioninwritingfromthepublisher.Detailson howtoseekpermission,furtherinformationaboutthePublisher’spermissionspoliciesandourarrangementswithorganizationssuchas theCopyrightClearanceCenterandtheCopyrightLicensingAgency,canbefoundatourwebsite:www.elsevier.com/permissions. ThisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythePublisher(otherthanasmaybenoted herein). Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchandexperiencebroadenourunderstanding,changes inresearchmethods,professionalpractices,ormedicaltreatmentmaybecomenecessary. Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgeinevaluatingandusinganyinformation, methods,compounds,orexperimentsdescribedherein.Inusingsuchinformationormethodstheyshouldbemindfuloftheirown safetyandthesafetyofothers,includingpartiesforwhomtheyhaveaprofessionalresponsibility. Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors,assumeanyliabilityforanyinjuryand/or damagetopersonsorpropertyasamatterofproductsliability,negligenceorotherwise,orfromanyuseoroperationofanymethods, products,instructions,orideascontainedinthematerialherein. LibraryofCongressCataloging-in-PublicationData AcatalogrecordforthisbookisavailablefromtheLibraryofCongress BritishLibraryCataloguinginPublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary ForinformationonallAcademicPresspublications visitourwebsiteathttp://store.elsevier.com/ PrintedandboundintheUSA ISBN:978-0-12-801213-0 Contributors Numbersinparenthesesindicatethepagesonwhichtheauthors’contributionsbegin. Réka Albert (65), Pennsylvania State University, M. Drew LaMar (193,217), The College of William and UniversityPark,PA,USA Mary,Williamsburg,VA,USA Todd J. Barkman (261), Department of Biological MatthewMacauley(93,321),DepartmentofMathematical Sciences,WesternMichiganUniversity,Kalamazoo,MI, Sciences,ClemsonUniversity,Clemson,SC,USA USA Wolfgang Marwan (141), Otto-von-Guericke University, Mary Ann Blätke (141), Otto-von-Guericke University, Magdeburg,Germany Magdeburg,Germany David Murrugarra (121), Department of Mathematics, Hannah Callender (193,217), University of Portland, UniversityofKentucky,Lexington,KY,USA Portland,OR,USA Christian M. Reidys (347), Department of Mathematics Margaret (Midge) Cozzens (29), Rutgers University, and Computer Science, University of Southern Piscataway,NJ,USA Denmark,OdenseM,Denmark Kristina Crona (51), Department of Mathematics and RainaRobeva(65),SweetBriarCollege,SweetBriar,VA, Statistics, American University, 4400 Massachusetts USA AveNW,Washington,DC20016 JanetSteven(237),ChristopherNewportUniversity,New- Robin Davies (93,321), Department of Biology, Sweet portNews,VA,USA BriarCollege,SweetBriar,VA,USA Blair R. Szymczyna (261), Department of Chemistry, Monika Heiner (141), Brandenburg University of WesternMichiganUniversity,Kalamazoo,MI,USA Technology,Cottbus,Germany NataliaToporikova(193),WashingtonandLeeUniversity, Terrell L. Hodge (261), Department of Mathemat- Lexington,VA,USA ics, Western Michigan University, Kalamazoo, MI, AlanVeliz-Cuba(121),DepartmentofMathematics,Uni- USA versity of Houston, and Department of BioSciences, QijunHe(93,321),DepartmentofMathematicalSciences, RiceUniversity,Houston,TX,USA ClemsonUniversity,Clemson,SC,USA RamaViswanathan(1),BeloitCollege,Beloit,WI,USA JohnR.Jungck(1),CenterforBioinformaticsandCompu- GradyWeyenberg(293),DepartmentofStatistics,Univer- tational Biology, University of Delaware, Newark, DE, sityofKentucky,Lexington,KY USA Emilie Wiesner (51), Department of Mathematics, Ithaca Winfried Just (193,217), Department of Mathematics, College,953DanbyRd,Ithaca,NY14850 OhioUniversity,Athens,OH,USA RurikoYoshida(293),DepartmentofStatistics,University BessieKirkwood(237),SweetBriarCollege,SweetBriar, ofKentucky,Lexington,KY VA,USA ix Preface Inthelast15years,thefieldofmodernbiologyhasbeentransformedbytheuseofnewmathematicalmethods, complementinganddrivingbiologicaldiscoveries.Problemsfromgeneregulatorynetworksandgenomics,RNA folding,infectiousdiseaseanddrugresistancemodeling,phylogenetics,andecologicalnetworksandfoodwebs have increasingly benefited from the application of discrete mathematics and computational algebra. Modern algebra approaches have proved to be a natural fit for many problems where the use of traditional dynamical modelsbuiltwithdifferentialequationsisnotappropriateoroptimal. While the use of modern algebra methods is now in the mainstream of mathematical biology research, this trend has been slow to influence the undergraduate mathematics and biology curricula, where difference and differential equation models still dominate. Several high-profile reports have been released in the past 5 years, including Refs. [1–3], calling urgently for broadening the undergraduate exposure at the interface of mathematics and biology, and including methods from modern discrete mathematics and their biological applications. However, those reports have been slow to elicit the transformative change in the undergraduate curriculumthatmanyofushadhopedfor.Theanemicresponsemaybeattributedtoarelativelackofeducational undergraduate resources that highlight the critical impact of algebraic and discrete mathematical methods on contemporarybiology.Itisthisnichethatourbookseekstofill. Theformatofthisvolumefollowsthatofourearlierbook,MathematicalConceptsandMethodsinModern Biology:UsingModernDiscreteMethods,RobevaandHodge(Editors),publishedin2013byAcademicPress. At the time of its planning, we considered the modular format of that text (with chapters largely independent fromoneanother)experimental,butwefeltreassuredwhenthebookwasselectedas1of12contendersforthe 2013 Society of Biology Awards in its category. We have adopted the same format here, as we believe that it provides readers and instructors with the independence to choose biological topics and mathematical methods thatareofgreatestinteresttothem. Duetothemodularformat,theorderofthechaptersinthevolumedoesnotnecessarilyimplyanincreased level of difficulty or the need for more prerequisites for the later chapters. When chapters are connected by a common biological thread, they are grouped together, but they can still be used independently. Each chapter begins with a question or a number of related questions from modern biology, followed by the description of certain mathematical methods and theory appropriate in the search of answers. As in our earlier book, chapters can be viewed as fast-track pathways through the problem, which start by presenting the biological foundation,proceedbycoveringtherelevantmathematicaltheoryandpresentingnumerousexamples,andendby highlightingconnectionswithongoingresearchandcurrentpublications.Thelevelofpresentationvariesamong chapters—some may be appropriate for introductory courses, while others may require more mathematical or biological background. Exercises are embedded within the text of each chapter, and their execution requires only material discussed up to that point. In addition, many chapters feature challenging open-ended questions (designatedasprojects)thatprovidestartingpointsforexplorationsappropriateforundergraduateresearch,and supply references to relevant publications from the recent literature. In their most general form, some of the projectsfeaturetrulyopenquestionsinmathematicalbiology. The book’s companion website (http://booksite.elsevier.com/9780128012130) contains solutions to the exercises, as well all figures and relevant data files for the examples and exercises in the chapters. In addition, the site hosts software code, project guidelines, online supplements, appendices, and tutorials for selected chapters.Thespecializedsoftwareutilizedthroughoutthebookhighlightsthecriticalimportanceofcomputing xi xii Preface applications for visualization, simulation, and analysis in modern biology. We have been careful to feature softwarethatisinthemainstreamofcurrentmathematicalbiologyresearch,whilealsobeingmindfulofgiving preferencetofreelyavailablesoftware. We hope that the book will be a valuable resource to mathematics and biology programs, as it describes methods from discrete mathematics and modern algebra that can be presented, for the most part, at a level completely accessible to undergraduates. Yet the book provides extensions and connections with research that wouldalsobehelpfultograduatestudentsandresearchersinthefield.Someofthematerialwouldbeappropriate formathematicscoursessuchasfinitemathematics,discretestructures,linearalgebra,abstract/modernalgebra, graph theory, probability, bioinformatics, statistics, biostatistics, and modeling, as well as for biology courses suchasgenetics,cellandmolecularbiology,biochemistry,ecology,andevolution. The selection of topics for the volume and the choice of contributors grew out of the workshop “Teaching Discrete and Algebraic Mathematical Biology to Undergraduates” organized by Raina Robeva, Matthew Macauley, and Terrell Hodge and funded and hosted by the Mathematical Biosciences Institute (MBI) on July 29-August2,2013atTheOhioStateUniversity.Theeditorandcontributorsofthisvolumegreatlyappreciatethe encouragementandassistancereceivedfromtheMBI’sleadershipandstaff.Withouttheirsupport,thisvolume would not have been possible. We also acknowledge with gratitude the support of the National Institute for MathematicalandBiologicalSynthesis(NIMBioS)inprovidinganopportunitytofurthertestselectedmaterials as part of the tutorial “Algebraic and Discrete Biological Models for Undergraduate Courses” offered on June 18-20,2014atNIMBioS. I would like to express my personal thanks to all contributors who embraced the project early on and committed time and energy into producing the chapter modules for this unconventional textbook. Your enthusiasm for the project was remarkable, and you have my deep gratitude for the dedication and focus with which you carried it out. My special thanks also go to Daniel Hrozencik and Timothy Comar for providing feedback on a few of the chapter drafts. I am indebted to the editorial and production teams at Elsevier and particularly to the book’s editors, Paula Callaghan and Katey Birtcher, our editorial project managers, Sarah Watson and Amy Clark, and our production manager, Vijayaraj Purushothaman. It has been a pleasure and a privilege to work with all of you. Finally, I would like to thank my husband, Boris Kovatchev, for his patience andsupportthroughout. RainaS.Robeva October20,2014 REFERENCES [1] CommitteeonaNewBiologyforthe21stCentury:EnsuringtheUnitedStatesLeadstheComingBiologyRevolution,Boardon LifeSciences,DivisiononEarthandLifeStudies,NationalResearchCouncil.Anewbiologyforthe21stcentury.Washington,DC: TheNationalAcademiesPress;2009. [2] Brewer CA, Anderson CW (eds). Vision and change in undergraduate biology education: a call to action. Final report of a National Conference organized by the American Association for the Advancement of Science with support from the National Science Foundation, July 15-17, 2009, Washington, DC. The American Association for the Advancement of Science; 2011. http://visionandchange.org/files/2013/11/aaas-VISchange-web1113.pdf(accessedMarch1,2015). [3] CommitteeontheMathematicalSciencesin2025,BoardonMathematicalSciencesandTheirApplications,DivisiononEngineering andPhysicalSciences,NationalResearchCouncil.Themathematicalsciencesin2025.Washington,DC:TheNationalAcademies Press;2013. Companionwebsite:http://booksite.elsevier.com/9780128012130/ SupplementaryResourcesforInstructors Thewebsitefeaturesthefollowingadditionalresourcesavailablefordownload: ● Allfiguresfromthebook ● Solutionstoallexercises ● Computercode,datafiles,andlinkstosoftwareandmaterialscarefullychosentosupplementthecontentof thetextbook ● Appendices,tutorials,andadditionalprojectsforselectedchapters Chapter 1 Graph Theory for Systems Biology: Interval Graphs, Motifs, and Pattern Recognition JohnR.Jungck1 andRamaViswanathan2 1CenterforBioinformaticsandComputationalBiology,UniversityofDelaware,Newark, DE,USA,2BeloitCollege,Beloit,WI,USA 1.1 INTRODUCTION Systems thinking is perceived as an important contemporary challenge of education [1]. However, systems biologyisanoldandinclusivetermthatconnotesmanydifferentsubareasofbiology.Historicallytwoimportant threads were synchronic: (a) the systems ecology of the Odum school [2–4], which was developed in the contextofengineeringprinciplesappliedtoecosystems[5,6],and(b)systemsphysiologythatusedmechanical principles[7]tounderstandorgansasmechanicaldevicesintegratedintothecirculatorysystem,digestivesystem, anatomicalsystem,immunesystem,nervoussystem,etc.Forexample,theheartcouldbethoughtofasapump, the kidney as a filter, the lung as a bellows, the brain as a wiring circuit (or later as a computer), elbow joints as hinges, and so on. It should be noted that both areas extensively employed ordinary and partial differential equations (ODEs and PDEs). Indeed, some systems physiologists argued that all mathematical biology should be based on the application of PDEs. On the other hand, evolutionary biologists argued that these diachronic systems approaches too often answered only “how” questions that investigated optimal design principles and didnotaddress“why”questionsfocusingontheconstraintsofhistoricalcontingency. Not surprisingly, one of the leading journals in the field—Frontiers in Systems Biology—announces in its missionstatement,[8]“ContrarytothereductionistparadigmcommonlyusedinMolecularBiology,inSystems Biologytheunderstandingofthebehaviorandevolutionofcomplexbiologicalsystemsneednotnecessarilybe basedonadetailedmoleculardescriptionoftheinteractionsbetweenthesystem’sconstituentparts.”Therefore, in this chapter we emphasize two major macroscopic and global aspects of contemporary systems biology: (i) the graph-theoretic relationships between components in networks and (ii) the relationship of these patterns to the historical contingencies of evolutionary constraints. Numerous articles and several books [9, 10] exist on graph theory and its application to systems biology, so the reader may ask what are we doing in this chapter that is different. Our main purpose is to help biologists, mathematicians, students, and researchers recognize which graph-theoretic tools are appropriate for different kinds of questions, including quantitative analyses of interactions for mining large data sets, visualizing complex relationships, modeling the consequences of perturbationofnetworks,andtestinghypotheses. Everynetworkconstructinsystemsbiologyisahypothesis.Forexample,RiosandVendruscolo[11]describe the network hypothesis as the assumption “according to which it is possible to describe a cell through the set ofinterconnectionsbetweenitscomponentmolecules.”Theythenconclude,“itbecomesconvenienttofocuson AlgebraicandDiscreteMathematicalMethodsforModernBiology.http://dx.doi.org/10.1016/B978-0-12-801213-0.00001-0 Copyright©2015ElsevierInc.Allrightsreserved. 1 2 AlgebraicandDiscreteMathematicalMethodsforModernBiology theseinteractionsratherthanonthemoleculesthemselvestodescribethefunctioningofthecell.”Inthischapter, wegoastepfurther.Webelievethatamathematicalbiologyperspectivealsostudiessuchquestionsas:Which moleculesareinvolved?Whatdotheydofunctionally?Whatistheirthree-dimensionalstructure?Wherearethey locatedinacell?Westressthateverynetworkandpathwaythatwediscussisausefulconstructfromabiological perspective. They do not exist per se inside of cells. Imagine a series of biological macromolecules (proteins, nucleicacids,polysaccharides)thatarecrowdedandcollidingwithoneanotherinasuspension.Thenetworks and pathways for the interactions between these molecules constructed by biologists may represent preferred associationsdefinedbytighterbindingsofspecificmacromoleculesortheproductofareactioncatalyzedbyone macromolecule(anenzyme)asthestartingmaterial(substrate)ofanotherenzyme.Thus,biologistshavealready drawnmathematicaldiagramsandgraphsinthesensethattheyhaveabstracted,generalized,andsymbolizeda setorrelationships. Toooftenbiologistsproducenetworksasvisualizationswithoutfurtheranalysis.Inthischapter,usingExcel and Java-based software that we have developed, we show readers how to make mathematical measurements (averagedegree,diameter,clusteringcoefficient,etc.)anddiscernholisticproperties(smallworldversusscale- free,seeHayes[12]foracompleteoverview)ofthenetworksbeingstudiedandvisualized,andobtaininsights that are relevant and meaningful in the context of systems biology. We show how the network hypothesis can be investigated by complementary and supplementary mathematical and biological perspectives to yield key insightsandhelpdirectandinformadditionalresearch. Palsson[10]suggeststhattwenty-firstcenturybiologywillfocuslessonthereductioniststudyofcomponents andmoreontheintegrationofsystemsanalysis.Heidentifiesfourprinciplesinhis“systemsbiologyparadigm”: “First, the list of biological components that participated in the process of interest is enumerated. Second, the interactions between these components are studied and the ‘wiring diagrams’ of genetic circuits are constructed…. Third, reconstructed network[s] are described mathematically and their properties analyzed…. Fourth, the models are used to analyze, interpret, and predict biological experimental outcomes.” Here, we assumethatthefirsttwostepsexistindatabasesorpublishedarticles;thisallowsustofocusonthemathematics ofthethirdstepasawaythatallowsbiologiststobetterdirecttheirworkonthefourthstep.Thus,thegoalsfor thischapterareasfollows. ● Learnhowgraphtheorycanbeusedtohelpobtainmeaningfulinsightsintocomplexbiologicaldatasets. ● Analyzecomplexbiologicalnetworksofdiversetypes(restrictionmaps,foodwebs,geneexpression,disease etiology)todetectpatternsofrelationships. ● Visualizeorderingofmodules/motifswithincomplexbiologicalnetworksbyfirsttestingtheapplicabilityof simplelinearapproaches(intervalgraphs). ● Demonstratethatevenwhenstrictmathematicalassumptionsdonotapplyfullytoagivenbiologicaldataset, there is still benefit in applying an analytical approach because of the power of the human mind to discern prominentpatternsindatarearrangedthroughtheapplicationofmathematicaltransformations. ● Show that the visualizations help biologists obtain insights into their data, examine the significance of outliers, mine databases for additional information about observed associations, and plan further experiments. To accomplish this, we first emphasize how graph theory is a natural fit for biological investigations of relationships, patterns, and complexity. Second, graph theory lends itself easily to questions about what biologistsshouldbelookingforamongrepresentationsofrelationships.Weintroduceconceptsofhubs,maximal cliques, motifs, clusters, interval graphs, complementary graphs, ordering, transitivity, Hamiltonian circuits, and consecutive ones in adjacency matrices. Finally, graph theory helps us interrogate why these relationships are occurring. Basically, we examine the triptych of form, function, and phylogeny to differentiate between evolutionaryandengineeringconstraints. Thechapterisstructuredasfollows.Webeginbyintroducingsomebackgroundconceptsfromgraphtheory that will be utilized later in the chapter. We then introduce interval graphs through two biological examples relatedtochromosomesequencingandfoodwebs.Therestofthechapterisdevotedtotwoextendedexamplesof biologicalquestionsrelatedtorecentlypublishedstudiesongeneexpressionanddiseaseetiology.Theanalyses GraphTheoryforSystemsBiology Chapter|1 3 forthoseexamplesdemonstratehowgraphtheorycanhelpilluminateconceptsofbiologicalimportance.Each of the examples is followed by suggestions for open-ended projects in pursuit of similar analyses of related biologicalquestionsanddata. 1.2 REVISUALIZING, RECOGNIZING, AND REASONING ABOUT RELATIONSHIPS Graph theory has enormous applicability to biology. It is particularly powerful in this era of terabytes of data because it allows a tremendous topological reduction in complexity and investigation of patterns. The applications of graph theory in mathematical biology, several of which are illustrated in this chapter, include subcellularlocalizationofcoordinatedmetabolicprocesses,identificationofhubscentraltosuchprocessesand thelinksbetweenthem,analysisoffluxinasystem,temporalorganizationofgeneexpression,theidentification ofdrugtargets,determinationoftheroleofproteinsorgenesofunknownfunction,andcoordinationofsequences ofsignals.Medicalapplicationsincludediagnosis,prognosis,andtreatment.Wewillseebelowthatbyreducing abiologicalexplorationtoarelevantgraphrepresentation,weareabletoexamine,study,andmeasurevarious quantitative and meta-properties of the resulting graph and to obtain insights into why particular biological processessuchasgene-gene,protein-protein,signal-detector-effector,andpredator-preyoccur. 1.2.1 Basic Concepts from Graph Theory A graph in mathematics is a collection of vertices connected by edges. Graphs are often used in biology to represent networks and, more generally, to represent relationships between objects. The objects of interest are the vertices of the network, usually depicted as geometrical shapes such as dots, circles, or squares, while the connectionsbetweenthemarerepresentedbytheedges.Inanappliedcontext,theverticesaregenerallylabeled. Vertices u and v that are directly connected by an edge are called adjacent vertices or neighbors. A subgraph that consists of all vertices adjacent to a vertex u and all edges connecting any two such vertices forms the neighborhoodofthevertexu.Foreachgraph,onecanconstructitscomplementarygraph:agraphthathasthe same vertices as the original graph, but such that vertices u and v are adjacent in the complementary graph if andonlyifuandv arenotadjacentintheoriginalgraph. If a vertex u is related to itself, the edge connecting u with itself is called a loop. A path is a sequence of edges connecting neighboring vertices and the length of a path is the number of edges it uses. Loops could be consideredpathsoflength1thatstartandendinthesamevertex.Wesaythatavertexuinagraphisconnected tovertexvifthereisapathfromutov.Anundirectedgraphisaconnectedgraphifapathwayexistsfromevery vertextoeveryothervertex.Otherwise,thegraphisdisconnected.AHamiltonianpathisapaththatgoesthrough allverticesinthegraphandvisitseachvertexexactlyonce.Agraphinwhichanytwoverticesareconnectedby auniquepathiscalledatree. If there is directional dependence (e.g., “u activates v”; “u is the parent of v” as opposed to “u and v are friends”),thenthedirectionisrepresentedbyanarrow.Graphswithdirectionaldependenciesarecalleddirected graphs. Paths in directed graphs must follow the direction of the edges. The number of edges connected to a vertex u represents the degree of the vertex (loops are usually counted twice). In a directed graph, a vertex is called a source when all of its edges are outgoing edges; it is called a sink when all of its edges are incoming edges.Thein-degreeofavertexisthenumberofincomingedgestothevertexandtheout-degreeisdefinedby the number of outgoing edges. Thus, the degree of each vertex in a directed graph is the sum of the in-degree andout-degree.Verticeswithdegreesamongthetop5%inanetworkareoftencharacterizedashubs.Ashubs havealargenumberofneighbors,theyoftenperformimportantrolesinmanybiologicalnetworks. Additionalgraph-theoreticaldefinitionsandpropertiesthatwewilluseinasubstantivewayinthechapterare: ● Clique—a subgraph is a graph in which every vertex is connected by an edge to any other vertex in the subgraph;amaximalcliqueisacliquethatcannotbeextendedbyincludinganadditionaladjacentvertex;in otherwordsamaximalcliqueisacliquethatisnotasubsetofalargerclique.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.