ebook img

Language Processing with Perl and Prolog: Theories, Implementation, and Application PDF

675 Pages·2014·11.581 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Language Processing with Perl and Prolog: Theories, Implementation, and Application

Cognitive Technologies Pierre M. Nugues Language Processing with Perl and Prolog Theories, Implementation, and Application Second Edition Cognitive Technologies ManagingEditors: D.M.Gabbay J.Siekmann Editorial Board: A.Bundy J.G.Carbonell M.Pinkal H.Uszkoreit M.Veloso W.Wahlster M.J.Wooldridge Forfurthervolumes: http://www.springer.com/series/5216 Pierre M. Nugues Language Processing with Perl and Prolog Theories, Implementation, and Application Second Edition 123 PierreM.Nugues DepartmentofComputerScience LundUniversity Lund,Sweden ManagingEditors DovM.Gabbay JörgSiekmann AugustusDeMorganProfessorofLogic ForschungsbereichDeduktions-und DepartmentofComputerScience Multiagentensysteme King’sCollegeLondon DFKI London,UK Saarbrücken,Germany ISSN1611-2482 CognitiveTechnologies ISSN2197-6635(electronic) ISBN978-3-642-41463-3 ISBN978-3-642-41464-0(eBook) DOI10.1007/978-3-642-41464-0 SpringerHeidelbergNewYorkDordrechtLondon LibraryofCongressControlNumber:2014945101 ©Springer-VerlagBerlinHeidelberg2014 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer. PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations areliabletoprosecutionundertherespectiveCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. While the advice and information in this book are believed to be true and accurate at the date of publication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityfor anyerrorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,with respecttothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Àmesparents, ÀMadeleine Preface to the Second Edition Eightyears,from2006to2014,isaverylongtimeincomputerscience.Thetrends I described in the preface of the first edition have not only been confirmed, but accelerated.Itriedtoreflectthiswithacompleterevisionofthetechniquesexposed in this book: I redesigned or updated all the chapters, I introduced two new ones, and, most notably, I considerably expanded the sections using machine-learning techniques.Tomakeplaceforthem,Iremovedafewalgorithmsoflesserinterest. This enabled me to keep the size of the book to ca. 700 pages. The programs and companionslidesareavailablefromthebookwebsiteathttp://ilppp.cs.lth.se/. ThisbookcorrespondstoacourseinnaturallanguageprocessingofferedatLund University.Iamgratefultoallthestudentswhotookitandhelpedmewritethisnew editionthroughtheircommentsandquestions.Curiousreaderscanvisitthecourse siteathttp://cs.lth.se/EDAN20/andseehowweusethisbookinateachingcontext. Iwouldliketothankthemanyreadersofthefirsteditionwhogavemefeedback or reported errors, the anonymous copy editor of the first and second editions, Richard Johansson and Michael Covington for their suggestions, as well as Peter Exner,thePhDcandidateIsupervisedduringthisperiod,forhisenthusiasm.Special thanks go to Ronan Nugent, my editor at Springer, for his thorough review and copyeditingalongwithhisadviceonstyleandcontent. This preface would not be complete without aword tothose who passed away, myaunt,Madeleine,andmyfather,Pierre.ThereisneveradayIdonotthinkofyou. Lund,Sweden PierreNugues April2014 vii Preface Inthepast20years,naturallanguageprocessingandcomputationallinguisticshave considerablymatured.Themovehasmainlybeendrivenbythemassiveincreaseof textualandspokendataandtheneedtoprocessthemautomatically.Thisdramatic growthofavailabledataspurredthedesignofnewconceptsandmethods,ortheir improvement,sothattheycouldscaleupfromafewlaboratoryprototypestoproven applications used by billions of people. Concurrently, the speed and capacity of machinesbecameanorderofmagnitudelarger,enablingustoprocessgigabytesof data and billions of words in a reasonable time, to train, test, retrain, and retest algorithms like never before. Although systems entirely dedicated to language processingremainscarce,therearenowscoresofapplicationsthat,tosomeextent, embedlanguageprocessingtechniques. Theindustrytrend,aswellastheuser’swishes,towardinformationsystemsable toprocesstextualdatahasmadelanguageprocessinganewrequirementformany computer science students. This has shifted the focus of textbooks from readers being mostly researchers or graduate students to a larger public, from readings by specialists to pragmatism and applied programming. Natural language processing techniques are not completely stable, however. They consist of a mix that ranges fromwell-masteredandroutinetorapidlychanging.Thismakestheexistenceofa newbookanopportunityaswellasachallenge. This book tries to take on this challenge and find the right balance. It adopts a hands-on approach. It is a basic observation that many students have difficulties going from an algorithm exposed using pseudocode to a runnable program. I did mybesttobridgethegapandprovidethestudentswithprogramsandready-made solutions. The book contains real code the reader can study, run, modify, and run again. I chose to write examples in two languages to make the algorithms easy to understandandencode:PerlandProlog. One of the major driving forces behind the recent improvements in natural languageprocessingistheincreaseoftextresourcesandannotateddata.Thehuge amount of texts made available by the Internet and never-ending digitization led many practitioners to evolve from theory-oriented, armchair linguists to frantic empiricists.Thisbooksattemptsaswellasitcantopayattentiontothistrendand ix

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.