ebook img

An Introduction to Pattern Recognition: A Matlab Approach PDF

216 Pages·2010·3.35 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview An Introduction to Pattern Recognition: A Matlab Approach

AcademicPressisanimprintofElsevier 30CorporateDrive,Suite400 Burlington,MA01803,USA TheBoulevard,LangfordLane Kidlington,Oxford,OX51GB,UK Copyright©2010ElsevierInc.Allrightsreserved. Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans,electronicormechanical,including photocopying,recording,oranyinformationstorageandretrievalsystem,withoutpermissioninwritingfromthepublisher. Detailsonhowtoseekpermission,furtherinformationaboutthePublisher’spermissionspoliciesandourarrangementswith organizationssuchastheCopyrightClearanceCenterandtheCopyrightLicensingAgency,canbefoundatourwebsite: www.elsevier.com/permissions. ThisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythePublisher (otherthanasmaybenotedherein). Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchandexperiencebroadenourunderstanding, changesinresearchmethods,professionalpractices,ormedicaltreatmentmaybecomenecessary. Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgeinevaluatingandusingany information,methods,compounds,orexperimentsdescribedherein.Inusingsuchinformationormethodstheyshouldbe mindfuloftheirownsafetyandthesafetyofothers,includingpartiesforwhomtheyhaveaprofessionalresponsibility. Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors,assumeanyliabilityforany injuryand/ordamagetopersonsorpropertyasamatterofproductsliability,negligenceorotherwise,orfromanyuseor operationofanymethods,products,instructions,orideascontainedinthematerialherein. MATLAB®isatrademarkofTheMathWorks,Inc.,andisusedwithpermission.TheMathWorksdoesnotwarrantthe accuracyofthetextorexercisesinthisbook.Thisbook’suseordiscussionofMATLAB®softwareorrelatedproductsdoesnot constituteendorsementorsponsorshipbyTheMathWorksofaparticularpedagogicalapproachorparticularuseofthe MATLAB®software. LibraryofCongressCataloging-in-PublicationData Introductiontopatternrecognition:aMATLAB®approach/SergiosTheodoridis…[etal.]. p.cm. “ComplimentofthebookPatternrecognition,4thedition,byS.TheodoridisandK.Koutroumbas, AcademicPress,2009.” Includesbibliographicalreferencesandindex. ISBN(alk.paper) 1.Patternrecognitionsystems.2.Patternrecognitionsystems–Mathematics.3.Numericalanalysis. 4.MATLAB. I.Theodoridis,Sergios,1951–.II.Theodoridis,Sergios,Patternrecognition. TK7882.P3I652010 006.4–dc22 2010004354 BritishLibraryCataloguing-in-PublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary. ForinformationonallAcademicPresspublications visitourwebsiteatwww.elsevierdirect.com PrintedintheUnitedStates 1011121314 10987654321 Preface The aim of thisbook is toserve pedagogic goals as a complement of the bookPattern Recognition, 4th Edition, by S. Theodoridis and K. Koutroumbas (Academic Press, 2009). It is the offspring of our experience in teaching pattern recognition for a number of years to different audiences such as studentswithgoodenough mathematical background, studentswho are more practice-oriented, pro- fessionalengineers,andcomputerscientistsattendingshorttrainingcourses.Thebookunravelsalong twodirections. ThefirstistodevelopasetofMATLAB-basedexamplessothatstudentswillbeabletoexperiment withmethodsandalgorithmsmetinthevariousstagesofdesigningapatternrecognitionsystem—that is,classifierdesign,featureselectionandgeneration,andsystemevaluation.Tothisend,wehavemade aneffortto“design”examplesthatwillhelpthereadergraspthebasicsbehindeachmethodaswellas therespectiveconsandpros.Inpatternrecognition,therearenomagicrecipesthatdictatewhichmethod ortechniquetouseforaspecificproblem.Veryoften,oldgoodandsimple(inconcept)techniquescan compete, from an efficiency pointof view, withmore modern and elaborate techniques. To thisend, thatis,selectingthemostappropriatetechnique,itisunfortunatethatthesedaysmoreandmorepeople followtheso-calledblack-boxapproach:trydifferenttechniques,usingarelatedS/Wpackagetoplay withasetofparameters,eveniftherealmeaningoftheseparametersisnotreallyunderstood. Such an “unscientific” approach, which really prevents thought and creation, also deprives the “user”oftheabilitytounderstand,explain,andinterprettheobtainedresults.Forthisreason,mostof theexamplesinthisbookusesimulateddata.Hence,onecanexperimentwithdifferentparametersand studythebehavioroftherespectivemethod/algorithm.Havingcontrolofthedata,readerswillbeable to“study,”“investigate,”andgetfamiliarwiththeprosandconsofatechnique.Onecancreatedatathat canpushatechniquetoits“limits”—thatis,whereitfails.Inaddition,mostofthereal-lifeproblemsare solvedinhigh-dimensionalspaces,wherevisualizationisimpossible;yet,visualizinggeometryisone ofthemostpowerfulandeffectivepedagogictoolssothatanewcomertothefieldwillbeableto“see” how various methods work. The 3-dimensioanal space is one of the most primitiveand deep-rooted experiencesinthehumanbrainbecauseeveryoneisacquaintedwithandhasagoodfeelingaboutand understandingofit. The second direction is to provide a summary of the related theory, without mathematics. We have made an effort, as much as possible, to present the theory using arguments based on physical reasoning, as well as point out the role of the various parameters and how they can influence the performanceofamethod/algorithm.Nevertheless,foramorethoroughunderstanding,themathematical formulationcannotbebypassed.Itis“there”wheretherealworthysecretsofamethodare,wherethe deep understanding has its undisputableroots and grounds, where science lies. Theory and practice are interrelated—onecannotbe developed withoutthe other.This isthereason thatwe considerthis bookacomplementofthepreviouslypublishedone.Weconsideritanotherbranchleaningtowardthe practicalside,theotherbranchbeingthemoretheoreticalone.Bothbranchesarenecessarytoformthe pattern-recognitiontree,whichhasitsrootsintheworkofhundredsofresearcherswhohaveeffortlessly contributed,overanumberofdecades,bothintheoryandpractice. ix x Preface All the MATLAB functions used throughoutthis book can be downloaded from the companion websiteforthisbookatwww.elsevierdirect.com/9780123744869.Notethat,whenrunningtheMATLAB code inthebook,theresultsmayslightlyvaryamong differentversionsofMATLAB. Moreover, we havemadeanefforttominimizedependenciesonMATLABtoolboxes,asmuchaspossible,andhave developedourowncode. Also, in spite of the careful proofreading of the book, it is still possible that some typos may have escaped. The authorswouldappreciate readers notifyingthemof any thatare found, as wellas suggestionsrelatedtotheMATLABcode. CHAPTER 1 Classifiers Based on Bayes Decision Theory 1.1 INTRODUCTION Inthischapter,wediscusstechniquesinspiredbyBayesdecisiontheory.Thetheoreticaldevelopments of the associated algorithms were given in [Theo09, Chapter 2]. To the newcomer in the field of pattern recognitionthe chapter’s algorithms and exercises are very important for developing a basic understandingand familiaritywithsome fundamentalnotionsassociated withclassification. Most of thealgorithmsaresimpleinbothstructureandphysicalreasoning. Inaclassificationtask,wearegivenapatternandthetaskistoclassifyitintooneoutofcclasses. Thenumberofclasses,c,isassumedtobeknownapriori.Eachpatternisrepresentedbyasetoffeature values,x(i), i=1,2,...,l,whichmakeupthel-dimensionalfeaturevector1x=[x(1),x(2),...,x(l)]T ∈ Rl.Weassumethateachpatternisrepresenteduniquelybyasinglefeaturevectorandthatitcanbelong toonlyoneclass. Givenx∈Rl andasetofcclasses,ω, i=1,2,...,c,theBayestheorystatesthat i P(ω|x)p(x)=p(x|ω )P(ω) (1.1) i i i where (cid:2)c p(x)= p(x|ω)P(ω ) i i i=1 whereP(ω )istheaprioriprobabilityofclassω; i=1,2,...,c,P(ω |x)istheaposterioriprobability i i i of class ω given the value of x; p(x) is the probabilitydensityfunction(pdf) of x; and p(x|ω), i= i i 1=2,...,c, is the class conditional pdf of x given ω (sometimes called the likelihood of ω with i i respecttox). 1.2 BAYES DECISION THEORY We are given a pattern whose class label is unknown and we let x≡[x(1),x(2),...,x(l)]T ∈ Rl be itscorrespondingfeature vector, whichresultsfromsome measurements. Also, weletthenumber of possibleclassesbeequaltoc,thatis,ω ,...,ω . 1 c 1Incontrastto[Theo09],vectorquantitiesarenotboldfacedhereincompliancewithMATLABnotation. Copyright©2010ElsevierInc.Allrightsreserved. 1 DOI:10.1016/B978-0-12-374486-9.00001-4 2 CHAPTER 1 Classifiers Based on Bayes Decision Theory AccordingtotheBayesdecisiontheory,xisassignedtotheclassω if i P(ω |x)>P(ω |x), ∀j(cid:5)=i (1.2) i j or,takingintoaccountEq.(1.1)andgiventhatp(x)ispositiveandthesameforallclasses,if p(x|ω)P(ω )>p(x|ω )P(ω ), ∀j(cid:5)=i (1.3) i i j j Remark • TheBayesianclassifier isoptimalinthesense thatitminimizestheprobabilityoferror[Theo09, Chapter2]. 1.3 THE GAUSSIAN PROBABILITY DENSITY FUNCTION The Gaussian pdf [Theo09, Section 2.4.1] is extensively used in pattern recognition because of its mathematical tractabilityaswellasbecause ofthecentrallimittheorem.The latterstatesthatthepdf ofthesumofanumberofstatisticallyindependentrandomvariablestendstotheGaussianoneasthe numberofsummandstendstoinfinity.Inpractice,thisisapproximatelytrueforalargeenoughnumber ofsummands. ThemultidimensionalGaussianpdfhastheform (cid:3) (cid:4) 1 1 p(x)= exp − (x−m)TS−1(x−m) (1.4) (2π)l/2|S|1/2 2 wherem=E[x]isthemeanvector,SisthecovariancematrixdefinedasS=E[(x−m)(x−m)T],|S| isthedeterminantofS. Often we refer to the Gaussian pdf as the normal pdf and we use the notationN(m,S). For the 1-dimensionalcase, x∈R,theabovebecomes (cid:3) (cid:4) 1 (x−m)2 p(x)= √ exp − (1.5) 2πσ 2σ2 whereσ2 isthevarianceoftherandomvariablex. Example 1.3.1. Compute the value of a Gaussian pdf, N(m,S), at x =[0.2,1.3]T and x = 1 2 [2.2,−1.3]T,where (cid:5) (cid:6) 1 0 m=[0, 1]T, S= 0 1 1.3 The Gaussian Probability Density Function 3 Solution. Usethefunctioncomp_gauss_dens_valtocomputethevalueoftheGaussianpdf.Specifi- cally,type m=[0 1]'; S=eye(2); x1=[0.2 1.3]'; x2=[2.2 -1.3]'; pg1=comp_gauss_dens_val(m,S,x1); pg2=comp_gauss_dens_val(m,S,x2); Theresultingvaluesforpg1andpg2are0.1491and0.001,respectively. Example1.3.2. Considera 2-class classification task inthe 2-dimensionalspace, where the data in bothclasses,ω , ω ,aredistributedaccordingtotheGaussiandistributionsN(m ,S )andN(m ,S ), 1 2 1 1 2 2 respectively.Let (cid:5) (cid:6) 1 0 m =[1, 1]T, m =[3, 3]T, S =S = 1 2 1 2 0 1 AssumingthatP(ω )=P(ω )=1/2,classifyx=[1.8, 1.8]T intoω orω . 1 2 1 2 Solution. Utilizethefunctioncomp_gauss_dens_valbytyping P1=0.5; P2=0.5; m1=[1 1]'; m2=[3 3]'; S=eye(2); x=[1.8 1.8]'; p1=P1*comp_gauss_dens_val(m1,S,x); p2=P2*comp_gauss_dens_val(m2,S,x); Theresultingvaluesforp1andp2are0.042and0.0189,respectively,andxisclassifiedtoω according 1 totheBayesianclassifier. Exercise1.3.1 RepeatExample1.3.2forP(ω )=1/6andP(ω )=5/6,andforP(ω )=5/6andP(ω )=1/6.Observethe 1 2 1 2 dependanceoftheclassificationresultontheaprioriprobabilities[Theo09,Section2.4.2]. Example1.3.3. Generate N =500 2-dimensional data points that are distribute(cid:5)d accordin(cid:6)g to the σ2 σ GaussiandistributionN(m,S),withmeanm=[0, 0]T andcovariancematrixS= 1 12 ,forthe σ σ2 12 2 followingcases: σ2=σ2=1,σ =0 1 2 12 σ2=σ2=0.2,σ =0 1 2 12 σ2=σ2=2,σ =0 1 2 12 4 CHAPTER 1 Classifiers Based on Bayes Decision Theory σ2=0.2,σ2=2,σ =0 1 2 12 σ2=2,σ2=0.2,σ =0 1 2 12 σ2=σ2=1,σ =0.5 1 2 12 σ2=0.3,σ2=2,σ =0.5 1 2 12 σ2=0.3,σ2=2,σ =−0.5 1 2 12 Ploteachdatasetandcommentontheshapeoftheclustersformedbythedatapoints. Solution. Togeneratethefirstdataset,usethebuilt-inMATLABfunctionmvnrndbytyping randn('seed',0) %Initialization of the randn function m=[0 0]'; S=[1 0;0 1]; N=500; X = mvnrnd(m,S,N)'; whereX isthematrixthatcontainsthedatavectorsinitscolumns. To ensure reproducibilityof the results, the randn MATLAB function, which generates random numbers following the Gaussian distribution, with zero mean and unit variance, is initialized to a specificnumberviathefirstcommand(inthepreviouscoderandniscalledbythemvnrnd MATLAB function). Toplotthedataset,type figure(1), plot(X(1,:),X(2,:),'.'); figure(1), axis equal figure(1), axis([-7 7 -7 7]) Workingsimilarlyfortheseconddataset,type m=[0 0]'; S=[0.2 0;0 0.2]; N=500; X = mvnrnd(m,S,N)'; figure(2), plot(X(1,:),X(2,:),'.'); figure(2), axis equal figure(2), axis([-7 7 -7 7]) Therestofthedatasetsareobtainedsimilarly.AllofthemaredepictedinFigure1.1,fromwhichone canobservethefollowing: • When the two coordinates of x are uncorrelated (σ =0) and theirvariances are equal, the data 12 vectorsform“sphericallyshaped”clusters(Figure1.1(a–c)). • Whenthetwocoordinatesofx areuncorrelated(σ =0)andtheirvariancesareunequal,thedata 12 vectorsform“ellipsoidallyshaped”clusters.Thecoordinatewiththehighestvariancecorresponds tothe“majoraxis”oftheellipsoidallyshapedcluster,whilethecoordinatewiththelowestvariance correspondstoits“minoraxis.”Inaddition,themajorandminoraxes oftheclusterareparallelto theaxes(Figure1.1(d,e)). 1.3 The Gaussian Probability Density Function 5 6 6 6 4 4 4 2 2 2 0 0 0 22 22 22 24 24 24 26 26 26 26 24 22 0 2 4 6 26 24 22 0 2 4 6 26 24 22 0 2 4 6 (a) (b) (c) 6 6 4 4 2 2 0 0 22 22 24 24 26 26 26 24 22 0 2 4 6 26 24 22 0 2 4 6 (d) (e) 6 6 6 4 4 4 2 2 2 0 0 0 22 22 22 24 24 24 26 26 26 26 24 22 0 2 4 6 26 24 22 0 2 4 6 26 24 22 0 2 4 6 (f) (g) (h) FIGURE1.1 EightdatasetsofExample1.3.3. • Whenthetwocoordinatesofxarecorrelated(σ (cid:5)=0),themajorandminoraxesoftheellipsoidally 12 shaped cluster are no longer parallel to the axes. The degree of rotationwith respect to the axes depends on thevalue ofσ (Figure1.1(f–h)).The effect ofthe value ofσ , whether positiveor 12 12 negative,isdemonstratedinFigure1.1(g,h).Finally,ascanbeseenbycomparingFigure1.1(a,f), whenσ (cid:5)=0,thedataformellipsoidallyshapedclustersdespitethefactthatthevariancesofeach 12 coordinatearethesame. 6 CHAPTER 1 Classifiers Based on Bayes Decision Theory 1.4 MINIMUM DISTANCE CLASSIFIERS 1.4.1 The Euclidean Distance Classifier TheoptimalBayesianclassifierissignificantlysimplifiedunderthefollowingassumptions: • Theclassesareequiprobable. • ThedatainallclassesfollowGaussiandistributions. • Thecovariancematrixisthesameforallclasses. • Thecovariancematrixisdiagonalandallelementsacrossthediagonalareequal.Thatis,S =σ2I, whereI istheidentitymatrix. Undertheseassumptions,itturnsoutthattheoptimalBayesianclassifierisequivalenttotheminimum Euclideandistanceclassifier.Thatis,givenanunknownx,assignittoclassω if i (cid:7) ||x−m||≡ (x−m)T(x−m)<||x−m||, ∀i(cid:5)=j i i i j It must be stated that the Euclidean classifier is often used, even if we know that the previously statedassumptionsarenotvalid,becauseofitssimplicity.Itassignsapatterntotheclasswhosemean isclosesttoitwithrespecttotheEuclideannorm. 1.4.2 The Mahalanobis Distance Classifier If onerelaxes theassumptionsrequiredbytheEuclidean classifier andremoves thelast one, theone requiringthecovariancematrixtobediagonalandwithequalelements,theoptimalBayesianclassifier becomesequivalenttotheminimumMahalanobisdistanceclassifier.Thatis,givenanunknownx,itis assignedtoclassω if i (cid:7) (cid:8) (x−m)TS−1(x−m)< (x−m)TS−1(x−m), ∀j(cid:5)=i i i j j whereSisthecommoncovariancematrix.Thepresenceofthecovariancematrixaccountsfortheshape oftheGaussians[Theo09,Section2.4.2]. Example 1.4.1. Consider a 2-class classification task in the 3-dimensional space, where the two classes, ω and ω , are modeled by Gaussian distributions with means m =[0,0,0]T and 1 2 1 m =[0.5, 0.5, 0.5]T,respectively.Assumethetwoclassestobeequiprobable.Thecovariancematrix 2 forbothdistributionsis ⎡ ⎤ 0.8 0.01 0.01 ⎢ ⎥ S=⎣ 0.01 0.2 0.01 ⎦ 0.01 0.01 0.2 Giventhepointx=[0.1, 0.5, 0.1]T,classifyx (1)accordingtotheEuclideandistanceclassifierand (2)accordingtotheMahalanobisdistanceclassifier.Commentontheresults.

Description:
Introduction to pattern recognition : a MATLAB® approach / Sergios Theodoridis … [et al.]. p. cm. “Compliment of the book Pattern recognition, 4th
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.