Anytime Online Novelty Detection for Vehicle Safeguarding BorisSofman J.AndrewBagnell AnthonyStentz CMU-RI-TR-09-17 April 2009 RoboticsInstitute CarnegieMellonUniversity Pittsburgh, Pennsylvania15213 (cid:13)c CarnegieMellonUniversity Abstract Novelty detection is often treated as a one-class classification problem: how to segment a data set of examples from everything else that would be considered novel or abnormal. Almost all existing novelty detection techniques, however, suffer from diminishedperformancewhenthenumberoflessrelevant,redundantornoisyfeatures increases,asoftenthecasewithhigh-dimensionalfeaturespaces. Additionally,many ofthesealgorithmsarenotsuitedforonlineuse,atraitthatishighlydesirableformany robotic applications. We present a novelty detection algorithm that is able to address thissensitivitytohighfeaturedimensionalitybyutilizingpriorclassinformationwithin thetrainingset. Additionally,ouranytimealgorithmiswellsuitedforonlineusewhen a constantly adjusting environmental model is beneficial. We apply this algorithm to onlinedetectionofnovelperceptionsysteminputonanoutdoormobilerobotandargue howsuchabilitiescouldbekeyinincreasingthereal-worldapplicationsandimpactof mobilerobotics1. 1Mostfiguresinthispaperarebestviewedincolor. I Contents 1 Introduction 1 2 NoveltyDetection 3 3 Approach 4 3.1 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2 ImprovedDimensionalityReduction . . . . . . . . . . . . . . . . . . 5 3.3 FramingasInstanceofNORMA . . . . . . . . . . . . . . . . . . . . 7 3.4 QueryOptimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4 ApplicationtoMobileRobotics 9 5 ExperimentalResults 11 6 Conclusion 16 III 1 Introduction Manyautonomousunmannedgroundvehicles(UGVs)haveadvancedtoalevelwhere they are competent and reliable a high percentage of the time in many environments [1,2,3]. Mostofthesesystems,however,areheavilyengineeredforthedomainsthey are intended to operate in. Any deviation from these domains often results in sub- optimal performance or even complete failure. Given the cost of such systems and theimportanceofsafetyandreliabilityinmanyofthetasksthattheyareintendedfor, even arelatively rarerateof failureisunacceptable. Inmany domains thatareprime candidates for mobile robotic applications such as space exploration, transportation, militaryreconnaissance,andagriculturaltasks,theriskofcatastrophicfailure,however small, is a primary reason why autonomous systems are still under-utilized despite alreadydemonstratingimpressiveabilities. One approach to addressing this limitation is for a UGV to be able to identify situations that it is likely untrained to handle before it experiences a major failure. This problem therefore becomes one of novelty detection: how a robot can identify whenperceptionsysteminputsdifferfrompriorinputsseenduringtrainingorprevious operation. With this ability, the system can either avoid novel locations to minimize riskorstopandenlisthumanhelpviasupervisorycontrolortele-operation(seeFigure 1). Two common limitations of novelty detection systems are particularly relevant to themobileroboticsdomain. Autonomoussystemsoftenneedtolearnfromtheirexpe- riences and continually adjust their models of what is normal and what is novel. For example,ifhumanfeedbackweretoconfirmthatacertaintypeofenvironmentselected asnovelisactuallysafetohandlewiththeexistingautonomysystemordemonstrateto thesystemtheproperwaytohandlethesituation(asin[4]),themodelnolongerneeds toidentifysuchinputsasnovel. Mostnoveltydetectionapproaches,however,builda modelofthenormalsetofexamplesaprioriinbatchinordertodetectnovelexamples inthefuturebutareunabletoupdatethatmodelonlinewithoutretraining. Furthermore, existing novelty detection techniques see diminished performance whenusinghigh-dimensionalfeaturespaces,particularlywhensomefeaturesareless relevant,redundant,ornoisy. Thesequalitiesareparticularlycommoninfeaturesfrom manyUGVperceptionsystemsduetothevarietyofsensorsanduncertaintyabouthow these features relate to novelty. For example, the relevance of camera-based features suchascolorandtextureofanareaoftheenvironmenttonovelty(orsimilaritymetrics in general) is difficult to understand as subsets of the features could contain redun- dantinformationorbemostlyirrelevant. Itisthereforeimportantfornoveltydetection techniquestoberesilienttosuchfeatureproperties. We present an online approach that addresses these common problems with nov- elty detection techniques. We approach the problem of novelty detection as one of onlinedensityestimationwhereseenexamplesgenerateaninfluenceoffamiliarityin featurespace. Whenpriorclassinformationisavailable,weshowhowusingMultiple Discriminant Analysis (MDA) for generating a reduced dimensional subspace to op- erateinratherthanothercommontechniquessuchasPrincipalComponentsAnalysis (PCA) can make the novelty detection system more robust to issues associated with high-dimensionalfeaturespaces. Ineffect, thiscreatesalowerdimensionalsubspace 1 Figure 1: Sample result from online novelty detection algorithm onboard Crusher, a large UGV. Chain-link fence was detected as novel (top and left, novelty shown in red)withrespecttothelargevarietyofterrainandvegetationpreviouslyencountered. Afteraninitialstretchbeingidentifiedasnovel,subsequentportionsofthefenceareno longerflagged(right)duetothealgorithm’sonlinetrainingability. Aswithallfuture similarimages,insetswithinthetopimageshowafirst-personview(leftinset)andthe classification of the environment by the perception system into road, vegetation, and solidobstacleinblue,greenandredrespectively(rightinset). thattrulycaptureswhatmakesthingsnovel. Additionally,ouralgorithmcanbeframed as a variant of the NORMA algorithm, an online kernelized Support Vector Machine (SVM) optimized through stochastic gradient descent, and therefore shares its favor- ablequalities[5]. Alongwithitsanytimeproperties,thisallowsouralgorithmtobetter dealwiththereal-timedemandsofonlinetasks. Whilethisworkwastargetedtowardmobileroboticsapplications,theapproaches herearemoregenerallyapplicabletoanydomainwhichcanbenefitfromonlinenovelty 2 detection. The next section presents background on novelty detection techniques and some example applications. Section 3 presents our novelty detection algorithm, followed by anexplanation of how this technique can be applied tomobile robotics inSection 4, results from field testing on a large UGV in Section 5 and concluding remarks in Section6. 2 Novelty Detection Novelty detection techniques (also referred to as anomaly or outlier detection) have beenappliedtoawiderangeofdomainssuchasdetectingstructuralfaults[6],abnor- maljetengineoperation[7],computersystemintrusiondetection[8],andidentifying masses in mammograms [9]. In the robotics domain some have incorporated novelty detectionsystemswithininspectionrobots[10,11]. Noveltydetectionisoftentreatedasaone-classclassificationproblem. Intraining thesystemseesavarietyof“normal”examples(andcorrespondingfeatures)andlater the system tries to identify input that does not fit into the trained model in order to separatenovelfromnon-novelexamples.Instancesofabnormalitiesornovelsituations are often rare during the training phase so a traditional classifier approach cannot be usedtoidentifynoveltyinmostcases. Mostnoveltydetectionapproachesfallintooneofseveralcategories. Statisticalor densityestimationtechniquesmodelthe“normal”classinordertoidentifywhethera testsamplecomesfromthesamedistributionornot. SuchapproachesincludeParzen window density estimators, nearest neighbor-based estimators, and Gaussian mixture models[12].Thesetechniquesoftenusealower-dimensionalrepresentationofthedata generatedthroughtechniquessuchasPCA. Other approaches attempt to distinguish the class of instances in the training set fromallotherpossibleinstancesinthefeaturespace. Scho¨lkopfetal. [13]showhow an SVM can be used for specifically this purpose. A hyper-plane is constructed to separatethedatapointsfromtheorigininfeaturespacebythemaximummargin. One applicationofthistechniquewasdocumentclassification[14]. Anoticeabledrawback ofthisapproachisthatitmakesaninherentassumptionthattheoriginisasuitableprior for the novel class. This limitation was addressed by [15] by attracting the decision boundary toward the center of the data distribution rather than repelling it from the origin.Asimilarapproachenclosesthedatainasphereofminimalradius,usingkernel functionstodealwithnon-sphericaldistributeddata[16]. Thesetechniquesallrequire solutionstolinearorquadraticprogramswithslackvariablestohandleoutliers. Another class of techniques attempts to detect novelty by compressing the resp- resentation of the data and measuring reconstruction error of the input. The key idea hereisthatinstancesoftheoriginaldatadistributionareexpectedtobereconstructed accuratelywhilenovelinstancesarenot. Asimplethresholdcanthenbeusedtodetect novel examples. The simplest method of this type uses a subset of the eigenvectors generated by PCA to reconstruct the input. An obvious limitation here is that PCA willperformpoorlyifthedataisnon-linear. Thislimitationwasaddressedbyusinga kernelPCAbasednoveltydetector[17]. Benefitsofmoresophisticatedauto-encoders, 3 neuralnetworksthatattempttoreconstructtheirinputsthroughnarrowhiddenlayers, havebeenstudiedaswell[18]. Online novelty detection has received significantly less attention than its offline counterpart. Sinceitisoftenimportanttobeabletoadjustthemodelofwhatiscon- siderednovelinreal-time,manyoftheabovetechniquesarenotsuitableforonlineuse astheyrequiresignificantbatchtrainingpriortooperation. WhileNetoetal. [10]re- placedtheuseofPCAfornoveltydetectionwithanimplementationofiterativePCA, performancewasstilllargelyinfluencedbytheinitialdatasetusedfortraining. Mars- land proposed a unique approach that models the phenomenon of habituation where thebrainlearnstoignorerepeatedstimuli[11]. Thisisaccomplishedthroughaclus- tering network called a Grow When Required (GWR) network. This network keeps trackoffiringpatternsofnodesandallowstheinsertionofnewnodestoallowonline adaptation. Markou andSinghhave writtenapairofextensive survey articlesdetailingmany additionalnoveltydetectionapplicationsandtechniques[19,20]. The performance of the above-mentioned novelty detection approaches, however, quicklydeterioratesasthenumberoflessrelevantornoisyfeaturesgrows. Thedispro- portionatelyhighvarianceofmanyofthesefeaturesmakeitdifficultformanyofthese algorithmstocaptureanadequatemodelofthetrainingdataandtheireffectsquickly begin to dominate more relevant features in making predictions. Our algorithm ad- dressesthiscruciallimitationincaseswhereclassinformationisavailablewithinthe trainingsetwhilestillbeingsuitableforonlineuse. 3 Approach 3.1 Formalization The goal of novelty detection can be stated as follows: given a training set D = {x}1...N ∈ X where xi = {x1i,...,xki}, learn a function f : X → {novel,not- novel}.Intheonlinescenario,eachtimesteptprovidesanexamplex andaprediction t f (x )ismade. t t Weperformonlinenoveltydetectionusingtheonlinedensityestimationtechnique showninAlgorithm1. Allpossiblefunctionsf areelementsofareproducingkernel HilbertspaceH[21]. Allf ∈Harethereforelinearcombinationsofkernelfunctions: t−1 f (x )= α k(x ,x ) (1) t t i i t Xi=1 Wemaketheassumptionthatproximityinfeaturespaceisdirectlyrelatedtosim- ilarity. Observed examples deemed as novel are therefore remembered and have an influence of familiarity on future examples through the kernel function k(x ,x ). A i j noveltythreshold,γ,andalearningrate,η,areinitiallyselected. Foreachexamplex , t thealgorithmaccumulatestheinfluenceofallpreviouslyseennovelexamples(line5). If this sum does not exceed γ then the example is identified as novel and is remem- beredforfuturenoveltyprediction(line7). Non-novelexamplesarenotstoredasthey 4
Description: