Alexander Gammerman · Zhiyuan Luo Jesús Vega · Vladimir Vovk (Eds.) Conformal and 3 5 Probabilistic Prediction 6 9 I A with Applications N L 5th International Symposium, COPA 2016 Madrid, Spain, April 20–22, 2016 Proceedings 123 fi Lecture Notes in Arti cial Intelligence 9653 Subseries of Lecture Notes in Computer Science LNAI Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany LNAI Founding Series Editor Joerg Siekmann DFKI and Saarland University, Saarbrücken, Germany More information about this series at http://www.springer.com/series/1244 Alexander Gammerman Zhiyuan Luo (cid:129) ú Jes s Vega Vladimir Vovk (Eds.) (cid:129) Conformal and Probabilistic Prediction with Applications 5th International Symposium, COPA 2016 – Madrid, Spain, April 20 22, 2016 Proceedings 123 Editors Alexander Gammerman Jesús Vega University of London CIEMAT Egham Madrid UK Spain Zhiyuan Luo Vladimir Vovk University of London University of London Egham Egham UK UK ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notesin Artificial Intelligence ISBN 978-3-319-33394-6 ISBN978-3-319-33395-3 (eBook) DOI 10.1007/978-3-319-33395-3 LibraryofCongressControlNumber:2016936643 LNCSSublibrary:SL7–ArtificialIntelligence ©SpringerInternationalPublishingSwitzerland2016 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthe material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodologynow knownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookare believedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissionsthatmayhavebeenmade. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland Preface This volume contains the proceedings of the 5th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2016), which was co-organized by Royal Holloway, University of London, UK, and Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain. The Symposium was held at CIEMAT during April 20–22, 2016. Conformal prediction is a recently developed framework for complementing the predictions of machine learning algorithms with reliable measures of confidence. The framework produces well-calibrated confidence measures for individual examples without assuming anything more than that the data are generated independently from the same probability distribution. Sinceitsdevelopmenttheframeworkhasbeenappliedtomanypopulartechniques, suchassupportvectormachines,k-nearestneighbors,neuralnetworks,ridgeregression etc.,andhasbeensuccessfullyappliedtomanychallengingrealworldproblems,such as the early detection of ovarian cancer, the classification of leukemia subtypes, the diagnosis of acute abdominal pain, the assessment of stroke risk, the recognition of hypoxia in electroencephalograms (EEGs), the prediction of plant promoters, the predictionofnetworktrafficdemand,theestimation ofeffortforsoftwareprojectsand the back-calculation of non-linear pavement layer moduli. The framework has also been extended to additional problem settings such as semi-supervised learning, anomalydetection,featureselection,outlierdetection,changedetectioninstreams,and active learning. Recent developments in collecting large volumes of data have also required its adjustment to handle “big data”. The aim of this symposium is to serve as a forum for the presentation of new and ongoing work and the exchange of ideas between researchers on any aspect of con- formal and probabilistic prediction and their applications. While the previous four annual gatherings (COPA 2012 to COPA 2015) were devoted mainly to conformal predictors, they also included extensions of conformal predictorstoVennpredictors.Thetitleofthisyear'seventreflectstheexpandedscope explicitly and covers all kinds of probabilistic prediction, not only Venn prediction. The popularity of conformal prediction in the machine-learning community is growing.Asevidenceofthiswecanmentionthefollowingeventsthattookplaceafter COPA 2015. In June 2015, a special issue on “Conformal Prediction and its Appli- cations”oftheAnnalsofMathematicsandArtificialIntelligence(Volume74,Issues1– 2) was published. In July 2015, Henrik Boström, Alexander Gammerman, Ulf Johansson, Lars Carlsson, and Henrik Linusson presented the tutorial “Conformal Prediction: A Valid Approach to Confidence Predictions” at the 2015 International Joint Conference on Neural Networks (Killarney, Ireland). An EU Horizon 2020 projectondrugdesignthatstartedinSeptember2015adoptedconformalpredictionas oneofthemaintoolsforselectingusefulchemicalcompounds.InDecember2015,an Indo-UK workshop on “Mathematical Foundations of Probabilistic Conformal VI Preface PredictionandItsApplicationsinMachineLearning”washeldattheIndianInstituteof Technology in Hyderabad, India. In January 2016, there was a session on “Data-Intensive Methods and Conformal Predictions” at the International Conference on Pharmaceutical Bioinformatics (ICPB 2016) in Pattaya, Thailand. Overall, 14 papers were accepted for presentation at the symposium after being reviewed by at least two independent academic referees. The authors of these papers come from 11 different countries, namely: Austria, Cyprus, Italy, The Netherlands, Russia, Spain, Sweden, Switzerland, Ukraine, the UK and the USA. The volume is divided into three parts. The first part presents the invited paper “LearningwithIntelligentTeacher”byVladimiriapnikandRaufIzmailov,devotedto learning with privileged information and emphasizing the role of the teacher in the learning process. Thesecondpartisdevotedtothetheoryofconformalprediction.Thetwopapersin this part investigate various criteria of efficiency used in conformal prediction (Vla- dimir Vovk, Valentina Fedorova, Ilia Nouretdinov, and Alexander Gammerman) and introduce a universal probability-free version of conformal predictors (Vladimir Vovk and Dusko Pavlovic). The core of the book is formed by the third part, containing experimental papers describingvariousapplicationsofconformalprediction.Thispartopensby“Conformal Predictors for Compound Activity Prediction” (Paolo Toccaceli, Ilia Nouretdinov and Alexander Gammerman), applying conformal prediction to big and imbalanced data- sets in the field of drug discovery. The following paper, “Conformal Prediction of Disruptions from Scratch: Application to an ITER Scenario” by Raul Moreno, Jesús Vega,andSebastianDormido-Canto,demonstratesadvantagesofconformalprediction over the conventional methodology in the field of nuclear fusion. In “Evaluation of a Variance-Based Nonconformity Measure for Regression Forests” Henrik Boström, Henrik Linusson, Tuve Löfström and Ulf Johansson continue their empirical investi- gation of conformal prediction based on random forests; their new algorithms achieve impressive computational efficiency while retaining predictive efficiency. This part is concludedbyfourpapersproposingvaluableextensionsoftheframeworkofconformal prediction in various directions. First, Antonis Lambrou and Harris Papadopoulos (“Binary Relevance Multi-label Conformal Predictor”) extend the framework to multi-label classification. The second extension is proposed by Andrea Murari, Saeed Talebzadeh,JesúsVega,EmmanuelePeluso,MichelaGelfusa,MicheleLungaroni,and PasqualinoGaudioin“AMetrictoImprovetheRobustnessofConformalPredictorsin the Presence of Error Bars”: now all data, including the attributes of the objects to be labelled, are not precise but are obtained using a noisy measurement procedure. The third paper, by Shuang Zhou, Evgueni Smirnov, Ralf Peeters, and Gijs Schoenmakers (“Decision Trees for Instance Transfer”), applies the ideas of conformal prediction to the case where the test data are generated from a distribution different from that generatingthetrainingdata.Finally,GiovanniCherubinandIliaNouretdinov(“Hidden MarkovModelswithConfidence”)extendthemethodologyofconformalpredictionto the popular setting of hidden Markov models. The third part contains theoretical and experimental papers in general machine learning.It opens bytwotheoretical papers,“VariableFidelityRegressionUsing Low Fidelity Function Blackbox, and Sparsification” by Alexey Zaytsev and “Effective Preface VII Design for Sobol Indices Estimation Based on Polynomial Chaos Expansions” by Evgeny Burnaev, Ivan Panin, and Bruno Sudret. Apart from theoretical results, both papers provide convincing empirical validation. The next two papers are devoted to twodifferent,bothveryimportant,applications:medicine(“JointPredictionofChronic Conditions Onset: Comparing Multivariate Probits with Multiclass Support Vector Machines” by Shima Ghassem Pour and Federico Girosi) and information security (“Method of Learning Malware Behavior Scripts by Sequential Pattern Mining” by Alexandra Moldavskaya, Victoria Ruvinskaya, and Evgeniy Berkovich). The final paper, “Extended Regression on Manifolds Estimation” by Alexander Kuleshov and Alexander Bernstein, solves several interrelated problems in the area of regression on manifolds. WeareverygratefultotheProgramandOrganizingCommittees;thesuccessofthe symposiumwouldhavebeenimpossiblewithouttheirhardwork.Wearealsoindebted to the sponsors: Royal Holloway, University of London, and CIEMAT. Our special thanks to Yandex for their help and support in organizing the symposium and the special Alexey Chervonenkis Memorial Lecture. March 2016 Alexander Gammerman Zhiyuan Luo Jesús Vega Vladimir Vovk Organization General Chairs Alexander Gammerman Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, UK Vladimir Vapnik AI Research Facebook, Columbia University, USA and Royal Holloway, University of London, UK Vladimir Vovk Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, UK Organizing Committee Jesús Vega Laboratorio Nacional de Fusion, CIEMAT, Madrid, Spain Zhiyuan Luo Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, UK Harris Papadopoulos Department of Computer Science and Engineering, Frederick University, Nicosia, Cyprus Program Committee Chairs Jesús Vega Laboratorio Nacional de Fusion, CIEMAT, Madrid, Spain Zhiyuan Luo Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, UK Program Committee Members Ernst Ahlberg AstraZenrca, Sweden Vineeth Balasubramanian Department of Computer Science and Engineering, Indian Institute of Technology, Hyderabad, India Henrik Boström Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden Lars Carlsson AstraZenrca, Sweden Vladimir Cherkassky Department of Electrical and Computer Engineering, University of Minnesota, USA Jesús Manuel de la Cruz Universidad Complutense de Madrid, Madrid, Spain Jose-Carlos Universidad Politécnica de Madrid, Madrid, Spain Gonzalez-Cristobal Anna Fukshansky Mathematik – Training und Lösungen, Germany Barbara Hammer Bielefeld University, Bielefeld, Germany X Organization Shenshyang Ho School of Computer Science and Engineering, College of Engineering, Nanyang Technological University, Singapore Carlo Lauro University of Naples Federico II, Italy David Lindsay GKFX Financial Services, London, UK Henrik Linusson University of Borås, Borås, Sweden Andrea Murari Consorzio RFX-Associazione EURATOM ENEA per la Fusione, Italy Fionn Murtagh University of Derby and Goldsmiths University of London, UK Ulf Norinder Swedish Toxicology Sciences Research Center, Sweden Ilia Nouretdinov Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, UK Augusto Pereira Laboratorio Nacional de Fusion, CIEMAT, Madrid, Spain Giuseppe Rattá Laboratorio Nacional de Fusion, CIEMAT, Madrid, Spain Matilde Santos Universidad Complutense de Madrid, Madrid, Spain Victor Solovyev Softberry Inc., USA Rosanna Verde Second University of Naples, Italy
Description: