ebook img

Machine learning\& artificial intelligence in the quantum domain PDF

106 Pages·2017·5.2 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine learning\& artificial intelligence in the quantum domain

Machine learning & artificial intelligence in the quantum domain Vedran Dunjko Institute for Theoretical Physics, University of Innsbruck, Innsbruck 6020, Austria Max Planck Institute of Quantum Optics, Garching 85748, Germany Email: [email protected] Hans J. Briegel 7 Institute for Theoretical Physics, University of Innsbruck Innsbruck 6020, Austria 1 Department of Philosophy, University of Konstanz, Konstanz 78457, Germany 0 Email: [email protected] 2 p Abstract. Quantuminformationtechnologies,ontheoneside,andintelligentlearning e systems,ontheother,arebothemergenttechnologiesthatwilllikelyhaveatransforming S impactonoursocietyinthefuture. Therespectiveunderlyingfieldsofbasicresearch– quantuminformation(QI)versusmachinelearningandartificialintelligence(AI)–have 8 theirownspecificquestionsandchallenges,whichhavehithertobeeninvestigatedlargely independently. However,inagrowingbodyofrecentwork,researchershavebeenprob- ] ingthequestiontowhatextentthesefieldscanindeedlearnandbenefitfromeachother. h QMLexplorestheinteractionbetweenquantumcomputingandmachinelearning,inves- p tigatinghowresultsandtechniquesfromonefieldcanbeusedtosolvetheproblemsof - t theother. Inrecenttime,wehavewitnessedsignificantbreakthroughsinbothdirections n ofinfluence. Forinstance,quantumcomputingisfindingavitalapplicationinproviding a speed-ups for machine learning problems, critical in our “big data” world. Conversely, u machine learning already permeates many cutting-edge technologies, and may become q instrumentalinadvancedquantumtechnologies. Asidefromquantumspeed-upindata [ analysis,orclassicalmachinelearningoptimizationusedinquantumexperiments,quan- tumenhancementshavealsobeen(theoretically)demonstratedforinteractivelearning 1 tasks, highlighting the potential of quantum-enhanced learning agents. Finally, works v exploring the use of artificial intelligence for the very design of quantum experiments, 9 and for performing parts of genuine research autonomously, have reported their first 7 successes. Beyond the topics of mutual enhancement – exploring what ML/AI can do 7 for quantum physics, and vice versa – researchers have also broached the fundamental 2 issueofquantumgeneralizationsoflearningandAIconcepts. Thisdealswithquestions 0 of the very meaning of learning and intelligence in a world that is fully described by . quantum mechanics. In this review, we describe the main ideas, recent developments, 9 andprogressinabroadspectrumofresearchinvestigatingmachinelearningandartificial 0 intelligenceinthequantumdomain. 7 1 : v i X r CONTENTS a I. Introduction 3 A. Quantummechanics,computationandinformationprocessing 4 B. Artificialintelligenceandmachinelearning 7 1. Learningfromdata: machinelearning 9 2. Learningfrominteraction: reinforcementlearning 11 3. Intermediarylearningsettings 12 4. Puttingitalltogether: theagent-environmentparadigm 12 C. Miscellanea 15 2 II. Classicalbackground 15 A. Methodsofmachinelearning 16 1. Artificialneuralnetworksanddeeplearning 17 2. SupportVectorMachines 19 3. Othermodels 22 B. Mathematicaltheoriesofsupervisedandinductivelearning 24 1. Computationallearningtheory 25 2. VCtheory 27 C. Basicmethodsandtheoryofreinforcementlearning 30 III. Quantummechanics,learning,andAI 34 IV. Machinelearningappliedto(quantum)physics 35 A. Hamiltonianestimationandmetrology 37 1. Hamiltonianestimation 37 2. Phaseestimationsettings 38 3. GeneralizedHamiltonianestimationsettings 39 B. Designoftargetevolutions 40 1. Off-linedesign 41 2. On-linedesign 41 C. Controllingquantumexperiments,andmachine-assistedresearch 42 1. Controllingcomplexprocesses 43 2. Learninghowtoexperiment 44 D. Machinelearningincondensed-matterandmany-bodyphysics 45 V. Quantumgeneralizationsofmachinelearningconcepts 47 A. Quantumgeneralizations: machinelearningofquantumdata 47 1. Statediscrimination,stateclassification,andmachinelearningofquantumdata 48 2. Computationallearningperspectives: quantumstatesasconcepts 52 B. (Quantum)learningandquantumprocesses 53 VI. Quantumenhancementsformachinelearning 55 A. Learningefficiencyimprovements: samplecomplexity 56 1. QuantumPAClearning 57 2. Learningfrommembershipqueries 58 B. Improvementsinlearningcapacity 60 1. Capacityfromamplitudeencoding 60 2. CapacityviaquantizedHopfieldnetworks 61 C. Run-timeimprovements: computationalcomplexity 63 1. Speed-upviaadiabaticoptimization 64 2. Speed-upsincircuitarchitectures 68 VII. Quantumlearningagents,andelementsofquantumAI 76 A. Quantumlearningviainteraction 77 B. Quantumagent-environmentparadigmforreinforcementlearning 83 1. AE-basedclassificationofquantumML 86 C. Towardsquantumartificialintelligence 87 VIII. Outlook 88 Acknowledgements 91 References 91 3 I. INTRODUCTION Quantumtheoryhasinfluencedmostbranchesofphysicalsciences. Thisinfluencerangesfromminor corrections, to profound overhauls, particularly in fields dealing with sufficiently small scales. In the second half of the last century, it became apparent that genuine quantum effects can also be exploited in engineering-type tasks, where such effects enable features which are superior to those achievable using purely classical systems. The first wave of such engineering gave us, for example, the laser, transistors, and nuclear magnetic resonance devices. The second wave, which gained momentum in the ’80s, constitutes a broad-scale, albeit not fully systematic, investigation of the potential of utilizing quantum effects for various types of tasks which, at the base of it, deal with the processing of information. This includes the research areas of cryptography, computing, sensing and metrology, all of which now share the common language of quantum information science. Often, the research into such interdisciplinary programs was exceptionally fruitful. For instance, quantum computation, communication, cryptography and metrology are now mature, well-established and impactful research fields which have, arguably, revolutionized the way we think about information and its processing. In recent years, it has become apparent that the exchange of ideas between quantum information processing and the fields of artificial intelligence and machine learning has its own genuine questions and promises. Although such lines of research are only now receiving a broader recognition, the very first ideas were present already at the early days of QC, and we have made an effort to fairly acknowledge such visionary works. In this review we aim to capture research at the interplay between machine learning, artificial intelligence and quantum mechanics in its broad scope, with a reader with a physics background in mind. To this end, we dedicate comparatively large amount of space to classical machine learning and artificial intelligence topics, which are often sacrificed in physics-oriented literature, while keeping the quantum information aspects concise. The structure of the paper is as follows. In the remainder of this introductory section I, we give quick overviews of the relevant basic concepts of the fields quantum information processing, and of machine learning and artificial intelligence. We finish off the introduction with a glossary of useful terms, list of abbreviations, and comments on notation. Subsequently, in section II we delve deeper into chosen methods, technical details, and the theoretical background of the classical theories. The selection of topics here is not necessarily balanced, from a classical perspective. We place emphasis on elements which either appear in subsequent quantum proposals, which can sometimes be somewhat exotic, or on aspects which can help put the relevance of the quantum results into proper context. Section III briefly summarizes the topics covered in the quantum part of the review. Sections IV - VII cover the four main topics we survey, and constitute the central body of the paper. We finish with a an outlook section VIII. Remark: The overall objective of this survey is to give a broad, “birds-eye” account of the topics which contribute to the development of various aspects of the interplay between quantum information sciences, and machine learning and artificial intelligence. Consequently, this survey does not necessarily present all the developments in a fully balanced fashion. Certain topics, which are in their very early stages of investigation, yet important for the nascent research area, were given perhaps a disproportional level of attention, compared to more developed themes. This is, for instance, particularly evident in section VII, which aims to address the topics of quantum artificial intelligence,beyondmainstreamdataanalysisapplicationsofmachinelearning. Thistopicisrelevant forabroadperspectiveontheemergingfield,howeverithasonlybeenbroachedbyveryfewauthors, works, including the authors of this review and collaborators. The more extensively explored topics 4 of, e.g., quantum algorithms for machine learning and data mining, quantum computational learning theory, or quantum neural networks, have been addressed in more focused recent reviews (Wittek, 2014a; Schuld et al., 2014a; Biamonte et al., 2016; Arunachalam and de Wolf, 2017; Ciliberto et al., 2017). A. Quantum mechanics, computation and information processing Executive summary: Quantum theory leads to many counterintuitive and fascinating phe- nomena, including the results of the field of quantum information processing, and in par- ticular, quantum computation. This field studies the intricacies of quantum information, its communication, processing and use. Quantum information admits a plethora of phenomena which do not occur in classical physics. For instance, quantum information cannot be cloned – this restricts the types of processing that is possible for general quantum information. Other aspects lead to advantages, as has been shown for various communication and com- putation tasks: for solving algebraic problems, reduction of sample complexity in black-box settings, sampling problems and optimization. Even restricted models of quantum computing, amenable for near-term implementations, can solve interesting tasks. Machine learning and artificial intelligence tasks can, as components, rely on the solving of such problems, leading to an advantage. Quantum mechanics, as commonly presented in quantum information, is based on few simple postulates: 1) the pure state of a quantum system is given by a unit vector ψ in a complex Hilbert | (cid:105) space, 2) closed system pure state evolution is generated by a Hamiltonian H, specified by the linear Schro¨dinger equation H ψ =i(cid:126)∂ ψ , 3) the structure of composite systems is given by the tensor | (cid:105) ∂t| (cid:105) product, and 4) projective measurements (observables) are specified by, ideally, non-degenerate Hermitian operators, and the measurement process changes the description of the observed system from state ψ to an eigenstate φ , with probability given by the Born rule p(φ)= ψ φ 2 (Nielsen | (cid:105) | (cid:105) |(cid:104) | (cid:105)| and Chuang, 2011). While the full theory still requires the handling of subsystems and classical ignorance1,alreadythefewmathematicalaxiomsofpurestateclosedsystemtheorygiverisetomany quintessentially quantum phenomena, like superpositions, no-cloning, entanglement, and others, most of which stem from just the linearity of the theory. Many of these properties re-define how researchers in quantum information perceive what information is, but also have a critical functional role in say quantum enhanced cryptography, communication, sensing and other applications. One of the most fascinating consequences of quantum theory are, arguably, captured by the field of quantum information processing (QIP), and in particular quantum computing (QC), which is most relevant for our purposes. QC has revolutionized the theories and implementations of computation. This field originated from the observations by Manin (Manin, 1980) and Feynman (Feynman, 1982) that the calculation of certainpropertiesofquantumsystems,astheyevolveintime,maybeintractable,whilethequantum systems themselves, in a manner of speaking, do perform that hard computation by merely evolving. Since these early ideas, QC has proliferated, and indeed the existence of quantum advantages which 1 This requires more general and richer formalism of density operators, and leads to generalized measurements, completelypositiveevolutions,etc. 5 are offered by scalable universal quantum computers have been demonstrated in many settings. Perhaps most famously, quantum computers have been shown to have the capacity to efficiently solve algebraic computational problems, which are believed to be intractable for classical computers. This includes the famous problems of factoring large integers computing the discrete logarithms (Shor, 1997), but also many others such as Pell equation solving, some non-Abelian hidden subgroup problems, and others, see e.g. (Childs and van Dam, 2010; Montanaro, 2016) for a review. Related to this, nowadays we also have access to a growing collection of quantum algorithms2 for various linear algebra tasks, as given in e.g. (Harrow et al., 2009; Childs et al., 2015; Rebentrost et al., Algorithm Oracle 2016a), which may offer speed-ups. input Quantum computers can also offer improvements in many optimiza- tionandsimulationtasks,forinstance,computingcertainproperties Processing block 1 of partition functions (Poulin and Wocjan, 2009), simulated an- data oqraucelrey Ef nealing (Crosson and Harrow, 2016), solving semidefinite programs Processing block 2 (Brandao and Svore, 2016), performing approximate optimization data oqraucelrey Ef (tFuamrhsiyestteaml.s, 2(G01e4o)r,gaenscdu, netataulr.,al2l0y,14in).the tasks of simulating quan- Processing block 3 … Advantages can also be achieved in terms of the efficient use of … sub-routines and databases. This is studied using oracular models data oqraucelrey Ef ofcomputation, wherethequantityofinterestisthenumberofcalls to an oracle, a black-box object with a well-defined set of input- Processing block k output relations which, abstractly, stands in for a database, sub- routine,oranyotherinformationprocessingresource. Thecanonical output exampleofaquantumadvantageinthissettingistheGrover’ssearch FIG. 1 Oracularcomputationand querycomplexity: a(quantum)al- (Grover, 1996) algorithm which achieves the, provably optimal, gorithmsolvesaproblembyinter- quadratic improvement in unordered search (where the oracle is mittently calling a black-box sub- the database). Similar results have been achieved in a plethora routine,definedonlyviaitsinput- of other scenarios, such as spatial search (Childs and Goldstone, outputrelations. Querycomplexity of an algorithm is the number of 2004),searchoverstructures(includingvariousquantumwalk-based calls to the oracle, the algorithm algorithms(Kempe,2003;Childsetal.,2003;Reitzneretal.,2012)), willperform. NAND(Childsetal.,2009)andmoregeneralbooleantreeevaluation problems (Zhan et al., 2012), as well as more recent “cheat sheet” technique results (Aaronson et al., 2016) leading to better-than-quadratic improvements. Taken a bit more broadly, oracular models of computation can also be used to model communication tasks, where the goal is to reduce communication complexity (i.e. the number of communication rounds) for some information exchange protocols (de Wolf, 2002). Quantum computers can also be used for solving sampling problems. In sampling problems the task is to produce a sample according to an (implicitly) defined distribution, and they are important for both optimization and (certain instances of) algebraic tasks3. For instance, Markov Chain Monte Carlo methods, arguably the most prolific set of computational methods in natural sciences, are designed to solve sampling tasks, which in turn, can be often 2 In this review it makes sense to point out that the term “quantum algorithm” is a bit of a misnomer, as what we really mean is “an algorithm for a quantum computer”. An algorithm – an abstraction – cannot per se be “quantum”,andthetermquantumalgorithmcouldalsohavemeante.g.“algorithmfordescribingorsimulating quantumprocesses”. Nonetheless,thisterm,inthesenseof“algorithmforaquantumcomputer”iscommonplacein QIP,andweuseitinthissenseaswell. Theconceptof“quantummachinelearning”is,however,stillambiguousin thissense,anddependingontheauthors,caneasilymean“quantumalgorithmforML“,or“MLappliedtoQIP”. 3 Optimizationandcomputationtaskscanbetriviallyregardedasspecialcasesofsamplingtasks,wherethetarget distributionis(sufficiently)localizedatthesolution. 6 used to solve other types of problems. For instance, in statistical physics, the capacity to sample from Gibbs distributions is often the key tool to compute properties of the partition function. A broad class of quantum approaches to sampling problems focuses on quantum enhancements of such Markov Chain methods (Temme et al., 2011; Yung and Aspuru-Guzik, 2012). Sampling tasks have been receiving an ever increasing amount of attention in the QIP community, as we will comment on shortly. Quantum computers are typically formalized in one of a few standard models of computation, many of which are, computationally speaking, equally powerful4. Even if the models are computationally equivalent, they are conceptually different. Consequently, some are better suited, or more natural, for a given class of applications. Historically, the first formal model, the quantum Turing machine (Deutsch, 1985), was preferred for theoretical and computability-related considerations. The quantum circuit model (Nielsen and Chuang, 2011) is standard for algebraic problems. The measurement-based quantum computing (MBQC) model (Raussendorf and Briegel, 2001; Briegel et al., 2009) is, arguably, best-suited for graph-related problems (Zhao et al., 2016), multi-party tasks and distributed computation (Kashefi and Pappa, 2016) and blind quantum computation (Broadbent et al., 2009). Topological quantum computation (Freedman et al., 2002) was an inspiration for certain knot-theoretic algorithms (Aharonov et al., 2006), and is closely related to algorithms for topological error-correction and fault tolerance. The adiabatic quantum computation model (Farhi et al., 2000) is constructed with the task of ground-state preparation in mind, and is thus well-suited for optimization problems (Heim et al., 2017). ResearchintoQIPalsoproducedexamplesof interesting restricted models of computation: Listofmodels applications models which are in all likelihood not univer- (BQP-complete) (notexlusive) sal for efficient QC, however can still solve QTM theory tasks which seem hard for classical machines. QCircuits algorithms MBQC distributedcomputing Recently, there has been an increasing in- Topological knot-theoreticproblems terest in such models, specifically the linear Adiabatic optimizationproblems opticsmodel,theso-calledlow-depthrandom Listofmodels applications circuits model and the commuting quantum (restricted) circuits model5. In (Aaronson and Arkhipov, DQC1 computingtraceofunitary 2011) it was shown that the linear optics LinearOptics sampling model can efficiently produce samples from ShallowRandomQ.Circuits sampling a distribution specified by the permanents of CommutingQ.Circuits sampling RestrictedAdiabatic optimizationtasks certain matrices, and it was proven (barring certain plausible mathematical conjectures) that classical computers cannot reproduce FIG. 2 Computational models the samples from the same distribution in polynomial time. Similar claims have been made for low-depth random circuits (Boixo et al., 2016; Bravyi et al., 2017) and commuting quantum circuits, which comprise only commuting gates (Shepherd and Bremner, 2009; Bremner et al., 2017). Critically, these restricted models can be 4 Variousnotionsof“equallypowerful”areusuallyexpressedintermsofalgorithmicreductions. InQIP,typically, the computational model B is said to be at least as powerful as the computational model A, if any algorithm of complexity O(f(n)) (where f(n) is some scaling function, e.g. “polynomial” or “exponential”), defined for modelA,canbeefficiently(usuallythismeansinpolynomialtime)translatedtoanalgorithmforB,whichsolves thesameproblem,andwhosecomputationalcomplexityisO(poly(f(n))). TwomodelsarethenequivalentifA isaspowerfulasBandBisaspowerfulasA.Whichspecificreductioncomplexitywecareabout(polynomial, linear,etc.) dependsonthesetting: e.g. forfactoringpolynomialreductionssuffice,sincethereseemstobean exponentialseparationbetweenclassicalandquantumcomputation. Incontrast,forsearch,thereductionsneedto besub-quadratictomaintainaquantumspeed-up,sinceonlyaquadraticimprovementisachievable. 5 Otherrestrictedmodelsexist,suchastheonecleanqubitmodel(DQC1)wheretheinputcomprisesonlyonequbit inapurestate,andothersaremaximallymixed. Thismodelcanbeusedtocomputeafunction–thenormalized traceofaunitaryspecifiedbyaquantumcircuit–whichseemstobehardforclassicaldevices. 7 realized to sufficient size, as to allow for a demonstration of computations which the most powerful classical computers that are currently available cannot achieve, with near-term technologies. This milestone,referredtoasquantum supremacy (Preskill,2012;Lundetal.,2017),andhasbeengetting a significant amount of attention in recent times. Another highly active field in QIP concentrates on (analogue) quantum simulations, with applications in quantum optics, condensed matter systems, andquantummany-bodyphysics(Georgescuetal.,2014). Many,ifnotmostoftheabovementioned aspects of quantum computation are finding a role in quantum machine learning applications. Next,webrieflyreviewbasicconceptsfromtheclassicaltheoriesofartificialintelligenceandmachine learning. B. Artificial intelligence and machine learning Executive summary: The field of artificial intelligence incorporates various methods, which are predominantly focused on solving problems which are hard for computers, yet seem- ingly easy for humans. Perhaps the most important class of such tasks pertain to learning problems. Various algorithmic aspects of learning problems are tackled by the field of machine learning, which evolved from the study of pattern recognition in the context of AI. Modern machine learning addresses a variety of learning scenarios, dealing with learning from data, e.g. supervised (data classification), and unsupervised (data clustering) learning, or from interaction, e.g. reinforcement learning. Modern AI states, as its ultimate goal, the design of an intelligent agent which learns and thrives in unknown environments. Artificial agents that are intelligent in a general, human sense must have the capacity to tackle all the individual problems addressed by machine learning and other more specialized branches of AI. They will consequently require a complex combination of techniques. In its broadest scope, the modern field of artificial intelligence (AI) encompasses a wide variety of sub-fields. Most of these sub-fields deal with the understanding and abstracting of aspects of various human capacities which we would describe as intelligent, and attempt to realize the same capacities in machines. The term “AI” was coined at Dartmouth College conferences in the 1956 (Russell and Norvig, 2009), which were organized to develop ideas about machines that can think, and the conferences are often cited as the birthplace of the field. The conferences were aimed to “find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves” 6. The history of the field has been turbulent, with strong opinions on how AI should be achieved. For instance, over the course of its first 30 years, the field has crystalized into two main competing and opposite viewpoints (Eliasmith and Bechtel, 2006) on how AI may be realized: computationalism – holding that that the mind functions by performing purely formal operations on symbols, in the manner of a Turing machine, see e.g. (Newell and Simon, 1976)), and connectionism – which models mental and behavioral phenomena as the emergent processes of interconnected networks of simple units, mimicking the biological brain, see e.g. (Medler, 1998)). Aspects of these two viewpoints still influence approaches to AI. Irrespective of the underlying philosophy, for the larger part of the history of AI, the realization of “genuine AI” was, purportedly perpetually “a few years away” – a feature often attributed also to 6 Paraphrasedfrom(McCarthyetal.,1955). 8 quantum computers by critics of the field. In the case of AI, such runaway optimism had a severe calamitous effect on the field, in multiple instances, especially in the context of funding (leading to periods now dubbed “winters of AI”). By the late 90s, the reputation of the field was low, and, even in hindsight, there was no consensus on the reasons why AI failed to produce human-level intelligence. Such factors played a vital role in the fragmentation of the field into various sub-fields which focused on specialized tasks, often appearing under different names. A particularly influential perspective of AI, often called nouvelle or embodied AI, was advocated by Brooks, which posited that intelligence emerges from (simple) embodied systems which learn through interaction with their environments (Brooks, 1990). In contrast to standard approaches of the time, Nouvelle AI insists on learning, rather than having properties pre-programmed, and on the embodiment of AI entities, as opposed to abstract entities like chess playing programs. To a physicist, this perspective that intelligence is embodied is reminiscent to the viewpoint that information is physical, which had been “the rallying cry of quantum information theory”(Steane, 1998). Such embodied approaches are particularly relevant in robotics where the key issues involve perception (the capacity of the machine to interpret the external world using its sensors, which includes computer vision, machine hearing and touch), motion and navigation (critical in e.g. automated cars). Related to human-computer interfaces, AI also incorporates the field of natural language processing which includes language understanding – the capacity of the machine to derive meaning from natural language, and language generation – the ability of the machine to convey information in a natural language. Other general aspects of AI pertain to a few well-studied ca- pacities of intelligent entities (Russell and Norvig, 2009). For instance, automated planning is related to decision theory7 and, broadly speaking, addresses the task of identifying strate- gies (i.e. sequences of actions) which need to be performed in order to achieve a goal, while minimizing (a specified) cost. Already the simple class of so-called off-line planning tasks, wherethetask,costfunction,andthesetofpossibleactionsare known beforehand, contains genuinely hard problems, e.g. it include,asaspecialcase,theNP-complete8 travellingsalesman problem (TSP); for illustration see Fig. 3 9 . In modern times, TSP itself would no longer be considered a genuine AI problem, but it is serves to illustrate how already veryspecialized,simplesub-sub-tasksofAImaybehard. More general planning problems also include on-line variants, where not everything is known beforehand (e.g. TSP but where the “map”mayfailtoincludealltheavailableroads,andonesimply FIG. 3 TSP example: finding the hastoactuallytraveltofindgoodstrategies). On-lineplanning shortestroutevisitingthelargestcities overlaps with reinforcement learning, discussed later in this in Germany. section. Closelyrelatedtoplanningisthecapacityofintelligent entitiesforproblemsolving. Intechnicalliterature,problems 7 Nottobeconfusedwithdecisionproblems,studiedinalgorithmiccomplexity. 8 Roughlyspeaking,NPistheclassofdecision(yes,no)problemswhosesolutionscanbeefficientlyverifiedbya classicalcomputerinpolynomialtime. NP-completeproblemsarethehardestproblemsinNPinthesensethat anyotherNPproblemcanbereducedtoanNPcompleteproblemviapolynomial-timereductions. Notethatthe exactsolutionstoNP-competeproblemsarebelievedtobeintractableevenforquantumcomputers. 9 Figure3hasbeenmodifiedfromhttps://commons.wikimedia.org/wiki/File:TSP_Deutschland_3.png. 9 solving is distinguished from planning by a lack of additional structure in the problem, usually assumed in planning – in other words, problem solving is more general and typically more broadly definedthanplanning. Thelackofstructureingeneralproblemsolvingestablishesaclearconnection to (also unstructured) searching and optimization: in the setting of no additional information or structure, problem solving is the search for the solution to a precisely specified problem. While general problem solving can be, theoretically, achieved by a general search algorithm (which can still be subdivided into classes such as depth-first, breath-first, depth-limited search etc.), more often there is structure to the problem, in which case an informed search strategies – often called heuristic search strategies – will be more efficient (Russell and Norvig, 2009). Human intelligence, to no small extent, relies on our knowledge. We can accumulate knowledge, reason over it, and use it to come to the best decisions, for instance in the context of problem solving and planning. An aspect of AI tries to formalize such logical reasoning, knowledge accumulation and knowledge representation, often relying on formal logic, most often first order logic. A particularly important class of problems central to AI, and related to knowledge acquisition, involve the capacity of the machine to learn through experience. This feature was emphasized already in the early days of AI, and the derived field of machine learning (ML) now stands as arguably the most successful aspect (or spin-off) of AI, which we will address in more detail. 1. Learning from data: machine learning Stemming from the traditions of pat- Supervised learning Unsupervised learning tern recognition, such as recognizing handwrittentext,andstatisticallearn- Label 0 Label 1 ing theory (which places ML ideas Unknown Linear classifier in a rigorous mathematical frame- work), ML, broadly speaking, ex- plores the construction of algorithms that can learn from, and make pre- dictions about data. Traditionally, ML deals with two main learning set- tings: supervised and unsupervised learning, which are closely related to data analysis and data mining-type FIG. 4 Supervised (in this case, best linear classifier) and un- tasks(Shalev-ShwartzandBen-David, supervised learning (here clustering into two most likely groups and outliers) illustrated. 2014). A broader perspective (Alpay- din,2010)onthefieldalsoincludesreinforcementlearning(SuttonandBarto,1998), whichisclosely related to learning as is realized by biological intelligent entities. We shall discuss reinforcement learning separately. In broad terms, supervised learning deals with learning-by-example: given a certain number of labeled points (so-called training set) (x ,y ) where x denote data points, e.g. N dimensional i i i i { } − vectors, and y denote labels (e.g. binary variables, or real values), the task is to infer a “labeling i rule” x y which allows us to guess the labels of previously unseen data, that is, beyond the i i (cid:55)→ training set. Formally speaking, we deal with the task of inferring the conditional probability distribution P(Y = y X = x) (more specifically, generating a labeling function which, perhaps | probabilistically, assigns labels to points) based on a certain number of samples from the joint 10 distributionP(X,Y).Forexample,wecouldbeinferringwhetheraparticularDNAsequencebelongs to an individual who is likely to develop diabetes. Such an inference can be based on the datasets of patients whose DNA sequences had been recorded, along with the information on whether they actually developed diabetes. In this example, the variable Y (diabetes status) is binary, and the assignmentoflabelsisnotdeterministic,asdiabetesalsodependsonenvironmentalfactors. Another example could include two real variables, where x is the height from which an object is dropped, and y the duration of the fall. In this example, both variables are real-valued, and (in vacuum) the labeling relation will be essentially deterministic. In unsupervised learning, the algorithm is provided just with the data points without labels. Broadly speaking, the goal here is to identify the underlying distribution, or structure, and other informative features in the dataset. In other words, the task is to infer properties of the distribution P(X =x), based on a certain number of samples, relative to a user-specified guideline or rule. Standard examples of unsupervised learning are clustering tasks, where data-points are supposed be grouped in a manner which minimizes within-group mean-distance, while maximizing the distance between the groups. Note that the group membership can be thought of as a label, thus this also corresponds to a labeling task, but lacks “supervision”: examples of correct labelings. In basic examples of such tasks, the number of expected clusters is given by the user, but this too can be automatically optimized. Other types of unsupervised problems include feature extraction and dimensionality reduction, critical in combatting the so-called curse of dimensionality. The curse of dimensionality refers to problems which stem from the fact that the raw representations of real-life data often occupy very high dimensional spaces. For instance, a standard resolution one-second video-clip at standard refreshfrequency,capturingeventswhichareextendedintimemapstoavectorin 108 dimensional ∼ space10, even though the relevant information it carries (say a licence-plate number of a speeding car filmed) may be significantly smaller. More generally, intuitively it is clear that, since geometric volume scales exponentially with the dimension of the space it is in, the number of points needed to capture (or learn) general features of an n dimensional object will also scale exponentially. − In other words, learning in high dimensional spaces is exponentially difficult. Hence, a means of dimensionality reduction, from raw representation space (e.g. moving car clips), to the relevant feature space (e.g. licence-plate numbers) is a necessity in any real-life scenario. These approaches the data-points to a space of significantly reduced dimension, while attempting to maintain the main features – the relevant information – of the structure of the data. A typical exampleofadimensionalityexampletechniqueise.g. principalcomponentanalysis. Inpractice,such algorithms also constitute an important step in data pre-processing for other types of learning and analysis. Furthermore, this setting also includes generative models (related to density estimation), where new samples from an unknown distribution are generated, based on few exact samples. As humanity is amassing data at an exponential rate (insideBIGDATA, 2017) it becomes ever more relevant to extract genuinely useful information in an automated fashion. In modern world ubiquitous big data analysis and data mining are the central applications of supervised and unsupervised learning. 10 Eachframeiscca. 106 dimensional,aseachpixelconstitutesonedimension,multipliedwith30framesrequiredfor theone-secondclip.

Description:
Max Planck Institute of Quantum Optics, Garching 85748, Germany QML explores the interaction between quantum computing and machine learning, inves- A. Quantum mechanics, computation and information processing. 4.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.