ebook img

Using Firing-Rate Dynamics to Train Recurrent Networks of Spiking Model Neurons PDF

5.4 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Using Firing-Rate Dynamics to Train Recurrent Networks of Spiking Model Neurons

Using Firing-Rate Dynamics to Train Recurrent Networks of Spiking Model Neurons 6 1 0 BrianDePasquale,MarkM.Churchland1,L.F.Abbott1,2 2 n DepartmentofNeuroscience a 1GrossmanCenterfortheStatisticsofMind J 2DepartmentofPhysiologyandCellularBiophysics 8 2 ColumbiaUniversityCollegeofPhysiciansandSurgeons NewYorkNY10032-2695USA ] C N . o Abstract i b - q [ Recurrent neural networks are powerful tools for understanding and modeling computa- tion and representation by populations of neurons. Continuous-variable or “rate” model 1 v networks have been analyzed and applied extensively for these purposes. However, neu- 0 rons fire action potentials, and the discrete nature of spiking is an important feature of 2 6 neural circuit dynamics. Despite significant advances, training recurrently connected spik- 7 ing neural networks remains a challenge. We present a procedure for training recurrently 0 . connectedspikingnetworkstogeneratedynamicalpatternsautonomously,toproducecom- 1 0 plextemporaloutputsbasedonintegratingnetworkinput,andtomodelphysiologicaldata. 6 Our procedure makes use of a continuous-variable network to identify targets for train- 1 : ing the inputs to the spiking model neurons. Surprisingly, we are able to construct spiking v networks that duplicate tasks performed by continuous-variable networks with only a rel- i X atively minor expansion in the number of neurons. Our approach provides a novel view r a of the significance and appropriate use of “firing rate” models, and it is a useful approach forbuildingmodelspikingnetworksthatcanbeusedtoaddressimportantquestionsabout representationandcomputationinneuralsystems. Introduction A fundamental riddle of nervous system function is the disparity between our continuous and comparatively slow sensory percepts and motor actions and the neural representation of those percepts and actions by brief, discrete and spatially distributed actions potentials. Arelatedpuzzleisthereliabilitywithwhich thesesignalsarerepresenteddespitethevari- ability of neural spiking across nominally identical performances of a behavior. A useful approachtoaddressingtheseissuesistobuildspikingmodelnetworksthatperformrelevant tasks, but this has proven difficult to do. Here we develop a method for constructing func- tioningnetworksofspikingmodelneuronsthatperformavarietyoftaskswhileembodying the variable character of neuronal activity. In this context, “task” refers to a computation performedbyabiologicalneuralcircuit. There have been previous successes constructing spiking networks that perform specific tasks (see for example Seung et. al. (2000); Wang (2002); Machens, Romo & Brody (2005); Hennequin et. al. (2014)). In addition, more general procedures have been devel- oped (reviewed in Abbott, DePasquale & Memmesheimer (2016)) that construct spiking networksthatduplicatesystemsoflinear(Eliasmith,2005;Boerlin&Dene`ve,2011;Boer- lin, Machens & Dene`ve, 2013) and nonlinear (Eliasmith, 2005; Thalmeier et. al., 2015) equations. However, most tasks of interest to neuroscientists, such as action choices based onpresentedstimuli,arenotexpressedintermsofsystemsofdifferentialequations. Our work uses continuous-variable network models (Sompolinsky, Crisanti & Sommers, 1988), typically called “rate” networks, as an intermediary between conventional task de- scriptionsintermsofstimuliandresponses,andspikingnetworkconstruction.Thisresults in a general procedure for constructing spiking networks that perform a wide variety of tasks of interest to neuroscience (see also Thalmeier et. al. (2015); Abbott, DePasquale & Memmesheimer (2016)). We apply this procedure to example tasks and show how con- straints on the sparseness and sign (Dale’s law) of network connectivity can be imposed. We also build a spiking network model that matches multiple features of data recorded fromneuronsinmotorcortexandfromarmmusclesduringareachingtask. Results The focus of our work is the development of a procedure for constructing recurrently con- nected networks of spiking model. We begin by describing the model-building procedure andthenpresentexamplesofitsuse. Networkarchitectureandnetworktraining ThegeneralarchitectureweconsiderisarecurrentlyconnectednetworkofNleakyintegrate- and-fire(LIF)modelneuronsthatreceivestask-specificinputF (t)and,followingtraining, in produces an approximation of a specified “target” output signal F (t) (Figure 1a). F (t) out in can be thought of as external sensory input or as input from another neural network, and F (t) as the input current into a downstream neuron or as a more abstractly defined net- out 2 a) b) Figure1:Networkarchitectures.a)Spikingnetwork.AnetworkofN recurrentlyconnectedleakyintegrate- and-fire neurons (green circles) receives an input F (t) (grey circle) through synapses U, and generates an in output F (t) (red circle) through synapses W. Connections marked in red (recurrent connections J and out output connections W) are modified by training, and black connections are random and remain fixed, in- cludingasecondsetofrecurrentconnectionswithstrengths Jf.b)Continuous-variablenetwork.Anetwork of N˜ recurrently connected “rate” units (blue circles) receive inputs F (t) and F (t) through synapses U˜ in out andu˜,respectively.Allconnectionsarerandomandheldfixed.ThesumofU˜F (t)andtherecurrentinput out determinedby J˜definestheauxiliarytargetsF (t)forthespikingnetwork. J work output (for example a motor output signal or a decision variable). The neurons in the network are connected to each other by synapses with strengths denoted by the N × N matrix J. Connections between the network and to the output have strengths given by an N × N matrix W, where N is the number of outputs (either 1 or 2 in the examples we out out provide). During network training both J and W are modified. In addition to the trained connectionsdescribedby J,wealsoincluderandomconnectionsdefinedbyanother N×N matrix, Jf. The elements of this matrix are chosen randomly from a Gaussian distribution with mean µ/N and variance g2/N and are not modified during training. The values of µ f and g are given below for the different examples we present. This random connectivity f produces chaotic spiking in the network (van Vreeswijk & Sompolinsky, 1998; Brunel, 2000),whichweuseasasourceofspikingirregularityandtrial-to-trialvariability.Weuse theparameterg tocontrolthelevelofthisvariability. f When a neuron in the network fires an action potential, it contributes both fast and slow synaptic currents to other network neurons. These currents are described by the two N- dimensional vectors, s(t) and f(t). When neuron i in the network fires an action potential, componentiofboth s(t)and f(t)isincrementedby1,otherwise ds(t) df(t) τ = −s(t) and τ = −f(t). (1) s f dt dt 3 Thetwotimeconstantsdeterminethedecaytimesoftheseslowandfastsynapticcurrents, andwesetτ = 100msandτ = 2ms.Thesynapsesthataremodifiedbytrainingareallof s f the slow type, while the random synapses are fast. For example, the output of the network istheproductof s(t)andtheoutputweightmatrixW,Ws(t)(Figure1a). The membrane potentials of the model neurons, collected together into a N-component vectorV,obeytheequation dV (cid:16) (cid:17) τ = V −V +g Js(t)+ Jff(t)+UF (t) +I, (2) m rest in dt with τ = 20 ms. For a case with N inputs, U is an N × N matrix (we consider N = 1 m in in in and 2) with elements drawn independently from a uniform distribution between -1 and 1. I is a bias current set equal to 10 mV. It is increased between trials in the examples of Figures3and4,representinga“holding”input.Eachneuronfiresanactionpotentialwhen its membrane potential reaches a threshold V = −55 mV and is then reset to V = th reset V = −65 mV. Following an action potential, the membrane potential is held at the reset rest potential for a refractory period of 2 ms unless stated otherwise. The parameter g controls thestrengthoftheinputstoeachneuron,andweprovideitsvalueforthedifferentexamples below. We can now specify the goal and associated challenges of network training. The goal is to modify the entries of J andW so that the network performs the task specified by F (t) in and F (t),meaningthat out Ws(t) ≈ F (t) (3) out when the network responds to F (t) (with the approximation being as accurate as possi- in ble). Equation 3 stipulates that s(t) must provide a basis for the function F (t). If it does, out it is straightforward to compute the optimal W by minimizing the squared difference be- tween the two sides of equation 3, averaged over time. This can be done either recursively (Haykin,2002)orusingastandardbatchleast-squaresapproach. Determiningtheoptimal J issignificantlymorechallengingbecauseoftherecurrentnature ofthenetwork. J mustbechosensothattheinputtothenetworkneurons, Js(t),generates a pattern of spiking that produces s(t). The circularity of this constraint is what makes recurrentnetworklearningdifficult.Thedifferencebetweentheeasyproblemofcomputing W and the difficult problem of computing J is that, in the case of W, we have the target F (t) in equation 3 that specifies what signal W should produce. For J, it is not obvious out whattheinputitgeneratesshouldbe. Suppose that we did have targets analogous to F (t) but for computing J (we call them out auxiliary target functions and denote them by the N-component vector F (t)). Then, J, J likeW, could be determined by a least-squares procedure, that is, by minimizing the time- averagedsquareddifferencesbetweenthetwosidesof Js(t) ≈ F (t). (4) J 4 Therearestabilityissuesassociatedwiththisprocedure,thatwediscussbelow,howeverthe main challenge in this approach is to determine the appropriate auxiliary target functions. Our solution to this problem is to obtain them from a continuous-variable model. More generally,ifwecantrainorotherwiseidentifyanothermodelthatimplementsasolutionto atask,wecanusesignalsgeneratedfromthatmodeltotrainourspikingnetwork. Usingcontinuousvariablemodelstodetermineauxiliarytargetfunctions Equations 4 and 3, respectively, summarize two key features of the vector of functions F (t): 1) They should correspond to the inputs of a recurrently connected dynamic sys- J tem, and 2) They should provide a basis for the network output F (t). To satisfy the out first of these requirements, we identify F (t) with the inputs of a recurrently connected J continuous-variable “rate” network. These networks have been studied intensely (Som- polinsky, Crisanti & Sommers, 1988; Rajan, Abbott & Sompolinsky, 2010) and have been trained to perform a variety of tasks (Jaeger & Haas, 2004; Sussillo & Abbott, 2009; Laje & Buonomano, 2013; Sussillo, 2014). To satisfy the second condition, we use the desired spikingnetworkoutput, F (t),asaninput totheratenetwork.Thisallowsustoobtainthe out auxiliarytargetfunctionswithouthavingtotrainthecontinuousvariablenetwork. The continuous-variable model we use is a randomly connected network of N˜ firing-rate units(throughoutweusetildestodenotequantitiesassociatedwiththecontinuous-variable network). Like the spiking networks, these units receive the input F (t) and, as mentioned in above, they also receive F (t) as an input. The continuous-variable model is described by out an N˜-componentvector x(t)thatsatisfiestheequation dx(t) τ = −x(t)+g˜J˜H(x(t))+u˜F (t)+U˜F (t), (5) x out in dt where τ = 10 ms, H is a nonlinear function (we use H(·) = tanh(·)), and J˜, u˜, and U˜ are x matrices of dimension N˜ × N˜, N˜ × N and N˜ × N , respectively. The elements of these out in matricesarechosenindependentlyfromaGaussiandistributionofzeromeanandvariance 1/N for J,andauniformdistributionbetween-1and1foru˜ andU˜ unlessstatedotherwise. Wesetg˜ = 1.2exceptwhereidentifiedotherwise. To be sure that signals from this driven network are appropriate for training the spiking model, the continuous-variable network, driven by the target output, should be capable of producing a good approximation of F (t). To check this, we can test whether an N × N˜ out out matrixcanbefound(byleastsquares)thatsatisfiesW˜ H(x(t)) ≈ F (t)toasufficientdegree out of accuracy. Provided J˜ and u˜ are appropriately scaled, this can be done for a wide range oftasks(Sussillo,2014). The auxiliary target functions F (t) that we seek are generated from the inputs to the neu- J rons in the continuous-variable network. There is often, however, a mismatch between the dimensionsof F (t),whichis N,andoftheinputstothecontinuous-variablemodel,which J isN˜.Todealwiththis,weintroduceanN×N˜ matrixu,withelementsdrawnindependently 5 (cid:112) fromauniformdistributionovertherange± 3/N˜,andwrite (cid:16) (cid:17) F (t) = u g˜J˜H(x(t))+u˜F (t) . (6) J out We leave out the input term proportion to F (t) in this expression because the spiking in network receives the input F (t) directly. This set of target functions satisfies both of the in requirements listed at the beginning of this section and, as we show in the following ex- amples,theyallowfunctionalspikingnetworkstobeconstructedbyfindingconnections J that satisfy equation 4. We do this initially by a recursive least squares algorithm (Haykin, 2002),butlaterwediscusssolvingthisproblembybatchleastsquaresinstead. Examplesoftrainednetworks Theproceduredescribedabovecanbeusedtoconstructnetworksthatperformavarietyof tasks. We present three examples that range from tasks inspired by problems of relevance toneurosciencetomodelingexperimentaldata. Our first example is an autonomous oscillation task that requires the network to generate a self-sustained,temporallycomplexoutput(Figure2).F (t)forthistaskisaperiodicfunc- out tion created by summing sine functions with frequencies of 1, 2, 3, and 5 Hz. We require that the network generates this output autonomously, therefore F (t) = 0 for this exam- in ple.Complex,autonomousoscillatorydynamicsareafeatureofneuralcircuitsinvolvedin repetitivemotoractssuchlocomotion(Marder,2000). Initially J = 0, so the activity of the network is determined by the random synaptic input prodvide by Jf, and the neurons exhibit irregular spiking (Figure 2a-c). In this initial con- figuration, the average Fano factor, computed using 100 ms bins, is 0.5, and the average firingrateacrossthenetworkis5Hz.Followingthetrainingprocedure,thelearnedpostsy- naptic currents Js(t) closely match their respective auxiliary target functions (Figure 2d), and the network output similarly matches the target F (t) (Figure 2e). Residual chaotic out spiking due to Jf (Figure 2c) and the fact that we are approximating a continuous function byasumofdiscontinuousfunctionscauseunavoidabledeviations.Nevertheless,anetwork of3,000LIFneuronsfiringatanaveragerateof6.5HzwithanaverageFanofactorof0.25 performs this task with normalized post-training error of 5% (this error is the variance of thedifferencebetweenWs(t)and F (t)dividedbythevarianceof F (t)). out out Because the output for this first task can be produced by a linear dynamical system, previ- ous methods could also have been used to construct a functioning spiking network (Elia- smith, 2005; Boerlin, Machens & Dene`ve, 2013). However, this is no longer true for the following examples. In addition, it is worth noting that the network we have constructed generates its output as an isolated periodic attractor of a nonlinear dynamical system. The otherprocedures,inparticularthatofBoerlin,Machens&Dene`ve(2013),createnetworks that reproduce the linear dynamics that generates F (t). This results in a system that can out produce not only Ws(t) ≈ F (t), but also Ws(t) ≈ αF (t) over a continuous range of α out out values.ThisoftenresultsinaslowdriftintheamplitudeofWs(t)overtime.Thepointhere 6 pre-training post-training a V m 0 1 b ns o ur e n 0 0 2 c V m 0 4 d V m 0 1 e 1 300 ms 300 ms Figure2:Networkactivitybeforeandaftertrainingforanautonomousoscillationtask.(a)Membranevolt- age and spiking activity of two example neurons. (b) Raster plot of 200 neurons. (c) Random recurrent input Jff(t) for two example neurons. (d) Auxiliary target function F (t) (black) and learned recurrent in- J put Js(t) (red) for two example neurons. (e) Target output F (t) (black) and the generated output Ws(t) out (red)overoneperiod.Parameterforthisexample:N =3000,g=7mV,µ=−57,g =17,andN˜ =1000. f is that our procedure solves a different problem than previous procedures, despite the fact that it generates the same output. The previous procedures were designed to duplicate the lineardynamicsthatproduce F (t),whereasourprocedureduplicates F (t)uniquely. out out ThesecondtaskwepresentisatemporalXORtaskthatrequiresthenetworktocategorize theinputitreceivesonagiventrialandreportthisdecisionthroughitsoutput.Eachtrialfor thistaskconsistsofasequenceoftwopulsesappearingasthenetworkinput F (t)(Figure in 3).Eachpulsehasanamplitudeof0.3,anditsdurationcanbeeithershort(100ms)orlong (300 ms). The two pulses are separated by 300 ms, and after an additional 300 ms delay the network must respond with either a positive or a negative pulse (with a shape given by 1/2 cycle of a 1 Hz sinewave). The rule for choosing a positive or negative output is an exclusive OR function of the input sequence (short-short → −, short-long → +, long-short → +, long-long → −). The time between trials and the input sequence on each trial are 7 {100 ms, 300 ms} {300 ms, 300 ms} {100 ms, 100 ms} {300 ms, 100 ms} 0 time (ms) 2100 0 time (ms) 2100 Figure 3: Temporal XOR task. The input F (t) (black) consists of two pulses that are either short or long in induration.TheoutputWs(t)(red)shouldreportanXORfunctionofthecombinationofpulses.Membrane potentials of 10 example neurons (blue) are shown for the 4 different task conditions. Parameters for this example:N =3000,g=10mV,µ=−40,g =12andN˜ =1000. f chosenrandomly. A network of 3,000 LIF neurons with an average firing rate of 7 Hz can perform this task correctlyon95%oftrials.Asintheautonomousoscillationtask,individualneuronspiking activityvariesfromtrial-to-trialduetotheeffectof Jf.TheFanofactorcomputedacrossall neurons,allanalysistimes,andalltaskconditionsis0.26.Thistaskrequiresintegrationof eachinputpulse,storageofamemoryofthefirstpulseatleastuntilthetimeofthesecond pulse, memory of the decision during the delay period before the output is produced, and classificationaccordingtotheXORrule. GeneratingEMGactivityduringreaching Wenowturntoanexamplebasedondatafromanexperimentalstudy,withthespikingnet- work generating outputs that match electromyograms (EMGs) recorded in 2 arm muscles 8 of a non-human primate performing a reaching task (Churchland, Lara & Cunningham, 2016).Inthistask,atrialbeginswiththeappearanceofatargetcueatoneofeightpossible reach directions (task conditions). After a short delay, during which the arm position must be held fixed, a “go” cue appears, instructing a reach to the target. The time between trials andthesequenceofreachdirectionsarevariedrandomly. To convey information about target location to the model, we use two network inputs de- noted by a two-component vector F (t) and with amplitudes 2cos(θ) and 2sin(θ) where in the angle θ specifies the reach direction (Figure 4a left). The input is applied for 500 ms and,whenitterminates,thenetworkisrequiredtogeneratetwooutputs(thus F (t)isalso out two-dimensional) that match trial-averaged and smoothed EMG recordings from the ante- rior and posterior deltoid muscles during reaches to the specified target (Figure 4a right & e). A network of 5,000 neurons with an average firing rate of 6 Hz performs this task with a normalized post-training error of 7% (Figure 4e), consistent with another modeling study that used a firing-rate network (Sussillo et. al., 2015). The activity of the trained network exhibits several features consistent with recordings from neurons in motor cortex. Individ- ual neurons show a large amount of spiking irregularity that is variable across trials and conditions (Figure 4b). The Fano factor computed across all neurons and all task condi- tions drops during task execution (Figure 4d), consistent with observations across a wide range of cortical areas (Churchland et. al., 2010). This network shows that EMGs can be generatedbyanetworkwitharealisticdegreeofspikingvariability. Another interesting feature of the network activity is the variety of different neural re- sponses. Individual neurons display tuning during the input period, the output period, or acrossmultipleepochswithdifferenttunings(Figure4f).Asinrecordingsfrommotorcor- tex, consistently tuned neurons represent a minority of the network; most neurons change theirtuningduringthetask(Figure4g). To examine the population dynamics of this model, we performed PCA on filtered (4 ms Gaussianfilter)single-trialspiketrainsfromaT×NCdatamatrix,whereT isthenumberof timessampledduringatrial(2900),N isthenumberofneurons(5000),andCisthenumber of reach conditions (8). We computed the eigenvectors of the T × T covariance matrix obtained from these “data”, generating temporal PCs (Figure 5a & b). Each temporal PC representsatemporalfunctionthatisstronglyrepresentedacrossthepopulationandacross all reach conditions, albeit in different ways across the population on a single trial for each condition. Two important features emerge from this analysis. First, more prominent single-trialPCs(Figure5a)arerelativelyconsistentfromtrial-to-trial,whilelessprominent PCs(Figure5b)varymore.Second,theprominentPCsfluctuateonaslowertimescalethan the less prominent PCs. These two features indicate that the more prominent PCs form the basis of the network output on a single trial, while the less prominent PCs reflect network noise. This can be verified by reconstructing the network output using increasing numbers of temporal PCs and calculating the fraction of the output variance captured (Figure 5c 9 a e b f c z H 0 3 d g1.0 1.0 C F A F 0.5 0.0 200 ms move onset 200 ms move onset Figure 4: Producing EMG during reaching. (a) Task design. A two-dimensional input (left) is applied to thenetworkfor500mstospecifyareachdirectionafterwhichthenetworkmustproduceoutputmatching thecorrespondingEMGactivitypatternsrecordedfromtwoarmmuscles(right).Eachcolorrepresentsthe activityforaspecificdirection.(b)Rasterplotshowingtheactivityofasingleneuronacrossalltrials(each row)forallconditions(differentcolors).TheFanofactorforthisneuronis1.2.(c)Firingrateoftheneuron shownin(b).Eachcolorrepresentsthetrial-averagedfiringrateforasinglecondition.(d)TheFanofactoras afunctionoftimecomputedacrossallneuronsandconditions.(e)Ws(t)forbothoutputsandallconditions (differentcolors)onasingletrial.(f)Firingratesforfournetworkneurons.Someneuronsaretunedduring input or output periods exclusively (bottom two plots), while most are tuned during both periods (top two plots). (g) Trial-averaged firing rate autocorrelation, averaged across all neurons and all conditions. By the time of movement, the autocorrelation function is near zero, indicating that the tuning between input and movement periods is, on average, uncorrelated. The time bar in all panels represents 200 ms, and the dot denotesmovementonset.Parametersforthisexample:N =5000,µ=−112,g =33,g=7.2mV,N˜ =1000, f g˜ =1.4.Also,inthisexample,theelementsofu˜ werechosenrandomlyanduniformlyovertherange±0.76, and the range for U˜ was ±0.25. The refractory period was set to zero in this example. EMGs were filtered witha4msGaussianwindow. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.