Malte J. Rasch, Arthur Gretton, Yusuke Murayama, Wolfgang Maass and Nikos K. Logothetis J Neurophysiol 99:1461-1476, 2008. First published Dec 26, 2007; doi:10.1152/jn.00919.2007 You might find this additional information useful... This article cites 47 articles, 16 of which you can access free at: http://jn.physiology.org/cgi/content/full/99/3/1461#BIBL Updated information and services including high-resolution figures, can be found at: http://jn.physiology.org/cgi/content/full/99/3/1461 Additional material and information about Journal of Neurophysiology can be found at: http://www.the-aps.org/publications/jn This information is current as of April 14, 2008 . D o w n lo a d e d fro m jn .p h y s io lo g y .o rg o n A p ril 1 4 , 2 0 0 8 Journal of Neurophysiology publishes original articles on the function of the nervous system. It is published 12 times a year (monthly) by the American Physiological Society, 9650 Rockville Pike, Bethesda MD 20814-3991. Copyright © 2005 by the American Physiological Society. ISSN: 0022-3077, ESSN: 1522-1598. Visit our website at http://www.the-aps.org/. JNeurophysiol99:1461–1476,2008. FirstpublishedDecember26,2007;doi:10.1152/jn.00919.2007. Inferring Spike Trains From Local Field Potentials Malte J. Rasch,1 Arthur Gretton,2 Yusuke Murayama,2 Wolfgang Maass,1 and Nikos K. Logothetis2,3 1InstituteforTheoreticalComputerScience,GrazUniversityofTechnology,Graz,Austria;2MaxPlanckInstituteforBiological Cybernetics,Tu¨bingen,Germany;and3ImagingScienceandBiomedicalEngineering,UniversityofManchester,Manchester, UnitedKingdom Submitted16August2007;acceptedinfinalform24December2007 RaschMJ,GrettonA,MurayamaY,MaassW,LogothetisNK. is used in most recordings to obtain multiple-unit spiking Inferring spike trains from local field potentials. J Neurophysiol activity(MUA),andalow-passfiltercutoffofabout300Hzto 99: 1461–1476, 2008. First published December 26, 2007; obtain the so-called local field potentials (LFPs). Numerous doi:10.1152/jn.00919.2007.Weinvestigatedwhetheritispossibleto experiments have presented data indicating that such a band infer spike trains solely on the basis of the underlying local field separation does indeed underlie different neural events (for potentials (LFPs). Using support vector machines and linear regres- references see, e.g., Logothetis 2003). sion models, we found that in the primary visual cortex (V1) of In summary, depending on the recording site and the elec- monkeys, spikes can indeed be inferred from LFPs, at least with moderatesuccess.Althoughthereisaconsiderabledegreeofvariation trode properties, the MUA most likely represents a weighted across electrodes, the low-frequency structure in spike trains (in the sumoftheextracellularactionpotentialsofallneuronswithin D 100-ms range) can be inferred with reasonable accuracy, whereas a sphere whose radius is about 140 to 300 (cid:2)m, with the ow exactspikepositionsarenotreliablypredicted.Twokindsoffeatures electrodeatitscenter(Henzeetal.2000).Spikesproducedby n lo oftheLFPareexploitedforprediction:thefrequencypowerofbands the synchronous firings of many cells can, in principle, be a in the high (cid:1)-range (40–90 Hz) and information contained in low- enhanced by summation and thus detected over a larger dis- de furleaqtiuoennscyaroesciinllfaotrimonasti(v(cid:1)e.10InHfozr)m,wathioenreabnoatlhyspihsasreevaenadlepdowtheartmboodth- tance (Arezzo et al. 1979; Huang and Buchwald 1977). In d fro general, experiments have shown that large-amplitude signal m featurescode(mainly)independentaspectsofthespike-to-LFPrela- variations in the MUA range reflect large-amplitude extracel- jn tcilounsstehriepd, wspitihkitnhge alocwtiv-fitrye.quAenltchyouLgFhPbpohthasefecaotudriensgafnodr tepmrepdoicrtailolyn lular potentials and that small-amplitude fast activity is corre- .ph y qualityaresimilarduringseminaturalmoviestimuliandspontaneous latedwithsmallones(BuchwaldandGrover1970;Gasserand sio activity,predictionperformanceduringspontaneousactivitydegrades Grundfest 1939; Grover and Buchwald 1970; Hunt 1951; lo much more slowly with increasing electrode distance. The general Nelson 1966). gy trend of data obtained with anesthetized animals is qualitatively The low-frequency range (i.e., the LFPs) of the mEFP .o rg mirrored in that of a more limited data set recorded in V1 of signal, on the other hand, represents mostly slow events re- o non-anesthetizedmonkeys.Incontrasttothecorticalfieldpotentials, flecting cooperative activity in neural populations. Initially n A thalamicLFPs(e.g.,LFPsderivedfromrecordingsinthedorsallateral these signals were thought to represent exclusively synaptic p gacetniivciutyla.tenucleus)holdnousefulinformationforpredictingspiking eavnedntBso(nAdjm19o6n4e-,M1a9r6s7a)n. 1E9v6id5e;nBcuechfowratlhdeeirt aolr.ig1i9n65w;aFsroomftemn ril 14, 2 gatheredfromcurrent-sourcedensity(CSD)analysisandcom- 0 0 bined field potential and intracellular recordings (Mitzdorf 8 INTRODUCTION 1985; Nadasdy et al. 1998). Mitzdorf suggested that LFPs In a typical electrophysiology experiment, the signal mea- actually reflect a weighted average of synchronized dendroso- sured by an electrode placed at a neural site represents the matic components of the synaptic signals of a neural popula- mean extracellular field potential (mEFP) from the weighted tionwithin0.5–3mmoftheelectrodetip(Juergensetal.1999; sum of all current sinks and sources along multiple cells. If a Mitzdorf 1987). Later studies, however, provided evidence of microelectrode with a small tip is placed close to the soma or the existence of other types of slow activity unrelated to axonofaneuron,thenthemeasuredmEFPdirectlyreportsthe synapticevents,includingvoltage-dependentmembraneoscil- spiketrafficofthatneuronandfrequentlythatofitsimmediate lations (e.g., Kamondi et al. 1998) and spike afterpotentials. neighbors as well. If the impedance of the microelectrode is Taken together, LFPs represent slow waveforms, including sufficientlylowanditsexposedtipisabitfartherfromasingle synaptic potentials, afterpotentials of somatodendritic spikes, large pyramidal cell, so that action potentials do not predom- andvoltage-gatedmembraneoscillations,thatreflecttheinput inate the neural signal, then the electrode can monitor the of a given cortical area as well as its local intracortical totality of the potentials in that region. The EFPs recorded processing, including the activity of excitatory and inhibitory undertheseconditionsarerelatedbothtointegrativeprocesses interneurons. (dendritic events) and to spikes generated by several hundred Given the different natures of LFPs and MUA, we felt that neurons. it would be interesting to address the question of whether one Thetwodifferentsignaltypescanbesegregatedbyfrequency- can infer spiking of neurons from the locally measured field bandseparation.Ahigh-passfiltercutoffofabout300–500Hz potentials.Hereinweaddressthisquestioninastraightforward Address for reprint requests and other correspondence: N. K. Logothetis, Thecostsofpublicationofthisarticleweredefrayedinpartbythepayment Max Planck Institute for Biological Cybernetics, Spemannstrasse 38, 72076 ofpagecharges.Thearticlemustthereforebeherebymarked“advertisement” Tu¨bingen,Germany(E-mail:[email protected]). inaccordancewith18U.S.C.Section1734solelytoindicatethisfact. www.jn.org 0022-3077/08$8.00Copyright©2008TheAmericanPhysiologicalSociety 1461 1462 RASCH, GRETTON, MURAYAMA, MAASS, AND LOGOTHETIS manner. We use methods derived from the field of machine cluding drives, holder, and preamplifier, was custom designed to learning in an attempt to infer exact spike timings from the minimizecrosstalkofsignalsbetweenelectrodes(typically(cid:5)1ppm). underlying LFPs. We compare the accuracy of spikes trains Thesignalswereamplifiedandfilteredintoabandof1–8kHz(Alpha predictedbysupervisedlearningalgorithmsonawiderangeof Omega Engineering, Nazareth, Israel) and then digitized at 21 kHz recordingsfromtheprimaryvisualcortex(V1)aswellasfrom with 16-bit resolution (National Instruments, Austin, TX), ensuring the lateral geniculate nucleus (LGN) of anesthetized and non- enoughresolutionforbothlocalfieldandspikingactivities.Binocular visual stimulation was provided through a two-fiber optic system anesthetized macaques and investigate what kinds of features (Avotec, Stuart, FL) after fine alignment to each of the animal’s of the LFP are important for inferring spikes from LFP. foveas by a modified retinoscope coupled with a stimulus projector holder. METHODS Inthecaseoftheanesthetizedanimalswedifferentiatebetweentwo different conditions: spontaneous activity (“spo”) and movie-driven Data acquisition data(“stm”).Intheformertheinputscreenisblankforabout5min; Electrophysiologicaldatarecordedfromnineanesthetizedandtwo inthelattera4-to6-minsegmentofacommerciallyavailablemovie non-anesthetized monkeys (Macaca mulatta) are included in the isshown.Movieframesweresynchronizedwiththerefreshrateofthe present study. All animal experiments were approved by the local monitor(60Hz,twosynchronizationspermovieframe)andcovered authorities (Regierungspraesidium) and are in full compliance with 7–12° of the visual field. Most of the electrodes were confirmed to theguidelinesoftheEuropeanCommunity(EUVD86/609/EEC)for haveareceptivefieldwithinthemoviepresentationarea(seeFig.1B the care and use of laboratory animals. Surgical procedures are foranexample).Multipletrialsofmoviepresentationsandspontane- described elsewhere (Logothetis et al. 2002) together with hardware ous activity are run within one recording session (intermingled with detailsoftherecordingsetup. recordingsofotherstimulinotconsideredhere).Fromthesedatawe Toperformtheneurophysiologicalrecordingsinanesthetizedmon- include1,304recordedtimeseriesinthepresentstudy,whichwecall D o keys, the animals were anesthetized [remifentanil (typically 1 trialsthroughoutthisarticle.Thedatasetconsistedofrecordingsfrom w (cid:2)g(cid:1)kg(cid:2)1(cid:1)min(cid:2)1)], intubated, and ventilated. Muscle relaxation was nineanimalscollectedin12recordingsessions.Fromeachsessionwe nlo achievedwithmivacurium(5mg(cid:1)kg(cid:2)1(cid:1)h(cid:2)1).Bodytemperaturewas takefiverepeatsofmoviepresentationandfiverepeatsofspontaneous a d keptconstant,andlactatedRingersolutionwasgivenatarateof10 activity trials (with the exception of j97nm1, where only one movie e d mmol(cid:1)nkkge(cid:2)y1(cid:1)ahn(cid:2)d1.thDeudreinpgththoef aennetisrteheesxiapewriemreenct,onthtienuvoiutasllysimgnosniotofrtehde. terlieacltirsodaevacihlaabnlnee).lsTpoearvsoeisdsiaonnyasruebijneccltuivdeeds.elTehctiisornesbuialtssailnlm67e0asturiraelds from Drops of 1% ophthalmic solution of anticholinergic cyclopentolate for spontaneous activity and 634 for movie stimulus recorded using jn hydrochloridewereinstilledintoeacheyetoachievecycloplegiaand 134 electrode placements. Movies are identical within a session but .p h mydriasis. Refractive errors were measured and contact lenses [hard maydifferbetweensessions. y s poly(methylmethacrylate)lensesbyWo¨hlkContactlinsen,Karlsruhe, In three sessions (two anesthetized monkeys) no more than four io Germany]withtheappropriatedioptricpowerwereusedtobringthe electrodes were simultaneously placed in the LGN. Thus stimuli lo g animal’seyeintofocusonthestimulusplane.Simultaneousrecording reflected in these data (100 trials) are exactly identical to those in y .o of neural activities were made from the primary visual cortex (V1) correspondingrecordingsoftheV1data.Thisdataisseparatedagain rg using8–16electrodesconfiguredin4(cid:3)4or2(cid:3)8matricesinagrid according to the conditions spontaneous activity (“spo(L)”) and o of1–2mm.Electrodetipsweretypically(butnotalways)positioned movie-driven activity (“stm(L)”). Data from non-anesthetized mon- n A intheupperormiddlecorticallayers.Theimpedanceoftheelectrode keysaremorelimitedandwereincludedinthepresentstudyonlyto p varied from 300 to 800 k(cid:4). In the case of simultaneous LGN corroborate results for the data described earlier. They are recorded ril 1 recordinganadditionalsetofdrives,usuallyconsistingoftwotofour usingonetofourtetrodesfromchronicimplantspenetratingV1(for 4 electrodes, was additionally positioned. The electronic interface, in- a detailed description of surgical methods and recording setup see , 2 0 0 8 A B Receptive fields 600 z] H pike rate [ 240000 °ual field [] −05 s s st. Vi n 0 −10 I 5 −5 0 5 Visual field [°] s] nit u D 0 V1 electrode S P [ LGN electrode F Stimulus region L −5 25 30 35 40 45 Time [sec] FIG.1. A:representativeelectroderecordingfromSessiona98nm5ofananesthetizedmonkey.Topplot:theinstantaneousfiringrateofanexperimentduring moviepresentationfromaprimaryvisualcortex(V1)electrodeinasmalltimeregion.Moviepresentationstartsafterablankperiodat30s.Recordingtime of170-sdurationstarting5saftermovieonsetisusedforpredictionperformanceevaluationandiscalledatrial(seeMETHODS).Bottomplot:thelocalfield potential(LFP)tracecorrespondingtotheV1electrodeabove.B:thearrangementofreceptivefieldsrelativetothemoviepresentationareaforSessiona98nm5, wheresimultaneousV1andlateralgeniculatenucleus(LGN)recordingsareavailable.Othersessionshavesimilarelectrodearrangements. JNeurophysiol•VOL99•MARCH2008•www.jn.org INFERRING SPIKE TRAINS FROM LFPS 1463 Tolias et al. 2007) in a total of 56 trials. Unlike the data from The sampling interval (cid:8) is 5 ms, in accordance with the sampling anesthetized animals the stimulus conditions here are mixed, with frequencyoftheLFPsignal(200Hz). spontaneousactivity(notask)andafixationtaskshowinggratingsof differentorientations.Thislastdataislabeled“awake”inthefollow- LEARNING ALGORITHMS. A support vector machine (SVM) was ing. All data are processed in the same way as outlined in the next used (Vapnik 1999) as our learning algorithm. For a more detailed section. introduction to SVMs see, for example, Bishop (2006), Burges (1998), and Scho¨lkopf and Smola (2002). SVMs perform binary PROCESSING. Thedatapreprocessingstepsareasfollows.Electrode classificationinasupervisedfashion. signals were decimated to 7 kHz. Spiking activity is inferred from Briefly,themodelcanbestatedasfollows(fordetailsseeBishop high frequencies of the resulting signal (see following text). The 2006) recordinghardwareintroducesahigh-passfilterwithacutoff(cid:1)1Hz; 1Hzisthusthelowestfrequencyconsideredhere. h(cid:10)x(cid:11)(cid:4)wT(cid:12)(cid:10)x(cid:11)(cid:5)b (1) The7-kHzsignalislow-passfilteredwithacutofffrequencyof250 Hzandresampledfirstto500Hzforcomputationalconvenience.The whereonelooksforthedecisionboundaryorweightvectorw;bisa resulting signal is low-pass filtered at 90 Hz to derive local field biastermand(cid:12)isaprojectionintoaspaceoffeatures.SVMschoose potentials (LFPs). For low-pass filtering we use a custom finite thehyperplanethathasthewidestmarginbetweenbothclassesrather impulseresponsefilter(FIR):aKaiserwindowFIRfilterwith60-dB thananarbitraryseparatinghyperplane.Thisisachievedbyenforcing attenuationinthestopband,a0.01-dBpassbandripple,andatransi- appropriate constraints in the optimization. For nonseparable prob- tion band of 1 Hz. To eliminate possible phase shifts, signals were lems,suchasourreal-worlddata,oneintroducestheconceptofsoft filteredforwardandbackward(usingtheMatLabfiltfiltfunction).The margins, i.e., in the optimization one now allows for incorrectly signalisthenresampledtoafinalsamplingrateof200Hz. classified examples, where an additional parameter C regulates the GoodpropertiesofFIRfiltersarewonattheexpenseoflargefilter penalty. sizes(afewseconds).However,sincewediscardleadingandtrailing SVMs have the power to do nonlinear separation (seen from the D pcoonrtcieornnshoefr(cid:6)e.15sofeachtrial,filteron-andoffsetartifactsareofno pimerpslpiceicttliyvedeofifniensputhtespfeaacteu)rebymcahpo(cid:12)os.inHgeraeninapnpornolpinreiaatrerkaedrinalelbtahsaist own function(RBF)kernelsareused. lo a SPIKEEXTRACTION. Spiketimesaredetectedbyapplyingathresh- AsasimplealternativetoSVMsweusedstandardlinearregression d e old to the high-pass filtered 7 kHz signal described earlier (fourth- (withaconstantbiasterm)onthelabelvectorandthesamples(see, d orderButterworth,cutofffrequency500Hz).SincethisMUAsignal e.g.,Bishop2006).Briefly,usingalinearmodelh (x)(cid:7)wTx (cid:9)b fro isusuallyasymmetric,thedetectionthresholdisautomaticallyapplied wecalculatedtheoptimalweightvectorw*bymirneigmiizingthemi ean m to that side where spike waveforms exhibit greater deflection. To squarederror(cid:13)[h (x)(cid:2)y]2(cid:14)onthetrainingsamples.Classlabelson jn avoid dependence of the size of spikes the threshold is applied at thetestsetwereroebgtaiinedbiythresholdingwiththesignfunction,i.e., .ph 3.5SD ((cid:3)) of the “noise component” of the MUA signal. (cid:3) is y (cid:7)sign[h* (x)]. ys estimated by calculating the SD of the signal, neglecting the 4.55% j reg j io ((cid:6)2(cid:3))absolutehighestvaluesdividedbythepercentageofvariance EXTRACTIONOFLFPFEATURES. AnLFPfeaturecouldbeanyaspect log y thatiskeptingeneral,whensettingtheprobabilityofabsolutevalues oftheLFPthatonemightdeemhelpfulforinferringwhetherthereis .o (cid:6)2V(cid:3)istuoazleirnos.pection confirms that spikes are well detected. If the aressppeikcet atot tti.i,Initsouproawnearlyastisd,iwffeeruesnetdfrtheqeuLeFncPieast,dainffdertehnet lpahgassweiothf onrg assumption of a Gaussian “noise component” is correct, then the oscillationsatparticularfrequencies(alsoatdifferentlags).Multiple A rNfwareootermteeodtfmhowanutelrtotiwhnpeigltehlryenssdieuneulgttreiloencn-gttseipsdp(essilekpeeeickttrDerosIadSiieCnssUsaSwwSbeIoiOluNdlto)m.1no.B6osettHculaizsukese(eflaoynmryin(cid:3)okcsit(cid:7)lnudrde3eco.o5fsrpdsSipiknDikge)ess. fmieneaaetcotahuntredeasisnmtdaaernunednsiisttoirmSnaDipnol.ifynFgtehcaoestenurtcrseea.sstuealntriaentegedxstairnamctptheldeespsar{imxoirp}tloeisdvnievocirtdmoirnaglxiitz.heeNdsotatomezptehlreaost pril 14, 20 sorting. feaItfugre(tim)raeyprbeesednetfisnthede(ansoTrm(atl)iz:e(cid:7)d)gv(tolta)g,ewahtesraem(cid:6)pl(cid:7)ingk(cid:8)binretpi,raesteimntes 08 k i i(cid:9)k the time lag (we neglect boundaries to simplify the description). Learning to infer spike trains FeaturesT (t)simplyrepresenttheLFPtimecourserelativetosample k i The learning algorithm has to learn to map from LFP waveforms timet. i (or other LFP features) to spikes. Ideally, the learning algorithm Another type of feature, which we denote P (t), is the estimated f,k i should output all predicted real-valued spike timings at once if the powerat(center)frequencyfoftheLFPtimecourseattimet .To i(cid:9)k LFP time course is given as input, although this task requires too obtain an estimate for the power at a given frequency and time, we muchdata.Instead,wesimplifythetaskbyassumingthatspikesare calculatedthespectrogram,usingthemultitaperapproachintroduced independent and that the spike-to-LFP relationship remains constant by Thomson (Jarvis and Mitra 2001; Percival and Walden 2002; over time. With these assumptions one can use a binary classifier, Thomson1982).Becausespikesaresingleeventsonthetimescaleof which yields the prediction of a spike (or no spike) at time t. (cid:8), we chose a high temporal resolution at the expense of frequency Concatenating the prediction for each t results in a predicted spike resolution. We set the moving window to 150 ms and the time– trainforagivenLFP.Notethattheindependenceassumptiondoesnot bandwidthproductto1.6,whichimplicitlysetsthehalf-bandwidthto implythatpredictedspiketrainsarenecessarilyuncorrelatedbecause W (cid:7) 10.67 Hz. Spectral estimation was averaged over two Slepian temporalcorrelationcanbeinducedbytheunderlyingLFPs. tapers. Because this window setting does not allow accurate power In supervised fashion the binary classifier is trained on a set of estimation(cid:1)20Hzweusedlargerwindowsforbands(cid:1)20Hz(500 trainingexamplesandtestedonadistincttestset.Wetrainabinary ms)and(cid:1)6Hz(2s).Thisreducedthehalf-bandwidthto3.2and0.8 classifier on LFP features summarized in the sample vectors x, i (cid:7) Hzforfrequencies(cid:1)20and6Hz,respectively.WealsotriedMorlet i 1,...,L,topredictthelabely (cid:1){1,(cid:2)1}.Theindexiistheithpoint waveletswithvariablebandwidthperfrequency,butthisdidnotalter i of the discrete LFP time series with sampling frequency 1/(cid:8) at predictionperformance. recordingtimet (cid:7)i(cid:8)(cid:9)t .Thusy (cid:7)1statesthatthereoccurs(at To access phase information at particular frequencies of the LFP, i 0 i least)onespikewithintimebint andy (cid:7)(cid:2)1indicatesnospike.In we first band-pass filtered the recorded signal with the FIR Kaiser i i this framework prediction is temporally restricted to the sampling filter(describedearlier)withabandwidthof2(Fig.6)or4Hz(Fig. resolutionoftheLFPs,makingitnecessarytobinthespiketimings. 7), and then used the Hilbert transform to extract an instantaneous JNeurophysiol•VOL99•MARCH2008•www.jn.org 1464 RASCH, GRETTON, MURAYAMA, MAASS, AND LOGOTHETIS pfehaatsueres(cid:7)f((cid:8)tif),k(atit)f:r(cid:7)equceonsc[y(cid:7)ff(.ti(cid:9)Frko)]mhathveinseg saiglnaaglsofwe(cid:6)d(cid:7)efikn(cid:8)ed. Tphheassee syn(cid:10)F1,F2(cid:2)S(cid:11):(cid:7)I(cid:10)S;F1,F2(cid:11)I(cid:10)(cid:11)S;IF(cid:10)S;,FF1(cid:11)(cid:11)(cid:11)I(cid:10)S;F2(cid:11) (4) features have phase information identical to that of the band-passed 1 2 signalsbutaredevoidofanyamplitudemodulation(AM).Addition- This measure ranges from (cid:2)1 in the case of completely redundant ally,weused(cid:8)ˆf,k:(cid:7)sin[(cid:7)f(ti(cid:9)k)]inthefeatureanalysisofFig.6to information to 1 for completely synergic information. The measure help the classifier linearly extract phase locking at phases where the syn(F ,F (cid:2)S)iszeroifbothfeaturesF andF conveyindependent 1 2 1 2 cosinewouldbenearthezerocrossing. informationaboutthespikesS. To analyze prediction accuracy on different timescales, we used PERFORMANCEMEASURES. Thekappameasurewasusedasamea- spectral coherence (Jarvis and Mitra 2001). Spectra were again esti- sureofperformance(Cohen1960).Letp bethefractionofsamples l,r matedviaamultitaperapproachdesignedforpointevents(Jarvisand havingtargetlabell(cid:1){(cid:2)1,1}andpredictedlabelr(cid:1){(cid:2)1,1}and Mitra 2001). Here the time–bandwidth product was set to TW (cid:7) 3 letqlandq˜rbethefractionofsamplesinthetestsetthathavethelabel usingtheaverageofK(cid:7)5tapers,yieldingahalf-bandwidthofW(cid:7) l in the target or the label r in the prediction, respectively. Then the 0.001HzforT(cid:7)17s. chancelevelforclassificationisgivenby(cid:9) (cid:7)q q˜ (cid:9)q q˜ .Ifwe c (cid:2)1 (cid:2)1 1 1 define (cid:9)0 (cid:7) p(cid:2)1,(cid:2)1 (cid:9) p1,1 to be the overall fraction of correctly PERFORMANCE EVALUATION. We evaluated the prediction perfor- classifiedsamples(bothpositiveandnegative),then(cid:10)isgivenby mance for each trial separately, using 10-fold cross-validation. We analyzed a 170-s region, avoiding the on- and offset of the movie (cid:9) (cid:11)(cid:9) (cid:10):(cid:7) 0 c (2) stimulus. Spontaneous activity trials are also restricted to 170-s 1(cid:11)(cid:9) duration. In the case of tetrode recordings, performance is estimated c astheaverageperformanceofthefourwiresofthetetrode. Thismeasureisanormalizedabove-chanceclassificationrate.Itcan HyperparametersfortheSVMalgorithmwereestimatedasfollows. beeasilyseenthat(cid:10)equalszeroifpredictionisatchancelevel(i.e., The RBF kernel width (cid:9)was selected by a heuristic procedure. We Do (cid:9)(cid:9)00(cid:7)(cid:7)(cid:9)1c)). andequalsoneifthepredictedclassificationisperfect(i.e., tdoiostkan(cid:9)cteosbine1th.7e7tr(aoirn3in.5g4s)etti.mFeosrtheeacmhetdriiaalnwdiestcahnoceseotfhaaltlCEu(calniddea(cid:9)n) wnloa AnotherperformancemeasureistheSpearmanrankcorrelationr(cid:3) showingthebestperformance(averagedover10cross-validationruns d e betweensmoothedpredictedspiketrainsandtargetspiketrains.This on a logarithmic grid of 25 values from 0.25 to 400). We visually d might give a more intuitive picture of the prediction quality. If not confirmedthatthisrangeisappropriateforourdata(notshown).We fro stated otherwise, spike trains are smoothed by a Gaussian kernel of used the libSVM library (http://www.csie.ntu.edu.tw/(cid:1)cjlin/libsvm/) m width25ms. forallSVMcalculations. jn Yet another measure for prediction quality is the mutual informa- Since the sample sizes were heavily biased toward the negative .ph tionbetweenclasslabels.Themutualinformation(MI)betweentarget (nonspiking)classwerandomlypickedapproximatelythesamenum- ys spikes S and the prediction outcome of a classifier C(F) using LFP ber of samples of both classes from the training region. This effec- iolo featuresFandlabelsL:(cid:7){(cid:2)1,1}is tivelychangesthelossfunctionfromequallosstohigherimportance g y I(cid:15)S;C(cid:10)F(cid:11)(cid:16)(cid:4) (cid:1)l(cid:1)L r(cid:1)(cid:1)L PI,rlog2qpllq,rr (3) frccoaloatrness)ss.tepasWinkte(ecoslrua(ssatesbhdoerau1ttm,i0oaa0)x0fiaimnvadenufdmeoml1d,pa2ivi0nrai0ccirlaaeslbaalymsleefp,oliduenesnpdtfheotenrhdistisrpnatigoikniobinnneggatahrngeedogmoinodeonancnsowpmfiiiktprhiirnnogga- on Ap.org wherewetaketheprobabilitiesdefinedearlier.ThisestimationofMI m(cid:6)i1s,e00b0etswameepnlepsreodnilcytiomnarqguinaaliltlyy aimndprocovmedputhtaetiroensaulltssp(eneodt sbheocwauns)e. ril 1 is different from nonparametric approaches in that it can access Trainingtheclassifieronallpossiblesampleswasprohibitivedueto 4, 2 dependencethatisonlyinreachoftheclassifier;thusonehastomake the enormous sample size. We tried to use class biasing in the C 0 0 sure that the classifier captures the main aspects of its dependence. parameters(Musicantetal.2003),butthisonlyincreasedcomputation 8 Notethatweusethenaiveestimatorformutualinformation[without timewithlittlegaininpredictionquality. biascorrection(Panzerietal.2007)].SinceallMIvaluecalculations The test set was always a temporally contiguous region to avoid involve an identical number of bins—that is, two, one for each featurecorrelationoftrainedandtestedsamplesthatmightlienearby class—we can nevertheless safely compare results even for classifi- intimeifarandomizedsetofsampleswereused. cationswithdifferentnumbersoffeatures.However,theabsoluteMI FEATURESELECTIONALGORITHM. Wenowdescribehowtodeter- valuesmightbebiased. mine the usefulness of different features for spike prediction. Al- To access redundancy, synergy, and independence of information thoughfeaturesimportantfortheSVMclassifierarehardtointerpret, (Polaetal.2003;Schneidmanetal.2003)conveyedbytwofeatures inthecaseoflinearregression(withsquaredloss)onecanderivean F andF aboutthespikingactivityS,weestimatemutualinformation 1 2 analytical expression for the prediction error of a set of features using two classifiers trained on features F and F individually, 1 2 involving only the STAs and correlation among features. Based on yieldingI(S;F ):(cid:7)I[S;C(F )]andI(S;F ):(cid:7)I[S;C(F )].Thenathird 1 1 2 2 thispredictionerror,wederivedanalgorithmthatforwardlyselectsa classifier is trained on both features jointly, yielding I(S; F1, F2) :(cid:7) small subset of features out of a much larger pool of features. As I[S;C(F1,F2)].Ifbothfeaturescarriedindependentinformationfrom subsequently explained, the selected subset will show minimal pre- theperspectiveoftheclassifier,bothfeaturestogetherwouldconvey diction error compared with other subsets with the same number of identical information as individual features; i.e., I(S; F , F ) (cid:7) I(S; features.Inthatsensethesubsetoffeaturesselectedbyouralgorithm 1 2 F )(cid:9)I(S;F ).Ifbothfeatureswererelatedbyaone-to-onemapping represents the most useful features from a given pool. Because this 1 2 (completely redundant information about the spikes), then all terms feature selection can be efficiently done for huge feature pools, we wouldbeequal:I(S;F )(cid:7)I(S;F )(cid:7)I(S;F ,F ).Ifthetwofeatures restrictedfeatureanalysestolinearclassification,ratherthanusingthe 1 2 1 2 didnotcarryinformationindividually,i.e.,I(S;F )(cid:7)I(S;F )(cid:7)0,but SVM classifier. This is not too restrictive in our case because linear 1 2 carriedinformationtogether,I(S;F ,F )(cid:6)0,theywouldbetermed classification achieves almost 90% of the performance of an SVM 1 2 (completely) synergistic. Thus we define a normalized degree of classifier(seeRESULTS). synergy of information about the spikes (as measured by the classi- In linear regression we look for the weight vector w that has ficationalgorithm)as(Schneidmanetal.2003) minimalerrorinameansquaredsense JNeurophysiol•VOL99•MARCH2008•www.jn.org INFERRING SPIKE TRAINS FROM LFPS 1465 (cid:12)(cid:10)w(cid:11)(cid:4)(cid:13)(cid:10)y (cid:11)wTx(cid:11)2(cid:14) (5) investigatewhichLFPfeaturesareimportantfortheprediction i i task and which aspect of the spikes they code for. wherethebracketsindicateaveragingoverallsamplesi.Minimizing theerrorisstraightforwardandresultsintheoptimalweightvector1 Average spike to LFP-power and -phase relation w*of w*(cid:4)(cid:13)xxT(cid:14)(cid:2)1(cid:13)yx(cid:14) (6) Figure1showsspikingactivityand(normalized)LFPsofa i i i i representative electrode. Relationships between spiking activ- provided that the estimated correlation matrix A: (cid:7) (cid:13)xxT(cid:14) has full ity and underlying LFPs are visualized in Fig. 2. i i rank.Wenotethatwehaveabinaryclassificationandthusyi(cid:1){(cid:2)1, InFig.2Athespike-triggeredaverage(STA)oftheLFPsof 1}.Thus anexampleelectrodeduringmoviestimulusisplotted.Clearly, thereisalinearrelationbetweenspikesandLFPs.Onenotesa c c (cid:13)yx(cid:14)(cid:4) (cid:11) (cid:2)1 m (cid:5) 1 m sharp negativity at spike position at zero time lag and a i i c (cid:5)c (cid:2)1 c (cid:5)c 1 1 (cid:2)1 1 (cid:2)1 prominentupswingforpositivetimelags,i.e.,afterspikinghas wherem andm aretheclassmeansforthenonspikingandspiking occurred. Likewise, in the STA of the spectrogram (Fig. 2B) (cid:2)1 1 class,respectively,andc andc arethenumberofsamplesineach power is enhanced in the high-frequency range (40–90 Hz) (cid:2)1 1 class. With the condition that each feature is normalized to variance during spiking activity. Enhancement of power at high LFP oneandzeromean,Eq.6reducesto frequencies as a response to spikes is common among elec- trodes,stimulusconditions,andmonkeys,aswewillseeinthe w*(cid:4)cA(cid:2)1m (7) 1 next sections. Figure 2C shows the probability of spiking activity at the with oscillation phase of a particular LFP frequency for the same D o 2c example electrode, but averaged (cid:6)30 repeats of the movie pre- w c:(cid:7)c (cid:5)1c sentation((cid:5)120minofrecordingtime).Onenotesthatthephases nlo 1 (cid:2)1 a of all LFP frequencies are at least weakly related to spiking d e Theminimalerroristhengivenby activity(Raleightestofnonuniformangulardistribution).Most d (cid:12)(cid:10)w*(cid:11)(cid:4)1(cid:11)c2mTA(cid:2)1m (8) strikingly,spikesarerelativelytightlylockedtophasesoflow from 1 1 frequencies ((cid:13)10 Hz). The generality of this behavior is jn Inotherwords,theerrorofthelinearregressionisdependentonlyon illustrated in Fig. 2D, where the phase preferred by spikes is .p h the STA m1 and the correlation among features. We note that if we plotted as an average across all data from V1 (anesthetized ys restrictourselvestotheuseofnfeatures(thedimensionsofxi)outof animals). The average preferred phase shifts with frequency io apoolofN(cid:6)nfeatures,theprecedingequationsremainvalidifthe fromtheonsetofapositivehalf-wavetothevalleyoftheLFP log correlation matrix and the mean are also restricted to these features y only(thatis,Ahassizen(cid:3)nwithranknandm1isn-dimensional). ocoscnislilsatteionnt, (rceogmarpdalerseswofithwhFeigth.er2Ea)c.tivTihtyisisbeshpaovniotarneisouvseroyr .org aalrggTomoritashxemle.(cid:2)cWt(maes)setta(cid:2)ro(tftwhneitihmvatphroieartnfaecnaettuforefeatetuharacethshwfaeseattuhusereehtihbgeehiefnosgtlloSnwToArinm,gai.lietiez.,reafd1tiv(cid:7)toe mquoevnicey-ddrievpeenn.deFnocresiosmsleigehltelyctrdoidffeesretnhte(apsreinferthreedexpahmaspel-efroef- on Ap osenaer)c.hAtshjsruomugeh1njaolwlNth(cid:2)atnn(cid:9)(cid:2)11rfeematauirneisngarfeeaatlurereasdyansdelcehcoteods.eTohneenthwaet Ftoi-gs.p2ikCefroerlahtiiognh fsreeeqmusentcoiebse).mFiorrroarefdewate(cid:14)lec(tnrootdeshsotwhen)p.hase- ril 14 minimizes the error (Eq. 8), where the restriction of the correlation The gray-shaded area in Fig. 2D shows the (average) phase , 2 matrix is now enforced on the n features rather than on n (cid:2) 1 (and rangewithinwhich50%ofthespikesfall.Itwouldbezerofor 00 8 analogouslywithm ).Westopthisiterationwhenthedesirednumber perfect phase locking and (cid:14)for no phase–spike relation. One 1 offeaturesm(cid:1)(cid:1)Nisselected. notes that this range is somewhat smaller for low frequencies Thisalgorithmishighlyefficientinfindingagoodsetoffeatures, (0.85(cid:14)), but approaches 0.98(cid:14)for frequencies (cid:6)20 Hz, indi- since we need to calculate the correlation matrix only between the catingthatthephaselockingisfarfromperfectatallfrequen- selectedfeaturesandallotherfeatures(whichcostsmuchlesseffort cies and is especially weak for high frequencies. thancalculatingitforallpairs). In summary, we have seen that there is indeed a consistent relationship between LFPs and spiking activity on average. In RESULTS thenextsectionsweasktowhatdegreeitispossibletoexploit these relations (and maybe other information available in the This section is organized as follows. After showing the LFP) in a systematic way to infer spikes from the LFP. general spike-to-LFP relationships present in our data, we report the population performance for the task of predicting spike trains from LFP, focusing first on data from V1 of Population prediction performance anesthetized monkeys (nine monkeys) collected during the presentationof5minofcommercialmoviestimuliandequally In Fig. 3 a typical example of a predicted spike train is long periods of spontaneous activity. We then compare these depicted together with the used LFP features. Figure 3, A and resultswithamorelimitedsetofsessionsrecordedfromV1of Bshows8sofLFPspectrogramandthetimecourseinthetest awake monkeys (two monkeys) and with results on data from region, respectively. Small vertical lines in Fig. 3B indicate LGN of anesthetized monkeys (two monkeys). Finally, we spike times before binning to 5-ms resolution. Several inter- estingpointscanbenoted.AsexpectedfromtheLFP-to-spike relations discussed in the last section, spikes preferentially 1Note that the optimal weight vector w* has only minimal error for the occurintheupswingandvalleysofverylowandmediumLFP regression,andthattheremaybeabetterweightvectorforclassification.We neglectthishereforthesakeofsimplicity. oscillations, as seen for instance at times 171 and 173.4 s in JNeurophysiol•VOL99•MARCH2008•www.jn.org 1466 RASCH, GRETTON, MURAYAMA, MAASS, AND LOGOTHETIS A B 0.4 LFP−STA 80 N 0.6 o U] Shuffled spikes SD rm 0.3 d LFP [SD 00..24 ency [Hz] 4600 0.2 alized LFP Normalize −0.02 Frequ 20 00.1 power [S D −0.4 −0.1 U ] 0 −1 −0.5 0 0.5 1 −2 0 2 Time lag [sec] Time lag [sec] C D E 2.3 6 6 5 2.2 S 5 d] p d] D P [ra 4 2.1 ike p P [ra 4 own Phase of LF 23 2 robability [% Phase of LF 23 loaded from 1 1.9 ] 1 movie jn spont .p h 0 1.8 0 y s 0 20 40 60 80 100 101 102−1 0 1 io Frequency [Hz] Frequency [Hz] log y aneFsIGth.e2t.izedSpmikoen–kLeFyPs(rDelaantidonEs)h.iAps:sfpoirkoe-nterigegleecrterdodaeveirnagVe1(SoTfAan)LanFePs.thFeotrizseigdnmifiocnaknecyeldeuvreilnsginmteorvspieiksetiimntuelravtiaolsn((IAS–IsC))araendshaucfflroesdsaanlldrtehceoSrdDinogfsthfreormesuVl1tinign .org o STAiscalculated.B:STAoftheLFPspectrogram(seeMETHODS),withpowerseriesnormalizedtozeromeanandunitSD.Powerathighfrequenciesisclearly n modulatedbyspikingactivity,whereaspoweratlowerfrequenciesshowsonlydiffusedependenceonspikes.C:probabilitydistributionofLFPphasesata A p(saarmticeuelalercstproikdiengaspboesiftoioren)..LNFoPtepthhaastetshaerceocloomrmpuatpedshvoiawHsoilnbleyrtatrnaanrsrfoowrmra(n1g-Hezofbpanrodbsa).bHilietireesaallndsptihkaetsvoavleures30abroevpeeaotsrobfelmowovtihee-dlriimveitnsaacretivtrituyncaareteidn.cBluldacekd pril 1 dotsindicatethepreferred(i.e.,mean)phase.Nophaselockingofspikeswouldresultinauniformdistributionat2%perbin.Althoughlockingtolowfrequencies 4 isstrong,lockingtohighfrequenciesisonlyweak(butpresent).D:theaveragepreferredphasesforallelectrodesacrossallanesthetizeddataindividuallyfor , 2 0 movie-drivenactivity(“movie”)andspontaneousactivity(“spont”).BarsindicateSEs.Thephaserangecontaininghalfofthespikesaroundthepreferredphase 0 8 isindicatedbytheshadedarea.E:theinterpretationofphase,showingthatspikesarelockedatverylowfrequenciestotheonsetofapositivehalf-waveand athighfrequenciestothevalley. Fig. 3B. Additionally, the power of multiple frequencies is closelypredicted(172.5s),althoughnoclearmarkintheLFP enhancedwhenaburstofspikingactivityoccurs,assuggested time course or spectrogram can be seen with the naked eye. byFig.2B,butthefrequencyresponsetoburstsisdiffuseand However, their length (176.6 s) and exact position (176.3 s) variable (compare the burst at 173.4 s to that at 174.5 s). The sometimes seem inaccurate. There are also occasions where clustering of spiking activity on a timescale of a few hundred spikes are simply missed (172.9 s) or fabricated (173.9 s). millisecondsinthisexampleisactuallyquitetypicalinourV1 Prediction performance is evaluated in different ways. One data (see following text). For the single spikes in between the measure is the (cid:10) performance, a measure defined on the clusters no feature of the LFP is immediately predictive. samplesinthetestset,whichispositiveforabove-chanceclassi- Figure 3C shows target and predicted spiking activities. fication; it equals one for perfect classification (see METHODS Prediction of spikes is made for individual sample times at a for definitions). In contrast, the correlation measure r (see (cid:3) resolution of 5 ms using a set of LFP time course and fre- METHODS) is defined as a local average in the time domain and quencyfeatures(seefollowingtext).Concatenatingthepredic- is therefore less sensitive to small temporal inaccuracies. The tion over time yields a predicted spike train that is compared performancemeasure(cid:10)ofthepredictedspiketraininFig.3Chas with the target spike train. One notes that the prediction a value of (cid:10)(cid:7) 0.40, which is relatively good (but not the best captures, at least approximately, the overall structure of the possible;seefollowingtext).Rankcorrelationisr (cid:7)0.60. 25ms spiketrain.Theoccurrenceofburstsofspikingactivity,which IntheexampleofFig.3,thepredictedspiketrainresembles are associated with easily seen traces in the LFP time course theoriginaltoacertaindegree.Weaskwhetherthisprediction and spectrogram, is well predicted. Nevertheless, the exact qualitycarriesovertoLFPsrecordedfromdifferentmonkeys, onsetsandoffsetsoftheburstsaresomewhatinaccurateinthe electrodes, and stimulus conditions. For that we estimated prediction. Even some smaller bursts and single spikes are prediction performance using a large data set (see METHODS). JNeurophysiol•VOL99•MARCH2008•www.jn.org INFERRING SPIKE TRAINS FROM LFPS 1467 A Normalized power −2 0 2 4 6 8 10 80 z] H 60 y [ c n e 40 u q e Fr 20 0 B 4 U] LFP and spikes D 2 S e [ 0 g a olt −2 V D −4 o w C n lo 300 Target SDF (σ=0.01s) ad kes/s] 200 Prediction (κ=0.40) ed fro Rate [spi 1000 jn.phym s io 171 172 173 174 175 176 177 log Time [s] y.o FIG.3. ExampleofspikepredictionfromLFP(anesthetizedmonkey,Sessiona98nm5,spontaneousactivity).A:the(normalized)spectrogramofthe8sof org LFPactivity.B:correspondingLFPtimecourseandspikingactivity.SpikesareindicatedbymarksbeforebinningtotheLFPresolution(5ms).C:thebinned n targetspikesandtheirspikedensityfunction(blue)togetherwiththepredictedspikesandtheirspikedensityfunction(red).Thepredictionisrelativelygood A (th(cid:10)a(cid:7)tre0g.i4o0n,sro2f5hmisgh(cid:7)a0ct.i6v0i)tyoanrethwiseltlriparle,dbiuctteodt,hwerhetrrieaalsstshheolwoceavtieonnboeftsteinrgpleersfpoirkmeasniscele(scsoamccpuarraetew.iCthlars2s5ifimcsavtiaolnueissdoofnoethweirthtrtihaelssuinppFoigrt.v4eBc,to“rspmoa”c)h.iOnene(SnVoMtes) pril 1 radialbasisfunction(RBF)classifiertrainedontheregion35–160susingthesamefeaturesasforthepopulationanalyses(Fig.4). 4 , 2 0 InferencesaremadeonthebasisofasetofLFPfeatures,with predictspikingactivity(allconditionshighlysignificant;t-test, 0 8 whichweobservedadependencebetweenspikesandLFPsin P (cid:1) 10(cid:2)6, Wilcoxon signed-rank test for zero median, P (cid:1) the previous section. In the population analysis we include as 10(cid:2)6). Second, prediction quality for the stimulus condition features the time course around each sample position (in a and for spontaneous activity differs only slightly: indeed, one windowof100msbeforeand300msafterspikeposition)and cannot reject the hypothesis that the underlying distributions an estimate of the frequency content of LFPs at zero time lag have identical means (two-sided unpaired t-test, P (cid:7) 0.21; [Pf,0(ti);seeMETHODS],resultinginatotalof116features.This linear, P (cid:7) 0.18). However, if one compares pairwise record- feature set generally produced good performance (with a rea- ings during spontaneous activity and stimulus presentation sonable computational speed) over a wide range of data. For done with identical electrodes, mean and median prediction thepredictionitselfanonlinearsupportvectormachineisused performances on spontaneous activity are significantly better with radial basis functions such as kernel (SVM–RBF) and thanthoseonstimulus-drivenactivity(one-sidedpairedt-test: a linear classification is employed (for details see METHODS). P(cid:1)10(cid:2)4;linear,P(cid:1)10(cid:2)4;forthedistribution-freeWilcoxon In Fig. 4 the prediction performance over all trials is eval- matched-pairssigned-rankstest:P(cid:1)10(cid:2)4;linear,P(cid:1)10(cid:2)3). uated(on10cross-validationruns)andaveraged.Theanesthe- Average prediction performance for spontaneous activity is tizedV1datasetislabeled“spo”forspontaneousactivityand (cid:10) (cid:7) 0.211 (cid:17) 0.006 (linear (cid:10) (cid:7) 0.185 (cid:17) 0.005) and (cid:10) (cid:7) “stm” for movie stimulus-driven activity. We shall focus on 0.201(cid:17)0.005(linear(cid:10)(cid:7)0.175(cid:17)0.005)forstimulus-driven thesedataforthemoment.Theremainingconditionsshownin activity. this plot are discussed in the following text. Third, nonlinear margin classification is consistently better PlotAshowstheaverageperformance(cid:10)fortheSVM–RBF than linear classification (one-sided paired t-test: “stm” P (cid:1) classifierandforlinearclassification.Fromtheresultswedraw 10(cid:2)6, “spo” P (cid:1) 10(cid:2)6; Wilcoxon matched-pairs signed-ranks the following insights. First, since performance measure (cid:10)is test,P(cid:1)10(cid:2)6).Itamountstoanincrease,onaverage,ofabout (cid:6)0 for above-chance prediction it can be said that both 12%inperformance.ThissuggeststhatthemappingfromLFP classifiers can exploit information in the LFP time course to featuresisnonlinear.However,sinceasimplelinearregression JNeurophysiol•VOL99•MARCH2008•www.jn.org 1468 RASCH, GRETTON, MURAYAMA, MAASS, AND LOGOTHETIS A B 0.35 1 SVM RBF a98nm5 0.3 Linear regression ms] ac9988nnmm16 0.8 c98nm2 e 5 d04nm1 nc 0.25 =2 d04nm2 a σ d98at rm 0.2 n [ 0.6 gg0927nnmm11 o o georgios3 κg. perf 00.1.15 correlati 0.4 jlrs99907772nnnnmmmm1111 Av k n 0.2 a 0.05 R 0 0 stm spo awake stm (L) spo (L) stm spo awake stm (L) spo (L) Condition Condition C D [%]mm 80 sstpmo ±± SSEE −−30..32%%mmmm−−11++4443%% 0.25 SCaromses eelleeccttrrooddees ( s(stmtm)) Dow κκce /0 60 mance 0.2 SCSairgomnsiesfi ceealleencct ttprrooeddrfeeos r( ms(spapono)c)e nloaded man 40 erfor 0.15 Not significant from ative perfor 20 κAvg. p 0.00.51 jn.physiolog el y R .o 0 0 rg 0 2 4 6 8 10 LFP: V1 LGN V1 LGN o n Electrode distance [mm] Spikes: V1 V1 LGN LGN A p Electrode configuration ril 1 FIG.4. Population performance for spike prediction from LFP. A: average prediction performance (cid:10)for SVM and the linear regression classifier across 4, 2 conditions[anesthetizedmonkeyV1,moviestimulus(“stm”),anesthetizedmonkeyV1,spontaneousactivity(“spo”),non-anesthetizedmonkeyV1withmixed 0 stimuli (“awake”), and spontaneous activity or movie-stimulus driven activity in anesthetized monkeys from LGN, “spo (L)” and “stm (L),” respectively]. 08 Prediction is above chance level for all conditions (see RESULTS for significance tests). B: prediction accuracy of the nonlinear classifier evaluated by rank correlationbetweentargetandpredictedspikestrainsmoothedbyaGaussiankernelofwidth(cid:3)(cid:7)25ms.Redhorizontallinesindicatetheaverageperformance withineachconditionanditsSE.Smallblacklinesshowthequalityonindividualtrials.Insomecasespredictionyieldsveryaccurateresults,withcorrelations ashighas0.8–0.9.Blackcurvesonthesidesofeachconditionaresmoothedhistogramsovertrials.Symbolsindicatetheaverageperformanceofindividual sessions(i.e.,1dayofrecording).Averagesessionperformanceclustersneartheoverallmean,butvarianceforindividualelectrodesishigh.C:cross-electrode prediction.LFPsaretakenfromoneelectrodeandspikesfromanother.Relativepredictiondecreasesmuchfasterwithincreasingelectrodedistanceinthe“stm” condition than for the “spo” condition. A linear fit of the data points is also shown. Vertical lines indicate SEs. D: average prediction performance for simultaneousLGNandV1recordings(3sessions,2monkeys).Performanceiscomparedforavailablecross-electrodepredictionfromdifferentareasandwith averageperformancewhenusingthesameelectrodeforspikesandforLFPs.X-axislabelsindicatewhereLFPandspikesoriginatefrom. classifier already achieves almost 90% of the accuracy of the Thereisnot muchvariabilityinperformance overtime: the nonlinearclassifier,onecouldstatethattheLFPfeaturespace average SD for the (cid:10)performance of five repeats of 170-ms exploited here seems expressive enough for this task. recordings for the same electrodes is 0.023 (cid:17) 0.002 for We found that for individual trials performance varies stimulus-drivenactivity,0.026(cid:17)0.002forspontaneousactiv- widely. For selected trials prediction performance can reach ity,and0.045(cid:17)0.002forbothtogether.Thisisincontrastto (cid:10)(cid:7)0.65.PlotBofFig.4showstherank-correlationmeasure the variance across electrodes recorded simultaneously. Here r oftheSVM–RBFprediction.Eachthinshortlinerepre- theaverageSD(in(cid:10))is0.113(cid:17)0.004forstimulus-drivenand 25ms sents performance for an individual trial. Whereas the corre- 0.130 (cid:17) 0.005 for spontaneous activity. The roughly 25-fold lation for some trials is as high as 0.8–0.9 on this moderately increase in variance across electrodes compared with within- small timescale (25 ms), it is almost zero in others. There are electrode variance suggests that prediction performance is a some trials where prediction fails completely in each of the matter of which electrode is being observed, rather than stim- conditions.Thefailingtrialsarenotallfromthesamesessions ulusconditionortime.Electrodetipsmightbepositionedina since the session means (markers) tend to cluster around the region where the arrangement of current sources and sinks overall mean. mightdiffer(e.g.,indeeporsuperficiallayers),orwhereactive JNeurophysiol•VOL99•MARCH2008•www.jn.org INFERRING SPIKE TRAINS FROM LFPS 1469 neuronsmightbelesswellcorrelatedwiththebulkactivityof comparespontaneousandstimulus-drivenactivitybecausethe the cortex. Since we cannot distinguish the layers from which electrode placements do not change with the condition. electrodes record, we cannot pursue this further. Because LGN data were collected while other electrodes Up to now we have presented results for recordings only simultaneously recorded from V1, we can investigate whether from V1 of anesthetized monkeys. We also have a limited the LFPs of V1 can be predicted on the basis of spikes from amount of data available from V1 where monkeys were not LGN and vice versa. This is shown in Fig. 4D averaged over anesthetized and free to behave. The stimuli for this data set data from the three sessions recording simultaneous measure- (labeled “awake”) are mixed and include spontaneous activity ments from V1 und LGN (see METHODS). Performance is averaged either across electrode predictions (regardless of andfixationtasks.Anotherpoolofdataconsistsofrecordings distances) or over all predictions using the same electrode for from LGN of anesthetized monkeys (labeled “stm(L)” and bothsignals.Althoughresultsaredifficulttointerpretbecause “spo(L)”; see METHODS). of the limited size of the data set, one notes that using LFPs WeseefromFig.4Athatpredictiondiffersquitedrastically from LGN and spikes from V1 results in performance above for the different data types. Spike prediction for the anesthe- chance, whereas LFPs from V1 seem to hold no information tized monkey data from V1 is more than fivefold better than about spikes in LGN (unpaired Wilcoxon signed-rank test for thatintheLGN,whereperformanceishardlyabovechance:on medianperformancedifferentfromzero,significancelevel0.05). average, (cid:10)(cid:7) 0.035 (cid:17) 0.005 (linear (cid:10)(cid:7) 0.033 (cid:17) 0.005) for movie-driven activity and (cid:10) (cid:7) 0.027 (cid:17) 0.003 (linear (cid:10) (cid:7) 0.022 (cid:17) 0.003) for spontaneous activity. Temporal accuracy of predicted spike trains AsinV1,thereislittledifferencebetweenspontaneousand We found an average (cid:10)value of about 0.2, which is well movie-driven activity in LGN, although there is a reversed above chance but nevertheless far from perfect prediction at D tendency for spike prediction to be easier on movie-driven (cid:10)(cid:7)1.Ontheotherhand,inexampleFig.3Csomefeaturesof ow activity than that on spontaneous activity. This tendency is the target spike trains seem to be well captured by the predic- nlo barely significant (one-sided paired t-test: P (cid:7) 0.02; linear, tion, especially regions of high and low activity, which alter- ad P (cid:7) 0.01; Wilcoxon matched-pairs signed-ranks test: P (cid:7) nate on a timescale of about 0.5 s in this example. Thus one ed 0.05; linear, P (cid:7) 0.08). mightaskatwhattimescalethepredictedspikingactivitymost fro Wefindthataveragepredictionperformanceonawakedata closely resembles the target spiking activity or at what timing m is(cid:10)(cid:7)0.063(cid:17)0.005(linear(cid:10)(cid:7)0.046(cid:17)0.005).Thisismuch accuracy the prediction fails. jn worse than that on anesthetized V1 data (unpaired t-test, P (cid:1) To answer this we evaluated the coherence between target .ph 10(cid:2)5), but still significantly better than that on LGN data (all and predicted spike train (Fig. 5). Coherence is a correlation ys unpaired one-sided t-test, P (cid:1) 0.05). Figure 4B reveals that measure in the frequency domain. Coherence at a particular iolo g individual trials have a correlation of target and prediction frequency makes a statement about the exactness of the pre- y similar to that in anesthetized monkeys. There are trials with diction on a timescale of one over that frequency. We also .org correlationuptor (cid:7)0.6,whereasinthecaseoftheLGN, estimated the temporal accuracy directly in the time domain o 25ms n no trial exceeds 0.3 correlation. (by varying the correlation kernel width) where one arrives at A similar conclusions (not shown). pril 1 Cross-electrode predictions 4 1 , 2 ThevolumeofcortexthatcontributestothegenerationofLFPs 500ms 100ms 25ms 10ms 5ms sstpmo 00 8 isdifferentfromthatproducingourspikingsignal(seeINTRODUCTION). 0.8 awake stm (L) Thus it might be interesting to see how the relationship be- spo (L) tween the two signals changes with distance. Because record- ce 0.6 n ingsweredonesimultaneouslywithmultipleelectrodes(inthe e data set from anesthetized animal), we tried to infer spikes her o 0.4 from LFPs collected with two different electrodes. In Fig. 4C C the average performance is plotted against the (three-dimen- sional)distanceoftheelectrodetips.Tofacilitatecomparison, 0.2 performance is evaluated relative to the average performance achieved using the spiking signal from the electrode from 0 which the LFPs were taken. 1 10 100 One notes that prediction performance drops to about 40% Frequency [Hz] when electrodes are 1 mm apart (the minimal distance in our FIG.5. Coherencelevelsofpredictedandtargetspiketrains.Coherenceis recording setup). Interestingly, for stimulus-driven activity plottedagainstfrequencyandisaveragedoveralltrialsforeachcondition.For performance degrades significantly with distance (rank corre- comparisonweincludedsurrogatedata,inwhichspiketrainsweregenerated withthesameISIdistributionasinthemonkeydata.Graydashedlinesshow lation between distance and relative kappa performance using coherencebetweensurrogatespiketrainanditsjitteredversionwithGaussian allmeasurements:(cid:2)0.20,P(cid:1)10(cid:2)4),whereasforspontaneous noiseofdifferentSDs((cid:3)from5to500ms,aslistedintheplot).Coherence activity no significant correlation with distance can be found dropsforhigherfrequencies,suggestingthat,onaverage,predictionisreason- for distances (cid:13)1 cm (rank correlation 0.015, P (cid:7) 0.2). Note ably good only for slow structure in the spike trains. Note that the chance coherencelevelisatabout0.15here,asshownbythesurrogatedata.Chance that the number of samples becomes relatively small for levels do not tend to zero because coherence is estimated on 10 cross- distances(cid:6)6mmsincerectangularelectrodegridswith1-mm validationregions(each17-sduration)andonlysubsequentlyaveragedover spacing are used for most sessions. However, we can safely alltrials.ColoredareasindicateSEs. JNeurophysiol•VOL99•MARCH2008•www.jn.org