This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy JournalofComputationalPhysics231(2012)6401–6418 ContentslistsavailableatSciVerseScienceDirect Journal of Computational Physics journal homepage: www.elsevier.com/locate/jcp Sequential data assimilation with multiple models ⇑ Akil Narayana, Youssef Marzoukb, Dongbin Xiua, aDepartmentofMathematics,PurdueUniversity,WestLafayette,IN47907,USA bDepartmentofAeronauticsandAstronautics,MassachusettsInstituteofTechnology,Cambridge,MA02139,USA a r t i c l e i n f o a b s t r a c t Articlehistory: Dataassimilationisanessentialtoolforpredictingthebehaviorofrealphysicalsystems Received18January2012 given approximate simulation models and limited observations. For many complex Receivedinrevisedform6May2012 systems, there may exist several models, each with different properties and predictive Accepted3June2012 capabilities.Itisdesirabletoincorporatemultiplemodelsintotheassimilationprocedure Availableonline21June2012 inordertoobtainamoreaccuratepredictionofthephysicsthananymodelalonecanpro- vide. Inthis paper, we propose a framework for conducting sequential data assimilation Keywords: withmultiplemodelsandsourcesofdata.Theassimilatedsolutionisalinearcombination Uncertaintyquantification ofallmodel predictions anddata. Onenotable feature isthat thecombination takesthe Dataassimilation mostgeneralformwithmatrixweights.Bydoingsothemethodcanreadilyutilizediffer- Kalmanfilter Modelaveraging entweightsindifferentsectionsofthesolutionstatevectors,allowthemodelsanddatato havedifferentdimensions,anddealwiththecaseofasingularstatecovariance.Weprove thattheproposedassimilationmethod,termeddirectassimilation,minimizesavariational functional,ageneralizedversionoftheoneusedintheclassicalKalmanfilter.Wealsopro- poseanefficientiterativeassimilationmethodthatassimilatestwomodelsatatimeuntil allmodelsanddataareassimilated.Themathematicalequivalenceoftheiterativemethod andthedirectmethodisestablished. Numericalexamplesarepresented todemonstrate theeffectivenessofthenewmethod. (cid:2)2012ElsevierInc.Allrightsreserved. 1.Introduction Numericalsimulationsofmathematicalmodelsareessentialtoolsforpredictingthebehaviorofphysicalsystems.Myriad numerical techniques and approximations are used to simulate physical phenomena in fluid dynamics, electromagnetics, chemical systems, astrophysics, and more. Since all of these simulations involve approximations, uncertainty and error are inevitably present in their predictions. To complicate matters, a single physical process may be described by multiple mathematical models and numerical approximations. In addition, one may have access to empirical observations of the system—noisy and limited in number,scope, and resolution.A natural questionto ask is howto combinethe modelsand the observational data to predict the physical state with greater fidelity than can be obtained with any of the models individually? Varioustechniquesformodelaveraginganddataassimilation,allofwhichattempttoimplementsuchacombinationof models and data, have received attention in recent years. In the case of a single dynamical model with a stream of noisy observations, the Kalman filter [15,14] is both simple and remarkably effective. The assimilation step of the Kalman filter updatesthestatebyweighingthemodelpredictionandthedatainordertominimizeaquadraticobjective.Thisoperation can be interpreted in many different ways, for instance, as a minimum variance estimator or as a Bayesian update. The ⇑ Correspondingauthor.Tel.:+17654962846. E-mailaddresses:[email protected](A.Narayan),[email protected](Y.Marzouk),[email protected](D.Xiu). 0021-9991/$-seefrontmatter(cid:2)2012ElsevierInc.Allrightsreserved. http://dx.doi.org/10.1016/j.jcp.2012.06.002 Author's personal copy 6402 A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 originalKalmanfilterwasdesignedforlinearsystems,butderivatemethodsforfilteringnonlinearsystemsareplentiful,e.g. theextendedKalmanfilter[9,11],theensembleKalmanfilter[7,8]anditsvariants[1,2,5,21,24],andisthesubjectofexten- siveongoingwork. The use of multiple models for filtering, on the other hand, has seen less development. With static data sets, Bayesian modelaveraging(BMA)isawell-establishedtechniqueforstatisticalpredictioninthepresenceofmodeluncertainty[10]. BMAwritesthepredictivedistributionofanyquantityofinterestasaweightedaverageoftheposteriorpredictionsdueto each model; the data-dependent scalar weights are the posterior model probabilities [19]. Alternatively, one might considernon-Bayesianapproachestomodelaveragingwithstaticdata;thesearegenerallyfocusedonprovidingpointpre- dictionsrather thanpredictivedistributions[10,6]. Dynamicmodel averaging (DMA) [20] generalizes BMA to the dynam- icalsetting,wheredataarrivesequentiallyandonewouldliketomakeonlinepredictionsaboutthestateofasystem.DMA updates the state conditional on a discrete model indicator, but some Markovian dynamics (e.g. a matrix of transition probabilities or a forgetting factor) for this model indicator must be specified. As in BMA, the predictive distribution is a mixture with one component for each model. Other methods for sequential estimation with multiple models include theInteractingMultipleModelfilter[4]andthegeneralizedpseudo-Bayesframework[23].Bothofthesetechniquesintro- duce dynamics for the ‘‘switching’’ behavior between models and update state probabilities based on innovations from a Kalman filter. Likealloftheaforementionedmethods,weproposeanassimilationtechniquethatreliesonaweightedarithmeticmean ofthemodels.Alimitationofthepreviousmethods,however,isthattheweightsusedinmodelaveragingarescalar-valued, evenwhenthestateisvector-valued.Therefore,theassimilationprocesscannotemploydifferentmodelweightsindifferent sectionsofthestatevector(correspondingtoregionsofspacewhereonemodelmightbesuperiortoanother,forinstance). Inthispaperweconsidermoreageneralassimilationtechnique:ifu ;...;u arestatevectorsforMdifferentmodelsandd 1 M isthedatavector,weattempttofindanassimilatedstatewoftheform M X w¼ A u þBd; ð1:1Þ m m m¼1 whereA andBarematrices.Therefore,weallowourassimilationtobethemostgenerallinearfunctionalofallthemodels m anddata.Thisformalsoallowsthemodelstatesanddatatohavedifferentdimensions.Alsoitispossibletohavedifferent, independentsourcesofdatad ;...;d ,measuringdifferentquantities-of-interest.Mathematicallytheycanbeconcatenated 1 N intoasinglevectord,allowingustouse(1.1)asageneralform.(Wewillshowthatassimilatingdataisphilosophicallythe sameassimplytakingthedatatobeextramodelstatesin(1.1).Inotherwords,ourmethodtreatsmodelpredictionsand dataasidenticalmathematicalobjects.) Wederiveourassimilationtechniquebyminimizingavariationalfunctionalsimilarto,andageneralizationof,theone usedintheclassicalKalmanfilter.Withonlyasinglemodelanddata,theassimilationisidenticaltotheKalmanfilter.When twoormoremodelsarepresent,thedirectassimilationintheform(1.1)canbeemployed,wheretheexplicitformsofthe matricesA andBarederived.Forpracticalsystemswithlargedimensions,wealsoproposeamoreefficientiterativeassim- m ilation method,which seeks to perform a series of two-model assimilationsuntil all modelsand data are assimilated.We thenestablishthemathematicalconditionsunderwhichthe iterativeassimilationisequivalenttothe directassimilation, and more importantly, invariant to the permutation of the model states. Our proposed iterative assimilation approach can also handlesingular model anddata covariancematrices. Like all Kalmanfiltering methods, the presentmethodology requirestheprescriptionandpropagationofthemodelcovariances.However,themethodisratheraframeworkandnottied toanyspecificalgorithmforcovariancepropagation.Therefore,themethodcanbereadilycombinedwithmostvariantsof theKalmanfilter,suchastheensembleKalmanfilter. Itisworthpointingoutthattheexplicitinclusionofthedatainourassimilationmethod(1.1)representsasubtle,andyet important,differencefromtheBMA.InBMA,theroleofdatapresentsitselfimplicitlyintheformoftheaveragingweights, whichis determinedbythe posterior probability. As aresult, theBMA isan ‘‘average’’of all models.Whenall modelsare consistentlybiasedtowardsonesideoftheprediction,theBMAresultis,byconstruction,guaranteedtobenobetterthan thebestavailablemodel.Theformof(1.1)effectivelypreventsthisfromhappening. In Section 2 we formalize notation and present the assimilation problem. This section also reviews existing assimila- tion methods that are relevant to our discussion. Section 3 introduces our algorithm and presents its mathematical jus- tification. In doing so, we compare a simultaneous assimilation procedure with a sequential assimilation procedure, and make the argument that the sequential procedure is more robust. A Bayesian interpretation of both procedures is also provided. Section 4 provides examples of our assimilation method in practice, and Section 5 follows with closing remarks. 2.Problemsetup In this sectionwe present the overall problem of data assimilation with multiple models, formalizing the assumptions andnotationtobeusedinsubsequentanalysis.WealsoreviewtheKalmanfilter(KF),whichiswidelyusedindataassim- ilationwithasinglemodelandcloselyrelatedtoourproposedmethodformultiplemodelassimilation. Author's personal copy A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 6403 2.1.Dataassimilationwithmultiplemodels Followingusualpracticeindataassimilation,wepresentourmodelsintheformofdynamicalsystems,emphasizingthe temporalnatureoftheproblem.Letut 2RNt,N P1,denotethe‘‘truestate’’ofaphysicalphenomenon.Typicallyut isafi- t nite representation of the spatially distributed state of the physical system, after some spatial discretizationhas been ap- plied. More generally, ut is a quantity of interest whose evolution we wish to track. This evolution is governed by certain physicallawsandprocessesthatarenotcompletelyavailabletous.Nonetheless,wesupposethatthereexistssomeoperator thatcapturestheexactevolutionofthesystemunderconsideration.Consideringonlydiscretevaluesoftimet ,wecanwrite k utðt Þ¼Lðt ;utðt ÞÞ; ð2:1Þ kþ1 k k wheretheoperatorLisunknowntous,butrepresentstheevolutionofthetruthstateut fromonepointintimetothenext. InsteadofL,wehaveasetofmodelsindexedbym¼1...M,thatapproximatetheevolutionofut indifferentways.In particular,thesemodelsspecifythetemporalevolutionofdistinctstatevectorsu 2RNm viaanoperatorg : m m du m ¼g ðt;u Þ; t2ð0;T(cid:2); dt m m ð2:2Þ u ð0Þ¼u ; m m;0 withT >0.Solutionsofthedifferentialequationsabovearethe‘‘forecast’’statesu ðtÞ.Wedefinethediscretesolutionoper- m atorforeachsystem,givenby u ðt Þ¼G ðt ;u ðt ÞÞ; ð2:3Þ m kþ1 m k m k wheretheoperatorG isthediscretizedversionofg ,andmaybenonlinearinu .Noteastrictfirst-orderMarkoviansetting m m m has been used throughout this section. This is done for mere notational convenience and does not affect our discussion below. Theforecaststatesareusefulbecausetheycarryinformationaboutthetruestate.Wewillassumethat,atanygiventime t , the forecaststate variablesu are linearly relatedto the truestate ut, with the additionof a stochasticdiscrepancy. In k m otherwords,wehave u ¼H utþ(cid:2) ; m¼1;...;M; ð2:4Þ m m m whereH 2RNm(cid:3)Nt,and(cid:2) ,satisfyingE½(cid:2) (cid:2)¼0,arerandomvariablesthatcapturethediscrepancybetweenthetransformed m m m truestateandu .Notethatthe(cid:2) canbetime-dependent. m m Inpracticetherelationbetweentheforecaststatesandthetruestatescouldbenonlinear: u ¼H ðutÞþ(cid:2) ; m¼1;...;M; ð2:5Þ m m m forsomenonlinearoperatorsH .Ourtechniqueusesafilteringproceduretoassimilatedifferentmodels,andincaseswhen m the measurement operators H are not linear, then nonlinear filtering techniques such as the Unscented Kalman Filter m [13,12]orparticlefilters[18,22]areappropriate.Herewerestrictourselvestothelinearcase,withtheunderstandingthat onecouldreplaceouruseoftheKalmanfilterwithanyappropriatenonlinearassimilationtechnique.Onecouldalsocon- sider(2.4)tobeanapproximationorlinearizationofthenonlinearoperatorH ,asoftendoneinpractice. m Weused2RNd; NdP1,todenoteasetofmeasurements d¼Hutþ(cid:2); ð2:6Þ whereH2RNd(cid:3)Nt isthe measurementmatrixand(cid:2)2RNd isthemeasurementerrorsatisfyingE½(cid:2)(cid:2)¼0. Notethattherela- tionshipbetweenthemeasurementandthetruesolutioncouldingeneralbenonlinear,butwefocushereonthelinearcase. (AgainthemeasurementmatrixHcouldbeconsideredasalinearizationofanonlinearmeasurementoperator.)Ouruseof thediscrepancyterms(cid:2)and(cid:2) encompassesmanysourcesofuncertaintyincluding,butnotlimitedto,modeldiscrepancy, m temporal/spatialnumericaldiscretizationerror,numericalroundofferror,andmeasurementerror.Alsonotethat(2.6)easily generalizestothecaseofmultiplemeasurementsthatareconditionallyindependentgivenut.Forourpurposesitisnota- tionallyconvenienttocollectalltheseobservationsintoonevectord,butthealgorithmwepresentbelowcanimmediately beappliedtoassimilationproblemswheredifferentmeasurementsaretreatedasseparatevectors. Our goal is to construct an ‘‘analyzed solution,’’ denoted by w2RNt and of the form (1.1), using the forecast solutions fu gM andthemeasurementdtoprovideamoreaccuratepredictionofthetruesolutionut. m m¼1 2.2.Kalmanfilter TheKalmanfilter(KF)iswidelyusedfordataassimilationwithasinglemodel;hereweonlylistrelevantproperties,and morein-depthdiscussioncanbefoundin[8].FollowingoursetupinSection2.1,weconsiderthecaseofM¼1andH ¼I, 1 whereIistheidentitymatrix.Thatis,thereexistsonlyoneforecaststateu anditisadirectpredictionofthetruestateut. 1 Let U12RN1(cid:3)N1 be the covariance matrix of the forecast solution u1 and suppress the subscript 1 hereafter only in this Author's personal copy 6404 A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 section.TheanalyzedsolutionwobtainedbythestandardKFisalinearcombinationoftheforecastsolutionuandthemea- surementdinthefollowingmanner, w¼uþKðd(cid:4)HuÞ; ð2:7Þ whereKistheso-calledKalmangainmatrixdefinedas K¼UHTðHUHT þDÞ(cid:4)1: ð2:8Þ Here the superscript T denotes the matrix transpose, and D2RNd(cid:3)Nd is the covariance of the measurement error (cid:2). The covariancematrixoftheanalyzedstatew,W2RNt(cid:3)Nt,isthenobtainedbytheupdate W¼ðI(cid:4)KHÞUðI(cid:4)KHÞTþKDKT ¼ðI(cid:4)KHÞU: ð2:9Þ Thisassimilationprocessisrepeatedateveryinstanceoftimewhendataisavailable.Theanalyzedstatewisthesolution thatminimizesthefollowingvariationalfunctional: J½w(cid:2)¼ðw(cid:4)uÞTU(cid:4)1ðw(cid:4)uÞþðHw(cid:4)dÞTD(cid:4)1ðHw(cid:4)dÞ: ð2:10Þ Therearemanywaystorationalizethedesiretominimizetheobjective(2.10).ThefunctionalJ½w(cid:2)isthesumofMahalan- obisdistancesbetweentheknownstatesVuandVdandwewanttofindastateVwlyingascloseaspossibletobothofthem. Alternatively, we may pose the problem in the Bayesian framework: we suppose that u and its covariance specify a prior Gaussiandistributiononstatespace;assuminglikewisethatdisobtainedasaGaussianperturbationfromlinearmeasure- mentsthetruth,thelikelihoodofobservingthedatacanbecomputed.Wethentaketheanalyzedstatewtobethemodeof the posterior; equivalently, wminimizesthe negativelog-likelihoodof theposterior.In thissetup,the negativelog-likeli- hoodisgivenbyJ,cf.(3.14)andSection3.7. 3.Assimilationofmultiplemodels Wepresentouralgorithmformultiplemodelassimilationinthissection.Section3.1beginsbystandardizingournota- tion.Ouralgorithmisimplicitlyrelatedtotheproblemofcomputingharmonicmeansofcovariancematrices,andsoSection 3.2 introduces the concept of harmonic means on positive semi-definite matrices. An unavoidable complication that may ariseinany assimilation procedureis theexistence of modeland/ormeasurement states thathave competingvaluesthat cannotbeclearlyresolved.WeacknowledgethisrealityinSection3.3andpresentsufficientconditionsunderwhichdiffer- ingvaluescanbeunambiguouslyassimilated.Section3.4presentsaversionofouralgorithmthatsimultaneouslyassimilates allmodelsandmeasurements,andSection3.5followsupwithourproposedsequentialalgorithm.Wethenprovideasimple exampletodemonstratewhysequentialassimilationispreferred.Section3.7concludesbyprovidingaBayesianinterpreta- tionofthemulti-modelassimilationscheme. 3.1.Preliminaries WeworkonacompleteprobabilityspaceðX;F;lÞwithXthecollectionofevents,Far-algebraonsetsofX,andlaprob- ability measure on F. All random variables considered here are in the L2 stochastic space: u2L2 )R kuðxÞk2dlðxÞ¼ l X kuðxÞk2 (cid:5)Ekuk2 <1,wherekukisthestandardEuclideannormwhenuisvector-valued;thisissufficienttoensuretheexis- l tence of the mean and variance of u. When talking about limits of random variables, equality is in the L2 sense: l lime!0ue ¼u)kue(cid:4)ukl !0. Throughoutthispaperwewilluselowercaseboldfaceletters(e.g.u)todenotevectorsanduppercaseboldfaceletters(e.g. A)todenotematrices.AT andAy denotethematrixtranspose,andtheMoore–PenrosepseudoinverseofA,respectively.For randomvectorswewillusethesameletterbutwithuppercasetodenotetheircorrespondingcovariancematrices.Forexam- ple,letv2RNbearandomvectorwithzeromean,thenV¼E½vvT(cid:2)2RN(cid:3)Ndenotesitscovariancematrix.AsquarematrixAis positivedefiniteifvTAv>0forallnon-trivialv,andispositivesemi-definite(or‘semi-positive’)ifvTAvP0.Foranyfixed sizeN,thespaceofallN(cid:3)NpositivedefinitematricesisdenotedbyH,andthespaceofallpositivesemi-definitematricesby H . 0 3.2.Matrixharmonicmeans ThestandardwaytodefineaharmonicmeanforpositivedefinitematricesA ;A 2His 1 2 (cid:2) (cid:3)(cid:4)1 F ¼2 A(cid:4)1þA(cid:4)1 2 1 2 anditisnotdifficulttoimaginegeneralizingthistoasequenceofmatricesA 2H; m¼1;...;M: m M !(cid:4)1 F ¼M XA(cid:4)1 : ð3:1Þ M m m¼1 Author's personal copy A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 6405 Itisclearthatthiscannotbeusedfor(non-invertible)semi-positivematrices.However,onecanproceedbytakingadifferent route. Definition1 (Matrixharmonicmean).LetA ;...;A 2H ,anddefineF ¼W ¼A .Form¼2;...;M,define: 1 M 0 1 1 1 W ¼W ðW þA ÞyA ; ð3:2aÞ m m(cid:4)1 m(cid:4)1 m m F ¼mW : ð3:2bÞ m m F iscalledtheharmonicmeanofthematricesA ;...;A . M 1 M This definition coincides with the traditional definition of harmonic means of MP2 positive-definite matrices. More importantly,thisdefinitionisageneralizationtosemi-positivematrices. Theorem 1. For a sequence of matrices A ;m¼1;...;M, on H, the harmonicmean of Definition 1 coincideswith the formula m (3.1).ForA onH ;F is m 0 M 1. closedonH :F 2H ; 0 M 0 2. continuous:ifFe istheharmonicmeanofAe;...;Ae 2H whereAe !A thenFe !F ; M 1 M 0 m m M M 3. consistent:A ¼A ¼(cid:6)(cid:6)(cid:6)¼A )F ¼A ; 1 2 M M 1 4. symmetric: For m ;m ;...;m any permutation of 1;2;...;M, the harmonic mean of A ;...;A is the same as that of 1 2 M m1 mM A ;A ;...;A ; 1 2 M 5. decreasing:F 6A forallm; M m 6. monotone:IfA 6B forallm,thenF 6J ,whereF istheharmonicmeanoftheA andJ istheharmonicmeanoftheB . m m M M M m m m TheabovepropertiesjustifycallingF amatrixmean.TheAppendixcontainstheproof,alongwithmathematicaldiscus- M sionsthatarenotdirectlyrelatedtoourpresenttaskofmultiplemodeldataassimilation. Wenoteherethatusingpropertiesofpseudoinversesonemayrewritetheiterativeprocedure(3.2a(a))forupdatingW : m (cid:2) (cid:3) W ¼ I(cid:4)W ðW þA Þy W ,ðI(cid:4)KÞW m m(cid:4)1 m(cid:4)1 m m(cid:4)1 m(cid:4)1 ComparingthiswiththeKalmanfiltercovarianceupdatefrom(2.8)and(2.9)whenH¼I,weseethattheiterativeprocedure forcomputingharmonicmeansisalmostidenticaltoastandardKalmanfilterprocedure. Ouralgorithmformodelassimilationproducesanassimilatedmodelstatewhosecovarianceisproportionaltothehar- monic mean of the input covariances. (If the input covariances are A , then the matrix W from (3.2b) is the assimilated m M covariance.)WewillseelaterthatthealgorithmitselfcanbeimplementedbyiteratingastandardKalmanfilter,andthus implicitlyusestheiterativedefinitionofthematrixharmonicmean(3.2b)tocomputetheassimilatedstate. 3.3.Consistentrandomvariables Anyassimilationproceduremustprecludethesituationwherethereisnologicalwaytoreconcileaquantitativediffer- encebetweentwomodels.Forexample,ifwehavetwomodelsofascalarquantityofinterestwithu ðxÞ¼1andu ðxÞ¼2 1 2 almostsurely,thennoassimilationprocedurecanproduceagoodsynthesis. Inoursetting,weexcludesuchsituationsby insistingthattherandomvariablescorrespondingtoourmodelstatevectorsareconsistent. Definition2. ConsiderL2ðXÞrandomvectorsum 2RNm,m¼1;...;M,eachpairedwithameasurementmatrixHm 2RNm(cid:3)Nt, whereN isthesizeofthetruthstate.1Thenu areconsistentifthecomponentsineachrandomvariablecorrespondingto t m zero variance can be assimilated, almost surely without contradiction, from the other variables. More precisely, let S0 be a m matrixoforthogonalcolumnvectorscorrespondingtothenullspaceofU .Thenu areconsistentifthelinearsystemdefined m m bytheMvector-valuedequations (cid:2) (cid:3)T (cid:2) (cid:3)T S0 H w¼ S0 E½u (cid:2); m¼1;2;...;M ð3:3Þ m m m m hasatleastonesolutionforthevectorw. Thecondition(3.3)essentiallystatesthattherandomvariablesu donothavecompetingzero-variancecomponents.If m alltherandomvariablesu havestrictlypositivecovariancekernels,thentheyareautomaticallyconsistent.Inallthatfol- m lows,wewillonlyconsidercollectionsofrandomvariablesthatareconsistent. 1 Recallfrom(2.4)thatHm isonlyusedtoconnectmodelstatestothetruthsolution.Thesourceofrandomness,whetherstemmingfrom(cid:2)m orfrom randomnessinthetruth,ishereirrelevant. Author's personal copy 6406 A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 3.4.Mainresult:directassimilationmethod Wenowprescribeamethodforassimilatingmultiplemodelsalongwithdata.Thesingle-modelKalmanfilterprocedure canbeidentifiedasminimizingthevariationalfunctional(2.10).Thissuggeststhatanassimilationprocedureforassimilat- ingmultiplemodelswithdatacanbeproposedsimplybymakingappropriatechangestothevariationalfunctional. Theorem2. ConsiderasetofMforecaststatesu ;m¼1;...;M,satisfying(2.4)andwitherrorcovarianceU ¼E½(cid:2) (cid:2)T(cid:2)2H, m m m m anddatavector(2.6)witherrorcovarianceD¼E½(cid:2)(cid:2)T(cid:2)2H.AssumethattherowspacesofHandHm spanallRNt: spanfranH;ranH ;ranH ;...;ranH g: ð3:4Þ 1 2 M Define M J½w(cid:2)¼XðH w(cid:4)u ÞTU(cid:4)1ðH w(cid:4)u ÞþðHw(cid:4)dÞTD(cid:4)1ðHw(cid:4)dÞ; ð3:5Þ m m m m m m¼1 thentheminimizersatisfies ! M X w¼W HTU(cid:4)1u þHTD(cid:4)1d ; ð3:6Þ M m m m m¼1 whereW isgivenby: M M !(cid:4)1 X W ¼ HTU(cid:4)1H þHTD(cid:4)1H : M m m m m¼1 Proof. The proof can be easily obtained by straightforward calculus, where the condition (3.4) is required to assure strict positivityofJ½w(cid:2). h Theanalyzedstateiswandtheanalyzedmodelstatesare v ¼H w; m¼1;...;M: ð3:7Þ m m ThematricesA in(1.1)arethusgivenby m A ¼W HTU(cid:4)1: ð3:8Þ m m m m Thetechnicalassumption(3.4)ismade2toensureauniqueminimizerforthequadraticformJ.While(3.6)willproducethe assimilated state we propose, the assumptions U ;D2H, i.e., strictly positive definite covariances, may be too restrictive in m practice.(NotethatundersuchanassumptionallrandomvariablesinTheorem2areautomaticallyconsistent.)Thisrestriction canberelaxedbyusingamorerobustiterativeassimilationmethod. 3.5.Mainresult:iterativeassimilationmethod Wenowpresenttheiterativeassimilationmethodthatbehavesinamorerobustmanner,inthesensethatitcanreadily dealwithsemi-positivecovariancematrices. Theorem 3. Consider a set of M forecast variables u ;m¼1;...;M, satisfying (2.4) and with error covariance m U ¼E½(cid:2) (cid:2)T(cid:2)2H , and data vector (2.6) with error covariance D¼E½(cid:2)(cid:2)T(cid:2)2H . Assume3 that H ¼I; set w ¼u and m m m 0 0 1 1 1 W ¼U .Form¼2;3;...;M,let 1 1 (cid:2) (cid:3)y K ¼W HT H W HT þU m m(cid:4)1 m m m(cid:4)1 m m (cid:2) (cid:3)y w ¼w þK ðu (cid:4)H w Þ¼w þW HT H W HT þU ðu (cid:4)H w Þ; ð3:9Þ m m(cid:4)1 m m m m(cid:4)1 m(cid:4)1 m(cid:4)1 m m m(cid:4)1 m m m m m(cid:4)1 (cid:2) (cid:3)y W ¼ðI(cid:4)K H ÞW ¼W (cid:4)W HT H W HT þU H W m m m m(cid:4)1 m(cid:4)1 m(cid:4)1 m m m(cid:4)1 m m m m(cid:4)1 2 Intheabsenceofanypriorinformationaboutthetruthstate,wealsoconsiderthisconditionasrequiredtoavoidinfinities:ifweknownothingaboutthe truthstate ut,and(3.4)isviolated, allthe modelsanddataessentiallycontaininsufficientinformationtosayanything aboutcertaincomponentsofut. Nevertheless we must assign some value to these components. In order to communicate ourlack of knowledge, prescribing infinite variance is the only quantitativelyaccurateassignment. 3 Thisassumptionismadeforsimplicityandisrelatedtotheconcernfromfootnote2.Weakeningthisassumption(H1–I)ispossiblebutrequiresspecial treatmenttoavoidinfinitevariancesintheearlystagesoftheassimilationprocedure.WhenH1–Ionecanformulateawell-definedassimilationprocedureso longas(3.4)holds. Author's personal copy A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 6407 Finally,assimilatethedatavector: (cid:2) (cid:3)y K¼W H HU HTþD M M ð3:10Þ w ¼w þKðd(cid:4)Hu Þ; Mþ1 M M W ¼ðI(cid:4)KHÞW : Mþ1 M W isthecovarianceofw .WhenU ; D2H,thenw isthesameaswfrom(3.6).Ifu ;...;u areconsistentrandomvari- m m m Mþ1 1 M ables,thenw isindependentoftheorderingintheiterativeprocedure(3.9). M WegivetheproofintheAppendix.Itisinthisproofthatwerequirethedefinitionofconsistentmodelstatesinorderto guaranteethattheassimilatedstatedoesnotdependontheorderingofthemodelstates. TherearemanyobservationstomakeabouttheiterativeproceduredefinedinTheorem3. (cid:7) Theiterativemethodallowsthecovariancematricestobesemi-positive.Whenthecovariancematricesarepositive,the iterativemethodisequivalenttothedirectmethod.Thereforetheiterativemethodhasamathematicallywiderrangeof applicabilitythanthedirectmethod. (cid:7) TheassumptionH ¼Iismadeonlytomakepresentationoftheassimilationprocedureclearer.Generallyweareinter- 1 estedindynamicalsystems,sothereisamodelstatewfromtheprevioustimestepthatcanbeused (cid:7) Theorderingoftheassimilationproceduredoesnotmatteriftherandomvariablesu ; m¼1;...;M,anddareconsis- m tent.Infactitisalsoindependentofthedata/modelordering.Therefore,onecantreatdataasanothersetofmodelsim- ulation results. Consequently, multiple and conditionally independent sources of data can be assimilated by simply performingadditionaldataassimilationsteps,independentoftheordering. (cid:7) ThematrixW canbeinterpretedasaharmonicmeanofthematricesU whenpairedwiththemeasurementmatrices M m H .IftheH areallidentitymatrices,thenW isexactly1=MtimestheharmonicmeanoftheU ,whereDefinition1is m m M m usedwhenanyoftheU issemi-positive.SeealsothediscussionattheendofSection3.2. m (cid:7) TheiterativeprocedureoutlinedaboveisessentiallyasequentialapplicationofastandardKalmanfilterupdate.(Useof thepseudoinverseisthemaindifference.)Thus,assimilationofeachnewmodelmaybeviewedasasingle-modelKalman filterupdate. (cid:7) Theiterativeschemeallowsonetoassimilateanysubsetofmodelsanddataattimeswhentheyareavailable.Oneprac- ticaluseofthisistheabilitytoassimilateonlymodelstatesintheabsenceofdata.Anexampleofsuchasituationisgiven inSection4.3. Theiterativeassimilationmethodcanbeimplementedinastraightforwardmanner: (cid:7) Initialization.Formodelu anddatad,performthestandardKalmanupdatetoobtainananalyzedmodelstatew with 1 1 covarianceW . 1 (cid:7) Iteration.Form¼2;...;M,applytheprocedure(3.9).Inotherwords,considerthepresentassimilatedmodelstatew m(cid:4)1 withits covarianceW tobe the forecaststate, andconducta Kalmanupdateusingthenew modelprediction u as m(cid:4)1 m ‘‘data’’ (with measurement matrix H ) to obtain the new analyzed state w¼w and the analyzed model states m Mþ1 v ¼H w. m m 3.6.Somemotivatingexamples ThedirectassimilationapproachisapplicabletocaseswhenthequadraticformJ from(3.5)ispositivedefinite;thisis satisfied,forexample,undertheassumptionsofTheorem2.Inothercases,forexamplewhensomecomponentshavezero variance,theiterativeprocedureismoreappropriate.However,theiterativeassimilationapproachisonlyrobustifthemod- elstatesareconsistent.Ifthemodelstatesarenotconsistent,thenonemustabandonthehopeofproducinganassimilation thatisfaithfultoallthemodels.Wenowpresentexamplesthatshowcasethesevarioussituations. Supposewehavetwomodelsu ;u 2R2withidentityobservationmatricesH ¼H ¼I.Thedirectassimilationscheme 1 2 1 2 istoformthevariable h i w¼ðU(cid:4)1þU(cid:4)1Þ(cid:4)1 U(cid:4)1u þU(cid:4)1u : ð3:11Þ 1 2 1 1 2 2 IfU andU arepositivedefinite,thereisnoissue.Butletusexploretheissueofcontinuitywithrespecttosemi-positive 1 2 definitematrices.Letusparameterizetheserandomvariableswithrespecttoasmallpositiveperturbatione: (cid:4)e 0(cid:5) (cid:4)1 0(cid:5) Ue ¼ ; Ue ¼ ; 1 0 1 2 0 e (cid:4)u (cid:5) (cid:4)u (cid:5) u ðxÞ¼ 11 ; u ðxÞ¼ 21 : 1 u 2 u 12 22 Author's personal copy 6408 A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 Thevaluesofu andu arenotimportant,butclearlytheiruncertaintiesdependone.Smallvaluesofecorrespondtothe 1 2 casewhereu haslittleuncertaintyinitsfirstcomponent,andu haslittleuncertaintyinitssecondcomponent.The‘sen- 1 2 sible’waytoassimilatesuchstatesthen,istoputmostoftheweightonu andu .Indeedthedirectassimilationprocedure 11 22 doesexactlythis;fore>0weobtainfrom(3.11) 1 (cid:4)u þeu (cid:5) we ¼ 11 21 : eþ1 eu þu 12 22 Wecandefinew¼lime!0we andUj ¼lime!0Uej.Clearly,w¼ðu11;u22ÞT.Onecanverifythateitheriterativeprocedure w¼u þU ðU þU Þyðu (cid:4)u Þ ð3:12aÞ 1 1 1 2 2 1 w¼u þU ðU þU Þyðu (cid:4)u Þ ð3:12bÞ 2 2 2 1 1 2 producesthisstatewregardlessofordering(indeedU þU ¼I>0sothepseudoinverseisnotevennecessary).Onemay 1 2 thenbetemptedtousethedirectapproach(3.11),butbynaïvelyreplacingalldirectinverses(cid:4)1bypseudoinverses(cid:2).Ifwedo thisandthenapply(3.11),weobtainastatew¼ðu ;u ÞT.Thisstateistheoppositeofwhatweshoulddo:wecompletely 21 12 ignorethecomponentsthatwehavethemostinformationabout.Thisshowsthatthedirectassimilationapproachcannotbe remediedforpositivesemi-definitecovariances. Letusnowshowhowtheiterativeschemetreatsconsistentrandomvariableswithsemi-positivecovariances.Againwe takeu ;u 2R2,andwenowprescribethecovariances 1 2 (cid:4)e 0(cid:5) (cid:4)1 0(cid:5) Ue ¼ ; Ue ¼ : 1 0 e 2 0 e Wenowexpectthattheprocedureshoulddiscardu infavorofu ,andshouldequallyweighthevaluesfromu andu . 21 11 12 22 Whene>0wecanuseeitherthedirectoriterativeschemestoobtain 1 ðu þeu Þ! we ¼ eþ1 11 21 : 1u þ1u 2 12 2 22 Takinglimitsweobtainwð1Þ,lime!0we ¼(cid:6)u11;12u12þ12u22(cid:7)T asexpected.AgaindefineU1¼lime!0Ue1 andsimilarlyforU2. The iterative scheme (3.12a) produces the state wð2Þ ¼ðu ;u ÞT whereas the second scheme (3.12b) produces 11 12 wð3Þ ¼ðu ;u Þ.Bothofthesearepotentiallydifferentfromwð1Þ,andfromeachother.However,ifu andu areconsistent 11 22 1 2 randomvariables,thenu andu arealmostsurelyequal,andthereforewð1Þ ¼wð2Þ ¼wð3Þ almostsurely,sothatthefinal 12 22 choiceamongthethreeoptionsislargelyirrelevant. Iftherandomvariablesarenotconsistent,thenallthreewvectorsdifferwithnonzeroprobability,butinthiscasethere canbenoremedy:wehavetwomodelswithanon-vanishingdiscrepancythatarebothentirelysureoftheirownaccuracy. Weclosethissectionbynotingthattheabilitytoassimilatecomponentswithvanishingvarianceopensupattractivepos- sibilities.Onemayenforceconstraintsoftheanalyzedstatevector(e.g.conservationofmass)inapost-processingstepby treatingtheconstraintsaszero-variancedata. 3.7.Bayesianinterpretationofmulti-modelassimilation Toplacethepresentschemeincontext,itisinstructivetoconsideritsBayesianinterpretation.Letwk,wðt Þdenotethe k assimilatedstateattimet .FromtheBayesianperspective,wkisarandomvariablewhosedistributioncapturesthecurrent k stateofknowledgeaboutthetruthstateutðt Þ.Inotherwords,thedistributionofwkistheposteriordistributionofthesys- k temstate,giventhedatauptotimet andtheavailablemodels. k To be more specific, the present scheme provides the mean and covariance of the posterior probability density pðwkjd1:k;M ;...;M Þ, where d1:k,ðdðt Þ;dðt Þ;...;dðt ÞÞ are the data provided up to assimilationtimestep t and M 1 M k k(cid:4)1 1 k m areindividualmodels.IfwehadonlyonemodelM andonesourceofdata,theposteriordensityofwk wouldbewritten 1 as p(cid:2)wkjd1:k;M (cid:3)/p(cid:2)dkjwk;M (cid:3)p(cid:2)wkjd1:k(cid:4)1;M (cid:3)¼p(cid:2)dkjwk(cid:3)Z p(cid:6)wkjwk(cid:4)1;M (cid:7)p(cid:2)wk(cid:4)1jd1:k(cid:4)1;M (cid:3)dwk(cid:4)1: ð3:13Þ 1 1 1 1 1 (cid:2) (cid:3) AsistypicalinKalmanfilteringvariants,theassimilationstepapproximatesthedatalikelihoodp dkjwk andtheforecast (cid:2) (cid:3) distribution p wkjd1:k(cid:4)1;M as Gaussian, even if propagation of uncertainty through a forecast model results in a non- 1 Gaussiandistribution.Inthiscase,iftheforecaststatehasmeanuk,covarianceUk,andH ¼I,suchthat 1 1 1 wk ¼ukþ(cid:2)k; (cid:2)k (cid:8)Nð0;UkÞ; 1 1 1 1 thentheposteriormeanandcovarianceofwk areexactlygivenbytheKalmanupdatespecifiedinSection2.2. Author's personal copy A.Narayanetal./JournalofComputationalPhysics231(2012)6401–6418 6409 Withmorethanonemodel,itisnaturaltointerpretmodelsM ,mP2,asprovidingadditionaltermsinthelikelihood m (cid:2) (cid:3) function,e.g.p ukjwk;d1:k(cid:4)1;M ,whereuk ,u ðt Þ.Thesetermscanbejustifiedasfollows.Beginningwiththemulti- m m m m m k (cid:2) (cid:3) modelpriorp wk(cid:4)1jd1:k(cid:4)1;M ,eachmodelpropagatesthisdistributionforwardtothenextassimilationtimeandaddsits 1:M ownsourcesofrandomness(e.g.parametricuncertaintyorprocessnoise).Eachresultingmodel-specificforecastdistribu- tionisthendescribedonlybyitsmeanuk andcovarianceUk.Onecanwritetheforecastsas m m H wk ¼uk þ(cid:2)k; m m m where(cid:2)k (cid:8)Nð0;UkÞ.Theresultinglikelihoodtermis m m p (cid:2)ukjwk;d1:k(cid:4)1;M (cid:3)/exp(cid:4)(cid:4)1(cid:6)H wk(cid:4)uk(cid:7)TðUkÞ(cid:4)1(cid:6)H wk(cid:4)uk(cid:7)(cid:5): ð3:14Þ m m 1:M 2 m m m m m NotethattheconditioningonallmodelsM reflectsthefactthatalloftheseforecastsbeganwiththeassimilatedmulti- 1:M modelstateattimestepk(cid:4)1,butthemodel-specificsubscriptinp emphasizesthatsubsequentforecastingwasperformed m onlywiththemthmodel. Sincethemodelforecastsareconditionallyindependentgivenwk,thelikelihoodtermscanbecombinedtoyieldthemul- ti-modelposteriordensity: ! M pðwkjd1:k;M Þ/pðdkjwkÞ Yp ðukjwk;d1:k(cid:4)1;M Þ p ðwkjd1:k(cid:4)1;M Þ: ð3:15Þ 1:M m m 1:M 1 1:M m¼2 Herethepriorpredictivep ðwkjd1:k(cid:4)1;M ÞischosentoreflecttheforecastofanyoneoftheMmodels,hereM .Assuming 1 1:M 1 H ¼I,thisdistributionisGaussianwithmeanuk andcovarianceUk. 1 1 1 Wecanmotivatetheformofthevariationalfunctional(3.5)forthedirectassimilationmethodfromthisBayesiananal- ysis:notethatmaximizingtheposteriorprobabilitygivenby(3.14)and(3.15)isequivalenttominimizingitsnegativelog- arithm.Thenegativelog-posterior(moduloconstants)isgivenbythefunctionalJ½w(cid:2)from(3.5). Comparedtoothermodelaveragingtechniques,thepresentschemedoesnotneedtospecifyadditionaldynamicsonthe modelspace.Instead,theroleofthecovarianceisparamountindeterminingtheweightassignedtoeachmodelprediction andtothedata.Thecovariance-basedschemethusallowsmatrix-valuedweightsandisanaturalgeneralizationoftheKal- manfilter. 4.Examples Inthissectionweprovideafewsimpleexamplesthatillustratethebroadrangeofapplicabilityofthemodelassimilation approachpreviouslydetailed.OurimplementedmethodsusetheiterativeprocedureoutlinedinTheorem3,andweassume that all models represent consistent random variables. For our examples, this is a valid assumption: all involved random variables have strictly positive covariances. The cost of the assimilation procedure does not suffer greatly from use of the pseudoinverseimplementationasasafeguard. Ourfirstexampleconcernsarudimentarydifferentialequationy0 ¼ayandusesTaylorpolynomialsas themodels.The second example is a related stochastic differential equation (SDE) example that uses the same Taylor polynomial models, but showcases the complex interplay that can occur between the models and the data. Finally, we consider a periodic one-dimensional advection problem with non-constant wavespeed to show how the strengths of different models can be combinedbyusingthemultimodelassimilationapproach. 4.1.Anexponentialmodel Considerasystemwhosetruthisgivenby dut ¼aut; ðutÞ ¼u ; dt 0 0 forsomeu .TheobservationsarescalarandthemeasurementmatrixHistheidentity(here,thescalar1).Themeasurement 0 noiseeisassumedtobeNð0;r2Þ.ForanymP1,theforecastmodelpropagatorsG from(2.3)aregivenby m ! mX(cid:4)1ðaDtÞk G ðvÞ¼ v; m k! k¼0 whichisadegree-ðm(cid:4)1ÞTaylorapproximationofthetruesolution.WefirstletDt¼0:05; a¼1; u ¼0:1,andr¼0:05. 0 WeuseM¼2models(constantandlinearapproximations).Havinginformationaboutthemodels,weassumethatthestan- darddeviationofthepropagationerrorðutÞ (cid:4)G ððutÞ Þisgivenby0:1Dtm(cid:4)1:i.e.,theconstantmodelisassumedtohave n m n(cid:4)1 error with standard deviation 0:1Dt, while the linear model’s error has standard deviation 0:1Dt2. Naturally these are not entirely accurate and we can make far better assumptions about the error, but this will serve our purposes. Furthermore,
Description: