Statistical Modelling and Regression Structures · Thomas Kneib Gerhard Tutz Editors Statistical Modelling and Regression Structures Festschrift in Honour of Ludwig Fahrmeir Editors Prof.Dr.ThomasKneib Prof.Dr.GerhardTutz Institutfu¨rMathematik Institutfu¨rStatistik CarlvonOssietzkyUniversita¨tOldenburg Ludwig-Maximilians-Universita¨tMu¨nchen 26111Oldenburg Akademiestraße1 Germany 80799Mu¨nchen [email protected] Germany [email protected] ISBN978-3-7908-2412-4 e-ISBN978-3-7908-2413-1 DOI10.1007/978-3-7908-2413-1 SpringerHeidelbergDordrechtLondonNewYork LibraryofCongressControlNumber:2009943264 (cid:2)c Springer-VerlagBerlinHeidelberg2010 Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting, reproductiononmicrofilmorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9, 1965,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violations areliabletoprosecutionundertheGermanCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnot imply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotective lawsandregulationsandthereforefreeforgeneraluse. Coverdesign:WMXDesignGmbH,Heidelberg Printedonacid-freepaper Physica-VerlagisabrandofSpringer-VerlagBerlinHeidelberg SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Foreword Thecollectedcontributionscontainedwithinthisbookhavebeenwrittenbyfriends andcolleaguestoacknowledgeLudwigFahrmeir’swidespreadandimportantimpact onStatisticsasascience,whilecelebratinghis65thBirthday. Asayoungstudent,LudwigstartedhiscareerasaMathematician,buthequickly turnedintoarisingandshiningstarwithintheGermanandinternationalStatistics community. He soon obtained both his PhD and his Habilitation at the Technical UniversityofMunich.AfterashortperiodasavisitingprofessorattheUniversity ofDortmund,hereturnedtohishomelandBavariaandwasappointedFullProfessor ofStatisticsattheUniversityofRegensburg,attheageof32. Someyearslater,hemovedtothecapitalofBavariaandbecameProfessoratthe DepartmentofStatisticsattheUniversityofMunich.Hisappointmenthadsignificant impactontheDepartmentsince,soonafterhisarrival,Ludwigstartedaninitiative to establish a collaborativeresearchcenteronthe “StatisticalAnalysisofDiscrete Structures.” After a successful application for initial funding,further funding was extendedseveral times, until the research center reached the maximumperiod for fundingin2006.Duringthecompleteduration,Ludwigservedasaspeakerofthe researchcenterand–tociteoneofthefinalreferees–“manageditinaneasyand efficientwayandcontributedseveralimportantresults.” Duringthelastfortyyears,Ludwig’sworkhashadtremendousimpactontheSta- tisticscommunity.Hewasamongthefirstresearcherstorecognizetheimportance ofgeneralizedlinearmodelsandcontributedinaseriesofpaperstothetheoretical backgroundof that modelclass. His interest in statistical modelling led to the or- ganizationofaworkshopon“StatisticalModellingandGeneralizedLinearModels (GLIM)”inMunichin1992andculminatedinthehighlycitedmonographon“Mul- tivariate Statistical ModellingBased on GeneralizedLinear Models”that saw two printingsandremainstobeakeyreferenceonappliedstatisticalmodellingutilizing generalized linear models. Ludwig also had great influence on the creation of the Statistical Modelling Society, and is currentlyon the advisoryboard of the corre- spondingjournalon“StatisticalModelling.”Boththesocietyandjournalemerged outoftheearlyGLIMworkshopsandproceedings. v vi Foreword Ofcourse,Ludwig’sworkisdefinitelynotrestrictedtogeneralizedlinearmodels but–onthecontrary–spansawiderangeofmodernStatistics.Heco-authoredor co-editedseveralmonographs,e.g.onMultivariateStatistics,StochasticProcesses, Measurementof Credit Risks, as well as popular textbookson Regression and an IntroductiontoStatistics.Hisrecentresearchcontributionsaremostlyconcentrated insemiparametricregressionandspatialstatisticswithinaBayesianframework. When first circulating the idea of a Festschrift for the celebration of Ludwig’s 65thbirthday,allpotentialcontributorswereextremelypositive,manyimmediately agreeing to contribute. These reactions atest to Ludwig’s high personal and pro- fessionalappreciationinthestatisticalcommunity.Thefarreachingandvarietyof subjectscoveredwithinthesecontributionsalsorepresentsLudwig’sbroadinterest andimpactinmanybranchesofmodernStatistics. BotheditorsofthisFestschriftwereluckyenoughtoworkwithLudwigatseveral occasionsandinparticularearlyintheircareersasPhDstudentsandPostDocs.His personalandprofessionalmentorshipandhisstrongcommitmentmadehimaperfect supervisorandhispatient,confidentandencouragingworkingstylewillalwaysbe rememberedbyallofhisstudentsandcolleagues.Ludwigalwaysprovidedafriendly workingenvironmentthatmadeitapleasureandanhonortobeapartofhisworking group.We areproudtobeableto saythatLudwigismuchmorethana colleague butturnedintoafriendforbothofus. OldenburgandMunich,January2010 ThomasKneib,GerhardTutz Acknowledgements Theeditorswouldliketoexpresstheirgratitudeto • allauthorsofthisvolumefortheiragreementtocontributeandtheireasycoop- erationatseveralstagesofputtingtogetherthefinalversionoftheFestschrift. • JohannaBrandt,JanGertheiss,AndreasGroll,FelixHeinzl,SebastianPetry,Jan Ulbrichtand Stephanie Rubenbauerfor their invaluable contributionsin proof- readingandcorrectionofthepapers,aswellasinsolvingseveralLATEX-related problems. • theSpringerVerlagforagreeingtopublishthisFestschriftandinparticularNils- Peter Thomas, Alice Blanck and Frank Holzwarth for the smooth collabora- tion in preparing th emanuscript. vii Contents ListofContributors.............................................. xix TheSmoothComplexLogarithmandQuasi-PeriodicModels .......... 1 PaulH.C.Eilers 1 Foreword ................................................ 1 2 Introduction.............................................. 1 3 DataandModels.......................................... 2 3.1 TheBasicModel.................................. 3 3.2 SplinesandPenalties .............................. 3 3.3 StartingValues.................................... 7 3.4 SimpleTrendCorrectionandPriorTransformation ..... 8 3.5 AComplexSignal................................. 8 3.6 Non-normalDataandCascadedLinks ................ 10 3.7 AddingHarmonics ................................ 11 4 MoretoExplore .......................................... 12 5 Discussion ............................................... 15 References..................................................... 17 P-splineVaryingCoefficientModelsforComplexData ................ 19 BrianD.Marx 1 Introduction ............................................. 19 2 “LargeScale”VCM,withoutBackfitting...................... 22 3 NotationandSnapshotofaSmoothingTool:B-splines.......... 24 3.1 GeneralKnotPlacement............................ 25 3.2 SmoothingtheKTBData........................... 25 4 UsingB-splinesforVaryingCoefficientModels................ 26 5 P-splineSnapshot:Equally-SpacedKnots&Penalization........ 28 5.1 P-splinesforAdditiveVCMs........................ 30 5.2 StandardErrorBands .............................. 30 6 OptimallyTuningP-splines ................................ 31 7 MoreKTBResults ........................................ 33 8 ExtendingP-VCMintotheGeneralizedLinearModel .......... 33 9 Two-dimensionalVaryingCoefficientModels ................. 36 ix x Contents 9.1 Mechanicsof2D-VCMthroughExample ............. 37 9.2 VCMsandPenaltiesasArrays....................... 39 9.3 EfficientComputationUsingArrayRegression......... 40 10 DiscussionTowardMoreComplexVCMs..................... 41 References..................................................... 42 PenalizedSplines,MixedModelsandBayesianIdeas.................. 45 Go¨ranKauermann 1 Introduction.............................................. 45 2 NotationandPenalizedSplinesasLinearMixedModels ........ 46 3 ClassificationwithMixedModels ........................... 48 4 VariableSelectionwithSimplePriors ........................ 50 4.1 MarginalAkaikeInformationCriterion ............... 50 4.2 ComparisoninLinearModels ....................... 53 4.3 Simulation ....................................... 55 5 DiscussionandExtensions.................................. 56 References..................................................... 57 BayesianLinearRegression—DifferentConjugateModelsandTheir (In)SensitivitytoPrior-DataConflict ............................... 59 GeroWalterandThomasAugustin 1 Introduction.............................................. 59 2 Prior-dataConflictinthei.i.d.Case .......................... 62 3 TheStandardApproachforBayesianLinearRegression(SCP) ... 64 3.1 Updateofβ|σ2 .................................. 65 3.2 Updateofσ2 ..................................... 66 3.3 Updateofβ ...................................... 67 4 An AlternativeApproachforConjugatePriorsin Bayesian LinearRegression(CCCP) ................................. 68 4.1 Updateofβ|σ2 .................................. 71 4.2 Updateofσ2 ..................................... 71 4.3 Updateofβ ...................................... 75 5 DiscussionandOutlook.................................... 76 References..................................................... 77 AnEfficientModelAveragingProcedureforLogisticRegressionModels UsingaBayesianEstimatorwithLaplacePrior ...................... 79 ChristianHeumannandMoritzGrenke 1 Introduction.............................................. 79 2 ModelAveraging ......................................... 80 2.1 Orthogonalization ................................. 81 2.2 UnrestrictedMaximumLikelihoodEstimation ......... 82 2.3 RestrictedApproximateMaximumLikelihoodEstimation 83 2.4 ModelAveraging.................................. 84 2.5 Algorithm........................................ 86 3 SimulationStudy ......................................... 86 Contents xi 4 ConclusionandOutlook.................................... 88 References..................................................... 89 PosteriorandCross-validatoryPredictiveChecks:A Comparisonof MCMCandINLA............................................... 91 LeonhardHeld,BirgitSchro¨dleandHa˚vardRue 1 Introduction.............................................. 91 2 TheINLAApproach ...................................... 92 2.1 ParameterEstimationwithINLA .................... 92 2.2 PosteriorPredictiveModelCheckswithINLA ......... 94 2.3 Leave-one-outCross-validationwithINLA............ 95 3 PredictiveModelCheckswithMCMC ....................... 96 3.1 PosteriorPredictiveModelCheckswithMCMC........ 97 3.2 Leave-one-outCross-validationwithMCMC .......... 97 3.3 ApproximateCross-validationwithMCMC ........... 98 4 Application .............................................. 99 4.1 AComparisonofPosteriorPredictiveModelChecks....101 4.2 A Comparison of Leave-one-outCross-validated PredictiveChecks .................................103 4.3 AComparisonofApproximateCross-validationwith PosteriorandLeave-one-outPredictiveChecksusing MCMC..........................................106 5 Discussion ...............................................107 References.....................................................109 DataAugmentationandMCMC forBinaryandMultinomialLogit Models ........................................................ 111 SylviaFru¨hwirth-SchnatterandRudolfFru¨hwirth 1 Introduction..............................................111 2 MCMCEstimationBasedonDataAugmentationforBinary LogitRegressionModels...................................113 2.1 WritingtheLogitModelasaRandomUtilityModel ....113 2.2 DataAugmentationBasedontheRandomUtilityModel.114 2.3 TwoNewSamplersBasedonthedRUMRepresentation.116 2.4 Finite Mixture Approximationsto the Logistic Distribution ......................................118 3 MCMC Estimation Based on Data Augmentationfor the MultinomialLogitRegressionModel.........................120 3.1 DataAugmentationintheRUM .....................121 3.2 DataAugmentationinthedRUM ....................121 4 MCMCSamplingwithoutDataAugmentation.................123 5 ComparisonoftheVariousMCMCAlgorithms ................125 6 ConcludingRemarks ......................................130 References.....................................................131 xii Contents GeneralizedSemiparametricRegressionwithCovariatesMeasuredwith Error.......................................................... 133 ThomasKneib,AndreasBrezgerandCiprianM.Crainiceanu 1 Introduction..............................................133 2 SemiparametricRegressionModelswithMeasurementError.....135 2.1 ObservationModel ................................135 2.2 MeasurementErrorModel..........................136 2.3 PriorDistributions.................................136 3 BayesianInference........................................139 3.1 Posterior&FullConditionals .......................139 3.2 ImplementationalDetails&Software.................141 4 Simulations ..............................................143 4.1 SimulationSetup..................................143 4.2 SimulationResults ................................144 5 IncidentHeartFailureintheARICStudy .....................150 6 Summary ................................................153 References.....................................................153 DeterminantsoftheSocioeconomicandSpatialPatternofUndernutrition bySexinIndia:AGeoadditiveSemi-parametricRegressionApproach .. 155 ChristianeBelitz,JudithHu¨bner,StephanKlasenandStefanLang 1 Introduction..............................................155 2 TheData ................................................158 3 MeasurementandDeterminantsofUndernutrition..............160 3.1 Measurement.....................................160 3.2 DeterminantsofUndernutrition......................161 4 VariablesIncludedintheRegressionModel ...................162 5 StatisticalMethodology-SemiparametricRegressionAnalysis...167 6 Results ..................................................170 7 Conclusion...............................................177 References.....................................................178 BoostingforEstimatingSpatiallyStructuredAdditiveModels.......... 181 NikolayRobinzonovandTorstenHothorn 1 Introduction..............................................181 2 Methods.................................................183 2.1 Spatio-TemporalStructuredAdditiveModels ..........183 2.2 TreeBasedLearners ...............................187 2.3 GeneralizedAdditiveModel ........................188 3 Results ..................................................189 3.1 ModelIllustrations ................................189 3.2 ModelComparison ................................193 4 Discussion ...............................................194 References.....................................................195