Semiparametric Regression Semiparametricregressionisconcernedwiththeflexibleincorporationofnonlinear functional relationships in regression analyses. Any application area that uses re- gressionanalysiscanbenefitfromsemiparametricregression.Assumingonlyabasic familiaritywithordinaryparametricregression,thisuser-friendlybookexplainsthe techniquesandbenefitsofsemiparametricregressioninaconciseandmodularfash- ion. Theauthorsmakeliberaluseofgraphicsandexamplespluscasestudiestaken fromenvironmental,financial,andotherapplications. Theyincludepracticaladvice onimplementationandpointerstorelevantsoftware. Thisbookissuitableasatextbookforstudentswithlittlebackgroundinregression as well as a reference book for statistically oriented scientists – such as biostatisti- cians, econometricians, quantitative social scientists, and epidemiologists – with a good working knowledge of regression and the desire to begin using more flexi- blesemiparametricmodels. Evenexpertsonsemiparametricregressionshouldfind somethingnewhere. DavidRuppertistheAndrewSchultz,Jr.,ProfessorofEngineering(SchoolofOp- erationsResearchandIndustrialEngineering)andProfessorofStatisticalScienceat Cornell University. He has served as editor for a number of prestigious series and journalsandhaspublishedsome80articlesofhisownaswellasco-authoringtwo popularbooks,TransformationandWeightinginRegressionandMeasurementError inNonlinearModels. HeisalsowinneroftheWilcoxonPrizeforbestpracticalap- plicationspaperintechnometricsandanelectedFellowoftheAmericanStatistical AssociationandtheInstituteofMathematicalStatistics. M.P.WandisProfessorofStatisticsattheUniversityofNewSouthWalesinSydney, Australia. HehasheldfacultyappointmentsatHarvardUniversity,RiceUniversity, andTexasA&MUniversity. Dr.WandisaFellowoftheAmericanStatisticalAssoci- ationandhasservedasanassociateeditorfortheJournaloftheAmericanStatistical AssociationandBiometrika. HeiswinneroftheP.A.P.MoranMedalforstatistical research. R. J. Carroll is Distinguished Professor of Statistics, Nutrition and Toxicology at TexasA&MUniversity. AmonghismanyhonorsaretheCOPSSPresidents’Award, the Fisher Lecture, the SnedecorAward, and theWilcoxon Prize. He is an elected FellowoftheAmericanStatisticalAssociationandtheInstituteofMathematicalSta- tisticsaswellasanelectedmemberoftheInternationalStatisticalInstitute. CAMBRIDGE SERIES IN STATISTICAL AND PROBABILISTIC MATHEMATICS EditorialBoard R.Gill(Departmentof Mathematics,UtrechtUniversity) B.D.Ripley(DepartmentofStatistics,UniversityofOxford) S.Ross(DepartmentofIndustrialEngineering,UniversityofCalifornia,Berkeley) M.Stein(DepartmentofStatistics,UniversityofChicago) D.Williams(Schoolof MathematicalSciences,Universityof Bath) Thisseriesofhigh-qualityupper-divisiontextbooksandexpositorymonographscov- ersallaspectsofstochasticapplicablemathematics. Thetopicsrangefrompureand appliedstatisticstoprobabilitytheory,operationsresearch,optimization,andmath- ematicalprogramming. Thebookscontainclearpresentationsofnewdevelopments inthefieldandalsoofthestateoftheartinclassicalmethods. Whileemphasizing rigorous treatment of theoretical methods, the books also contain applications and discussionsofnewtechniquesmadepossiblebyadvancesincomputationalpractice. Alreadypublished 1. BootstrapMethodsandTheirApplication,byA.C.DavisonandD.V.Hinkley 2. MarkovChains,byJ.Norris 3. AsymptoticStatistics,byA.W.vanderVaart 4. WaveletMethodsforTimeSeriesAnalysis,byDonaldB.Percivaland AndrewT.Walden 5. BayesianMethods,byThomasLeonardandJohnS.J.Hsu 6. EmpiricalProcessesinM-Estimation,bySaravandeGeer 7. NumericalMethodsofStatistics,byJohnF.Monahan 8. AUser’sGuidetoMeasureTheoreticProbability,byDavidPollard 9. TheEstimationandTrackingofFrequency,byB.G.QuinnandE.J.Hannan 10. DataAnalysisandGraphicsusingR,byJohnMaindonaldandJohnBraun 11. StatisticalModels,byA.C.Davison Semiparametric Regression DAVID RUPPERT CornellUniversity M. P.WAND HarvardUniversity R. J. CARROLL TexasA&MUniversity Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge , United Kingdom Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521780506 © David Ruppert, M. P.Wand, R. J. Carroll 2003 This book is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2003 - isbn-13 978-0-511-06683-2 eBook (NetLibrary) - isbn-10 0-511-06683-X eBook (NetLibrary) - isbn-13 978-0-521-78050-6 hardback - isbn-10 0-521-78050-0 hardback - isbn-13 978-0-521-78516-7 paperback - isbn-10 0-521-78516-2 paperback Cambridge University Press has no responsibility for the persistence or accuracy of s for external or third-party internet websites referred to in this book, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. ToAnne,withlove —David Tomywife’sparents,AyhanandRecep —Matt ToBrettandJeb —Raymond Contents Preface page xiii GuidetoNotation xv 1 Introduction 1 1.1 AssessingtheCarcinogenicityofPhenolphthalein 3 1.2 SalinityandFishinginNorthCarolina 4 1.3 ManagementofaRetirementFund 5 1.4 BiomonitoringofAirborneMercury 7 1.5 TermStructureofInterestRates 7 1.6 AirPollutionandMortalityinMilan:TheHarvestingEffect 11 2 ParametricRegression 15 2.1 Introduction 15 2.2 LinearRegressionModels 15 2.3 RegressionDiagnostics 20 2.4 Inference 28 2.5 ParametricAdditiveModels 36 2.6 ModelSelection 44 2.7 PolynomialRegressionModels 46 2.8 NonlinearRegression 48 2.9 TransformationsinRegression 51 2.10 BibliographicNotes 55 2.11 SummaryofFormulas 55 3 ScatterplotSmoothing 57 3.1 Introduction 57 3.2 PreliminaryIdeas 58 3.3 PracticalImplementation 62 3.4 AutomaticKnotSelection 64 3.5 PenalizedSplineRegression 65 3.6 QuadraticSplineBases 67 3.7 OtherSplineModelsandBases 69 3.8 OtherPenalties 74 3.9 GeneralDefinitionofaPenalizedSpline 75 3.10 LinearSmoothers 76 3.11 ErrorofaSmoother 76 vii viii Contents 3.12 RankofaSmoother 78 3.13 DegreesofFreedomofaSmoother 80 3.14 ResidualDegreesofFreedom 82 3.15 OtherApproachestoScatterplotSmoothing 84 3.16 ChoosingaScatterplotSmoother 87 3.17 BibliographicalNotes 88 3.18 SummaryofFormulas 89 4 MixedModels 91 4.1 Introduction 91 4.2 MixedModels 91 4.3 Prediction 95 4.4 TheLinearMixedModel(LMM) 98 4.5 EstimationandPredictioninLMM 98 4.6 EstimatedBLUP(EBLUP) 101 4.7 StandardErrorEstimation 102 4.8 HypothesisTesting 104 4.9 PenalizedSplinesasBLUPs 108 4.10 BibliographicalNotes 110 4.11 SummaryofFormulas 110 5 AutomaticScatterplotSmoothing 112 5.1 Introduction 112 5.2 TheLikelihoodApproach 113 5.3 TheModelSelectionApproach 114 5.4 CaveatsofAutomaticParameterSelection 120 5.5 ChoosingtheKnotsandBasisFunctions 123 5.6 AutomaticSelectionoftheNumberofKnots 127 5.7 BibliographicalNotes 131 5.8 SummaryofFormulas 131 6 Inference 133 6.1 Introduction 133 6.2 VariabilityBands 133 6.3 ConfidenceandPredictionIntervals 135 6.4 InferenceforPenalizedSplines 137 6.5 SimultaneousConfidenceBands 142 6.6 TestingtheAdequacyofParametricModels 145 6.7 TestingforNoEffect 149 6.8 InferenceUsingFirstDerivatives 151 6.9 TestingforExistenceofaFeature 156 6.10 BibliographicalNotes 158 6.11 SummaryofFormulas 159 7 SimpleSemiparametricModels 161 7.1 Introduction 161 7.2 BeyondScatterplotSmoothing 161 Contents ix 7.3 SemiparametricBinaryOffsetModel 162 7.4 AdditivityandInteractions 164 7.5 GeneralParametricComponent 164 7.6 Inference 167 7.7 BibliographicalNotes 168 8 AdditiveModels 170 8.1 Introduction 170 8.2 FittinganAdditiveModel 171 8.3 DegreesofFreedom 174 8.4 SmoothingParameterSelection 176 8.5 HypothesisTesting 181 8.6 ModelSelection 183 8.7 BibliographicalNotes 185 9 SemiparametricMixedModels 186 9.1 Introduction 186 9.2 AdditiveMixedModels 186 9.3 Subject-SpecificCurves 191 9.4 BibliographicalNotes 192 10 GeneralizedParametricRegression 194 10.1 Introduction 194 10.2 BinaryResponseData 194 10.3 LogisticRegression 195 10.4 OtherGeneralizedLinearModels 197 10.5 IterativelyReweightedLeastSquares 200 10.6 HatMatrix,DegreesofFreedom,andStandardErrors 201 10.7 OverdispersionandVarianceFunctions:Pseudolikelihood 201 10.8 GeneralizedLinearMixedModels 203 10.9 Deviance 209 10.10 TechnicalDetails 210 10.11 BibliographicalNotes 213 11 GeneralizedAdditiveModels 214 11.1 Introduction 214 11.2 GeneralizedScatterplotSmoothing 215 11.3 GeneralizedAdditiveMixedModels 217 11.4 Degrees-of-FreedomApproximations 219 11.5 AutomaticSmoothingParameterSelection 220 11.6 HypothesisTesting 220 11.7 ModelSelection 221 11.8 DensityEstimation 221 11.9 BibliographicalNotes 222 12 InteractionModels 223 12.1 Introduction 223 12.2 Binary-by-ContinuousInteractionModels 224 x Contents 12.3 Factor-by-CurveInteractionsinAdditiveModels 226 12.4 VaryingCoefficientModels 234 12.5 Continuous-by-ContinuousInteractions 235 12.6 BibliographicalNotes 237 13 BivariateSmoothing 238 13.1 Introduction 238 13.2 ChoiceofBivariateBasisFunctions 240 13.3 Kriging 242 13.4 GeneralRadialSmoothing 248 13.5 DefaultAutomaticBivariateSmoother 256 13.6 GeoadditiveModels 258 13.7 AdditivePlusInteractionModels 259 13.8 GeneralizedBivariateSmoothing 259 13.9 Appendix:EquivalenceofBLUPusingZ andZ 259 R P 13.10 BibliographicalNotes 260 14 VarianceFunctionEstimation 261 14.1 Introduction 261 14.2 Formulation 263 14.3 ApplicationtotheLIDARData 264 14.4 QuasilikelihoodandVarianceFunctions 266 14.5 BibliographicalNotes 267 15 MeasurementError 268 15.1 Introduction 268 15.2 Formulation 269 15.3 TheExpectationMaximization(EM)Algorithm 270 15.4 SimulatedExampleRevisited 273 15.5 SensitivityAnalysisExample 273 15.6 BibliographicalNotes 275 16 BayesianSemiparametricRegression 276 16.1 Introduction 276 16.2 GeneralFramework 277 16.3 ScatterplotSmoothing 279 16.4 LinearMixedModels 285 16.5 GeneralizedLinearMixedModels 288 16.6 Rao–Blackwellization 291 16.7 BibliographicalNotes 292 17 SpatiallyAdaptiveSmoothing 293 17.1 Introduction 293 17.2 ALocalPenaltyMethod 294 17.3 CompletelyAutomaticAlgorithm 295 17.4 BayesianInference 296
Description: