Use R! Advisors: RobertGentleman KurtHornik GiovanniParmigiani Forothertitlespublishedinthisseries,goto http://www.springer.com/series/6991 · Paul S.P. Cowpertwait Andrew V. Metcalfe Introductory Time Series with R 123 PaulS.P.Cowpertwait AndrewV.Metcalfe Inst.Informationand SchoolofMathematical MathematicalSciences Sciences MasseyUniversity UniversityofAdelaide Auckland AdelaideSA5005 AlbanyCampus Australia NewZealand [email protected] [email protected] SeriesEditors RobertGentleman KurtHornik PrograminComputationalBiology DepartmentofStatistikandMathematik DivisionofPublicHealthSciences Wirtschaftsuniversita¨tWienAugasse2-6 FredHutchinsonCancerResearchCenter A-1090Wien 1100FairviewAvenue,N.M2-B876 Austria Seattle,Washington98109 USA GiovanniParmigiani TheSidneyKimmelComprehensiveCancer CenteratJohnsHopkinsUniversity 550NorthBroadway Baltimore,MD21205-2011 USA ISBN978-0-387-88697-8 e-ISBN978-0-387-88698-5 DOI10.1007/978-0-387-88698-5 SpringerDordrechtHeidelbergLondonNewYork LibraryofCongressControlNumber:2009928496 (cid:2)c SpringerScience+BusinessMedia,LLC2009 Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewritten permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY10013,USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarlyanalysis.Usein connection with any form of information storage and retrieval, electronic adaptation, computer software,orbysimilarordissimilarmethodologynowknownorhereafterdevelopedisforbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not theyaresubjecttoproprietaryrights. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) In memory of Ian Cowpertwait Preface Rhasacommandlineinterfacethatoffersconsiderableadvantagesovermenu systemsintermsofefficiencyandspeedoncethecommandsareknownandthe languageunderstood.However,thecommandlinesystemcanbedauntingfor thefirst-timeuser,sothereisaneedforconcisetextstoenablethestudentor analysttomakeprogresswithRintheirareaofstudy.Thisbookaimstofulfil that need in the area of time series to enable the non-specialist to progress, at a fairly quick pace, to a level where they can confidently apply a range of time series methods to a variety of data sets. The book assumes the reader hasaknowledgetypicalofafirst-yearuniversitystatisticscourseandisbased around lecture notes from a range of time series courses that we have taught over the last twenty years. Some of this material has been delivered to post- graduatefinancestudentsduringaconcentratedsix-weekcourseandwaswell received, so a selection of the material could be mastered in a concentrated course, although in general it would be more suited to being spread over a complete semester. The book is based around practical applications and generally follows a similar format for each time series model being studied. First, there is an introductory motivational section that describes practical reasons why the model may be needed. Second, the model is described and defined in math- ematical notation. The model is then used to simulate synthetic data using R code that closely reflects the model definition and then fitted to the syn- thetic data to recover the underlying model parameters. Finally, the model is fitted to an example historical data set and appropriate diagnostic plots given. By using R, the whole procedure can be reproduced by the reader, and it is recommended that students work through most of the examples.1 Mathematical derivations are provided in separate frames and starred sec- 1 WeusedtheRpackageSweavetoensurethat,ingeneral,yourcodewillproduce the same output as ours. However, for stylistic reasons we sometimes edited our code; e.g., for the plots there will sometimes be minor differences between those generated by the code in the text and those shown in the actual figures. vii viii Preface tions and can be omitted by those wanting to progress quickly to practical applications. At the end of each chapter, a concise summary of the R com- mands that were used is given followed by exercises. All data sets used in the book, and solutions to the odd numbered exercises, are available on the website http://www.massey.ac.nz/∼pscowper/ts. We thank John Kimmel of Springer and the anonymous referees for their helpfulguidanceandsuggestions,BrianWebbyforcarefulreadingofthetext andvaluablecomments,andJohnXieforusefulcommentsonanearlierdraft. The Institute of Information and Mathematical Sciences at Massey Univer- sity and the School of Mathematical Sciences, University of Adelaide, are acknowledged for support and funding that made our collaboration possible. Paul thanks his wife, Sarah, for her continual encouragement and support during the writing of this book, and our son, Daniel, and daughters, Lydia and Louise, for the joy they bring to our lives. Andrew thanks Natalie for providing inspiration and her enthusiasm for the project. Paul Cowpertwait and Andrew Metcalfe Massey University, Auckland, New Zealand University of Adelaide, Australia December 2008 Contents Preface ........................................................ vii 1 Time Series Data .......................................... 1 1.1 Purpose ................................................ 1 1.2 Time series ............................................. 2 1.3 R language.............................................. 3 1.4 Plots, trends, and seasonal variation ....................... 4 1.4.1 A flying start: Air passenger bookings................ 4 1.4.2 Unemployment: Maine ............................. 7 1.4.3 Multiple time series: Electricity, beer and chocolate data 10 1.4.4 Quarterly exchange rate: GBP to NZ dollar........... 14 1.4.5 Global temperature series .......................... 16 1.5 Decomposition of series .................................. 19 1.5.1 Notation ......................................... 19 1.5.2 Models........................................... 19 1.5.3 Estimating trends and seasonal effects ............... 20 1.5.4 Smoothing ....................................... 21 1.5.5 Decomposition in R ................................ 22 1.6 Summary of commands used in examples................... 24 1.7 Exercises ............................................... 24 2 Correlation ................................................ 27 2.1 Purpose ................................................ 27 2.2 Expectation and the ensemble............................. 27 2.2.1 Expected value.................................... 27 2.2.2 The ensemble and stationarity ...................... 30 2.2.3 Ergodic series*.................................... 31 2.2.4 Variance function ................................. 32 2.2.5 Autocorrelation ................................... 33 ix x Contents 2.3 The correlogram......................................... 35 2.3.1 General discussion................................. 35 2.3.2 Example based on air passenger series ............... 37 2.3.3 Example based on the Font Reservoir series........... 40 2.4 Covariance of sums of random variables .................... 41 2.5 Summary of commands used in examples................... 42 2.6 Exercises ............................................... 42 3 Forecasting Strategies ..................................... 45 3.1 Purpose ................................................ 45 3.2 Leading variables and associated variables .................. 45 3.2.1 Marine coatings ................................... 45 3.2.2 Building approvals publication ...................... 46 3.2.3 Gas supply ....................................... 49 3.3 Bass model ............................................. 51 3.3.1 Background ...................................... 51 3.3.2 Model definition................................... 51 3.3.3 Interpretation of the Bass model* ................... 51 3.3.4 Example ......................................... 52 3.4 Exponential smoothing and the Holt-Winters method ........ 55 3.4.1 Exponential smoothing............................. 55 3.4.2 Holt-Winters method .............................. 59 3.4.3 Four-year-ahead forecasts for the air passenger data ... 62 3.5 Summary of commands used in examples................... 64 3.6 Exercises ............................................... 64 4 Basic Stochastic Models ................................... 67 4.1 Purpose ................................................ 67 4.2 White noise............................................. 68 4.2.1 Introduction ...................................... 68 4.2.2 Definition ........................................ 68 4.2.3 Simulation in R ................................... 68 4.2.4 Second-order properties and the correlogram.......... 69 4.2.5 Fitting a white noise model......................... 70 4.3 Random walks .......................................... 71 4.3.1 Introduction ...................................... 71 4.3.2 Definition ........................................ 71 4.3.3 The backward shift operator ........................ 71 4.3.4 Random walk: Second-order properties............... 72 4.3.5 Derivation of second-order properties* ............... 72 4.3.6 The difference operator ............................ 72 4.3.7 Simulation ....................................... 73 4.4 Fitted models and diagnostic plots......................... 74 4.4.1 Simulated random walk series....................... 74 4.4.2 Exchange rate series ............................... 75 Contents xi 4.4.3 Random walk with drift............................ 77 4.5 Autoregressive models ................................... 79 4.5.1 Definition ........................................ 79 4.5.2 Stationary and non-stationary AR processes .......... 79 4.5.3 Second-order properties of an AR(1) model ........... 80 4.5.4 Derivation of second-order properties for an AR(1) process*.......................................... 80 4.5.5 Correlogram of an AR(1) process.................... 81 4.5.6 Partial autocorrelation ............................. 81 4.5.7 Simulation ....................................... 81 4.6 Fitted models........................................... 82 4.6.1 Model fitted to simulated series ..................... 82 4.6.2 Exchange rate series: Fitted AR model............... 84 4.6.3 Global temperature series: Fitted AR model .......... 85 4.7 Summary of R commands................................. 87 4.8 Exercises ............................................... 87 5 Regression................................................. 91 5.1 Purpose ................................................ 91 5.2 Linear models........................................... 92 5.2.1 Definition ........................................ 92 5.2.2 Stationarity ...................................... 93 5.2.3 Simulation ....................................... 93 5.3 Fitted models........................................... 94 5.3.1 Model fitted to simulated data ...................... 94 5.3.2 Model fitted to the temperature series (1970–2005) .... 95 5.3.3 Autocorrelation and the estimation of sample statistics* 96 5.4 Generalised least squares ................................. 98 5.4.1 GLS fit to simulated series.......................... 98 5.4.2 Confidence interval for the trend in the temperature series ............................................ 99 5.5 Linear models with seasonal variables ...................... 99 5.5.1 Introduction ...................................... 99 5.5.2 Additive seasonal indicator variables................. 99 5.5.3 Example: Seasonal model for the temperature series ...100 5.6 Harmonic seasonal models ................................101 5.6.1 Simulation .......................................102 5.6.2 Fit to simulated series .............................103 5.6.3 Harmonicmodelfittedtotemperatureseries(1970–2005)105 5.7 Logarithmic transformations ..............................109 5.7.1 Introduction ......................................109 5.7.2 Example using the air passenger series ...............109 5.8 Non-linear models .......................................113 5.8.1 Introduction ......................................113 5.8.2 Example of a simulated and fitted non-linear series ....113
Description: