ebook img

Robust nonparametric statistical methods PDF

532 Pages·2011·5.088 MB·xvii, 535 p. : ill. ; 27 cm\532
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Robust nonparametric statistical methods

Robust Nonparametric Statistical Methods Second Edition K10449_FM.indd 1 11/19/10 1:27 PM MONOGRAPHS ON STATISTICS AND APPLIED PROBABILITY General Editors F. Bunea, V. Isham, N. Keiding, T. Louis, R. L. Smith, and H. Tong 1 Stochastic Population Models in Ecology and Epidemiology M.S. Barlett (1960) 2 Queues D.R. Cox and W.L. Smith (1961) 3 Monte Carlo Methods J.M. Hammersley and D.C. Handscomb (1964) 4 The Statistical Analysis of Series of Events D.R. Cox and P.A.W. Lewis (1966) 5 Population Genetics W.J. Ewens (1969) 6 Probability, Statistics and Time M.S. Barlett (1975) 7 Statistical Inference S.D. Silvey (1975) 8 The Analysis of Contingency Tables B.S. Everitt (1977) 9 Multivariate Analysis in Behavioural Research A.E. Maxwell (1977) 10 Stochastic Abundance Models S. Engen (1978) 11 Some Basic Theory for Statistical Inference E.J.G. Pitman (1979) 12 Point Processes D.R. Cox and V. Isham (1980) 13 Identification of Outliers D.M. Hawkins (1980) 14 Optimal Design S.D. Silvey (1980) 15 Finite Mixture Distributions B.S. Everitt and D.J. Hand (1981) 16 Classification A.D. Gordon (1981) 17 Distribution-Free Statistical Methods, 2nd edition J.S. Maritz (1995) 18 Residuals and Influence in Regression R.D. Cook and S. Weisberg (1982) 19 Applications of Queueing Theory, 2nd edition G.F. Newell (1982) 20 Risk Theory, 3rd edition R.E. Beard, T. Pentikäinen and E. Pesonen (1984) 21 Analysis of Survival Data D.R. Cox and D. Oakes (1984) 22 An Introduction to Latent Variable Models B.S. Everitt (1984) 23 Bandit Problems D.A. Berry and B. Fristedt (1985) 24 Stochastic Modelling and Control M.H.A. Davis and R. Vinter (1985) 25 The Statistical Analysis of Composition Data J. Aitchison (1986) 26 Density Estimation for Statistics and Data Analysis B.W. Silverman (1986) 27 Regression Analysis with Applications G.B. Wetherill (1986) 28 Sequential Methods in Statistics, 3rd edition G.B. Wetherill and K.D. Glazebrook (1986) 29 Tensor Methods in Statistics P. McCullagh (1987) 30 Transformation and Weighting in Regression R.J. Carroll and D. Ruppert (1988) 31 Asymptotic Techniques for Use in Statistics O.E. Bandorff-Nielsen and D.R. Cox (1989) 32 Analysis of Binary Data, 2nd edition D.R. Cox and E.J. Snell (1989) 33 Analysis of Infectious Disease Data N.G. Becker (1989) 34 Design and Analysis of Cross-Over Trials B. Jones and M.G. Kenward (1989) 35 Empirical Bayes Methods, 2nd edition J.S. Maritz and T. Lwin (1989) 36 Symmetric Multivariate and Related Distributions K.T. Fang, S. Kotz and K.W. Ng (1990) 37 Generalized Linear Models, 2nd edition P. McCullagh and J.A. Nelder (1989) 38 Cyclic and Computer Generated Designs, 2nd edition J.A. John and E.R. Williams (1995) 39 Analog Estimation Methods in Econometrics C.F. Manski (1988) 40 Subset Selection in Regression A.J. Miller (1990) 41 Analysis of Repeated Measures M.J. Crowder and D.J. Hand (1990) 42 Statistical Reasoning with Imprecise Probabilities P. Walley (1991) 43 Generalized Additive Models T.J. Hastie and R.J. Tibshirani (1990) 44 Inspection Errors for Attributes in Quality Control N.L. Johnson, S. Kotz and X. Wu (1991) K10449_FM.indd 2 11/19/10 1:27 PM 45 The Analysis of Contingency Tables, 2nd edition B.S. Everitt (1992) 46 The Analysis of Quantal Response Data B.J.T. Morgan (1992) 47 Longitudinal Data with Serial Correlation—A State-Space Approach R.H. Jones (1993) 48 Differential Geometry and Statistics M.K. Murray and J.W. Rice (1993) 49 Markov Models and Optimization M.H.A. Davis (1993) 50 Networks and Chaos—Statistical and Probabilistic Aspects O.E. Barndorff-Nielsen, J.L. Jensen and W.S. Kendall (1993) 51 Number-Theoretic Methods in Statistics K.-T. Fang and Y. Wang (1994) 52 Inference and Asymptotics O.E. Barndorff-Nielsen and D.R. Cox (1994) 53 Practical Risk Theory for Actuaries C.D. Daykin, T. Pentikäinen and M. Pesonen (1994) 54 Biplots J.C. Gower and D.J. Hand (1996) 55 Predictive Inference—An Introduction S. Geisser (1993) 56 Model-Free Curve Estimation M.E. Tarter and M.D. Lock (1993) 57 An Introduction to the Bootstrap B. Efron and R.J. Tibshirani (1993) 58 Nonparametric Regression and Generalized Linear Models P.J. Green and B.W. Silverman (1994) 59 Multidimensional Scaling T.F. Cox and M.A.A. Cox (1994) 60 Kernel Smoothing M.P. Wand and M.C. Jones (1995) 61 Statistics for Long Memory Processes J. Beran (1995) 62 Nonlinear Models for Repeated Measurement Data M. Davidian and D.M. Giltinan (1995) 63 Measurement Error in Nonlinear Models R.J. Carroll, D. Rupert and L.A. Stefanski (1995) 64 Analyzing and Modeling Rank Data J.J. Marden (1995) 65 Time Series Models—In Econometrics, Finance and Other Fields D.R. Cox, D.V. Hinkley and O.E. Barndorff-Nielsen (1996) 66 Local Polynomial Modeling and its Applications J. Fan and I. Gijbels (1996) 67 Multivariate Dependencies—Models, Analysis and Interpretation D.R. Cox and N. Wermuth (1996) 68 Statistical Inference—Based on the Likelihood A. Azzalini (1996) 69 Bayes and Empirical Bayes Methods for Data Analysis B.P. Carlin and T.A Louis (1996) 70 Hidden Markov and Other Models for Discrete-Valued Time Series I.L. MacDonald and W. Zucchini (1997) 71 Statistical Evidence—A Likelihood Paradigm R. Royall (1997) 72 Analysis of Incomplete Multivariate Data J.L. Schafer (1997) 73 Multivariate Models and Dependence Concepts H. Joe (1997) 74 Theory of Sample Surveys M.E. Thompson (1997) 75 Retrial Queues G. Falin and J.G.C. Templeton (1997) 76 Theory of Dispersion Models B. Jørgensen (1997) 77 Mixed Poisson Processes J. Grandell (1997) 78 Variance Components Estimation—Mixed Models, Methodologies and Applications P.S.R.S. Rao (1997) 79 Bayesian Methods for Finite Population Sampling G. Meeden and M. Ghosh (1997) 80 Stochastic Geometry—Likelihood and computation O.E. Barndorff-Nielsen, W.S. Kendall and M.N.M. van Lieshout (1998) 81 Computer-Assisted Analysis of Mixtures and Applications— Meta-analysis, Disease Mapping and Others D. Böhning (1999) 82 Classification, 2nd edition A.D. Gordon (1999) 83 Semimartingales and their Statistical Inference B.L.S. Prakasa Rao (1999) 84 Statistical Aspects of BSE and vCJD—Models for Epidemics C.A. Donnelly and N.M. Ferguson (1999) 85 Set-Indexed Martingales G. Ivanoff and E. Merzbach (2000) K10449_FM.indd 3 11/19/10 1:27 PM 86 The Theory of the Design of Experiments D.R. Cox and N. Reid (2000) 87 Complex Stochastic Systems O.E. Barndorff-Nielsen, D.R. Cox and C. Klüppelberg (2001) 88 Multidimensional Scaling, 2nd edition T.F. Cox and M.A.A. Cox (2001) 89 Algebraic Statistics—Computational Commutative Algebra in Statistics G. Pistone, E. Riccomagno and H.P. Wynn (2001) 90 Analysis of Time Series Structure—SSA and Related Techniques N. Golyandina, V. Nekrutkin and A.A. Zhigljavsky (2001) 91 Subjective Probability Models for Lifetimes Fabio Spizzichino (2001) 92 Empirical Likelihood Art B. Owen (2001) 93 Statistics in the 21st Century Adrian E. Raftery, Martin A. Tanner, and Martin T. Wells (2001) 94 Accelerated Life Models: Modeling and Statistical Analysis Vilijandas Bagdonavicius and Mikhail Nikulin (2001) 95 Subset Selection in Regression, Second Edition Alan Miller (2002) 96 Topics in Modelling of Clustered Data Marc Aerts, Helena Geys, Geert Molenberghs, and Louise M. Ryan (2002) 97 Components of Variance D.R. Cox and P.J. Solomon (2002) 98 Design and Analysis of Cross-Over Trials, 2nd Edition Byron Jones and Michael G. Kenward (2003) 99 Extreme Values in Finance, Telecommunications, and the Environment Bärbel Finkenstädt and Holger Rootzén (2003) 100 Statistical Inference and Simulation for Spatial Point Processes Jesper Møller and Rasmus Plenge Waagepetersen (2004) 101 Hierarchical Modeling and Analysis for Spatial Data Sudipto Banerjee, Bradley P. Carlin, and Alan E. Gelfand (2004) 102 Diagnostic Checks in Time Series Wai Keung Li (2004) 103 Stereology for Statisticians Adrian Baddeley and Eva B. Vedel Jensen (2004) 104 Gaussian Markov Random Fields: Theory and Applications Ha˚vard Rue and Leonhard Held (2005) 105 Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition Raymond J. Carroll, David Ruppert, Leonard A. Stefanski, and Ciprian M. Crainiceanu (2006) 106 Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood Youngjo Lee, John A. Nelder, and Yudi Pawitan (2006) 107 Statistical Methods for Spatio-Temporal Systems Bärbel Finkenstädt, Leonhard Held, and Valerie Isham (2007) 108 Nonlinear Time Series: Semiparametric and Nonparametric Methods Jiti Gao (2007) 109 Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis Michael J. Daniels and Joseph W. Hogan (2008) 110 Hidden Markov Models for Time Series: An Introduction Using R Walter Zucchini and Iain L. MacDonald (2009) 111 ROC Curves for Continuous Data Wojtek J. Krzanowski and David J. Hand (2009) 112 Antedependence Models for Longitudinal Data Dale L. Zimmerman and Vicente A. Núñez-Antón (2009) 113 Mixed Effects Models for Complex Data Lang Wu (2010) 114 Intoduction to Time Series Modeling Genshiro Kitagawa (2010) 115 Expansions and Asymptotics for Statistics Christopher G. Small (2010) 116 Statistical Inference: An Integrated Bayesian/Likelihood Approach Murray Aitkin (2010) 117 Circular and Linear Regression: Fitting Circles and Lines by Least Squares Nikolai Chernov (2010) 118 Simultaneous Inference in Regression Wei Liu (2010) 119 Robust Nonparametric Statistical Methods, Second Edition Thomas P. Hettmansperger and Joseph W. McKean (2011) K10449_FM.indd 4 11/19/10 1:27 PM Monographs on Statistics and Applied Probability 119 Robust Nonparametric Statistical Methods Second Edition Thomas P. Hettmansperger Penn State University University Park, Pennsylvania, USA Joseph W. McKean Western Michigan University Kalamazoo, Michigan, USA K10449_FM.indd 5 11/19/10 1:27 PM CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2011 by Taylor and Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number: 978-1-4398-0908-2 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material repro- duced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copy- right.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identifica- tion and explanation without intent to infringe. Library of Congress Cataloging‑in‑Publication Data Hettmansperger, Thomas P., 1939- Robust nonparametric statistical methods / Thomas P. Hettmansperger, Joseph W. McKean. -- 2nd ed. p. cm. -- (Monographs on statistics and applied probability ; 119) Summary: “Often referred to as distribution-free methods, nonparametric methods do not rely on assumptions that the data are drawn from a given probability distribution. With an emphasis on Wilcoxon rank methods that enable a unified approach to data analysis, this book presents a unique overview of robust nonparametric statistical methods. Drawing on examples from various disciplines, the relevant R code for these examples, as well as numerous exercises for self-study, the text covers location models, regression models, designed experiments, and multivariate methods. This edition features a new chapter on cluster correlated data”-- Provided by publisher. Includes bibliographical references and index. ISBN 978-1-4398-0908-2 (hardback) 1. Nonparametric statistics. 2. Robust statistics. I. McKean, Joseph W., 1944- II. Title. III. Series. QA278.8.H47 2010 519.5--dc22 2010044858 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com K10449_FM.indd 6 11/19/10 1:27 PM i i “book” — 2010/11/17 — 16:39 — page vii — i i vii Dedication: To Ann and to Marge i i i i i i “book” — 2010/11/17 — 16:39 — page ix — i i Contents Preface xv 1 One-Sample Problems 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Location Model . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Geometry and Inference in the Location Model . . . . . . . . . 5 1.3.1 Computation . . . . . . . . . . . . . . . . . . . . . . . 13 1.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5 Properties of Norm-Based Inference . . . . . . . . . . . . . . . 19 1.5.1 Basic Properties of the Power Function γ (θ) . . . . . 20 S 1.5.2 Asymptotic Linearity and Pitman Regularity . . . . . . 22 1.5.3 Asymptotic Theory and Efficiency Results for θ . . . . 26 1.5.4 Asymptotic Power and Efficiency Results for the Test Based on S(θ) . . . . . . . . . . . . . . . . . .b. . . . . 27 1.5.5 Efficiency Results for Confidence Intervals Based on S(θ) 29 1.6 Robustness Properties of Norm-Based Inference . . . . . . . . 32 1.6.1 Robustness Properties of θ . . . . . . . . . . . . . . . . 33 1.6.2 Breakdown Properties of Tests . . . . . . . . . . . . . . 35 b 1.7 Inference and the Wilcoxon Signed-Rank Norm . . . . . . . . 38 1.7.1 Null Distribution Theory of T(0) . . . . . . . . . . . . 39 1.7.2 Statistical Properties . . . . . . . . . . . . . . . . . . . 40 1.7.3 Robustness Properties . . . . . . . . . . . . . . . . . . 46 1.8 Inference Based on General Signed-Rank Norms . . . . . . . . 48 1.8.1 Null Properties of the Test . . . . . . . . . . . . . . . . 50 1.8.2 Efficiency and Robustness Properties . . . . . . . . . . 51 1.9 Ranked Set Sampling . . . . . . . . . . . . . . . . . . . . . . . 57 1.10 L Interpolated Confidence Intervals . . . . . . . . . . . . . . 61 1 1.11 Two-Sample Analysis . . . . . . . . . . . . . . . . . . . . . . . 65 1.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 ix i i i i i i “book” — 2010/11/17 — 16:39 — page x — i i x CONTENTS 2 Two-Sample Problems 77 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 2.2 Geometric Motivation . . . . . . . . . . . . . . . . . . . . . . 78 2.2.1 Least Squares (LS) Analysis . . . . . . . . . . . . . . . 81 2.2.2 Mann-Whitney-Wilcoxon (MWW) Analysis . . . . . . 82 2.2.3 Computation . . . . . . . . . . . . . . . . . . . . . . . 84 2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2.4 Inference Based on the Mann-Whitney-Wilcoxon . . . . . . . . 87 2.4.1 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 2.4.2 Confidence Intervals . . . . . . . . . . . . . . . . . . . 97 2.4.3 Statistical Properties of the Inference Based on the MWW 97 2.4.4 Estimation of ∆ . . . . . . . . . . . . . . . . . . . . . . 102 2.4.5 Efficiency Results Based on Confidence Intervals . . . . 103 2.5 General Rank Scores . . . . . . . . . . . . . . . . . . . . . . . 105 2.5.1 Statistical Methods . . . . . . . . . . . . . . . . . . . . 109 2.5.2 Efficiency Results . . . . . . . . . . . . . . . . . . . . . 110 2.5.3 Connection between One- and Two-Sample Scores . . . 113 2.6 L Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 1 2.6.1 Analysis Based on the L Pseudo-Norm . . . . . . . . . 115 1 2.6.2 Analysis Based on the L Norm . . . . . . . . . . . . . 119 1 2.7 Robustness Properties . . . . . . . . . . . . . . . . . . . . . . 122 2.7.1 Breakdown Properties . . . . . . . . . . . . . . . . . . 122 2.7.2 Influence Functions . . . . . . . . . . . . . . . . . . . . 123 2.8 Proportional Hazards . . . . . . . . . . . . . . . . . . . . . . . 125 2.8.1 The Log Exponential and the Savage Statistic . . . . . 126 2.8.2 Efficiency Properties . . . . . . . . . . . . . . . . . . . 129 2.9 Two-Sample Rank Set Sampling (RSS) . . . . . . . . . . . . . 131 2.10 Two-Sample Scale Problem . . . . . . . . . . . . . . . . . . . 133 2.10.1 Appropriate Score Functions . . . . . . . . . . . . . . . 133 2.10.2 Efficacy of the Traditional F-Test . . . . . . . . . . . . 142 2.11 Behrens-Fisher Problem . . . . . . . . . . . . . . . . . . . . . 144 2.11.1 Behavior of the Usual MWW Test . . . . . . . . . . . . 144 2.11.2 General Rank Tests . . . . . . . . . . . . . . . . . . . . 146 2.11.3 Modified Mathisen’s Test . . . . . . . . . . . . . . . . . 147 2.11.4 Modified MWW Test . . . . . . . . . . . . . . . . . . . 149 2.11.5 Efficiencies and Discussion . . . . . . . . . . . . . . . . 150 2.12 Paired Designs . . . . . . . . . . . . . . . . . . . . . . . . . . 152 2.12.1 Behavior under Alternatives . . . . . . . . . . . . . . . 156 2.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 i i i i i i “book” — 2010/11/17 — 16:39 — page xi — i i CONTENTS xi 3 Linear Models 165 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 3.2 Geometry of Estimation and Tests . . . . . . . . . . . . . . . . 166 3.2.1 The Geometry of Estimation . . . . . . . . . . . . . . . 166 3.2.2 The Geometry of Testing . . . . . . . . . . . . . . . . . 169 3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 3.4 Assumptions for Asymptotic Theory . . . . . . . . . . . . . . 177 3.5 Theory of Rank-Based Estimates . . . . . . . . . . . . . . . . 180 3.5.1 R Estimators of the Regression Coefficients . . . . . . . 180 3.5.2 R Estimates of the Intercept . . . . . . . . . . . . . . . 185 3.6 Theory of Rank-Based Tests . . . . . . . . . . . . . . . . . . . 191 3.6.1 Null Theory of Rank-Based Tests . . . . . . . . . . . . 191 3.6.2 Theory of Rank-Based Tests under Alternatives . . . . 197 3.6.3 Further Remarks on the Dispersion Function . . . . . . 201 3.7 Implementation of the R Analysis . . . . . . . . . . . . . . . . 203 3.7.1 Estimates of the Scale Parameter τ . . . . . . . . . . 204 ϕ 3.7.2 Algorithms for Computing the R Analysis . . . . . . . 207 3.7.3 An Algorithm for a Linear Search . . . . . . . . . . . . 210 3.8 L Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 1 3.9 Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 3.9.1 Properties of R Residuals and Model Misspecification . 214 3.9.2 Standardization of R Residuals . . . . . . . . . . . . . 220 3.9.3 Measures of Influential Cases . . . . . . . . . . . . . . 227 3.10 Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 231 3.11 Correlation Model . . . . . . . . . . . . . . . . . . . . . . . . . 240 3.11.1 Huber’s Condition for the Correlation Model . . . . . . 240 3.11.2 Traditional Measure of Association and Its Estimate . 242 3.11.3 Robust Measure of Association and Its Estimate . . . . 243 3.11.4 Properties of R Coefficients of Multiple Determination 245 3.11.5 Coefficients of Determination for Regression . . . . . . 250 3.12 High Breakdown (HBR) Estimates . . . . . . . . . . . . . . . 252 3.12.1 Geometry of the HBR Estimates . . . . . . . . . . . . 252 3.12.2 Weights . . . . . . . . . . . . . . . . . . . . . . . . . . 253 3.12.3 Asymptotic Normality of β . . . . . . . . . . . . . 256 HBR 3.12.4 Robustness Properties of the HBR Estimates . . . . . . 260 3.12.5 Discussion . . . . . . . . b. . . . . . . . . . . . . . . . . 263 3.12.6 Implementation and Examples . . . . . . . . . . . . . . 264 3.12.7 Studentized Residuals . . . . . . . . . . . . . . . . . . 265 3.12.8 Example on Curvature Detection . . . . . . . . . . . . 267 3.13 Diagnostics for Differentiating between Fits . . . . . . . . . . 268 3.14 Rank-Based Procedures for Nonlinear Models . . . . . . . . . 276 3.14.1 Implementation . . . . . . . . . . . . . . . . . . . . . . 279 i i i i

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.