Medical Statistics Fourth Edition A Textbook for the Health Sciences David Machin Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre, Singapore, Medical Statistics Group, School of Health and Related Research, University of Sheffi eld, UK, Children’s Cancer and Leukaemia Group, University of Leicester, UK Michael J Campbell Medical Statistics Group, School of Health and Related Research, University of Sheffi eld, UK Stephen J Walters Medical Statistics Group, School of Health and Related Research, University of Sheffi eld, UK This page intentionally left blank Medical Statistics Fourth Edition This page intentionally left blank Medical Statistics Fourth Edition A Textbook for the Health Sciences David Machin Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre, Singapore, Medical Statistics Group, School of Health and Related Research, University of Sheffi eld, UK, Children’s Cancer and Leukaemia Group, University of Leicester, UK Michael J Campbell Medical Statistics Group, School of Health and Related Research, University of Sheffi eld, UK Stephen J Walters Medical Statistics Group, School of Health and Related Research, University of Sheffi eld, UK Copyright © 2007 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+ 44) 1243 779777 Email (for orders and customer service enquiries): [email protected] Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to permreq@ wiley.co.uk, or faxed to (+ 44) 1243 770620. Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The Publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Other Wiley Editorial Offi ces John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 6045 Freemont Blvd, Mississauga, Ontario, L5R 4J3, Canada Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Anniversary Logo Design: Richard J. Pacifi co Library of Congress Cataloging-in-Publication Data Campbell, Michael J., PhD. Medical statistics : a textbook for the health sciences / Michael J. Campbell, David Machin, Stephen J. Walters. – 4th ed. p. ; cm. Includes bibliographical references. ISBN 978-0-470-02519-2 (cloth : alk. paper) 1. Medical statistics. 2. Medicine–Research–Statistical methods. I. Machin, David, 1939– II. Walters, Stephen John. III. Title. [DNLM: 1. Biometry–methods. 2. Research Design. 3. Statistics. WA 950 C189m 2007] R853.S7C36 2007 610.72′7–dc22 British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 978-0-470-02519-2 Typeset in 10.5/12.5 Times by SNP Best-set Typesetter Ltd., Hong Kong Printed and bound in Great Britain by Antony Rowe Ltd,. Chippenham, Wilts This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production. Contents Preface to the Fourth Edition xi 1 Uses and abuses of medical statistics 1 1.1 Introduction 2 1.2 Why use statistics? 3 1.3 Statistics is about common sense and good design 4 1.4 Types of data 5 1.5 How a statistician can help 9 1.6 Further reading 11 1.7 Exercises 12 2 Describing and displaying categorical data 13 2.1 Summarising categorical data 14 2.2 Displaying categorical data 21 2.3 Points when reading the literature 23 2.4 Exercises 24 3 Describing and displaying quantitative data 27 3.1 Summarising continuous data 28 3.2 Displaying continuous daa 34 3.3 Within-subject variability 39 3.4 Presentation 42 3.5 Points when reading the literature 43 3.6 Exercises 43 vi CONTENTS 4 Probability and decision making 45 4.1 Types of probability 46 4.2 Diagnostic tests 49 4.3 Bayes’ Theorem 51 4.4 Relative (receiver)–operating characteristic (ROC) curve 57 4.5 Points when reading the literature 59 4.6 Exercises 60 5 Distributions 63 5.1 Introduction 64 5.2 The Binomial distribution 64 5.3 The Poisson distribution 66 5.4 Probability for continuous outcomes 68 5.5 The Normal distribution 69 5.6 Reference ranges 73 5.7 Points when reading the literature 75 5.8 Technical details 75 5.9 Exercises 76 6 Populations, samples, standard errors and confi dence intervals 79 6.1 Populations 80 6.2 Samples 81 6.3 The standard error 82 6.4 The Central Limit Theorem 84 6.5 Standard errors for proportions and rates 85 6.6 Standard errors of differences 87 6.7 Confi dence intervals for an estimate 89 6.8 Confi dence intervals for differences 93 6.9 Points when reading the literature 94 6.10 Technical details 94 6.11 Exercises 96 7 p-values and statistical inference 99 7.1 Introduction 100 7.2 The null hypothesis 100 7.3 The p-value 103 7.4 Statistical inference 105 7.5 Statistical power 108 7.6 Confi dence intervals rather than p-values 110 7.7 One-sided and two-sided tests 112 7.8 Points when reading the literature 113 7.9 Technical details 114 7.10 Exercises 114 CONTENTS vii 8 Tests for comparing two groups of categorical or continuous data 117 8.1 Introduction 118 8.2 Comparison of two groups of paired observations – continuous outcomes 119 8.3 Comparison of two independent groups – continuous outcomes 125 8.4 Comparison of two independent groups – categorical outcomes 132 8.5 Comparison of two groups of paired observations – categorical outcomes 136 8.6 Non-Normal distributions 138 8.7 Degrees of freedom 139 8.8 Points when reading the literature 140 8.9 Technical details 140 8.10 Exercises 144 9 Correlation and linear regression 149 9.1 Introduction 150 9.2 Correlation 151 9.3 Linear regression 156 9.4 Comparison of assumptions between correlation and regression 165 9.5 Multiple regression 166 9.6 Logistic regression 169 9.7 Correlation is not causation 174 9.8 Points when reading the literature 175 9.9 Technical details 175 9.10 Exercises 179 10 Survival analysis 181 10.1 Time to event data 182 10.2 Kaplan–Meier survival curve 185 10.3 The logrank test 189 10.4 The hazard ratio 190 10.5 Modelling time to event data 193 10.6 Points when reading the literature 197 10.7 Exercises 198 11 Reliability and method comparison studies 201 11.1 Introduction 202 11.2 Repeatability 203 11.3 Agreement 206 viii CONTENTS 11.4 Validity 209 11.5 Method comparison studies 210 11.6 Points when reading the literature 212 11.7 Technical details 213 11.8 Exercises 215 12 Observational studies 217 12.1 Introduction 218 12.2 Risk and rates 218 12.3 Taking a random sample 220 12.4 Questionnaire and form design 221 12.5 Cross-sectional surveys 222 12.6 Non-randomised studies 224 12.7 Cohort studies 227 12.8 Case–control studies 231 12.9 Association and causality 237 12.10 Points when reading the literature 238 12.11 Technical details 238 12.12 Exercises 239 13 The randomised controlled trial 241 13.1 Introduction 242 13.2 Why randomise? 242 13.3 Methods of randomisation 244 13.4 Design features 247 13.5 Design options 250 13.6 Meta-analysis 254 13.7 The protocol 255 13.8 Checklists for design, analysis and reporting 256 13.9 Number needed to treat (NNT) 258 13.10 Points when reading the literature 259 13.11 Exercises 260 14 Sample size issues 261 14.1 Introduction 262 14.2 Study size 263 14.3 Continuous data 267 14.4 Binary data 268 14.5 Prevalence 270 14.6 Subject withdrawals 271 14.7 Internal pilot studies 272 14.8 Points when reading the literature 272 14.9 Technical details 273 14.10 Exercises 274