ebook img

Foundations of Applied Statistical Methods PDF

168 Pages·2014·4.275 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Foundations of Applied Statistical Methods

Hang Lee Foundations of Applied Statistical Methods Foundations of Applied Statistical Methods Hang Lee Foundations of Applied Statistical Methods Hang Lee Department of Biostatistics Massachusetts General Hospital Boston , MA , USA ISBN 978-3-319-02401-1 ISBN 978-3-319-02402-8 (eBook) DOI 10.1007/978-3-319-02402-8 Springer Cham Heidelberg New York Dordrecht London Library of Congress Control Number: 2013951231 © Springer International Publishing Switzerland 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifi cally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfi lms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifi cally for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specifi c statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Pref ace Researchers who design and conduct experiments or sample surveys, perform sta- tistical inference, and write scientifi c reports need adequate knowledge of applied statistics. To build adequate and sturdy knowledge of applied statistical methods, fi rm foundation is essential. I have come across many researchers who had studied statistics in the past but are still far from being ready to apply the learned knowledge to their problem solving, and else who have forgotten what they had learned. This could be partly because the mathematical technicality dealt with the study material was above their mathematics profi ciency, or otherwise the studied worked examples often lacked addressing essential fundamentals of the applied methods. This book is written to fi ll gaps between the traditional textbooks involving ample amount of technically challenging complex mathematical expressions and the worked exam- ple-oriented data analysis guide books that often underemphasize fundamentals. The chapters of this book are dedicated to spell out and demonstrate, not to merely explain, necessary foundational ideas so that the motivated readers can learn to fully appreciate the fundamentals of the commonly applied methods and revivify the forgotten knowledge of the methods without having to deal with complex mathe- matical derivations or attempt to generalize oversimplifi ed worked examples of plug-and-play techniques. Detailed mathematical expressions are exhibited only if they are defi nitional or intuitively comprehensible. Data-oriented examples are illustrated only to aid the demonstration of fundamental ideas. This book can be used as a self-review guidebook for applied researchers or as an introductory statis- tical methods course textbook for the students not majoring in statistics. Boston , MA , USA Hang Lee v Contents 1 Warming Up: Descriptive Statistics and Essential Probability Models .................................................................................. 1 1.1 Types of Data .................................................................................... 1 1.2 Description of Data Pattern .............................................................. 2 1.2.1 Distribution ......................................................................... 2 1.2.2 Description of Categorical Data Distribution ..................... 3 1.2.3 Description of Continuous Data Distribution ..................... 3 1.2.4 Stem-and-Leaf .................................................................... 5 1.3 Descriptive Statistics ........................................................................ 8 1.3.1 Statistic ............................................................................... 8 1.3.2 Central Tendency Descriptive Statistics for Quantitative Outcomes .................................................. 8 1.3.3 Dispersion Descriptive Statistics for Quantitative Outcomes .................................................. 9 1.3.4 Variance .............................................................................. 9 1.3.5 Standard Deviation ............................................................. 11 1.3.6 Property of Standard Deviation After Data Transformations .................................................................. 11 1.3.7 Other Descriptive Statistics for Dispersion ........................ 13 1.3.8 Dispersions Among Multiple Data Sets ............................. 14 1.3.9 Caution to CV Interpretation .............................................. 15 1.3.10 Box and Whisker Plot......................................................... 16 1.4 Descriptive Statistics for Describing Relationships Between Two Outcomes ................................................................... 18 1.4.1 Linear Correlation Between Two Continuous Outcomes ... 18 1.4.2 Contingency Table to Describe an Association Between Two Categorical Outcomes .................................. 19 1.4.3 Odds Ratio .......................................................................... 20 vii viii Contents 1.5 Two Useful Probability Distributions ............................................... 21 1.5.1 Gaussian Distribution ........................................................... 21 1.5.2 Density Function of Gaussian Distribution .......................... 21 1.5.3 Application of Gaussian Distribution ................................... 22 1.5.4 Standard Normal Distribution .............................................. 23 1.5.5 Binomial Distribution ........................................................... 25 1.6 Study Questions ................................................................................ 29 Bibliography .............................................................................................. 29 2 Statistical Inference Focusing on a Single Mean .................................. 31 2.1 Population and Sample ..................................................................... 31 2.1.1 Sampling and Non-sampling Errors ..................................... 31 2.1.2 Sample- and Sampling Distributions .................................... 32 2.1.3 Standard Error ...................................................................... 33 2.2 Statistical Inference .......................................................................... 35 2.2.1 Data Reduction and Related Nomenclature ......................... 35 2.2.2 Central Limit Theorem ......................................................... 35 2.2.3 The t-Distribution ................................................................. 37 2.2.4 Testing Hypotheses ............................................................... 39 2.2.5 Accuracy and Precision ........................................................ 48 2.2.6 Interval Estimation and Confi dence Interval ........................ 50 2.2.7 Bayesian Inference ............................................................... 54 2.2.8 Study Design and Its Impact to Accuracy and Precision ..... 56 2.3 Study Questions ................................................................................ 61 Bibliography .............................................................................................. 62 3 t-Tests for Two Means Comparisons ..................................................... 63 3.1 Independent Samples t-Test for Comparing Two Independent Means .......................................................................... 63 3.1.1 Independent Samples t-Test When Variances Are Unequal ......................................................................... 66 3.1.2 Denominator Formulae of the Test Statistic for Independent Samples t-Test ............................................ 67 3.1.3 Connection to the Confi dence Interval ................................. 67 3.2 Paired Sample t-Test for Comparing Paired Means ......................... 68 3.3 Use of Excel for t-Tests .................................................................... 71 3.4 Study Questions ................................................................................ 71 Bibliography .............................................................................................. 74 4 Inference Using Analysis of Variance for Comparing Multiple Means........................................................................................ 75 4.1 Sums of Squares and Variances ........................................................ 75 4.2 F-Test ................................................................................................ 77 4.3 Multiple Comparisons and Increased Type-1 Error ......................... 81 4.4 Beyond Single-Factor ANOVA ........................................................ 82 4.4.1 Multi-factor ANOVA ............................................................ 82 4.4.2 Interaction ............................................................................. 82 Contents ix 4.4.3 Repeated Measures ANOVA ................................................ 84 4.4.4 Use of Excel for ANOVA ..................................................... 85 4.5 Study Questions ................................................................................ 85 Bibliography .............................................................................................. 86 5 Linear Correlation and Regression ....................................................... 87 5.1 Inference of a Single Pearson’s Correlation Coeffi cient .................. 87 5.1.1 Q & A Discussion ................................................................ 88 5.2 Linear Regression Model with One Independent Variable: Simple Regression Model ................................................................ 88 5.3 Simple Linear Regression Analysis ................................................. 89 5.4 Linear Regression Models with Multiple Independent Variables ...................................................................... 94 5.5 Logistic Regression Model with One Independent Variable: Simple Logistic Regression Model .................................................. 95 5.6 Consolidation of Regression Models ............................................... 98 5.6.1 General and Generalized Linear Models .............................. 98 5.6.2 Multivariate Analyses and Multivariate Model .................... 99 5.7 Application of Linear Models with Multiple Independent Variables ...................................................................... 100 5.8 Worked Examples of General and Generalized Linear Modes ........ 101 5.8.1 Worked Example of a General Linear Model ....................... 101 5.8.2 Worked Example of a Generalized Linear Model (Logistic Model) Where All Multiple Independent Variables Are Dummy Variables .......................................... 102 5.9 Study Questions ................................................................................ 103 Bibliography .............................................................................................. 104 6 Normal Distribution Assumption-Free Nonparametric Inference ..... 105 6.1 Comparing Two Proportions Using 2×2 Contingency Table ........... 105 6.1.1 Chi-Square Test for Comparing Two Independent Proportions ........................................................................... 106 6.1.2 Fisher’s Exact Test ................................................................ 109 6.1.3 Comparing Two Proportions in Paired Samples .................. 110 6.2 Normal Distribution Assumption-Free Rank-Based Methods for Comparing Distributions of Continuous Outcomes ................... 112 6.2.1 Permutation Test ................................................................... 114 6.2.2 Wilcoxon’s Rank Sum Test .................................................. 115 6.2.3 Kruskal–Wallis Test .............................................................. 116 6.2.4 Wilcoxon’s Signed Rank Test .............................................. 117 6.3 Linear Correlation Based on Ranks.................................................. 118 6.4 About Nonparametric Methods ........................................................ 118 6.5 Study Questions ................................................................................ 119 Bibliography .............................................................................................. 119

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.