ebook img

Statistics for Allied Health Professionals - MEDLABSTATS.com PDF

120 Pages·2013·4.06 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Statistics for Allied Health Professionals - MEDLABSTATS.com

Statistics For Allied Health Professionals STATISTICS FOR ALLIED HEALTH PROFESSIONALS CONTENTS Contents ABOUT THE AUTHOR ......................................................................................................... 4 Author’s Contact Details ................................................................................................... 4 PURPOSE OF THIS BOOK .................................................................................................. 4 A FREQUENTLY ASKED QUESTIONS GUIDE TO USING THIS MONOGRAPH............... 5 1: INTRODUCTION .............................................................................................................. 6 2: ELEMENTARY PROBABILITY THEORY ......................................................................... 7 3 : WHAT IS MEANT BY A ‘SIGNIFICANT DIFFERENCE’. ................................................ 7 4: CHI SQUARE TEST ......................................................................................................... 8 4.1 : ONLINE CHI SQUARE CALCULATOR ................................................................... 9 5: WILCOXON RANK SUM TEST ON INDEPENDENT SAMPLES ................................... 10 6: THE MANN-WHITNEY U-TEST ..................................................................................... 11 7: KRUSKAL-WALLIS TEST .............................................................................................. 14 8: HISTOGRAMS AND MEDIANS ...................................................................................... 14 9: QUARTILES AND BOX AND WHISKER PLOTS ........................................................... 18 10: BUMP PLOTS ............................................................................................................... 20 11: THE NORMAL DISTRIBUTION OF FREQUENCIES, ITS MEAN AND ITS STANDARD DEVIATION. ....................................................................................................................... 23 11.1 CHARACTERISTICS OF THE NORMAL DISTRIBUTION ..................................... 23 11.2 KURTOSIS ............................................................................................................. 25 11.3 SKEW .................................................................................................................... 28 12: Z-SCORE: .................................................................................................................... 30 13: THE F TEST ................................................................................................................. 33 14: THE Z TEST FOR COMPARING TWO SAMPLE MEANS ........................................... 33 15: THE STUDENT’S t TEST ............................................................................................. 34 16: GOODNESS OF FIT OF A HISTOGRAM OF YOUR DATA TO A NORMAL ERROR CURVE ............................................................................................................................... 34 17: LINE OF BEST FIT BY LEAST SQUARES LINEAR AND NON-LINEAR REGRESSION ............................................................................................................................................ 36 18: THE CORRELATION COEFFICIENT ........................................................................... 38 19: ERRORS ON THE ESTIMATES OF THE SLOPE AND INTERCEPT OF A LINEAR REGRESSION .................................................................................................................... 39 20: GEOMETRIC MEAN REGRESSION ............................................................................ 40 21: MULTIPLE LINEAR REGRESSION ............................................................................. 42 22: RANK ORDER CORRELATION ................................................................................... 45 23: LOGISTIC REGRESSION ............................................................................................ 47 24: ANOVA (ANALYSIS OF VARIANCE) ........................................................................... 54 25: THE DESIGN OF QUESTIONNAIRES ......................................................................... 61 25.1 : ASSESSING THE “QUALITY” OF YOUR QUESTIONNAIRE – CRONBACH’S ALPHA ........................................................................................................................... 64 25.2: EXTRACTING THE ‘MEANING’ FROM QUESTIONNAIRES ............................... 64 26: PROBABILITY (DECISION) TREES ............................................................................. 65 26.1 BAYES THEOREM ................................................................................................ 67 26.2 DATA MINING BASED UPON BAYES THEOREM ............................................... 73 27: THE STATISTICAL DESIGN OF RESEARCH PROJECTS: ........................................ 79 27.1: SAMPLE SIZE REQUIRED FOR A STUDY TO REACH A SPECIFIED POWER . 79 27.2: POWER OF A STUDY .......................................................................................... 79 Dr Tom Hartley Page 2 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS 27.3: RESOLVING THE PROBLEM OF APPROPRIATE PLACEBO/CONTROL GROUP DESIGN ......................................................................................................................... 84 28: APPLIED STATISTICS ................................................................................................. 84 28.1: THE SHEWHART OR LEVEY JENNINGS QC PLOT ........................................... 84 28.2 TRIGG TRACKING SIGNAL .................................................................................. 88 28.3 YOUDEN PLOTS ................................................................................................... 90 28.3 THE BLAND AND ALTMAN PLOT FOR COMPARING TWO METHODS OF MAKING THE SAME MEASUREMENT ......................................................................... 91 28.4 CURVE FITTENG USING CUBIC SPLINES .......................................................... 93 APPENDIX ONE : CRITICAL VALUES OF CHI SQUARES ............................................... 95 APPENDIX TWO : WILCOXON RANK SUMS .................................................................. 96 APPENDIX THREE : TABLE FOR THE MANN-WITNEY U TEST ..................................... 97 APPENDIX FOUR : ORDINATES (y values or Frequencies) OF THE STANDARD NORMAL CURVE ............................................................................................................... 98 APPENDIX FIVE : STANDARD NORMAL (z) DISTRIBUTION : AREA UNDER THE CURVE FROM THE MEAN TO YOUR SPECIFIED VALUE OF Z ..................................... 99 APPENDIX SIX : THE F DISTRIBUTION : 95th PERCENTILE VALUES FOR THE F DISTRIBUTION ................................................................................................................ 101 APPENDIX SEVEN : STUDENTS’S T TEST TABLE OF t ............................................... 103 APPENDIX EIGHT : TABLE OF CRITICAL VALUES OF THE LEAST SQUARES REGRESSION CORRELATION COEFFICIENT .............................................................. 105 APPENDIX NINE : J-WALK ENHANCED DATA ENTRY FORM FOR EXCEL : HOW TO USE IT .............................................................................................................................. 106 APPENDIX TEN : VBA CODE FOR DATA ENTRY FORMS AND ARTICLE BY ….. ON HOW TO BUILD YOUR OWN DATA ENTRY FORM IN EXCEL ...................................... 110 Dr Tom Hartley Page 3 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS ABOUT THE AUTHOR Dr Hartley is a Clinical Biochemist with a longstanding interest in statistics. HeI graduated with a B.Sc.(Hons) degree in Chemistry from the University of Manchester Institute of Science and Technology in 1969 and then took up a MRC funded research student position within the Medical Physics Department of the University of Leeds and the MRC Mineral Metabolism Unit. He was awarded the higher degree of PhD by the University of Leeds in 1974. The title of his thesis was ‘The Metabolism of Copper, Zinc and the Four Major Cations in Man’. Between 1972 and 1974, he carried out two years post-doctoral research at St Mary’s Hospital in Portsmouth, UK, on a clinical trial of a new Total Parenteral Nutrition amino acid solution. In 1974 he emigrated to Adelaide in South Australia, where he took up a post in the Clinical Chemistry Dept at the Institute of Medical and Veterinary Science. While there he extended his clinical nutrition, metabolic bone disease and statistical quality control work before moving to take up the post of Scientist Second in Charge of the Clinical Chemistry Dept. at the Royal Hobart Hospital in 1988. He continues as a Senior Biochemist in the Department of Pathology at the Royal Hobart Hospital as well as being their Quality Manager since 2000. In June 2004 he took up a 0.25 position in the School of Human Life Sciences, University of Tasmania, as a Senior Research Fellow. He is the author of the book ‘Computerised Quality Control – 2nd Ed’ published by Ellis Horwood in 1990 as well as a large number of papers in scientific journals. His research interests include nutritional biochemistry, trace elements, antioxidants, instrumental methods of analysis, laboratory computing and statistics. Currently his three foci of interests are laboratory quality systems, biomedical statistics and the HPLC analysis of clinical samples for capsaicins. The latter involves the assay of capsaicins, the ‘hot’ components of chillis, in human plasma samples using HPLC. Author’s Contact Details Dr Tom Hartley, Quality Manager and Senior Biochemist, Pathology Services, Royal Hobart Hospital, Hobart, Tasmania 7000, Australia Dr Tom Hartley, Senior Research Fellow, School of Human Life Sciences, University of Tasmania, Launceston, Tasmania 7250, Australia. Email [email protected] Phone +61 3 6222 8780 Website : www.medlabstats.com _____________________________________________________________________ PURPOSE OF THIS BOOK Several years experience with teaching statistical concepts to a variety of Allied Health undergraduates has highlighted the need for a concise and approachable book on statistics. When these students graduate the aim is that they enter their professions with a Dr Tom Hartley Page 4 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS basic knowledge of project design, statistics and experience with the practical analysis of simple datasets using the readily available tools in Microsoft Excel. This book is designed so that the statistical concepts are described in the first half and the second half is devoted to illustrating how these statistical tests are accessed and used in Microsoft Excel. Where the latter does not proved the required tool then the reader is directed to www resources either on his own website or on others that have proved to be up to the job. _____________________________________________________________________ A FREQUENTLY ASKED QUESTIONS GUIDE TO USING THIS MONOGRAPH Q : Why do I need to know about probability theory to understand statistics ? Answer : Read Section 2. Q : Researchers are always talking and writing about how their research was statistically significant. What do they mean by this? Answer : Read Section 3. Q : I don’t usually measure anything on my patients but I do notice that they tend to fall into categories according to a few basic characteristics eg. women between 45 and 55 who are overweight usually present with or end up with Type 2 Diabetes. Is there a statistical method for confirming this kind of ‘hunches’? Answer : Read Sections 4, 5, 6 and 7 Q : When would I find it useful to plot a histogram of my data ? Answer : Read Sections 8, 9 and 15. Q : What kind of plots are useful in illustrating differences in patient outcomes or responses ‘before’ and ‘after’ treatments ? Answer : Read Sections 8, 9 and 10. Q : Researchers are always talking about their data being Normally Distributed. What do they mean by this ? Answer : Read all of Section 11. Q : If my data are Normally Distributed what statistical tests can I do on them ? Answer : Read Sections 12, 13, 14, and 15. Q : Researchers talk about being able to prove a cause and effect from their research. What do they really mean by this and how do they prove it ? Answer : Read Sections 17, 18, 19, 20, 21, 22, 23 and24. Q : In my work I suspect that the clinical effects I am observing are due to a combination of different causes. How would I go about proving this ? Answer : Read Section 21 and 24. Q : In my work I tend to find that my patients tend to fall into ‘Low’, ‘Moderate” and ‘High’ risk categories. Are there a statistical method to deal more objectively with this kind of classification? Answer : Read Sections 22 and 23. Dr Tom Hartley Page 5 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS Q : In my area of work we gather a lot of data from our patients using questionnaires. Is there a way of ensuring that are questionnaires are ‘good’ and ‘unbiased’ ? Answer : Read Section 25. Q : I have collected a lot of data on my patients. Is there a way of exploring that data for relationships that I probably have not realised are there ? Answer : Read Section 26 first and see if that gives you some insights. Then you can also read Sections 21, 22, 23 and 24. Q : I would like to apply for a research grant or a project grant but I am put off by the requirements to write up the statistical section. Answer : Read Sections 27, 28.1 (if you are going to be making lab measurements) and 28.3 (if you are going to be making lab measurements of the same parameter using different instruments or methods). Once you have decided what your hypotheses are and then what types of data you are going to collect for analysis then you should read those sections that best fit your data types ie. choose the appropriate mix of parametric and non- parametric tests. Q : I have time series data. What are the best statistical tests to use for analysing those kind of data ? Answer read Sections 28.2 and 28.4 _____________________________________________________________________ 1: INTRODUCTION Objectives in medical research are usually to intervene for a beneficial effect or to intervene to determine the cause of a clinical symptom The intervention is usually an experimental drug an experimental therapy a new diagnostic test or investigation The question always arises – how can we objectively decide that the intervention has been a success. First we have to define what constitutes a success, a failure and an equivocal outcome. Then a fairly intuitive next step would be to measure the frequencies of successes, failures and equivocal outcomes (neither successes or failures ). Then use statistics to analyze the numbers of successes, numbers of failures and number of equivocal outcomes against the hypothesis that these numbers are not related in any way to the intervention. This is the null hypothesis. _____________________________________________________________________ Dr Tom Hartley Page 6 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS 2: ELEMENTARY PROBABILITY THEORY The simplest analogy is the tossing of a coin – it can land heads up or tails up. There is a 1/2 chance that there will be head and a 1/2 chance that there will be a tail. 1/2 equals 0.50. It is usual to use the letter p to signify probability and use the phrase … the probability of a head is p=0.50 the probability of a tail is p=0.50 From these two phrases we can derive one of the fundamental rules of probability : If there a n mutually exclusive events then the sum of the individual probabilities must be ONE p + p = 0.50 + 0.50 = 1 HEAD TAIL The next simplest analogy is a six sided dice – it can land with a 1, 2, 3, 4, 5 or 6 face up. This means that there is a 1/6 chance of landing with 1 facing up. It follows from this that the value of p for throwing a ‘1’ is 0.1667, of p for throwing a ‘2’ is 0.1667, of p for throwing a ‘3’ is 0.1667, of p for throwing a ‘3’ is 0.1667, of p for throwing a ‘4’ is 0.1667, of p for throwing a ‘5’ is 0.1667 and of p for throwing a ‘6’ is 0.1667. And the p for throwing any number is 6 * 0.1667 = 1.000…. ! (So we know we have got our probability predictions right.) _____________________________________________________________________ 3 : WHAT IS MEANT BY A ‘SIGNIFICANT DIFFERENCE’. In clinical research we are usually involved in activities in which we want to cause a change in a parameter – these fall under the general heading of intervention studies. In this scenario the rule is that we need to look at our statistical results with the objective of finding those parameters that have changed with a statistical p value equal to or less than 0.05 then we can be 95% confident that we have detected a significant difference. Many statistical tests report p values for a ‘one tailed’ and a ‘two tailed’ test. You can use the ‘one tailed’ test only in scenarios where it is clear that the parameter you have tested statistically could only go one way because of the study design eg. body weight will fall in a ‘hunger strike’ study! But in other studies you cannot be so dogmatic and you should then be more ‘open minded’ and opt for the two tailed p value. For example you may be studying a blood pressure lowering drug – the company assures you that based on their studies of ‘monotherapy’ it lowers blood pressure – but you want to use it in a more realistic scenario – use it in combination with other drugs usually co-prescribed to a hypertensive eg a diuretic. Under that scenario you cannot be dogmatic – it may well cause no change or an increase in blood pressure in some of your test subjects. So if in doubt always use the p value from a ‘two tailed’ test. Dr Tom Hartley Page 7 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS Other researchers undertake observational studies ie. non-intervention studies. Usually they have a question and they then go out and collect data that they reasonable expect to provide and answer to their question eg. Does socioeconomic status affect blood pressure ? They would usually start from the premise that socioeconomic status does not affect blood pressure – they take the ‘null’ hypothesis route. Under this scenario they are usually hoping that all their ‘two tailed’ statistical tests will return p values greater than 0.05. When this is observed they can be confident that there is a less than 5% chance that socioeconomic status affects blood pressure. But they may find, for example, within some of the parameters they have measured that there is a significant difference between say the number of cigarettes smoked between two socioeconomic status groups. Then if they look at the blood pressures in these two subgroups they may find that there are significant differences – p values equal to or less than 0.05. Regardless of the direction of the difference researchers involved in non-intervention studies must always apply ‘two tailed’ tests. _____________________________________________________________________ 4: CHI SQUARE TEST This is the test to use when we want to check that the probabilities that we observed in a clinical trial or a new clinical investigation support the hypothesis that the intervention has improved the probabilities of clinical improvement or diagnosis. We tabulate our data in the following format : TABLE 1 TREATMENTS OUTCOMES Physiotherapy Physio plus Antibiotics only ROW counted as only Antibiotics TOTALS number of Patients Improved No Change Deteriorated COLUMN Grand TOTALS Total The formula for the Chi Square Test is This formula is saying ‘for every observed frequency of an event (O) subtract the expected frequency of that event (E), square it and divide it by the expected frequency of that event’. You then sum all those calculations to get Chi Squared. Dr Tom Hartley Page 8 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS How do you know what the expected frequency of the events are in each box of the table ? The rule is to multiply the row total by the column total and divide by the grand total. This is best left to the computer program you use for calculating Chi Squares. What you do need to realise is that if the observed frequencies in each square equalled the expected frequencies in each square then the value of Chi Square would be ZERO…a result that would mean that your treatments had had NO EFFECT AT ALL. In Medical Research we are usually hoping for large values of Chi Square – signifying our intervention has been significantly effective! Statisticians have modelled this type of experiment with all possible outcomes and drawn up Statistical Tables which show with what probability a particular value of Chi Square can be expected to occur, See the Table in APPENDIX ONE The only missing item that you need to calculate is the number of degrees of freedom for your study. This is easily calculated from the formula .. Degrees of Freedom for Chi Square Test = (Number of Data Columns in the Table minus One) multiplied by (Number of Data Rows in the Table minus One) In this example we have three treatments and three possible outcomes which means we have FOUR degrees of freedom; (3-1) * (3-1) = 2 * 2 = 4 Reading across the row corresponding to FOUR degrees of freedom in APPENDIX ONE we have three values 7.78, 9.49 and 13.28. Imagine that we have a Chi Square result of 8.2 which is to the left of the value 9.49 we can state that the p value for this Chi Square Test is >0.05 and <0.10. Any statistical test that returns a ‘p’ value of > 0.05 is a sign that there is NO statistically significant finding using the test. Any statistical test that returns a ‘p’ value of < 0.05 is a sign that there IS a statistically significant finding using the test. So in conclusion we can say that there was no statistically significant evidence to support the hypothesis that the three treatments had any significantly different outcome. 4.1 : ONLINE CHI SQUARE CALCULATOR There is an online calculator for doing Chi Square testing at: www.physics.csbsju.edu/stats/contingency.html Dr Tom Hartley Page 9 of 120 Version :20-12-2013 STATISTICS FOR ALLIED HEALTH PROFESSIONALS or you can download one based upon an Excel Spreadsheet from the author’s website : www.medlabstats.com/alliedhealth _____________________________________________________________________ 5: WILCOXON RANK SUM TEST ON INDEPENDENT SAMPLES In the Introduction I made the comment “First we have to define what constitutes a success, a failure and an equivocal outcome. Then a fairly intuitive next step would be to measure the frequencies of successes, failures and equivocal outcomes (neither successes or failures )” I am now going to add a second fairly intuitive next step and that is just to look at the data in terms of the relative sizes of the responses and see if these correspond to your classification or hypothesis. Suppose you had a fitness intervention programmes, which had the aim of getting the participants to loose a bit of weight, and these were being run at two gyms. Lets call the two groups East Gym and North Gym and look at their weight losses six months into their participation, see Table 2 TABLE 2 EAST GYM NORTH GYM Andrew 7.5 Fred 7.6 Ben 7.7 Geoff 8.2 Carl 8.3 Harry 8.5 Dennis 8.6 Ian 8.8 Eddie 8.9 John 9.1 We would agree that if we sorted all the weight losses in ascending order AND if there was no difference between the gym groups then the results should alternate East, North, East, North …… So if we then went on to add up the ranks for the two gyms we would expect to identical results. When we do this on small samples like those shown in Table 2 we get Rank Sums of 24 and 30 for the East and North gyms respectively; not exactly equal but this illustrates the ‘small sample size’ effect. This kind of ’phenomenon’ was studied by a famous statistician Frank Wilcoxon who devised the simple and elegant Wilcoxon-Rank Sum Test. He then went on to devise the Table in APPENDIX TWO that covers a range of possible situations of groups of various sizes. This is read as follows: Read across to 5 in the category ‘Number of Subjects in Group with Fewest Members’ and then down to row 5 which equals the ‘Number of Subjects in the Group with the Most Members’. In that box you can see two numbers – 17 and 38. These are the two limits that you interpret your calculated rank sum against. Because 24 and 30 are both INSIDE these limits you can state that there was no effect of exercise on their final weight losses of the people in this study at the p<=0.05 level of significance. Dr Tom Hartley Page 10 of 120 Version :20-12-2013

Description:
STATISTICS FOR ALLIED HEALTH PROFESSIONALS. Dr Tom 4.1 : ONLINE CHI SQUARE CALCULATOR . Objectives in medical research are usually.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.