ebook img

Measures of Association for Cross Classifications PDF

155 Pages·1979·6.638 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Measures of Association for Cross Classifications

Springer Series in Statistics Advisors: D. Brillinger, S. Fienberg, J. Gani, J. Hartigan J. Kiefer, K. Krickeberg Leo A. Goodman William H. Kruskal Measures of Association for Cross Classifications [I] Springer-Verlag New York Heidelberg Berlin Leo A. Goodman William H. Kruskal Department of Statistics Department of Statistics University of Chicago University of Chicago Chicago, Illinois 60637 Chicago, Illinois 60637 USA USA Library of Congress Cataloging in Publication Data Goodman, Leo A Measures of association for cross classifications. (Springer series in statistics; v. 1) Includes bibliographies. 1. Sociology-Methodology. 2. Sociology Statistical methods. I. Kruskal, William, 1919- joint author. III. Title. III. Title: Cross classifications. IV. Series. HM24.G627 301' .01'82 79-19570 All rights reserved. No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag. © 1979 by Springer-Verlag New York Inc. Softcover reprint of the hardcover 1st edition 1979 9 8 7 6 543 2 1 ISBN-13: 978-1-4612-9997-4 e-ISBN-13: 978-1-4612-9995-0 001: 10.1 00 7/978-1-4612-9995-0 Foreword In 1954, prior to the era of modem high speed computers, Leo A. Goodman and William H. Kruskal published the fmt of a series of four landmark papers on measures of association for cross classifications. By describing each of several cross classifications using one or more interpretable measures, they aimed to guide other investigators in the use of sensible data summaries. Because of their clarity of exposition, and their thoughtful statistical approach to such a complex problem, the guidance in this paper is as useful and important today as it was on its publication 25 years ago. Summarizing association in a cross-classification by a single number inevita bly loses information. Only by the thoughtful choice of a measure of association can one hope to lose only the less important information and thus arrive at a satisfactory data summary. The series of four papers reprinted here serve as an outstanding guide to the choice of such measures and their use. Many·users view measures of association as they do correlations, applicable to essentially all data sets. To their credit, Goodman and Kruskal argue that ideally each research problem should have one or possibly several measures of associa tion, with operational meaning, developed for its unique needs. Because the Goodman-Kruskal papers provide what amounts to a comprehensive catalogue of existing measures (several of which they themselves created), analysts may begin by examining and attempting to choose wisely from those measures currently available. If none are satisfactory, and new ones are created, the Goodman Kruskal papers will be helpful as models and guides. This series of papers evolved over a twenty year period. The first and core paper appeared in 1954. It suggests criteria for judging measures of association and introduces several new measures with specific contextual meanings. Exam ples and illustrations abound. The 1959 paper serves as a supplement to the inital one and provides additional historical and bibliographic material. The 1963 paper v vi Foreword derives large-sample standard errors for the sample analogues Of population measures of association and presents some numerical results about the adequacy of large-sample normal approximations. The 1972 paper presents a new look at the asymptotics, and provides a more unified way to derive large-sample var iances for those measures of association that can be expressed as ratios of func tions of the cell probabilities. Thus the techniques can be used for tried and true measures, and also for ones not yet invented. Only by rereading these papers many times can one appreciate the perspicacity that the authors have brought to this perplexing problem. As a colleague of Leo and Bill at The University of Chicago, I was privileged to witness the care and scholarly attention they gave to the last of the measures of association papers. It was truly a labor of love. Thus I am delighted both person ally and as a member of the Editorial Advisory Board for the Springer Statistical Series that Springer-Verlag has been able to bring together these four papers in a single volume, so that they can be shared with a new generation of statisticians and scientists. August, 1979 STEPHEN E. FIENBERG Preface * In the early 1950s, as young faculty members at the University of Chicago, we had separate conversations with senior colleagues there about statistical treatment of data that were naturally arranged as cross classifications of counts. One of us talked to Bernard Berelson (then Dean of the Graduate Library School and later the President of the Population Council), who was at that time dealing with extensive cross classifications related to voting beltavior. For example, he might have a number of cross classifications of intended vote against educational level for different sections of a city. The other conversations were with the late Louis Thurstone (a major figure in the field of psychometrics, and in particular in the development of factor analysis) , who also was dealing with multiple cross classifications in the context of the relationships between various personal characteristics (e.g., leadership ability) and results from various psychological tests. In both cases the investigator had substantial numbers of cross classifications and needed a sensible way to reduce the data to try to make it coherent. One promising approach was felt to be replacement of each cross classification by a number (or numbers) that measured in a reasonable way the degree of association between the characteristics corresponding to the rows and columns of the tabulated cross classification. Thus, the two of us were independently thinking about the same question. We discovered our mutual interest during a conversation at a party-we think that it was a New Year's Eve party at the Quadrangle (Faculty) Club-and the paper grew out of that interaction. We knew something of the existing literature on measures of association for *Most of this preface appeared in "This Week's Citation Classic", Current Contents, Social and Behavioral Sciences, No. 26, 25 June 1979, page 14. vii viii Preface cross classifications, and as we studied it further we recognized that most suggested measures of association were formal and arbitrary, without relevant interpretations--or without interpretations at all. Our contribution was to suggest a number of association measures that have interesting interpretations and to provide a simple taxonomy for cross classifications. As an example of the latter, we emphasized the importance of knowing whether or not the categories of a classifi cation have not a natural ordering. Since cross classifications occur throughout science, since our emphasis on interpretation was perhaps novel, and since our work was quickly incorporated into textbook expositions, citations to the paper became numerous. We continued work on the topic, digging more deeply into its history and fields of application, and treating at length the relevant approximate sampling theory in an effort to contribute some new approaches and to effect some changes in statistical thinking and practice. One of us also developed an interest in ordinal measures of association beyond cross classifications as such.1 The other was led to extensive research in the analysis of mUlti-way cross classifications, ieading to what have come to be known as log-linear model theory and methodology.2 Another outgrowth, we dare to hope, of our paper has been fresh general concern with descriptive statistics from the viewpoint of finding usefully interpretable characteristics of populations and samples. In this reprinting, notes appear in the margin at a few points to indicate errors that were corrected in later papers of the sequence. One additional trivial error has been directly corrected. Otherwise the papers appear just as they originally appeared. We end this preface with a statement of thanks to W. Allan Wallis, first Chairman of the Department of Statistics at the University of Chicago. There are many reasons for us to thank him, but the relevant one now is that he introduced us to Berelson and to Thurstone, and from those introductions our thinking on measures of association arose . Wallis, in fact. did far more than perform introduc tion: he discussed our nascent work with us. and suggested an important approach with which his name is associated in our first paper. Chicago, Illinois Leo A. Goodman September, 1979 William H. Kruskal IKruskal, w. H. Ordinal measures of association. J. Amer. Statist. Assoc. 53:814-61, 1958. zGoodman, L. A. The multivariate analysis of qUalitative data: interactions among multiple classifi cations. J. Amer. Statist. Assoc. 65:226-56, 1970. Contents Measures of Association for Cross Classifications 2 1. Introduction 3 2. Four Preliminary Considerations 5 3. Conventions 8 4. Traditional Measures 9 5. Measures Based on Optimal Prediction 10 6. Measures Based upon Optimal Prediction of Order 17 7. The Generation of Measures by the Introduction of Loss Functions 24 8. Reliability Models 26 9. Proportional Prediction 29 10. Association with a Particular Category 30 11. Partial Association 30 12. Multiple Association 31 13. Sampling Problems 32 14. Concluding Remarks 33 15. References 33 Measures of Association for Cross Classifications. II: Further Disc~sion and References 35 1. Introduction and Summary 35 2. Supplementary Discussion to Prior Paper 37 3. Work on Measures of Association in the Late Nineteenth and Early Twentieth Centuries 39 4. More Recent Publications 49 5. References 68 ix x Contents Measures of Association for Cross Classifications. m: Approximate Sampling Theory 76 1. Introduction and Summary 77 2. Notation and Preliminaries 79 3. Multinominal Sampling over the Whole Double Polytomy 81 4. Multinomial Sampling within Each Row (Column) of the Double Polytomy 114 5. Further Remarks 120 6. References 121 Appendix 122 Measures of Association for Cross Classifications. IV: Simplification of Asymptotic Variances 131 1. Introduction and Summary 131 2. Multinomial Sampling over the Entire Two-Way Cross Classification 132 3. Independent Multinomial Sampling in the Rows 137 4. Use of the Results in Practice 141 5. When Does u=O? 142 6. Cautionary Note about Asymptotic Variances 145 References 146 Measures of Association for Cross Classifications

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.