ebook img

ERIC ED471461: A Scalable Set of ESL Reading Comprehension Items. PDF

37 Pages·2002·0.36 MB·English
by  ERIC
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ERIC ED471461: A Scalable Set of ESL Reading Comprehension Items.

DOCUMENT RESUME ED 471 461 TM 034 627 Perkins, Kyle AUTHOR A Scalable Set of ESL Reading Comprehension Items. TITLE 2002-00-00 PUB DATE NOTE 35p. PUB TYPE Reports - Research (143) EDRS PRICE EDRS Price MF01/PCO2 Plus Postage. DESCRIPTORS *Adults; Classification; *English (Second Language); Limited English Speaking; *Reading Comprehension; *Scaling; *Test Items IDENTIFIERS Test of English as a Foreign Language; *Unidimensionality (Tests) ABSTRACT Guttman implicational scaling techniques were used to identify a unidimensional set of English as a Second Language reading comprehension items. Data were analyzed from 202 students who sat for an institutional administration of the Test of English as a Foreign Language (TOEFL). The examinees who contributed to the scalable set had significantly higher TOEFL scores than those who didn't contribute to the scalable set. The distribution of native languages represented in the scalable pool was significantly different from the native language distribution of the entire sample. The scalable items had significantly fewer syllables in their question stems than the nonscalable items. The scalable item question taxonomy distribution deviated significantly from the question taxonomy distribution for all the items. The results are discussed in relation to the Linguistic Threshold Hypothesis, language transfer, capacity constrained comprehension, psycholinguistic processing approaches, universal grammar-, - restructuring, and the Competition Model. (Contains 1 figure, 2 tables, and 41 references.) (Author/SLD) Reproductions supplied by EDRS are the best that can be made from the original document. A Scalable Set of ESL Reading Comprehension Items Kyle Perkins Southern Illinois University Carbondale U.S. DEPARTMENT OF EDUCATION Office of Educational Research and Improvement PERMISSION TO REPRODUCE AND EDUCATIONAL RESOURCES INFORMATION DISSEMINATE THIS MATERIAL HAS CENTER (ERIC) BEEN GRANTED BY his document has been reproduced as received from the person or organization originating it. O Minor changes have been made to K. Perkins improve reproduction quality. Points of view or opinions stated in this TO THE EDUCATIONAL RESOURCES document do not necessarily represent INFORMATION CENTER (ERIC) official OERI position or policy. 1 1 AVAILABLE. BEST COPY 4 Abstract Guttman implicational scaling techniques were used to identify a unidimensional set of ESL reading comprehension items. The examinees who contributed to the scalable set had significantly higher TOEFL scores than those persons who didn't contribute to the scalable set. The distribution of native languages represented in the scalable pool was significantly different from the native language distribution of the entire sample. The scalable items had significantly fewer syllables in their question stems than the non- scalable items. The scalable item question taxonomy distribution deviated significantly for the question taxonomy distribution for all the items. The results are discussed in relation to the Linguistic Threshold Hypothesis, language transfer, capacity constrained comprehension, psycholinguistic processing approaches, universal grammar, restructuring, and the Competition Model. 2 3 Introduction The purposes of this study are (1) to identify a set of items in an ESL reading comprehension test that are truly scalable and unidimensional, (2) to describe the test passages using the Lexile Framework for Reading (Stenner, 1996), (3) to describe the question stems and question options using objective measurement of reading comprehension, (4) to describe the subjects in terms of their total TOEFL scores and native languages, and (5) to discuss the findings in relation to the Linguistic Threshold Hypothesis (Bernhardt and Kamil, 1995), language transfer (Grabe and Stoller, 2002), capacity constrained comprehension (Just and Carpenter, 1992), psycholinguistic processing (Snow, 1998), universal grammar (Gass and Selinker, 2001), restructuring (McLaughlin, 1990), and the Competition Model (MacWhinney, 1987; Bates and MacWhinney, 1981). The quest to identify a set of reading comprehension items that are truly scalable and unidimensional (i.e., measure a single construct) is motivated by the wide variety of knowledge, skills, abilities, and strategies that one finds discussed in texts and articles whose general headings can be characterized as researching and teaching reading comprehension. Three quite randomly-chosen examples should be enough to make the point. Johnston (1983) wrote that one must consider the following factors in terms of describing reading comprehension assessment tasks: " production requirements; memory and retrieval requirements; reasoning requirements; motivation; purpose; social setting and interaction; expectation and perceived task demands; and test-wiseness" (p. 34). 3 4 Omaggio (1986) listed ten factors involved in reading: "recognizing the script of a language; deducing the meaning and use of unfamiliar vocabulary; understanding information that is stated explicitly; understanding implications not explicitly stated; understanding relationships within sentences; understanding relationships between the parts of a text through cohesive devices, both grammatical and lexical; identifying the main point or the most important information; distinguishing the main idea from the supporting detail; extracting the main points in order to summarize; and understanding the communicative value and function of the text" (p. 151). Grabe and Stoller (2002) mention lower- and higher-level processes that are engaged when we read. The lower-level processes include lexical access; syntactic parsing; semantic proposition formation; and working memory activation. The higher- level processes include text model of comprehension; situation model of reader interpretation; background knowledge and inferencing; and executive control processes (p. 27). A review of factor analytic and multiple regression studies of reading comprehension shows an even greater variety of findings. The following list indicates the number of "factors" identified in reading measures and by whom: Davis (1944), two; Derrik (1953), three; Davis (1968, 1972), five. Subjects Data were analyzed from 202 students who sat for an institutional administration of a TOEFL test. The subject pool had an average score of 456.99 (SD = 59.51) on the overall TOEFL, and the distribution of native languages was as follows: 4 5 Chinese 50 Japanese 35 Korean 28 27 Spanish Arabic 13 Thai 11 Cantonese 4 Malay 4 Greek 4 Hindi 3 Portuguese 3 Turkish 3 Urdu 3 Russian 2 Unknown 2 Swedish 1 Somali 1 German 1 Kurundi 1 Malinke 1 French 1 5 6 Indonesian 1 Romanian 1 Amharic 1 Mandarin 1 Guttman Implicational Scaling Guttman implicational scaling was utilized to identify a scalable set of items. Guttman scaling analyzes the underlying characteristics of three or more items to determine whether the interrelationships between the items meet the properties that define a Guttman scale. Two of those properties are unidimensionality and cumulativeness. Unidimensionality implies that items must all measure movement toward or away from the same underlying construct. Cumulativeness implies that items can be ordered by item difficulty and that subjects who "pass" a difficult item will also "pass" easy items and vice versa (Torgerson, 1958). Operationally, one looks for the extent to which scores of 1 for a given item are associated with scores of 1 for all items that have been determined to be less difficult. One also looks for the extent to which scores of 0 for a given item are associated with scores of 0 for all items that have been determined to be more difficult. In conducting a Guttman scaling procedure, one seeks the degree to which the data fit the model. Deviations from the expected pattern are counted as errors that are aggregated, and coefficients are produced to enable the researcher to ascertain whether the items are scalable, unidimensional, and cumulative. 6 7 Four statistics are associated with Guttman implicational scaling. "The coefficient of reproducibility is a measure of the extent to which a respondent's scale score is a predictor of one's response pattern. It varies from 0 to 1. A general guideline to the interpretation of this measure is that a coefficient of reproducibility higher than .9 is considered to indicate a valid scale. The minimum marginal reproducibility constitutes the minimum coefficient of reproducibility that could have occurred for the scale given the cutting points used and the proportion of respondents passing and failing each of the items. The difference between the coefficient of reproducibility and the minimum marginal reproducibility indicates the extent to which the former is due to response patterns rather than the inherent cumulative interrelationship of the variables used. This difference is called the percent improvement and is actually the difference in two percents rather than a ratio itself. The final measure is obtained by dividing the percent improvement by the difference between 1 and the minimum marginal reproducibility. The denominator represents the largest value that the percent improvement may attain, and the resulting ratio is called the coefficient of scalability. The coefficient of scalability also varies from 0 to 1, and should be well above .6 if the scale is truly unidimensional and cumulative" (Nie, Hull, Jenkins, Steinbrenner, and Bent, 1975, pp. 532-33). The Lexile Framework for Reading The Lexile framework is based on the notion that all symbol systems share a semantic component and a syntactic component. "In all cases, the semantic units vary in familiarity and the syntactic structures vary in complexity. The comprehensibility or 7 8 difficulty of a message is governed largely by the familiarity of the semantic units and by the complexity of the syntactic structures used in constructing the message. "As far as the semantic component is concerned, it is clear that most operationalizations are proxies for the probability that a person will encounter a word in context and thus infer its meaning (Bormuth, 1966). Klare (1963) builds the case for the semantic component varying along a familiarity-to-rarity continuum. Knowing the frequency of words as they are used in written and oral communication provides the best means of inferring the likelihood that a word will be encountered and thus become a part of the individual's receptive vocabulary. "Variables such as the average number of letters or syllables per word are actually proxies for word frequency. They capitalize on the high negative correlation between the length of words and the frequency of word usage. Polysyllabic words are used less frequently than monosyllabic words, making word length a good proxy for the likelihood of an individual being exposed to them. "Sentence length is a powerful proxy for the syntactic complexity of a passage. One important caveat is that sentence length is not the underlying causal influence (Chall, 1988). Researchers sometimes incorrectly assume that manipulation of sentence length will have a predictable effect on passage difficulty. Davidson and Kantor (1982), for example, illustrate rather clearly that sentence length can be reduced and difficulty increased and vice versa. 8 9 "Klare (1963) provides a possible interpretation for how sentence length works in predicting passage difficulty. He speculates that the syntactic component varies in the load placed on short-term memory. This explanation also is supported by Crain and Shankweiler (1988), Shankweiler and Crain (1986), and Liberman, Maim, Shankweiler, and Werfelman (1982), whose work has provided evidence that sentence length is a good proxy for the demands that structural complexity places upon verbal short-term memory" (Stenner, 1996, pp. 9-10). Meta Metrics (Meta Metrics, 1995) computer program was used to analyze the test reading passages. The program includes sentence length and word frequency in its analysis and reports the difficulty in Lexiles. The Lexiles are anchored at the low end (200) on text from seven basal primers and at the high end (1200) on text from the Electronic Encyclopedia (Grolier, 1986). Question Stem and Question Options Measures For each question stem, the number of syllables and the average word frequency for the words appearing in the question stem were recorded. The word frequency measure used was the raw count of how often a given word appeared in a corpus of 5,088,721 words sampled from a broad range of school materials (Carroll, Davies, and Richman, 1971). The number of syllables in the question options were also recorded in addition to the average word frequency for the words appearing in the question options. Syllables are considered in this research as a proxy for syntactic complexity, and word 9 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.