ebook img

ERIC EJ889148: Tyler Heights Is Not Alone: Score Inflation Is Common in Education--and Other Fields PDF

2010·0.46 MB·English
by  ERIC
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ERIC EJ889148: Tyler Heights Is Not Alone: Score Inflation Is Common in Education--and Other Fields

They were told that science was a stepping subject she used to teach—social studies, on random knowledge—Billie Holiday’s stone to all sorts of learning and how much and particularly geography—but when it alcoholism, female Arctic explorers—and students loved it. came down to it, social studies fared no breezed by quickly. They were hard to But I saw very little science in the third better than science. understand on the fly when the children grade at Tyler Heights. The kits in Johnson’s Tyler Heights’ third-graders got only the had had such little exposure, at school and room would be opened to roll marbles one most cursory introduction to economics at home, to history, culture, and the natural time early in the year, and later to make goo and Native Americans, and much of the world.† and sculpt a landform and to compare curriculum was skipped altogether. The * * * seeds and pebbles in a petri dish. These students were geographically ignorant. At a conference on assessment, a reading were only a tiny fraction of the experiments Approaching the Naval Academy after a specialist from the Maryland Department inside, and at any rate, they were presented three-mile bus ride, several shouted, of Education told teachers and principals in class severely abridged—no hypotheses, “Look, it’s New York!” The third-graders desperate to unlock the secrets of the MSA no data. Mostly students read from the had heard Africa mentioned a lot but were that BCRs are not tests of writing skills at textbook and did worksheets. The only full not sure if it was a city, country, or state. all, but of reading. “I’m not saying kids pass through the scientific method was (They never suggested “continent.”) At the shouldn’t write well-developed para- made after the MSA, in the days spent pre- end of the year, the children in Johnson’s graphs,” she told the standing-room-only paring for the science fair. class were asked to name all the states they crowd. “But that’s not what we’re worried “I’m a realist,” McKnight had told the could. Cyrus knew the most: three. He about on this test.” teachers. “What gets taught is what gets couldn’t name any countries, though, and “You could bullet it, list key phrases, and tested.” The rest—even if it is part of the when asked about cities, he thrust his fin- you could get the same number of points state standards—gets left behind. When it ger in the air triumphantly. “Howard as someone who wrote a well-crafted came to the accountability movement, County!” answer,” McKnight said. The formula is a McKnight epitomized the ambivalence of McKnight had asked teachers to give helpful scaffold, she said, but “if the only most educators I’ve met: she was support- students passages on social studies and ive of standards and testing in theory, but science topics for supplemental reading †For the relationship between background knowledge painfully aware of the unintended conse- lessons in preparation for the MSA. But the and reading skills, see e. D. Hirsch, Jr., The Knowledge Deficit: Closing the Shocking Education Gap for quences. She was passionate about the passages the third-graders read touched American Children (new york: Houghton mifflin, 2006). Tyler Heights Is Not Alone Score Inflation Is Common in Education—and Other Fields BY DANIEL KORETz reported gains are entirely illusory, and and the more apt it will be to distort and others are real but grossly exaggerated. corrupt the social processes it is intended Every year, newspaper articles and news The seriousness of this problem is hard to to monitor.”1 One can find examples of releases from education departments overstate. When scores are inflated, many Campbell’s law in the media from time to around the nation tell us that test scores of the most important conclusions people time that provide a hint of how score are up again, often dramatically. Usually, base on them will be wrong, and stu- inflation arises. there are some grades or districts that dents—and sometimes teachers—will The most disturbing example of have not made substantial gains, and the suffer as a result. Campbell’s law that I have encountered gaps in performance between poor and This is the dirty secret of high-stakes was reported by the New York Times in rich, and majority and minority, often fail testing. You may see occasional references 2005. The School of Medicine and to budge. Nevertheless, the main story to this problem in newspapers, but for the Dentistry at the University of Rochester line is usually positive: performance is most part, news reports and announce- had surveyed cardiologists around the getting better, and rapidly. ments of scores by states and school state. As the Times reported, “An Unfortunately, this good news is often districts accept increases in scores at face overwhelming majority of cardiologists in more apparent than real. Scores on the value. New York say that, in certain instances, tests used for accountability have become When I and others who work on this they do not operate on patients who inflated, badly overstating real gains in issue point it out, the reactions often might benefit from heart surgery, because student performance. Some of the range from disbelief to anger. So perhaps they are worried about hurting their it is best to start on less controversial rankings on physician scorecards issued by ground. We see something akin to score the state.”2 Fully 83 percent of respon- Daniel Koretz is the Henry Lee Shattuck Professor of inflation in many other fields as well. It is dents said that the reporting of mortality Education at Harvard University’s Graduate School of so common, in fact, that it has the name rates had this effect, and 79 percent Education and a member of the National Academy of Campbell’s law in social sciences: “The admitted that “the knowledge that Education. This sidebar is excerpted with permission from more any quantitative social indicator is mortality statistics would be made public” measuring up: What educational testing really tells us, Harvard University Press, copyright © 2008 by the used for social decision making, the more had affected their own decisions about President and Fellows of Harvard College. subject it will be to corruption pressures whether to perform surgery.* 8 AMERICAN EDUCATOR | SUMMER 2010 thing you’re teaching is BCRs, your kids are the county benchmark tests and suspected mer teachers who were now an aide and a not learning to write.” would be on the MSA. The third-graders mentor reminisced about the days when The third-graders at Tyler Heights, answered again and again what traits third-graders read novels and did chemis- then, did not learn to write. They learned, described the main character of a story. try experiments and worked in groups to thanks to a timer broadcast on the over- They wrote the “I know this is a play design versions of the 13 colonies and did head projector, to fill in the box of eight because” BCR about 10 times but never got writing, real writing. A resource teacher lines in seven to nine minutes. They to act out a play. They wrote “I know this is who was an active part of the school’s learned to “proof and polish” with a special a fairy tale because” and “I know this is a laser-sharp focus over the last few years purple pen, and whisper their paragraphs fable because” but never tried their hand began to question her own role. She lis- to themselves through C-shaped sections at creating either. About a fake brochure tened to the veterans and added her two of PVC pipe held to their ears—what they they wrote, “The text features that make this cents. “While our scores were really good called “whisper phoning,” a strategy for easy for a third-grader to understand are last year, can I tell you our kids are any detecting if your answer makes sense. They italics, numbering, and underline.” But smarter? I don’t know.” learned to adhere to the BATS formula in they never made their own brochures with * * * BCRs like the one Johnson led her students their own text features; the only things they Tyler Heights was not explicitly ordered to through one day: underlined were hundred-dollar words. de-emphasize topics that are not tested; They wrote “I know this is a poem because then again, nobody from the school dis- Damon and Pythias is a play it has rhyme, rhythm, and stanzas” about trict, and nobody who lauded the school because it has the elements of a play. 50 times, Johnson estimated, but they only for its scores, bothered to make sure the Some elements of a play are that wrote three poems. whole curriculum was taught. On the last plays have stage directions. Also, The Tyler Heights teachers knew that day of MSA testing, McKnight said to me, there is a narrator. This play also has the BCR focus was a problem but were “MSA, that’s just the bottom of what kids a lot of characters. So I know this either unwilling or unable to veer from the should know. It’s not like we were calling play has all the features it needs. program—they felt they were not allowed. them brilliant. We’re still shooting for the The BCRs tended to repeat themselves, One day in the teachers’ lounge, two for- basement. We celebrate the bottom right so the children worked on a limited range now. I pray we don’t have to keep celebrat- of questions teachers knew would be on ing the bottom.” ☐ So it should not be surprising that instruction, and a reliance on teaching when the heat is turned up, educators— test-taking tricks. and students—will sometimes behave in I strongly support the goal of improved ways that inflate test scores. Actually, it accountability in public education. I saw would be quite remarkable, given how the need for it when I was an elementary pervasive the problem is in other fields, if school and junior high school teacher none of them did. many years ago. I certainly saw it as the Advocates of current test-based parent of two children in school. Nothing accountability systems often counter by in more than a quarter century of educa- arguing, “So what if the gains are tion research has led me to change my distorted? What matters is that students many people assume that if scores are mind on this point. And it seems clear that learn more, and if we get that, we can increasing we can trust that kids are student achievement must be one of the live with some distortion.” Hypothetically, learning more, there is a disturbing lack most important things for which educators yes, we could live with it if we knew that of good evaluations of these systems, and school systems should be accountable. students were in fact learning more, and even after more than three decades of However, we need an effective system of if the distortions were small enough that high-stakes testing. What we do know is accountability, one that maximizes real they did not seriously mislead people and that score inflation can be enormous, gains, and minimizes bogus gains and cause them to make incorrect decisions. more than large enough to seriously other negative side effects. But in fact, we usually cannot distinguish mislead people. In all, educational testing is much like a between real and bogus gains. Because so As a result, we need to be more realistic powerful medication. If used carefully, it about using tests as a part of educational can be immensely informative, and it can *these numbers may be off by a modest amount, but accountability systems. Systems that simply be a very powerful tool for changing not by enough to make the results less appalling. only pressure teachers to raise scores on one education for the better. Used indiscrimi- 65 percent of the sampled surgeons responded to the test (or one set of tests in a few subjects) nately, it poses a risk of various and severe survey, which is a marginally acceptable response rate. are not likely to work as advertised, side effects. ☐ the risk is that surgeons who did not respond would particularly if the increases demanded are have given different answers than those who did. but Endnotes even if all 35 percent who did not respond would have large and inexorable. They are likely 1. Donald t. Campbell, “assessing the impact of Planned social replied to these questions in the negative—an instead to produce substantial inflation of Change,” in Social Research and Public Policies: The Dartmouth/ extremely unlikely case—that would still leave more scores and a variety of undesirable changes OECD Conference, ed. Gene m. lyons (Hanover, nH: Public than half saying that publication of mortality measures affairs Center, Dartmouth College, 1975), 35. in instruction, such as an excessive focus on led to surgeons’ declining to do procedures that could 2. marc santora, “Cardiologists say rankings sway Choices on have benefited patients. old tests, an inappropriate narrowing of surgery,” New York Times, January 11, 2005. AMERICAN EDUCATOR | SUMMER 2010 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.