The More Tests Change……

I am still sorting those boxes full of old letters, records and newspaper clippings! It is hard not to keep stopping and examining the past more carefully. In an odd way it makes me feel better to realize that “I’ve heard that song before.” The education headlines are indeed the old familiar score (see below). Of course, it could also be discouraging. But it reinforces my determination to sustain the work based on the data that matters most: the actual life histories of the human beings schools reach. “You can’t take that away from me,” I remind myself. In the end we each have to make some judgments about what “counts” most to us.

Meanwhile, we keep “counting” in ways that defy quite ordinary common sense. Examples:

[Headline] City Cheats on Reading Test: “The mayor has turned the Chancellor’s smashing two-year increase in the citywide test into ‘the single most important achievement’ of his administration.” From the Village Voice. By Wayne Barrett.,

And furthermore,

“It was not surprising that the city’s scores had risen dramatically… the test the city uses is designed to do that… There is some concern that the children learn the art of passing tests, according to Ida Echavarria, director of testing” reports the NY Times.

Both the above from June 1981.

Just days before the 1981 scandal broke, even astute Albert Shanker’s column in the NY Times was blasting testing critics and praising NYC’s high scores, noting proudly that Washington D.C. students had made similarly big gains. Yes, it requires, he said, “special efforts to overcome” poverty, but “as the recent scores in NYC and D.C. show… the greatest gains were made by minorities and the poor in some of our very toughest neighborhood schools.” No further comment after the expose.

I arrived in NYC in 1967 and had been an unwitting supporter of testing as a parent, teacher and local school board member. I was even part of a cabal (led by Ann Cook and Herb Mack) to “expose” Chicago’s secret test scores a few years earlier. I was, like Diane Ravitch, a believer. It took experiences that involved both my own children and those I taught in central Harlem to wake me up. The kids and their scores did not match what I knew about them, and NYC’s wild fluctuations led me to became an amateur expert on standardized testing. (Go to for a list of my writings on standardized testing.)

For example, between 1974 and 1975 scores took an amazing turn: going from 33.8% reading on or above grade level to 43.3% in 1975. A year later the headline in the NY Times noted “A Slight Decline in Reading in New York Schools,” although the Times noted that the decline was from 1975 which had shown “surprisingly high achievement by pupils compared with earlier years.” What changed? The test publisher. So, the next year the Board of Education contracted with still another test publisher. Guess what? Next year: we all did better.

In 1979 the NY Times front page noted that “City Pupils Remain Behind in Reading.” But there was improvement. Although a different test was used that year so comparisons were hard to make, said reporter Ed Fiske.

In 1984 Gene Maeroff noted that more than 50% were now reading above grade! Victory? An improvement in less than 10 years from below 40% to over 50% reading on grade level. None of my high school teaching friends saw any sign of change in their students who had so miraculously scored better during their elementary years.

A year later Joyce Purnick reported “Reading Scores Fall in City for the First Time in 5 Years” The Chancellor said “that reading experts had told him the version of the test given this year was more difficult… but suggested that the teacher shortage may also have contribute to the dip in scores.” The Chancellor said “he would meet with a committee to determine… whether to use a different test entirely in the future.”

And so it has gone for the 43 years I have been a NYC school test watcher. I was hardly surprised then to read the headlines a few weeks ago that informed us that in fact the latest test scores that the Mayor touted during his reelection campaign were… inaccurate. In fact, the latest data shows that we are more or less back where we started when Bloomberg became Mayor 8 years ago. The only difference this time is that the dips usually coincide with the appointment of a new Chancellor and Mayor Klein is still with us. But in the old days NYC controlled its own tests!

Dizzy from trying to follow these ups and downs?

Remember, these publicized scores went along with a lot of “deep” editorial analysis, plus hours of precious time spent in every school and district carefully dissecting each up and down by class, grade, teacher and kid. Teachers and schools were inundated with sure-fire commercial test prep programs—for doing better next year. And if you are a school teacher now, this should sound familiar.

Given that the tests used were all produced by equally reputable test makers, who promised that their tests were “normed” with expensive and extensive pre-testing, guaranteeing a high degree of reliability and reported measurement error, and built to measure the exactly same thing—how is this bizarre history possible?

When the switch was made from “norm-referenced tests” to so-called “criterion-reference” tests, I jokingly noted that this was another word for “politically” normed tests—with benchmarks set to meet a particular political agenda. But, since I suspected the old tests were also influenced by politics, criterion-referenced seemed a step forward. However, they came with another decision—to report scores simply as a 1, 2 ,3 or 4. Period. The difference between a high 3 and a low 3 being indistinguishable, and thus a move from a 3 to 4 might indicate almost no change—except in headlines.

The climax of this story? Last fall, 2009—before the Mayoral election—we witnessed the claim that another rise had taken place in the 8-year upward curve of test scores under the Mayor’s reign. But—another report this summer has uncovered a new truth—actually test scores this year were back where they were before Bloomberg became Mayor 8 years ago.

I hope this explains why my expertise has convinced me not to believe data collected by any city or state or Federal DOE (domestic or international)—re attendance, drop-outs or so-called achievement. I know what goes on behind the scenes—at what hour one takes attendance matters, what constitutes a drop-out depends on how you record it. Like “achievement” they are equally subject to Campbell’s Law. The data declines in value the more high stakes attached to them.

I am not anti-data—but I want the real stuff. More on that next time.


5 Responses

  1. Required reading – "The Emperor's New Clothes". 'Nuff said.

  2. Ah, Deborah, with age comes wisdom. Some acquire it sooner than others.Diane Ravitch

  3. NYC Parent frustration with City's testing manipulation/obsession is growing:

  4. Yes Debbie… Having worked in the Bureau of Educational Research for the Board of Education, which administered the mandated citywide testing program let me add a few more historical facts:See Wayne Barrett's other front-page Village Voice piece (June 1, 1982), The Politics of Flunking. I was Wayne's primary source on this story and the one you allude to the year before.Memory shouldn't be lost about how the BOE boosted the 1981 election year test results (Ed Koch, an earlier "education mayor" was seeking re-election) by using the CAT [California Achievement Test]for 5 straight years leading up to 1981. The test had only two forms. In 1980 and 1981 the same form was used back to back. Talk about test score inflation!In 1975 I blew the whistle on favoritism in awarding the city's testing contract to the same company, Harcourt Brace and World, whose Metropolitan Achievement Test had been selected each year without competition. The director of the Bureau of Educational Resarch was an editor of the test and received royalties from its sale. The Comptroller [Goldin] intervened and issued a scathing report on the cozy arrangement. Next year competitive bidding was instituted and Harcourt put in a bid that was $170K below what it had charged for essentially the same testing program the year before. Not good enough. McGraw-Hill underbid Harcourt by $240K. The Comptroller's Office estimated that NYC had given away $2.3 million to Harcourt over the course of its 12-year testing reign. (That's $2.3 million unadjusted for inflation.) As a reward for my efforts, I was harrassed and received a U-rating, back in the good old pre-whistle blower protection days.The ironic part of this is that CTB/McGraw-Hill has been in the New York State testing saddle since 1999–given a virtual monoploy over the ELA and math tests that have been given to 1.2 million students each year. And don't forget about the cost of other devices they sell us ad nauseum to help teachers prepare students (it's called interim assessments, etc.)to do well on each year's discredited tests.There are numerous other ugly twists and turns that have taken place in the story of the testing program–and that continue to this day in the name of "trust us" and "reform." The only way out is for parents and teachers to just say No to Testing–until a real independent investigation takes place into all aspects of testing. The belief we should move forward on promises is belied by the testing program's history. Accountability for the statewide tests and phony results over the last 6 years would be the first step in the road to recovery.No testing is better than what we have now. Until something legitimate is available, which could take a few years to develop, let teachers teach and students learn–unfettered by a crooked testing system.~Fred Smith

  5. This article brings to light one of the most important and often ignored aspects of "testing" and achievement. Who actually knows what is on the tests? Parents don't know, teachers know a little, administrators don't know. We can look at the standards and see the areas covered, but nobody sees the tests from year to year. You can't evaluate progress without looking at the test given. When tests are given in schools they are graded and returned. Everyone can see all the questions and answers. Not so with the achievement tests or other tests given on computer to students with no access to a written transcript of the questions.How can you say something is mastered or not mastered without evaluating the question/s asked to decide the decision?And what about grade level appropriate questions and the constant ramping up of the standards at the elementary level. I see questions on the open ended section of our state's test that are unlike anything students have been asked to do before. We covered the skills, but the presentation on the test is so complicated and the tasks required are so foreign that students may not be able to get the answer right. Who evaluates these types of things? Only a handful of people actually see the tests, and everyone seems to think that's fine. I guess everyone just trusts those making and selecting the test questions and assumes they are fair and appropriate. I wish someone would start asking more questions and demanding answers in this area.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: