IE 11 is not supported. For an optimal experience visit our site on another browser.

Scorers of new SAT get ready for essays

With just two months to go before the much-heralded new SAT is given, a team of English professors and psychometricians is poring over sample essays to determine what kind of writing should be rewarded and what penalized.
/ Source: a href="" linktype="External" resizable="true" status="true" scrollbars="true">The Washington Post</a

With just two months to go before the much-heralded new SAT is given, a team of English professors and psychometricians is poring over sample essays to determine what kind of writing should be rewarded and what penalized.

Much of the scoring proceeds swiftly. Many of the essays, written by students in 25 minutes as part of a trial run for the real test, fall into obvious categories: excellent, dreadful or — most common of all — mediocre. But there is some writing that defies easy pigeonholing — such as the first-person story from a star high school actress.

Scores for the essay are all over the map — from 4 for competent to 6 for outstanding. To illustrate her theme, the drama student provides only one example — her own acting experiences — rather than the traditional three. She makes some grammatical errors but has an engaging voice and an argument that is sustained from beginning to end. After a lengthy discussion, the panel reaches agreement.

"We're going to give it a 6," announces Daisy Vickers, director of design and development for Pearson Educational Measurement, which is devising the writing test on behalf of the College Board, which owns the SAT. "Is everybody okay with that?" There is a murmur of assent around the hotel conference table.

For the first time in the 67-year history of the SAT, the March 12 test will include a written essay along with revised reading and math exams. For the millions of high school juniors and seniors who will ultimately take the test, as well as the thousands of U.S. colleges that will use them in determining whether to admit those students, the stakes in deciding how to score the test could hardly be higher.

Critics of standardized tests have depicted the essays as lightning-fast, formulaic exercises that are unlikely to reveal much about a student's true writing abilities. "This test forces you to write very quickly with little time for reflection," says Adam Robinson, author of best-selling test preparation books. "There's no time for rewriting, which is the essence of good writing."

Proponents concede that the new SAT will not produce award-winning writing. But they defend the test as a useful exercise in establishing whether college applicants can write under pressure. The test will also provide college admissions officers with an authenticated sample of a student's writing to compare against the meticulously prepared essay that is part of most college applications.

No single formula for success
In an attempt to demystify the process, the College Board allowed a reporter to sit in on a pilot scoring session last week that is like the ones that will be used to train thousands of test scorers around the country. The reporter agreed not to divulge questions that could be used in future tests but was otherwise free to describe what took place at the meeting.

The behind-the-scenes look at the making of the new SAT suggests that there is no single formula for achieving a high score on the writing portion of the test, and that formulaic writing can result in a lower score. At the same time, it is legitimate to wonder whether the eccentric spark of genius will continue to be rewarded when thousands of test-graders across the country try to implement the guidelines established by the experts.

Unlike traditional multiple-choice questions, some of which will also be on the writing portion of the SAT and are scored by computer, the new essay portion represents a logistical challenge akin to a military operation.

Each essay will be scanned into computers and read by at least two scorers. A force of 3,000 scorers, mainly moonlighting teachers, is being deployed at 15 regional centers. Scorers must read an average of 220 essays in eight to 10 hours. Some will read many more. Others will drop out, exhausted, their brains befuddled by incomprehensible sentences and impossible-to-decipher handwriting.

The preparatory range-finding session is a cross between a debate among art critics on public television and the judging of an ice-skating competition. Bursts of passionate discussion are followed by the grading of the essays, with scores from 0, for a blank sheet of paper or an essay that has nothing to do with the topic, to 6. If two scorers differ by more than one point, a supervisor is summoned to adjudicate. The cumulative score can fall anywhere between 0 and 12.

To guide scorers, the team has already approved a sample set of answers to a question about the benefits and drawbacks of secrecy. The "prompt," as an essay question is called in education parlance, consists of two quotations, one justifying secrecy as an indispensable part of human life, the other attacking it. Students are then asked to develop a point of view on secrecy, with examples to support their argument.

An essay that does little more than restate the question gets a 1. An essay that compares humans to squirrels — if a squirrel told other squirrels about its food store, it would die, therefore secrecy is necessary for survival — merits a 5. Brian A. Bremen, an English professor at the University of Texas at Austin, notes that the writer provides only one real example. Nevertheless, he says, the writer displays "a clear chain of thought" and should be rewarded, "despite his Republican tendencies."

The panel overlooks a few grammatical errors and misspelled words. "F. Scott Fitzgerald once handed in a manuscript with seven consecutive misspelled words," Bremen says. "If you can write like F. Scott Fitzgerald, you will be okay."

"We rewarded [the squirrel paper] because it was unique, and the student came up with it in 25 minutes," says Noreen Duncan, who teaches African American literature at Mercer County College in New Jersey. Some mistakes are permissible, she says, because anything that can be written in that time is, by definition, "a first draft."

The team uses a technique known as "holistic scoring," a euphemism for reading an essay very quickly (a minute or so per paper) and making a snap judgment. This is not like grading a school essay, in which points may be deducted for uncapitalized letters or an insufficient number of paragraphs. The scoring technique puts a premium on a student's ability to develop a logical chain of reasoning over the mechanics of writing.

Reading quickly boosts the productivity of the scorers, who are paid as much as $22 an hour. But that is not the main point, Vickers says. "If they labor over the essay too long, they become too analytical," she says. In the world of holistic scoring, "analytical" is bad. An overall first "impression" is good.

Most scorers end up within a point of each other on most essays. The discussions at the range-finding sessions are designed to establish guidelines for dealing with the difficult-to-categorize essays, many of which will probably be kicked up to a supervisor.

The big question is whether scorers trained to speed-read hundreds of essays will recognize exceptional writing when it comes across their computer screens. A mock scoring session conducted by the Princeton Review, a company that prepares students for the SAT and other high-stakes admissions tests, suggested that outstanding writers such as William Shakespeare would do poorly on the test because they would refuse to write according to the formula.

"Shakespeare would have done just fine on one of these tests," counters Vickers, without promising a 6 for the Bard.

Back at the sample-scoring session, meanwhile, an Illinois high school teacher named Bernard Phelan is keeping everyone entertained with his pointed comments. "This essay has the ring of empty assertion," he says of one effort, which ends up with a 2. "The student is telling us, 'I don't have an example, and I'm not about to provide one any time soon.' "

He awards another essay a 5. "Some kids write but don't think," he explains. "This kid thinks as she writes. There is some awkwardness here, but she moves fluently from topic to topic."

By 5 p.m., after eight hours of scoring essays, the eyes of the 13 panelists have begun to assume a glazed look. "The lighting in here is terrible," Vickers suddenly notices. "We need to get some proper lights in here tomorrow."

The panelists decide to tackle one final essay, which has received scores ranging from 2 (seriously limited) to 5 (reasonably consistent mastery). The essay is virtually illegible — no marks are deducted for bad handwriting — but it is two pages long and is sprinkled with academic-sounding words such as "commodity" and "value."

Ed Hardin, an expert with the College Board, makes a stab at reading the essay out loud. He had awarded it a 5 on the basis of his first impression and the sophisticated vocabulary but changes his mind as he tries to make sense of the stilted prose.

"Somebody is going to have to buy me a drink," he groans halfway through the reading.

It is getting late, and everybody is tired. The panelists agree on a 3 and call it a day.