When It Comes to Test Scores, How Good Is Good Enough?

Classroom teachers, principals, college professors and other educators have been meeting in Denver to engage in one of the biggest events in the PARCC consortium’s history—reviewing test items to begin the process of setting “threshold" or "cut scores.” The cut scores are what give meaning to the scores students earn. Does a score of 721 show a student is doing well? Or that he or she has more work to do to get back on track for the next grade level or for college or a career? The PARCC states invited national and state media to the first performance level setting week—focused on the high school assessments—to learn more about this relatively unknown process that establishes valid and reliable scores that parents and teachers can use to help students succeed in school and beyond. Reporters spoke with some of the educators working long hours to make the judgment calls, and painted a clear picture of the very human and structured process of setting the scores that separate the performance levels. NPR’s Cory Turner asked the educators how what they were doing affects students. The challenge, they told him, is setting score expectations, based on the standards, that reflect what kids need to know to be successful. "The cut score is the manifestation of how good is good enough," says Mary Ann Snider, chief academic officer for the Rhode Island Department of Education. She noted that the balance lies between being realistic about each individual student’s achievement while providing the information to help them on their paths to success.
"We have to get enough of that right so that we're not giving kids a false sense of accomplishment.”
In a Politico story by Kim Hefling, college professor and administrator Loretta Holloway painted a disappointing picture for many of her students who think they are ready for college, but are not. Like others, she is looking to the assessments as a way of helping students know much earlier if they are on track, and giving teachers and parents information they can use to help get them back on track.
Loretta Holloway, a panelist from Framingham State University who teaches freshman English classes, said one of the reasons she’s motivated to participate is because she’s had far too many college freshmen in her office in tears because they come in thinking they are ready for college but in reality they are not. “They show up to college believing that they are ready and fail, and their parents have taken out loans and they’ve taken out loans. They are working jobs and they are struggling because they aren’t ready,” Holloway said.
Chalkbeat Colorado’s Todd Engdahl also tapped into the tension educators face as they attempt to raise the bar, and be realistic about where kids are and where they need to be.
Panelists were asked repeatedly about that gap between how students actually perform and how they should perform. They all came down on the side of setting high expectations. “We’ve got to raise the standard if we want to do better…The only way to do that is to keep raising the bar,” said Robin Helms, a math teacher at Wray High School on Colorado’s eastern plains. She served on one of the panels. “Students only give you what you ask them, so you have to push,” said Katherine Horodowich, an English teacher at Hot Springs High School in Truth or Consequences, N.M. “We have to set the bar higher.” Shirley said even though the standards and the scoring may look hard, there’s wide agreement among educators ‘that these standards are attainable. Are they attainable tomorrow? That’s not the case. …Trust us. Give us the benefit of the doubt that we know what we’re doing.”
Adam Clark’s article in The Star Ledger captured the intense atmosphere of collaboration among educators from multiple states, as well as the significance of the cut scores they were helping set. He spoke at length with David Knecht, an English teacher at Lenape Regional High School in Burlington County, New Jersey.
Ideally, Knecht said, every student will be in Level 5. But he expects results to be scattered throughout the performance levels. "We are setting the levels to determine where they are now," Knecht said. "So that ultimately teachers all around the country can help to bring students up that ideal."
Clark observed the diversity of experience and geography among the more than 100 educators involved in the work.
Knecht's committee included an administrator from Maryland, a high school English teacher from New Mexico, a representative from an Illinois community college and a faculty member from a four-year university in Massachusetts, among others. Shirley, who teaches in southern Illinois, worked alongside a math teacher from Colorado, a math coach from the District of Columbia, and a curriculum administrator from a college in Maryland, she said. The educators represented schools from both wealthy and poor areas. Coming to a consensus wasn't easy, they said. “It's not like we all sit down and make one judgment and that's the end of it,” said Robin Helms, a high school math teacher on the eastern plains of Colorado. “It's not just a wham, bam, we're done.”
In a story in the Hechinger Report, Emmanuel Felton delved deeper into the technical side of the performance level setting process:
This difficult balancing act is in part due to the nature of these tests. The new Common Core aligned tests are different than say the SAT or ACT. SAT scores are based on where a student falls on the distribution of all students; no matter how tough the test is, some students will get a perfect score. With annual state tests like PARCC, there is no such guarantee. These tests are only concerned with how well the students have learned the standards.
He also observed that the results of the test might provide a “reality check” for students, teachers, and families in PARCC states.
“How my students are going to do isn’t important to me in this process,” said Marti Shirley, a panelist and a high school math teacher in Mattoon, Illinois. “It might be a tough test, but it’s going to be a measure of what they should be able to do under the standards.”
With the conclusion of performance level setting last month, the recommendations of these 200 or so educators went to the PARCC state education commissioners/superintendents. Together, they approved final threshold scores within the range recommended by the educators. Student scores will range from 650 to 850, with a 700 representing the threshold of Level 2, 725 representing the threshold of Level 3, and 750 representing the threshold of Level 4. The threshold score for Level 5 will vary slightly by test and will be approximately 800. Each state makes its own decisions about possible additional uses of the score results and each state will release results on its own timeline.  
David Connerty-Marin is the communications director for PARCC. An earlier version of this post appeared on PARCC's website as Setting Cut Scores.
David Connerty-Marin
David Connerty-Marin is the communications director for PARCC.

Join the Movement