Sometimes justice is delivered through life’s little ironies. According to this article from yesterday’s New York Times, we’re experiencing a shortage in psychometricians, those who practice that esoteric “science” a big part of which is to assign the mapping from raw scores on standardized tests into percentile rankings. This, apparently, is an indirect consequence of No Child Left Behind and its reliance on such testing for measuring learning (or non-learning as the case may be). It seems we’re testing so much that we don’t have enough of these measurement folks to validate the tests. First, there was the scandal with the SAT scoring. Now this. If I didn’t know better, I’d think God was trying to send a message to President Bush.
Here, it’s Finals Week. Many of our high enrollment courses (and some smaller classes too) use those scantron sheets for scoring final exams. Apart from individual student complaints about erasures not being treated correctly, I’ve never heard of us having scoring problems with the scantrons. But as far as what those final exam scores mean about student learning, I am more doubtful now than I’ve ever been as to what the exams actually measure and on why the entire culture is immersed in this type of high stakes testing.
There are some narrow instances where I understand what the tests do and indicate. After I had entered the doctoral program in Economics at Northwestern, I was told that the GRE in Economics was meaningless, but the basic GRE in Math was a strong predictor of success in the Economics program. That made sense to me. More recently, I learned something similar about the Chemistry sequence here. The Math part of the SAT is a very good predictor of performance there. This is not surprising: math aptitude is necessary for doing well, both in Chemistry and in Economics. The SAT and GRE provide reasonably good measures of a simple type of math problem solving skill – exactly what I mean by math aptitude. But I’m guessing this is the exception rather than the rule.
After getting killed in my course evaluations by students with comments like “test us on what we know, not on what we don’t know,” I stopped writing exams in my intermediate economics course where to get the right answer the student would have to show some cleverness in set up of the problem and its analysis. Coming from graduate school, I had thought it was that cleverness that the instructor wanted to encourage and hence that should be a big part of any exam. The students, in contrast, wanted something else. They had studied and put time into the course “learning” the material. They wanted some way to show that on the test. After about 10 years of teaching (I stopped teaching intermediate microeconomics after my second year except for an honors section and only returned to it 7 or 8 years later) I figured out that if students had seen essentially the same type of problem in a homework assignment or a practice exam, then they would feel the exam was fair because they’d be prepared for it. They had their bag of tricks and my test questions could be found in it. Even with that, my exams were perceived as hard by the students. But beforehand, they were viewed as impossible.
Impossible tests, of course, don’t measure very much, unless there are a handful who can actually do them. When that is the case, it is really not very hard to identify these individuals. But for the rest, it is not at all clear that gradations in exam performance measure much of anything. That has always been my issue with relying on tests. So in the Honors course that is just ending, I didn’t use exams at all. I had homework and projects – that’s it. I learned about what the students knew from their writing and their discussion in class. It seems to me to be far more informative, and clearly preferred if their performance doesn’t have to be sited on a bell curve and I can rely on my qualitative judgments in giving them their end of semester grade.
Then there is the issue of the stress in the students that exams can cause. We know there is good stress and bad. The good encourages elevated performance as if running a race. The keyed up individual “competes” whether the grading is on a curve or not and taking the test is a way of proving oneself worthy (though of exactly what I’m not sure). I was a good test taker when I was a student. I know the feeling. I’m guessing that most of my Honors students know the feeling as well.
But what about those students who don’t have reasonable expectations that they will perform well on the tests? How do they feel under this stress? Don’t many under prepare because they hate the feeling of coming up short and thus have a ready excuse in their lack of prior effort? And for the diligent types who do put in the hours beforehand but feel that their performance might not show the benefits from their work, where is their reward? Do they benefit from the pressure induced by the exams?
My own sense of irritation (the instructor can get testy too) is that we’re so enveloped in this method of evaluation without much if any critical thinking on why. We do it because that’s the way it’s been done. If we don’t do it, then somehow we’re cheating, getting away with it.
Maybe, just maybe, when God is speaking to Mr. Bush, he’s speaking to us too.