Friday, July 15, 2005

On Data, Schlock Social Science, and Big Brother

Yesterday, I spent a little time reading some of Casey Green’s online columns at Campus Technology. I liked the piece “Digitizing Maslow,” but the one that really got me thinking was, “Digital Tweed: Bring Data.” Apart from giving a hard elbow to the ribcage of Margaret Spellings, the new Secretary of Education in the Bush Administration, the column is a reminder to all of us in higher education that the “Bring Data” mantra is for real and it may rise up and bite us in an unpleasant spot, if we’re not ready for it.

This past spring at one of the Campus Ed Tech Board meetings here in Urbana, we discussed the monitoring capabilities of WebCT Vista. I was interested and surprised to learn that even faculty whom I thought were active users of that environment were nonetheless entirely ignorant of the Vista reporting capabilities available to instructors. So I showed some of these off with screen shots. (I’ll review those below.) We had a representative from the Provost’s office at this meeting and in discussing the NCA accreditation visit a few years hence he said they’ve moved away from across the board evaluations and now do something much more targeted and so perhaps Casey Green has been making the bring data argument in too melodramatic a way. But he also said that perhaps the accrediting associations in the various disciplines might want broader evidence of student engagement and student learning. That said, I’m fairly confident that most college level or departmental administrators are even less informed than the Ed Tech Board faculty about the data reports that Vista can generate.

Vista does indeed offer various types of usage reports and someone with the appropriate privileges can get those at the class level aggregated or disaggregated, as well as at the individual student level. To see what these look like, I’ve done a report on my Campus Honors Class from spring 2004. This is a full semester’s worth of data for a class with 15 students. There may be some information in the relative tool usage. (My use of assessments was comparatively high because I did used content surveys.) This same set of data can be viewed disaggregated by student (and here I changed the names for privacy reasons.) And one can view individual student data disaggregated by session.

While looking at these reports may already arouse in you thoughts about Big Brother, let’s suspend those for a bit and ask this straightforward question. What is being measured? My sense of this is that mostly, it is measuring the students “clicks” inside the course management system. So, for example, in the Discussion boards the student can open a thread as a scroll with one click. There might be 10 or 15 messages in that thread. I’m thinking that is counted as one message read (but I’d like to be enlightened on that point). Certainly, the student could have other applications open and toggle between the CMS and say an Instant Messaging session. The durations reported must be either till logout (for the application as a whole) or till the work has been done, in the case of an assessment where completing the work means the assessment has been submitted. In an absolute sense, I would put very little stock in the duration numbers. Those numbers may be useful as a comparative across students.

As an instructor I would use these type of data in two different ways (but in a regular class not the honors class). First, for a kid who complains that the virtual dog ate the bytes (of his homework), or for a kid who seems like a complete screw off but claims to have put in a lot of effort into the class, such reports are a ready way to either support the students and give them credit for the work or contradict the students and call their bluff. I’ve used Mallard in the past this way. It does help and it cuts out some shenanigans that any large class instructor I know would rather do without. Second, I would look at the outlier students, both the very good and the very bad, and see if they look different in the way the report measures. Who is online more? Who posts more to the discussion board? etc. This information might help me consider how to modify the course the next time through. (If the better students are online more, I’d urge the other students to do likewise. If it is the poorer students who are spending a lot of time with the quizzes, perhaps some of those need to be rewritten, but coming to that conclusion would require more investigation.)

Beyond this, I’m not sure about the value of the information content. We can run aggregates of these and at one point I thought that might be useful to provide instructors with benchmarks for comparison (but when there are many students over a long time period the report can take quite a while to generate and doing this is server intensive). Consequently, until there is some demand for these reports from elsewhere on campus, we’re not likely to do this on our own.

Do legitimate students have reasons to be concerned that these type of data are gathered in such reports and should others on campus be concerned on behalf of these students in the same way that there is concern about providers of Web applications who use their applications to monitor individual online behavior? My answer (and that is not the official one ) is that as long as the instructors don’t post these reports with the student identities to a public Web site then there is no harm. Aggregate reports, such as the first one I listed, can be displayed without hesitation. Again, this is only my opinion, not the campus endorsement.

Let’s take a leap of imagination and imagine that others wanted these reports. Further let’s suppose that they want to use the reports to consider how the technology affects student learning. Imagine that! This, more than the student privacy issue, is where the real trouble begins.

Whoever has initiated the request, although probably not a social scientist, likely has an implicit idea of a model such as the following.




Now the real fun starts. It’s easy enough to list the variables but do we have a clue on how to measure any of them? Student is probably the easiest because the campus is in the business of measuring this already. Between SAT scores and GPA in prior semester, there are two continuous variables that should work reasonably well. I’ve said in earlier posts that I’m no great fan of the SAT, but really, this is the least of the data problems, so let’s just push on.

How in tarnation does one measure Instructor? If it is a categorical variable used in a small study so that to each instructor can be attributed their individual effect, fine. Then one doesn’t have to get into the causality from instructor to learning, but simply view each instructor as a separate regime. I’ve been told by folks who do education evaluation for a business that in such small studies if one does control for both Student and Teacher, there is usually insufficient variation left over to worry about any other variable (let alone to worry about if the effect is linear or interactive). In a larger study there might be such variation but then we’re back to square one on the definition of Instructor. Should salary be used? What about prior course evaluation ratings? How about teaching experience? And is there a need to account for the computer aptitude of the instructor? Say, let’s be clever and build an index based on several of these variables. But what justifies that?

There are similar issues with the variable, Technology – categorical or continuous, but let me turn to the last variable, Learning. Basically, no way Jose. You can talk pretests and post tests all you want but in the main we don’t do it and in the few cases when we do there is a lot of concern about sand bagging on the pretest (which doesn't count in the final grade). So then you decide to go for the post test only. But the real problem is not that it doesn’t measure value added. The real problem is the “teaching to the test” phenomenon. We know about students who do well on the final but don’t know a lick about the subject. Parsing out real understanding from doing well on the test is, in essence, intractable in a within-course study.

So you through your hands up and say, ok, Learning is too hard to measure. Let’s do something easier, like measuring student engagement. And let’s use the time spent online, something that is measured readily, as a proxy for engagement. Surely we can do that, can’t we? Well, yes we can, but as I said earlier in some classes it will be the best students who are online the most while in others it will be the worst students and when you pool those ….

So let’s give up. But it sure is tempting to try.

1 comment:

  1. just a test to see if this comment appears quickly. i'm now wondering whether comments appear when the blog is republished. hmm..

    ReplyDelete