Sunday, April 23, 2017

The Economic Value In Freely Available Online Content

Some actions once done can't be undone.  The costs entailed in taking such an action are referred to as sunk costs, which are costs that can't be recovered.  Economists teach that sunk costs don't matter, in the sense that they don't enter into what is termed "producer surplus" and therefore shouldn't impact decisions that aim to make producer surplus as big as possible.  I know this and I used to teach this in intermediate microeconomics when explaining the theory of the firm in the short run. Yet I'm finding that with my own content creation activities I often seem to care whether potential users access the content and also care when they do access the content how they react to it.  Is this narcissism on my part only?  Or is there some way the sunk cost metaphor is not appropriate here and my concern about user access has productive value?  These are some of the questions I want to get at in this essay.

A different set of questions comes from considering how to assess value of a public good when there are no market transactions to observe.  Is it possible to impute value to the public good?  If so, how would one go about doing that?

I want to also briefly try to tie this discussion to the issue of whether college should be free for those who attend.  This gets at who should pay for the public good (and why).  It's one of those things that needs discussion at a first principles level.  We tend to consider actual policy without knowing what first principles to appeal to when evaluating the policy.

Let me begin with a little personal history, which will explain the technology considered here.  In spring 2011 I taught for the first time since I retired the previous summer.  While I taught a regular section of intermediate microeconomics, there was a possibility I might teach a blended section in the future, so I made a lot of online micro-lecture content with that in mind.  I will get back to that in a bit.

I had previously taught intermediate microeconomics ten years earlier.  Many of the lessons I had learned the hard way from teaching it were apparently forgotten.  I made many of the same mistakes I made when teaching intermediate micro back in the early 1980s, making the course much too difficult for students and discouraging them further in the process.  Many of the students in intermediate micro are Business majors.  For them intermediate micro plays a similar role to what organic chemistry plays for pre-med students.  In these courses the students tend to be quite mercenary about their grades and not care much about learning the fundamentals of the subject, because they don't see the relevance.  Further, because the Econ major itself is seen largely as a proxy for Business by those students who didn't have the standardized test scores to get into the College of Business, the attitude about the class by most Econ majors mirrors the views of the Business students.  The course is generally liked by a handful of students, 10% or so, who are either in Engineering or Business students quite proficient in math. These students don't find the course overly challenging on the technical front and then can make some sense of what is being taught.  The rest of the students can't see the forest for the trees.

The micro-lectures I made were screen capture movies of Excelets with my voice over.   At the time I was using Jing (now extinct) to make the captures.  Now I would use Snagit for this purpose.   Jing had a 5 minute maximum on the length of these movies.  So each video was short and to the point.    As I briefly described in my previous post, I captioned every one of these videos.  At the time I was on a campus committee for media accessibility, which partly gave the motivation to do the captioning.  But I also reasoned that for students to get familiar with some of the jargon in the course, it would help to see the words in print while hearing them spoken.  The captioning would facilitate that.  Students were expected to watch the videos before the live class session and then we'd review them in class.  There would several of these short videos associated with a single live-class session.

The videos themselves were posted to YouTube.  I developed a profarvan channel for that purpose.  I posted the Excel files to Google Docs (now Google Drive).  I was using the public version of Google Docs, but the campus had recently gone to Google for student email and with that the students had access to Google Apps for Education.  I don't know if this is generally true or if it is only true because of how the campus managed authentication to Google Apps, but it turned out that if students were logged into their campus Google, then that blocked access to the public Google for them.  This took a while to figure out and proved somewhat unworkable,  both because students wouldn't remember to log out of their campus account and because at that time instructors did not have access to Google Apps.  So ultimately I had to find a different solution for this.

As I've indicated, my students were quite instrumental about this online content.  They evaluated it by how well they were prepared for the exams.  They found those exams quite tough (the means were below 60%).  Given these results, the micro-lectures weren't regarded very highly.  I found this quite discouraging, on the one hand, but then I was not bothered when I later learned that I would not be teaching the blended learning version of the course, on the other hand.  Indeed, I haven't taught intermediate microeconomics since then.  But I had a pleasant surprise that made me reconsider the value of the micro-lectures, as learning artifacts that might be considered entirely separate from the course I taught.

The Analytics section in YouTube provides a map of where people are when they access the videos.  My map showed global access even though all of my students were in Illinois.  This was clear evidence that people outside the class were watching the videos.  But I couldn't tell how significant that outside-the-class use was.  There were about 8,000 views in total that semester.  I was unable to accurately separate usage between my students and people elsewhere.  The following year, however, convinced me that the external use was substantial.  There were about 33,000 views during the next year and then I wasn't teaching the course.  Further, I would occasionally get comments from some of these users.  I surmised that most of the viewers were students taking the course elsewhere.  Some of those comments expressed appreciation for the videos.  In many others, where I hadn't put the link in to the Excel, they wanted access to that.  Apparently, both the videos and the Excel files were useful to them.

The experience opened my eyes to an audience that previously I didn't know existed.  In my subsequent content generation, I've been keenly aware of that audience, while making content that is also intended to be useful for the class that I am now teaching - The Economics of Organizations.   Let me describe that a little before moving onto the analysis.  The Excelets I mentioned above are reasonably good for considering the geometry of the economic model.  I also have developed homework in Excel, that can be auto-graded and has some capability to test students on the algebra entailed in the models.  But to explicate that algebra Excel is not the right tool.  I have subsequently learned how to use PowerPoint for this purpose.

In the old days when I'd do a two-hour lecture on a chalkboard while teaching graduate microeconomics, the bulk of the class session would be derivations of mathematical results.   I had a reputation then for being thorough, fully working through the thinking needed to reach the conclusion.  My belief then, which I still cling to now, is that you can only understand the result if you can reproduce the derivation, not by having memorized it but by thinking it through.  At the undergraduate level I opted for a similar approach, but modified so the arguments are mainly intuitive, which is why geometry plays such a prominent role.  The math used in the undergraduate class can't go beyond analytic geometry and algebra.  However, undergraduate students in economics are no longer used to mathematical derivations.  (Maybe they weren't used to them 30 years ago, but then I deluded myself that it was the necessary way to teach intermediate micro.)  If you do them in the live class session these days, after only a couple of minutes the students' eyes glaze over.  As long as your head is facing the blackboard and writing the equation, you can just be into that and ignore whether you are getting through.  But once you turn around to see if the students are paying attention and all you get is that glazed look, it is very discouraging.

So the micro-lectures that are online allow me to satisfy my long standing belief about how the subject should be taught while not ramming it down the students' throats.  It is opt in on their part.  Some students (actually the better ones) have told me they can complete the homework without watching the micro-lectures.  If their goal is to get through the homework and the exams, that's probably right.  I can't test properly on this stuff because the grades would be too grim if I did.  But if the student actually wants to understand what's going on, the micro-lectures are there to provide the derivations.

Let's close this section with a little more on the technology.  I ultimately put my non-video content (PowerPoint, PDF versions of those, and Excel files) in my campus account at Box.com and then made it publicly available.  Box is reasonably functional, giving the user a preview of the file before deciding whether to download or not.  For a PowerPoint or PDF file, the preview may be sufficient.  The Excel files I use for homework have to be downloaded. Box gives reasonably good access stats, if you care to look.  It also sends me an email every time a file is downloaded.  Since I do check email fairly often, this gives me some sense about how much the content is utilized.  I still use YouTube for the videos, which gives its own access stats.  Years ago I experimented with using Archive.org as a different possible host.   I put up a variety of multimedia there, both video and audio only (podcast content).  (Here is an example of a video of mine at archive.org)  In some sense it did a very nice job of rendering the content and allowing for different file formats for download, to accommodate different users.  But it generated essentially none of the serendipitous use that I've described happened for the videos at YouTube.  If users are rather important in determining the value of the online content, and indeed they are, then the creator needs to put the content in a place where the users will find it.

* * * * *  

In much of what I talk about in this section I am going to consider one particular video and the steps needed to assess its value.  The video is called the Shapiro Stiglitz Model.  It derives many of the equations found in the paper Equilibrium Unemployment as a Worker Discipline Device.   As that paper is widely available to students, one should first ask - what incremental value might the video provide?  This is quite similar to the question - if there is a good textbook on the subject, what value do lectures provide?  Can't students teach themselves the subject by reading the textbook?

There are several possible answers to these questions that give lectures value into themselves. First, lectures may economize on student time.  If students can follow the lecture well, they can penetrate the subject faster than by slugging through the readings as the path to understanding.  Indeed, nowadays in a lecture class most students opt for the lecture as the gateway to the subject, with the textbook used as a reference thereby playing a supporting role only, probably for this very reason.  Second, students may not have the wherewithal to penetrate the subject matter on their own.  The paper by Shapiro and Stiglitz that I linked to above was published in the American Economic Review and intended for academic economists.  To read a paper like this requires having pencil and paper on the side and deriving all the equations manually.  Students may be ill equipped to that.  Third, while covering the same subject matter, the lecture may take a different approach than the reading materials to get at that content.  In my video I derive asset equations.  In the paper, the authors write flow versions of these same equations.  I show how to go from the asset equations to their flow version.  This helps students understand the paper.  Fourth, the lecture approach serves as a model of the type of behavior we'd like to see in the students as they work through the paper.  If that modeling works well, perhaps the students learn to read through other papers on their own, without needing a lecture.  Fifth, the lecture can provide context both for the assumptions in the paper and for related research that is in a similar vein.  And there may be still other reasons why the lecture creates value.

For any of these reasons, however, the lecture only has value if the students access it and make good sense of what is in it.  Like the tree that falls in the forest, there needs to be somebody present for the falling tree to make a noise.  There is no value to the online content as a thing in itself.  The value is in its use.

When making a micro-lecture such as this one, I am uncertain about how potential viewers will use the content.  I design it as best as I can to fit my conception of what would be useful to learners.  However, I don't know if it hit the mark or not.  Users may communicate that to me indirectly via their use.  (This is a revealed preference argument.)  So I learn about the effectiveness of the content by observing the patterns of use.  Comments are a more direct form of communicating this sort of information.  Typically, however, there are only a few comments.  To learn about effectiveness more broadly, one has to consider the usage patterns.

I don't need to be paid for making content like this.  I do get paid for teaching the course and I have ample retirement income.  The making of such content and ensuring it is publicly available is not a requirement of teaching.  It is something I do voluntarily as a complement to teaching.  Knowing that the content is effective encourages me to make more content in a similar style.  Conversely, if the content seems ineffective or if the content isn't accessed at all even when it could be found by potential users, that would discourage me in making additional content.  Why bother in that case?

There is also uncertainty for the potential user who finds the content.  Will the content be helpful or not in increasing their understanding?  Ahead of time, the user can't know the answer to that question.  The answer is revealed only through use.  It is this insight that explains how we might go from use data to imputing value for the online content.

The methodology is similar to the method I described in the paper The Economics of ALN: Some Issues.  In that paper, student time is identified as the primary input in instruction.  The paper then worked through an exercise to impute the student time value.  Here, while the video is itself freely available to users and satisfies the requirements of non-excludability -  meaning that one person's use in no way impedes the use of others, so it fits the definition of a public good, it is nonetheless true that the user can't consume the video without watching it.  Watching takes time.  There is thus an implicit time cost in watching. But further, watching is a voluntary act.  So the act of continued watching must mean the user perceives the value from doing so to be at least as great as the opportunity cost of time.  (If it is strictly greater, the difference represents a surplus that accrues to the user from further watching.)  Once the perceived value of continued watching drops below the opportunity cost of time, the user will stop viewing and do something else.

With this in mind, let's consider the usage data I have.  It's far from perfect, but it is enough to make some intelligent guesses about what is going on.  The two sources of data are Box and YouTube.  I put the data into an Excel Workbook, with the first spreadsheet showing the data from Box and the second showing the data from YouTube.  The Box data show there have been 154 previews and 76 downloads of the PowerPoint file associated with the video.  In order to download the user must preview first.  So the download rate is about 50%.  As there have been 2,745 views of the video, the preview rate is about 5.6%.  In other words, the bulk of the views have no preview associated with them.  There is also location data for the last 50 "events," where an event is either a preview or a download.  I thought that information interesting.  It shows quite a lot of geographic diversity in access, with a distinct international flavor.  It's also a bit ego deflating in that there is only one entry for Urbana, Illinois, meaning only one of my students from the class last fall cared to preview the file.

The data from YouTube on the second spreadsheet shows the top 10 videos for the year 2016, measured by minutes viewed, from my profarvan channel.  The Shapiro Stiglitz Model came in second in this ranking.  There are also data on number of views.  I put in the average duration per view, the duration of the full video and the percentage duration per view.  For none of these videos does the percentage duration per view exceed 50% and in some cases it is quite low.  (These results are again somewhat ego deflating.  As the creator of the content, I'd like to see these numbers much closer to 100%.  It is enlightening, however, to see what the numbers actually look like)  The percentage duration per view is lowest for the Shapiro Stiglitz Model, at around 15%.  Note that this video is also the longest, more than a half hour, and regarding duration it is an outlier.  The other videos are substantially shorter.

Why would one watch a half hour lecture for only a minute or two?  One reason is to make a quick determination on whether to watch the rest of the thing and then deciding that it's not worth it.  One might think of fishing as an apt analogy.  The first fish that is caught is very small, so is thrown back in.  But then maybe the same thing happens for the second and third fist caught, at which point a determination is made that this is not a good spot to fish.  The fisherman then packs up his gear and leaves.  A different reason is to consider users who make repeat visits.  A return visit is an indicator that the site does provide value for the user.  But the sort of access for a user on a return visit may be different from the first time through.  The second or third time the user goes to a particular spot in the video, a place of importance or one where it was difficult to understand the first time around.  One needs much more granular data about usage than what YouTube provides creators to sort this out well. Nonetheless, it is helpful to keep these different alternatives in mind.

In absence of those granular data I conjecture a bi-modal distribution of users.  The first group are the serious users, who watch the video in full and may come back for repeat visits.  They account for the bulk of the minutes watched, but are comparatively small when considered from the perspective of views.  The second group are the quick hitters, who watch for a minute or two only and then don't come back.  Their brief viewing is experimental consumption, where the experiment didn't pan out.

The quick hitters will not preview the PowerPoint file, so that one does preview it is an indicator of being a serious user.  There may be some serious users who don't preview the PowerPoint file, as the video is sufficient for them and they prefer to do their work with pencil and paper at their side rather than on the computer.  Let's now guess our way through some plausible numbers.  If the preview rate is around 50% among serious users, then there are around 300 of them.  Their average viewing time might be something like 35 minutes, including one full viewing and some repeat viewing.  That gives a total or 10,500 minutes or 175 hours of viewing by serious users.  The total number of minutes viewing is 12,866, so there are about 2,300 minutes of viewing done by the quick hitters, which works out to about 1 minute per view.    Of course there is a lot of guessing here in coming up with these numbers, but something like this has to be what is going on to explain use.

The last part of the calculation converts the use by the serious users into value.  For this we need a conversion rate, something that measures the opportunity cost of time for the students.  In that JALN paper of mine from 20 years ago, where I first did this sort of imputation, for illustration I used a wage rate of $6/hour, which was then more than the minimum wage but not by a lot.  It was what I paid my undergraduate TAs at the time. There's been some inflation since, but to offset that the labor market is softer now and some of the serious users are from third world countries (though they are surely elite students within those countries).  The Federal minimum wage now is $7.25/hour.  With that as background, I will use $8/hour as the opportunity cost of time for the serious users.  Then 175 hours X $8/hour = $1,400.  This gives a lower bound of the benefit that the serious users get in aggregate from watching the video.  This imputation does not measure the surplus that the serious users accrue from the video.  There is no way to get at that surplus value from these usage data.  In a social welfare calculation that surplus value would need to be accounted for, as should the time value of the quick hitters, which should be subtracted off as a cost with no associated benefit.  While I have no basis for making this claim, I'm going to ignore both of them, as if the two just offset.

Since the video will remain online, additional serious use will contribute to further benefit generated.  I no longer can remember how long it took me to make this video and the PowerPoint file, but I'm quite confident that the benefit I calculated in the previous paragraph well exceeds my cost from making the video.  Most other creators will not go to these lengths to make this sort of imputation.  They will quickly eyeball the usage data to make their determination.  But then they likely will respond the same way I have.  They will be happy with their own efforts if those generate substantial serious usage and otherwise feel their efforts have been wasted.

* * * * *

I now want to connect this analysis with the discussion about free college education, but confine myself to a rather abstract view only.

First, lets note that they type of social welfare calculation we went through in the previous section can be done for any good-citizen voluntary act.  It doesn't require online content.  All that is required is for there be providers and beneficiaries.  Then a cost-benefit calculation can be performed.  Indeed, some might argue that the public interest in college education is to encourage such good-citizen acts, and that the education itself should train students to be good citizens, for this very reason.  Last year on campus there was a talk by Harry Boyte that made just this point.

Second, lets note that an outsider to the system could inject funds into the system to change the fundamentals and thereby impact the social welfare calculation.  For example, the Econ department could fund a student assistant to help me in creating other online content.  If the social value of the additional content exceeds the cost of compensating the student assistant, then that enhances social welfare overall.  So it would be a good thing to do.

However, the Econ department might not care about social welfare overall.  It might care only about whether it can internalize enough of the benefit to cover its costs.  This would happen, for example, if enough of the serious users were students on campus, meaning they were current or future students in my class.  The data I showed are pretty discouraging on this point, since the bulk of the serious use is geographically dispersed.  The Econ department can't internalize this benefit.

This is not an argument that I shouldn't have a student assistant.  It is an argument that the funds for such an assistant need to come from elsewhere, possibly from a grant program that the State Department administers or a grant program that some U.N. agency administers.

Finally, the social benefit might be increased not by me having a student assistant but rather by generating a larger serious user population by increasing enrollments in the appropriate programs at institutions around the globe.  If there are limited funds to invest overall then one should ask which way of funding the activity will increase social welfare the most.  In other words, we should try to separate in our discussions of free college education, and other social policy as well, the underlying objectives from the means of achieving those objectives.  We tend to garble the two and make a muddle of things as a result.

In the case of free college possible objectives include: (1) getting students who otherwise wouldn't go to college at all to attend, (2) getting students who would have gone to a commuter school to attend a residential college instead so the educational experience is enhanced, (3) keeping the debt burden for students and their families manageable, (4) increasing voter participation among students and their families who might receive such benefits, and (5) rewarding voter participation among students and their families who might receive such benefits.   One might design quite different programs depending on how these objectives are prioritized.  Let's first consider the potential benefit from each of these.  Our discussion would be richer if we did that.

* * * * *

Let's wrap this up.  Everywhere I turn I'm bombarded with spiels about the virtues of big data and using analytics to make sense of things.  I'm old school enough to believe that data is not enough.  You need some framework for understanding what the data might tell you.  I've tried to provide that sort of framework for online instructional content and then bring the data I had available to consider things in that setting  As we continue to debate social policy like free college education, we need frameworks to make sense of the issues and then look at the numbers as applied to those frameworks.  This takes some skill to do and it might take effort by readers to wade through the arguments.   We need to do that and not let our impatience for answers let us short circuit the thinking about the policies that we should want to see get put in place. 

No comments: