Thursday, 7 August 2014

Do universities award honest grades?


National scholastic exam results should be honest, firstly because honesty is the best policy in a moral sense; secondly because honest results transmit the most information; thirdly because honest results lead to the best candidates getting the best jobs, which is the best for society; and fourthly because honest results allow scholastic institutions to be evaluated and improved.

None of this is the case for the most important scholastic results: university grades. Each university teaches their own version of each discipline and examines it in their own way. Sure, there are external examiners, and they can sometimes guide an errant institution towards best practice, but it is an uphill struggle.

So, universities have to make up the final degree results. Getting a ranking of the students is the easy bit, particularly because with sufficient observation one can probably work out which are the strongest students in the first term. Then the tricky bit heaves into view. Do we want to be honest about what the students actually know, or do we find it expedient to make them, and the university, look good? Difficult question isn’t it? The answer takes about 5 seconds. The university does not want to admit that they have let in a bunch of dullards, that the teachers are incompetent, the courses misconceived, the exams too easy and the whole institution a refuge for inebriated idiots. However true, it is best not to disclose this to students, parents and grant-giving bodies, let alone to the locals who are wearily familiar with the institution’s many shortcomings. Therefore each department decides upon the level of mendacity required to make them look good as teachers, and to keep their students happy. They set an average grade which makes most students appear good enough, and a substantial minority to appear to be very good indeed. No-one does very badly. All must have prizes.

Students are now consumers. They have consumed industrial quantities of stupifying substances, have avoided most forms of academic enlightenment and effort, yet are entrusted with providing the data on which university teachers will be rewarded and promoted, by giving their hazy recollections about which teachers made them laugh earlier in the term.

For hard pressed university teachers, exam marking is simplified by this artful procedure. In advance they decide on the average grade which will fulfil the above mentioned market requirements. Some better students will be chosen to get marks somewhat above that Platonic average, and a few will get marks a little below it, in a simulacrum of academic judgment. Standard deviations narrow, distributions are skewed toward respectably higher marks, and nasty scenes and confrontations are avoided.

Into this maelstrom of deceit swim Butcher, McEwan and Weerapana, economists from Wellesley College in Massachusetts, to report on what happens when the sloppy inflationist running dog departments at that institution (Spanish, Women’s Studies, Italian, Chinese, Anthropology, Africana Studies, English) are brought into line with the cool and restrained marking of their more honest colleagues (Astronomy, Physics, Mathematics, Geology, Economics, Quantitative Reasoning, Biological Sciences, Chemistry).

The Effects of an Anti-Grade-Inflation Policy at Wellesley College. Journal of Economic Perspectives—Volume 28, Number 3—Summer 2014—Pages 189–204

This paper evaluates an anti-grade-inflation policy that capped most course averages at a B+. The cap was biding for high-grading departments (in the humanities and social sciences) and was not binding for low-grading departments (in economics and sciences), facilitating a difference-in-differences analysis. Professors complied with the policy by reducing compression at the top of the grade distribution. It had little effect on receipt of top honors, but affected receipt of magna cum laude. In departments affected by the cap, the policy expanded racial gaps in grades, reduced enrollments and majors, and lowered student ratings of professors.

The estimated drop in grades in treated departments is smaller for Latina students but much larger than average for black students (including African-Americans and foreign students who self-identify as black), those with low SAT verbal scores, and those with low Quantitative Reasoning scores.

In brief, the accuracy and honesty of the grades improve, though in the long run these departments drift towards their old habits of debasing the currency. Some racial difference increase, but overall the grades become more honest and more informative.

The downside is that Wellesley College students now look bad compared to others from more sloppy institutions. A Wellesley student who gets the controlled and restrained grade average score of 3.3 is inconvenienced in a market place where higher fake scores are the norm. Recruiting departments in desirable companies cannot hope to keep up with precise calculations as to how each university marks their exams. For ease of selection it would make sense for them to rank candidates by grade point average, and then glance at the awarding institutions afterwards.

Of course, if there were an agreed ranking of institutions (based on the SAT grade score averages of the entrants) it would be possible, assuming employers have the interest and the ability to apply the corrections, to create a new national ranking system. Possible, but difficult and time consuming. The authors make a final lament:

Any institution that attempts to deal with grade inflation on its own must consider the possibility of adverse consequences of this unilateral disarmament. At Wellesley College, for example, prospective students, current students, and recent alums all worry that systematically lower grades may disadvantage them relative to students at other institutions when they present their grades to those outside the college. They point to examples of web-based job application systems that will not let them proceed if their GPA is below at 3.5. The economist’s answer that firms relying on poor information to hire are likely to fare poorly and to be poor employers in the long run proves remarkably uncomforting to undergraduates. These concerns lead to pressure to reverse the grade policy. If grade inflation is a systemic problem leading to inefficient allocation of resources, then colleges and universities may wish to consider acting together in response.

It is the tragedy of the commons all over again. Debased institutions confer mendacious advantage to their students and garner resources whilst honest institutions hamper their students in the market place of life. You and I know what the procedure should be: standardise the scores on a national basis, taking into consideration student grade point totals on key subjects prior to university entrance as a way of grading the institutions and correcting for institution heterogeneity. The current system is measuring students with a rubber ruler, not with a platinum meter in a vault (or more precisely the path travelled by light in vacuum during a time interval of 1/299 792 458 of a second). Someone from on high has to force change and bring the degree counterfeiters into line. Some modern day Newton, as Master of the Mint of Graduates.

Until then, university grades are primarily a moral issue.


  1. I'm sure it's gotten worse since I was in college 20+ years ago, just as it was worse for me than for my father 40+ years ago. One annoyance I noticed in graduate school was that expensive private schools seemed to award much higher grades than even excellent state schools. Makes one wonder how much of the supposed better outcomes from an elite education spring from simple grading dishonesty.

    I had good grades anyway, but every minor difficulty (dropping a class in the 3rd week of the quarter instead of the 2nd, for example) showed up on my transcript while my classmates from Stanford (to cite a specific example) could drop a class the day before the final and not have it appear anywhere on their transcript.

  2. Wouldn't a cheap and simple measure be to print IQ scores on the diploma next to the GPA?

  3. Cheap, simple, and very probably wrong. A scholarly degree, properly examined, is a measure of application and persistence as well as ability. It also indicates that, apart from native wit and effort, you actually know something which may be of use to others. I am all in favour of measuring intelligence, but I wouldn't want a single figure summary of my abilities to have such an impact on my life. Mind you, my school leaving program gives my actual scholastic scores in percentages, but no-one has ever paid any attention to them.

  4. Well, Professor Thompson, you certainly brightened my day with this optimistic piece! Whatever the case, your enlightening commentary comes at a good time, and I thank you for your work.

    One way in which some composition professors combat grade inflation in our courses is to generate and carefully stick to "grading rubrics." All professors know and follow the rubric; all students know and follow the rubric. Moreover, we have "norming sessions" for midterms and finals, where we make sure our minds are synced and focused on the task at hand as we prepare to anonymously grade papers in a sort of Fifth Circle of Hell.

    Sure, our system is not perfect but it helps! I believe that by setting specific learning goals [I hate the term "outcomes," which sounds too robotic for my taste] and generating specific formulas for measuring how those goals are met, we can function just as effectively, efficiently, and objectively in Alabama as we could in Berkeley.

  5. Thank you. I agree one should set standards for marking, and have in mind what would constitute a good answer to a question. However, the one joy of examining is finding the brilliant student who breaks all those rules and shows another, better way, to answer the question. Alabama or Berkeley..... so long as there is some sunshine, and lots of bright students.

