Thursday, 19 June 2014

Detecting bias, and understanding American culture


If you, dear reader, are seeking to make your way in academia, be of good cheer: many people are doing fine work, and you will prosper if you follow their example, and will triumph if you take the best of their practices and extend them further.

On the basis that any moral tale must have exemplars of good and bad practice, I commend April Bleske-Rechek and Kingsley Browne;

caution you against Casey Miller and Keivan Stassun who wrote the silly Nature piece;

and commend this paper by Dwight Davis, Kevin Dorsey, Ronald Franks, Paul Sackett, Cynthia Searcy, and Xiaohui Zhao: Do Racial and Ethnic Group Differences in Performance on the MCAT Exam Reflect Test Bias? Acad Med. 2013;88:593–602. First published online March 8, 2013  doi: 10.1097/ACM.0b013e318286803a

Here is a brief summary of the underlying issues to be discussed. A test is not flawed because some people do better than others. Tests reveal those differences. A test is not flawed because some groups do better than others. Tests reveal those differences. However, a test result can be questioned if it has lower predictive power for any group or, more precisely, if it over- or underestimates that group’s later achievements on some agreed benchmark. Errors of this sort strongly suggest the test is too harsh or too lenient in assessing the talents of that group. So, in any instance in which test bias is suspected, a couple of further analyses of the data are required.

Here is the abstract: The Medical College Admission Test (MCAT) is a standardized examination that assesses fundamental knowledge of scientific concepts, critical reasoning ability, and written communication skills. Medical school admission officers use MCAT scores, along with other measures of academic preparation and personal attributes, to select the applicants they consider the most likely to succeed in medical school. In 2008–2011, the committee charged with conducting a comprehensive review of the MCAT exam examined four issues: (1) whether racial and ethnic groups differ in mean MCAT scores, (2) whether any score differences are due to test bias, (3) how group differences may be explained, and (4) whether the MCAT exam is a barrier to medical school admission for black or Latino applicants.
This analysis showed that black and Latino examinees’ mean MCAT scores are
lower than white examinees’, mirroring differences on other standardized admission tests and in the average undergraduate grades of medical school applicants. However, there was no evidence that the MCAT exam is biased against black and Latino applicants as determined by their subsequent performance on selected medical school performance indicators. Among other factors which could contribute to mean differences in MCAT performance, whites, blacks, and Latinos interested in medicine differ with respect to parents’ education and income. Admission data indicate that admission committees accept majority and minority applicants at similar rates, which suggests that medical students are selected on the basis of a combination of attributes and competencies rather than on MCAT scores alone.

First, they show that black applicant’s scores are one standard deviation below the mean for white applicants, in line with other measures of intellectual ability such as grade point averages, and also in line with the Graduate Record Examination and the entrance exams for other professions.

Medical College Admission Test (MCAT)               1.0
Graduate Record Examination (GRE)                  1.3
Graduate Management Admission Test (GMAT) 1.0

Law School Admission Test (LSAT)                        1.1

Ever helpful, they then spell out the misconception about test bias into which the Nature article plunged:  Although some individuals may conclude that any test showing average differences in performance between majority and URM examinees is biased, such differences, in the absence of corroborating evidence, are insufficient to conclude that there is bias. Instead, professional testing standards compel testing experts to gather logical and empirical evidence about potential sources of these group differences using well-established (and generally agreed-on) procedures to determine whether test bias exists. Test bias arises “when deficiencies in a test itself or the manner in which it is used result in different meanings for scores earned by members of different identifiable subgroups.

Two processes are used to remove presumed test bias. The first is to fling out virtually any item which could conceivably be influenced by the presumed effects of being in a sub-culture. Incidentally, this may remove items which show real differences, but that is an inevitable casualty of the “if my scores are low then the test is biased” complaint. Incidentally, the same exclusion procedure is done for items which show sex differences. I hope to return to that particular issue sometime.

The second procedure to detect and remove test bias is to look at predictive power, in the sense of trying to detect differential prediction for the different racial groups. The authors are to be commended for their simple explanations of the procedures (which you might like to forward to researchers who do not grasp the concept):

If the MCAT exam predicts success in medical school in a comparable fashion for different racial and ethnic groups, medical students with the same MCAT score will, on average, achieve the same outcomes regardless of racial or ethnic background. On the other hand, if their outcomes differ significantly, test bias in the form of differential prediction exists because the prediction will be more accurate for some groups than for others.

If the observed graduation rates were 90% for white students but 95% for black students with the same MCAT score, this would be evidence of predictive bias against black students because their four-year graduation rate was higher than predicted and also higher than that of white students earning the same score. This is an
example of under-prediction.



As you can see, in the case of black medical students the test over-predicts the eventual outcome, particularly as to whether they graduate within the usual 4 years. The test also over-predicts the 4 year outcome for Latino students. The test is not biased against them. On the contrary, it has tendency to flatter them.

The authors then go on to list a whole lot of variables which may make life difficult for black and Latino applicants. They do not analyse these, or distinguish between them in terms of whether they are independent of intelligence, nor do they attempt to “control” for them in statistical terms. They are possible reasons for different outcomes and not the main part of the paper.

Finally, they look at the relationship between scores and admittance to medical school by race.



Figure 2 is an interesting picture of the United States of America today. This is what the authors say about the results:

Figure 2 compares white, black, and Latino individuals’ academic qualifications, as measured by MCAT scores and reported in applications for admission to the 2010 matriculating class, with the percentages of applicants in each group who were ultimately offered acceptance by one or more medical schools. As the figure illustrates, the percentage of white applicants with MCAT scores ≥ 25 is much greater than the percentage of black or Latino applicants reporting similar scores (84% for whites versus 37% for blacks and 56% for Latinos). This profile of MCAT scores stands in sharp contrast to the overall acceptance rates shown in Figure 2 (47% for whites versus 40% for blacks and 49% for Latinos), reflecting differences of
7 percentage points for white versus black applicants and –2 percentage points for
white versus Latino applicants.
The similar overall acceptance rates for the three groups suggest that admission
committees do not limit themselves to the consideration of MCAT scores in their efforts to identify the applicants who are the most likely to succeed in medical school. If they did, differences in acceptance rates across racial and ethnic groups would more closely parallel differences in their mean MCAT scores.
That is, greater emphasis on the MCAT exam would decrease the percentages
of minority applicants selected for entry into medical school. In sum, although group differences on the MCAT exam have the potential to reduce the percentage of URM students selected into medical school, these results show that this is not occurring in practice.

The authors do not say: “the failure of medical schools to use the best test predictors is grossly unfair to applicants, most of whom are well qualified whites”. Figure 2 shows the blatant manipulation of the entry process, in which black and latino applicants are given preference on the basis of their genetics.

For example, if entry to medical school was based on ability, 84% of white applicants would get an offer and only 37% of blacks. In fact, only 47% of whites gain entry and 40% of blacks. Very confusingly, the authors describe this as “reflecting differences of 7 percentage points”. In fact, the difference is 84-47 = 37 percentage points. Over a third of qualified white candidates are rejected. No black candidate is rejected. 7% of Latinos are rejected.

I do not live in the US, but the phrase “admission committees do not limit themselves to the consideration of MCAT scores” strikes me as an understatement of Orwellian proportions. Qualified white candidates are rejected at very high rate, Latinos rejected at a far lower rate, black applicants all accepted. This is clear evidence that admission is not based on merit, but that a substantial rent is paid to race. If a culture countenances such malfeasance in the allocation of resources its fundamental values become corrupted. The “other” criteria used for admission result in fewer candidates graduating even after 5 years, so those criteria, whatever they are, do not lead to success in medicine. I presume that we can reject the notion that it is a rational policy to place one’s life in the hands of an incompetent physician just for the exotic fun of racial variety.

Can anyone explain what is going on in America?


  1. Nobel prizewinning economist Gary Becker wrote, back in 1957, that when a group was discriminated against in admissions - those who despite this broke through would perform at a *higher* level than average.

    This is obviously correct when you think about it. It applied when Jews were discriminated against at US universities - those who managed to get into elite colleges performed above average.

    Even a sportsman can see this - Jim Bouton in his famous baseball book Ball Four commented that although baseball was integrated in the sixties, he believed that blacks were still discriminated against on the basis that he could observe that all the black players were better than average (i.e. batted with an average above 300).

    Nowadays, the groups that are *supposedly* discriminated against perform worse than average - confirming that in reality they are being discriminated in-favour-of.

    Which is of course no secret, but up front, explicit, and enforced by government agencies - who nonetheless lie and pretend they are not doing what they explicitly and proudly do.

  2. This is just about admission for the first year.

    If they are really "incompetent physicians", well, they will not go any further.
    The only bad thing is to refuse some competent people.

  3. No, they are less competent than those who could have been admitted. They will be slower to learn the vast corpus of what is known, take longer to benefit from experience, have a higher error rate, and be less likely to innovate throughout their careers. If you do not recruit the best you do not get the best outcome.

    1. I don't know exactly how the american system work, but these students will have to pass through several tests and the only time they are helped is for the first year admission.

      I agree with you when you say that you have to recruit the best, but I feel like it would be more interesting to compare the number of these "helped" students in the first year with their number at the end of their curriculum to show people this is not a solution.

      Affirmative action is a bad idea, and this is coming from a young black male, but in this case, I think the "only" prejudice is for other students who performed better.

      Sorry for the bad english.

    2. Agree with you we could look at all the "helped" students in more detail, but I think the outcome will be as predicted by intelligence, and race per se will not be relevant. At one stage Jensen spoke about studying "pseudo races" ie groups made up of people one standard deviation below the white population mean regardless of what their race was. His speculation was that they would perform according to intelligence, and nothing else

  4. Anyone remember the advert for private medical insurance in UK where the woman says something like "now I can get to see Doctor McTavish rather than just any doctor". Eh yes, we know what you mean!


  5. Back in 2009 I wrote up the scores by race on the five most important American grad school / professional school tests:

    1. Dear Steve, As usual, you cracked the code long ago. I was particularly pleased to see that you put in the percentile ranks measures. I considered putting them into my post, and then thought it would be sufficient fro readers to imply them from the results tables, but yours is the better exposition by far.