Wednesday 12 October 2016

More markers, more differentiation, and people know what race they are anyway


Cultural lag is the polite term for habits and hypotheses that never die. They become immune to refutation by virtue of constant repetition.  One such meme, due to Lewontin (1972), asserts that there is more genetic variation within genetic groups than between them, and therefore that…… er, ….there is no difference between the groups/there is no genetic difference between genetic groups/any differences between groups cannot be due to genetic reasons/asserting that genetic group differences are discriminable by genetics would be arbitrary and wrong/genetic groups do not exist.

I had never been convinced by these arguments, on the simple basis that genetic groups are clearly visible, and sustain themselves by genetic means, and are usually halved by admixture. Also, it was only a vague thought, but it seemed to me that a t test could still be significant with relatively small mean differences if the sample size was high enough. Probably not relevant in genetics, I mused.

In fact, the ease with which you can separate two genetic groups depends, like all discriminations and all clustering, on the number of markers available for the discrimination and clustering techniques being used. With only a few markers, discrimination is difficult, and error prone. As you increase the number, allocation to different groups becomes progressively easier.

So, to counter the endless echo of the original hypothesis, I am trying to put together a list of papers which explain and test the issue.

Tim Bates explains that Lewontin based his claims on blood type markers: about as advanced as it was possible to be in 1972, but hopeless to identify genetic clustering, therefore doomed to render a false negative.  The 2005 paper by Neil Risch (now cited 400 times) shows how inadequate that procedure was by showing one can now predict race near perfectly with random sets of SNPs.

Hua Tang, Tom Quertermous, Beatriz Rodriguez, Sharon L. R. Kardia, Xiaofeng Zhu, Andrew Brown, James S. Pankow, Michael A. Province, Steven C. Hunt, Eric Boerwinkle, Nicholas J. Schork, and Neil J. Risch. (2005) Genetic Structure, Self-Identified Race/Ethnicity, and Confounding in Case-Control Association Studies.  Am J Hum Genet. 2005 Feb; 76(2): 268–275.

The authors say in their abstract:

We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multi-ethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity—as opposed to current residence—is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed.


In their discussion they say:

Attention has recently focused on genetic structure in the human population. Some have argued that the amount of genetic variation within populations dwarfs the variation between populations, suggesting that discrete genetic categories are not useful (Lewontin 1972; Cooper et al. 2003; Haga and Venter 2003). On the other hand, several studies have shown that individuals tend to cluster genetically with others of the same ancestral geographic origins (Mountain and Cavalli-Sforza 1997; Stephens et al. 2001; Bamshad et al. 2003). Prior studies have generally been performed on a relatively small number of individuals and/or markers. A recent study (Rosenberg et al. 2002) examined 377 autosomal microsatellite markers in 1,056 individuals from a global sample of 52 populations and found significant evidence of genetic clustering, largely along geographic (continental) lines. Consistent with prior studies, the major genetic clusters consisted of Europeans/West Asians (whites), sub-Saharan Africans, East Asians, Pacific Islanders, and Native Americans. It is clear that the ability to define distinct genetic clusters depends on the number and type of markers used (Risch et al. 2002). Reports that document inability to define distinct clusters generally used only a modest number of markers and, hence, had little power to detect clusters (Romualdi et al. 2002). Studies with larger numbers of markers appear to show strong evidence of clustering (Stephens et al. 2001; Rosenberg et al. 2002).

Another major point of discussion has been the correspondence between genetic clusters and commonly used racial/ethnic labels. Some have argued for poor correspondence between these two entities (Lewontin1972; Wilson et al. 2001), whereas others have suggested a strong correlation (Risch et al. 2002; Burchard et al. 2003). We have shown a nearly perfect correspondence between genetic cluster and SIRE for major ethnic groups living in the United States, with a discrepancy rate of only 0.14%.

In sum, you get a near perfect correspondence between genetic measures and the common racial labels, with a misclassification rate of a mere 14 per 10,000. Some of this is due to the admixed “other” category, and perhaps some existential confusion in the others, but 9,986 in 10,000 subjects can master the art of looking in a mirror and noting which race they most resemble, a task beyond the wit of some academics.


  1. "... a task beyond the wit of some academics."

    I'm sure it's "unprofessional" to say, but since I'm not a professional psychologist I'll say it - many are just in denial. It's a "what elephant." Their whole world view would collapse if they admit reality.

    Others are just political propagandists.

    another fred

  2. The journal Sociological Theory had a target article + discussion about the topic in 2014 - see here:

    Reading the whole exchange might make your head hurt (I did, two years ago - and I'm a sociologist), but I'm guessing the target article has useful references for your list of papers.

    1. I was recommended it a few days ago, and have yet to properly dive into it, hence this piece with a reference from Tim Bates, who knows the score.

  3. Santo piece und luv13 October 2016 at 11:51

    ''They become immune to refutation by virtue of constant repetition''

    Ideas mutates too, many times at bad way, ;)

    You can have two individuals, nigerian and norwegian, with: same height (genes), same propensity to the obesity (genes), propensity to baldness (genes), lack of body hair (genes)... and other norwegian individual with different height (genes), different propensity to the obesity (genes), to the baldness (genes), abundance of body hair (genes)... than the two guys above.

    the first norwegian and the nigerian guys are more similar, phenotypically speaking, than the second norwegian guy.

    But, i thought race is not just phenotypical variation but genotypical too

    ''race is phenotype'' = just skin color*

    Even blood markers are far to be great method to differentiate human populations and races, we still have great differences, for example, negative rhesus tend to be overwhelmingly common among europeans in contrast with non-europeans and within europeans we know that certain groups like basques tend to have overwhelming negative rhesus blood type than most other european people.

    We know that certain recent blood types like AB and B are more common among MENA and east asians than europeans or africans. We know that B and AB are more common among eastern europeans than western europeans, A are more common among nordics than among mediterraneans, specially iberians, and so on...