Sunday 22 May 2016

A mountain to climb this Sunday


Every now and then I buy a Sunday newspaper, supposedly for the treat of catching up with the week’s news in a more detailed manner than would have been possible on a weekday.

The Sun on Sunday and The Mail on Sunday manage 1,400,00 readers each, the more refined The Sunday Times a decent 765,000, as befits a supposed newspaper of repute. The Sunday Times aims to influence the influencers, and to be the standard bearer of accepted wisdom and genteel debate, plus cookery tips. Readers will help determine how society works by making choices and shaping opinions, not least in schools and universities. In the Culture section Siddhartha Mukherjee’s The Gene is reviewed by journalist Brian Appleyard, who inter alia says (column four):

Kinship studies around the world have repeatedly come up with percentages indicating the “heritability” of certain human traits, notably intelligence. Identical twins reared together have a much more than 50% likelihood of having similar IQs. But with non-identical siblings, the correlation plummets. And with families living apart the number drops even further. The point is that “heritability” (a genetic cause) isn’t the same as “inheritability”, the transfer of traits down the generations. Intelligence is clearly based on a vast complex of genes and their interactions that are highly unlikely to be passed on intact from parent to children. Along with now proven flaws in the whole idea of IQ, these new discoveries utterly discredit all attempts to make crude connections between race and intelligence. With the development of the theory of epigenetics, which argues that genes can be responsive to environmental factors, we now know that events in life can change the genetic destiny of future generations.

This is an interesting example of contemporary received wisdom, which will be read and even possibly believed by the chattering classes, and even some of the whispering classes, to which minority category I probably belong.

So, here in a whisper is my question: how will we ever climb this mountain of misunderstanding?

The comment is confused on so many different levels, showing much confusion about genetics, but also a touching faith in the epigenetics argument. This journalist studied English at university, but has no hesitation in reviewing a book about genetics, mangling the arguments, and making full use of the megaphone placed in his hands by credulous editors.

I have at best 1200 readers per day, so I cannot compete on numbers alone, but in the interests of establishing facts I suggest some minor correctives:

a link to the latest work on the genetics of scholastic achievement by Davies et al.

a link to the paper  on the genetics of completed school years which James Lee presented at Albuquerque last year

and the full version of that paper now published in Nature

Okbay 74 hits


and a link to Davide Piffer’s work on racial group differences, updated again so as to take in the latest Okbay findings above.

In case you know anyone who really thinks that “there are now proven flaws in the idea of IQ” , a book written for clever sillies:

If you pass this post on to another colleague you will have helped put the record straight, though you may have to do so in a whisper.


  1. i especially like the author's "we now know..." for things we DON'T know, & his "proven flaws" for things that are 99% sound. & there's his "new discoveries utterly discredit all attempts to make crude connections between race & intelligence" -- i'd like to know what those "new discoveries" are - the discovery that one can ignore reality when engaging in politically correct rhetoric, perhaps? :)

  2. I'm of the opinion that "epigenetics" in the newspapers just means wishful thinking.

    "a vast complex of genes and their interactions that are highly unlikely to be passed on intact from parent to children": are there no limits to this fool's ignorance?

  3. elijahlarmstrong22 May 2016 at 20:43

    Being the conciliatory fellow that I am, I will point out that:

    a) the GWAS studies do not really disprove this guy's point: what they show is that, in a particular environment, a few percent of the variance in IQ or a proxy phenotype is due to additive genetic variance. That this is so is not much of a concession for people of his school of thought.
    (GCTA gives much higher heritabilities for IQ, but it yields low heritabilities for personality traits so far as I see, roughly consistent with the modest personality heritabilities estimated from adoption studies; this suggests that IQ's narrow heritability is substantial but personality's narrow heritability is quite low. But, as Ali G says, I digest.)

    b) yes, epigenetics is not a panacea, but it is simply the case that genes work very much differently than Lush or Holzinger could have known. And there are implications for the interpretation of behavior genetic models. See Charney (2012).

    1. Human body is product of both genes and environment. All are important for the final product. Without environmental factors like food&nutrient/shelter/clothing, genetic potential might go nowhere but death. With pile of correct environmental factors, an organism with genes of fish would never become a human.

      Some environmental factors have life long organic impact on living being. Example: myopia due to living in most indoor life during childhood.

  4. " will we ever climb this mountain of misunderstanding?"

    Misunderstanding has nothing to do with it. First thing to do, don't purchase the paper.

  5. "Davide Piffer’s work on racial group differences, updated again so as to take in the latest Okbay findings above."

    The Okbay results don't readily support a polygenic selection/hereditarian racial model. I hope I'm not the only one that sees this.

    1. elijahlarmstrong24 May 2016 at 08:26

      I don't have access to the google doc Thompson linked, so could you explain what you mean?

    2. You don't have a gmail account? Based on Okbay's 74 alleles, for example, African Americans had a higher polygenic score than White Americans. When Davide included more alleles, bringing the total to around 160, Asians>Europeans>Africans, but the frequency differences were seemingly minor e.g., .515, .500, .475. I asked Emil to generate inter-individual standard deviations, to put the matter in perspective, and I am still waiting on those. But from other scores I have seen, those magnitudes of population differences are minor. Also, the Okbay results did not conform to a polygenic selection model on a number of accounts. There was no apparent positive manifold across populations (thus the average factor loading of alleles was ~ zero) and there was virtually no correlation between SNP effects sizes and factor loadings. To salvage results, Davide created a "meta-factor" of reliability, based on, among other things, national IQ and earlier allele results which fit his model. He then used this to, essentially, reweight Okbay's results. Unsurprisingly, the expected associations were found -- but of course this procedure begs the question -- and so provide no support. Davide's assumption was that a pg selection model was correct and that there was a lot of noise, obscuring a clear signal; so he was trying to filter that out -- as an exploratory, post hoc, "well maybe this was the problem" analysis, this is OK. In sum, the polygenic selection model was not clearly supported given his theory/predictions and this more or less invalidates factor analysis scores, leaving polygenic ones, which, didn’t show substantial population differences (we will have to wait for individual SD to better evaluate).

      Of course, I am more than open to being convinced that my reading is incorrect.

    3. Your reading is totally incorrect. You do not provide an explanation of what "minor" means to you. Even a difference of 3% can mean a lot. And honestly, what do you expect? We all know that intelligence is due to the action of thousands of genes. Why do you think GWAS struggle to find associations? So 3% spread over thousands of genes can cause deep differences.And your critique of my "meta-factor" is also unfair. I didn't do that to "salvage" results. I didn't use the meta-factor of realiability to reweight Okbay's results. It just happened that the factor analysis of subsets of Okbay's hits produced a factor that perfectly correlated with an independent factor of reliability. So you transformed an independent validation into a post-hoc analysis, totally misrepresenting my work. You are also focusing on the bits of the puzzles that don't fit and ignoring the bigger picture. You are ignoring the results from the two previous GWAS, which correlate so strongly with the new ones. You are ignoring the replication I carried out using ALFRED. You are ignoring the stronger hits that replicated across publications confirming the pattern. And guess what? The replicated hits exhibit the "positive manifold" that you cannot see in the other hits. It's a long paper that is not written for lazy readers who are too lazy to read all of it, and so lazy or incompetent that they've got to defer reproducing the analysis to someone else.

    4. Hi Davide,
      I noticed Andrew Kern's review of your "Evidence of polygenic selection on human stature" paper. Wasn't that basically what I said years ago i.e., (a) your method is unvetted and (b) you should cross validate using Berg and Coop's method? Maybe try that.

      Regarding this paper, I noted the obvious way of interpreting the significance of between group differences in pgs. If we look at Okbay's independently validated 74 alleles (e.g., Beauchamp, 2016), the question is: how much are e.g., African Americans ahead or White Americans in pgs standard deviations? The fact that the results reverse after adding other alleles (hidden away in a supplementary file) suggests, to me, that the differences are not robust. They surely are not like what you originally found e.g., .355 versus .16 (Piffer, 2013). But maybe, as you suggest, a minor score difference spread across 1000s of alleles which each have minute effects adds up to large differences.

      As for support for the pg selection model, I'm just going by your own predictions e.g.,

      Piffer (2013): "hypothesis of random evolution, the expected correlation between trait increasing (A, B) or trait decreasing (a, b) alleles is 0. Instead, if frequency of allele A (trait increasing) is positively correlated to frequency of allele B (trait-increasing) and negatively correlated to frequency of allele b, then this
      suggests a mechanism other than neutral evolution (Kimura, 1984) such as natural selection (Piffer, 2013)... This “positive manifold” of trait-enhancing alleles can be operationalized by
      their factor score."

      An average factor loading of zero would suggest no clear "positive manifold", no? But you say, "You are ignoring the stronger hits that replicated across publications confirming the pattern." Sure, the distribution of alleles found by Davies+Rietveld were in line with your model. So when you limit Okbay's alleles to ones also found by Davies+Rietveld, it's not surprising that you get what you were looking for. But ya, maybe most of those other Okbay alleles, were noise, as you conjectured. Maybe.

    5. An attenuation in frequency differences with more and more GWAS is exactly what we had predicted. Obviously the first GWAS will find the most potent variants, and the next GWAS will find the variants with smaller effect. Of course frequency differences will be higher for stronger variants, and since stronger variants are those discovered first (e.g. Rietveld), it's inevitable that we see a reduction. I also need to amend another misleading statement of yours. You wrote that "there was virtually no correlation between SNP effects sizes and factor loadings. ". Sure, this correlation was small (r=0.08) although in the right direction. But you failed to mention that the correlation between SNP p value and IQ, factor loadings was higher (-0.25). For some reason, even in previous analyses p value was more sensitive to Beta in MCV. Perhaps because there is a lot of noise? That's what I think. Also, in the present paper I vetted the factor analytic method by showing that lower p values hits as well as replicated hits have higher factor loadings than random hits. Yesterday I've added ALFRED populations and derived vs ancestral frequency control. After applying all sorts of control the results still stand. Yes, there is a lot of noise in there, but the general trend conforms expectations!

  6. "I noted the obvious way of interpreting the significance of between group differences"

    Alternatively, one might compute the education loci specific Fst, for even a subset of alleles, and then do a Fst to quantitative genetic variance estimate. This would moot concern about the absolute number of alleles involved, as the number of alleles underlying a trait is independent of the magnitude of the trait variance between groups. This is all to reiterate the problem of a lack information on inter-individual/inter-population variability in the scores.

    “With the aim of validating the meta-accuracy measure, a meta-factor was created by factor analyzing the scores of the 7 factors. The factor loadings (“meta-loadings”) were in turn correlated to the meta-accuracy vector, thus producing a “meta-Jensen coefficient” (table 10). The correlation between the two meta-vectors was r= 0.969”.

    Fair enough, I mischaracterized this. But the basic problem is the same. The question is whether the creation of a meta-factor makes sense given the otherwise weak support, based on 1000 Genomes, for what you term a "positive manifold". Maybe there is a lot of noise and the meta-factor cuts through this or maybe the 2 fold factor analysis spuriously filters out alleles that go in the unpredicted direction. (Admittedly, I don’t yet have a model on how this might happen.) You seem to recognize this concern. And so seek to validate the meta-factor using a “meta-reliability” index, one partially based on associations you are trying to replicate (Davies+Rietveld) and predict (NIQs). I’m, in a way, calling foul on the meta-reliability index since its correlation with meta-factor scores only suggests that the meta-factor is a good indexes of pgs selection -- actually cuts through the noise -- if you assume the results you wish to establish.

  7. The best way to proceed would be to try to replicate your findings using Berg and Coop's rather complex method, which has the benefit of not suffering from the case/variable FA/PCA problem, which, as more alleles come in, you won't be able to evade. And also to try to put the population score differences in relation to individual ones.

    If you can firmly establish no/trivial polygenic selection (true negative), you would accomplish a remarkable task -- and, in the process, win yourself accolade among your future peers. I'm not sure how a true positive would play out, but I would think a false positive wouldn't look so good.

  8. Interesting article and even better the comments. I'm enjoying the debate