Thursday 15 September 2016

Polygenic scores


Polygenic scores on Dunedin by Belsky


Stuart Ritchie (as in Intelligence: All that Matters) has done a guest post on the British Psychological Society Research Digest. This has wide readership among psychologists, so that it is very good news that they will be getting an update on contemporary research by an active researcher. I hope that they will consider the inheritance of characteristics in all their research.

This is intended to be a very brief post, just directing you to Stuart’s article, and adding a few links.

A few points to add: Stuart mentions the marvellous Dunedin study, so here is a link to those researchers, and the questions they set the ISIR conference in 2014:

Here is a post about recent work done on polygenic scores and human behaviours:

Here is a link to the Belsky paper Stuart mentioned, from which the above graph was drawn:

As Stuart says, only 1 or 2% of the variance in these behaviours is explained by the polygenic score. This sound little, and is, but the miracle is that any link can be shown between gene sequences and complex human outcomes.

The next paper by Selzam boosts the variance-accounted-for to 9.1%. Stuart says: The polygenic scores are already pretty good predictors: in Selzam’s study, they have just about half of the predictive value of asking about the parent’s socio-economic status, or testing the child’s IQ at age 7 (and the scores are based on DNA variants that are unchanged since birth and can be measured with a simple saliva or blood test).

Of course, parent’s socio-economic status is not random. Higher status is achieved by brighter persons. IQ at age 7 is usually a better predictor of adult success than class of origin, though the two are confounded, and quite properly so.

Stuart adds: Using an even newer polygenic education estimate from a more recent gene-finding study (published in Nature this year), Saskia Selzam and colleagues found that their polygenic score explained a remarkable 9.1 per cent of the variance in age-16 GCSE results in a sample of 4,300 British teenagers

It is worth noticing that the most easily available and most often used educational achievement measure is very crude: years of schooling. Once proper scholastic and intellectual assessment measures are used on much larger genetic samples the power of the predictive polygenetic scores can very probably be considerably refined.

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact or the author

Here is the full paper:

It requires detailed reading, and links to other recent studies on educational attainment.

In summary, we now have an incredible advance. We can now understand a bit more about how DNA, the ultimate cause of how we are built, contributes causally to an important aspect of our behaviour.


  1. "an update on contemporary research": oh, Dr T, have the Americans won their war against the meaning of "contemporary"?

  2. This area does seem to be moving along. What happens if prospective adopters say they'd like the DNA results on the candidate baby, please?

    P.S. "life success" is a phase that could be argued with, but something shorter than a sentence is clearly required. Maybe "life progress" would be better, but probably it's not worth fussing.

    When will enough data exist to make it possible to make a useful stab at predicting lifespan?

    1. Stuart Ritchie is working on a lifespan paper.

  3. Leonard Cohen was asked what it was like living in the Sixties. He replied: "We did not think it was The Sixties. We thought it was the current time."

  4. I thought young Ritchie wrote that rather well, until "ignorance and denial are no longer an option". As he probably knows perfectly well, for many people ignorance and denial are no longer an option only in the sense that they are always embraced.

    1. Lol,

      the arrogance of hbd is astonishing...

      need a honest mirror to see the same super-confident stupidity about their own cognitive bias

      and grow...

  5. whenever one of these things makes it to the popular press it makes me sad. the whole thing is meaningless.

    for instance, since when is 9% R^2 “pretty good”? polygenic scores can be slightly interesting from the point of view of animal/plant breeding, where they might actually be usable/measurable. but human GWAS is too much of a hot mess for these things to be even slightly interpretable. what conceivable use does a measure have when it maybe sometimes can explain 1% of the variance? recall that polygenic scores are, basically, linear combinations of dozens of features–when there is no plausible reason to believe that these effects are statistically independent–so you’ve basically overfit the crap out of what little signal there is, and gee whiz, turns out the prediction is crap also.

    if this is as good as anyone can do, maybe it’s time to think about something other than the genetic basis of these traits, and more importantly something that can actually be fixed with known large effects (nutrition, poverty, teacher quality etc.). the worst part is that we’ve known this for >40 years:

    until we solve the basic problem that heritability measures are near-hopelessly unstable, this kind of study is just more noise harvesting; genetics of behavior is a much worse reproducibility fiasco than other issues in e.g. psychology, because in genomics there might as well not be a statistical significance filter. if you test a gazillion hypotheses, a bunch will pass whatever threshold you might happen to set. some might even be real signal! but if your odds ratio is 1.01, there is no perceptible reason to care.

    1. There have been replies to Lewontin 1974, and some further work since then.

  6. There is a definitional problem here. When Richie says 'prediction' he doesn't actually mean prediction in a dictionary sense of the word (whether he does this knowingly or unknowingly is another question). At best, these methods 'explain' variance, they do not predict variance. Remember regression methods are simply correlational.

    For more reading:

    1. The use of sibling pairs is a good way to establish causality with GWAS studies, since it helps refute the challenge of population stratification. I don't know about the issue of linkage.

    2. Hear, hear! Using statistical jargon as if the words carried their everyday meanings is thoroughly bad practice in my view.

    3. Agree that testing predictions is a better approach than generating very general and theoretical explanations.

  7. Intelligence/qualitative constancy of good judgments is not uni-linearly quantitative-reducible... not, in the near-purrfeckt way.

    most of you 'hBBs' believe in the (QUASI-PERFECT) replicability or transference of the iq results in the real life.

    i know most of you really believe in this quasi-perfection...

    the earth is flat = intelligence is iq neaar perfect mirror

    in the ''world of work'' seems pretty correlative with iq, but the world is not only what we do in your professions, far to be, is not*