Readers of this blog will be familiar with my views
on behavioural scientist’s use of very small samples, from which they draw very
large conclusions, in sharply opposing directions, very frequently. This makes
for good headlines and weak science. So, it gives me great pleasure to read an
article in Science about educational attainment which has a sample size of
101,069 persons, and then promptly checks its findings on a further sample of
25,490 other people. With one bound they propel themselves into the
stratosphere of behavioural science: a massive sample of “discovery” and a very
large “replication” sample. Psychologists, with some notable exceptions, generally
limit themselves to a small “discovery” sample which they treat as if it were
the entire universe, and leave replication to others.
GWAS of 126,559 Individuals Identifies Genetic
Variants Associated with Educational Attainment. Science Xpress http://www.sciencemag.org.libproxy.ucl.ac.uk/content/early/2013/05/29/science.1235488.full.pdf?sid=03403730-92e7-416f-8c43-ef69ae2a5cc5
You may think it churlish of me not to give the
authors’ names, but as befits a collaborative study, the author list is the
length of a short letter to Nature, and the 175 references are a longer paper in
themselves.
The cautious naming of a treasured sample as being
one of “discovery” is very wise. We use samples like a net cast into the sea to
try to discover what fish are like in all oceans. Our conclusions as we pick
through our fish will contain many elements which are characteristic of that
particular catch, and not of all other catches. We have to dip the net into
another sea at another time in order to better understand what creatures live
in the oceans.
As you might suspect, the scientists in this paper
are gene hunters. They know that the only way to try to make sense of the genetic
code is to hit the problem with massive samples, thus squeezing out random
characteristics and homing in on the real causes of variance. They found that
three SNPs had genome wide significance as regards educational attainment, and
also found that a score derived from all these many hundreds of SNPs, each having a tiny but additive effect, accounted for 2% of educational
attainment and 2.5% of cognitive function.
Can we now trumpet “The genes for IQ have been found”?
The authors make no such error. With commendable caution they say that these
areas of the genome are associated with health, cognitive and central nervous
system expression, so they are worth following up, and that their study
provides a benchmark for power analyses in social science genetics.
This is a very important study, and sets a high
standard for others to follow. What does this mean for the genetics of
intelligence?
Criterion heterogeneity is the technical term for
the “rubber ruler” effect. Suppose we try to study intelligence by asking every
school to name their most able student. Some schools in unfavoured catchment
areas will nominate a student who would not be rated outstanding in a slightly
better school with brighter students. Suppose we try to do better by measuring
the number of years that the student spends getting educated. (We discount
those rare students who are too bright to spend much time at school and leave
early to found their own companies). As a rule of thumb, the brighter you are,
the longer you spend in education, because you go to college, and may then
continue to even higher degrees. The authors, no slouches, were wise to this
problem, and used the International Standard Classification of Education Scale
(1997) to calculate years of education, and whether or not the person went to
college. Frankly, this is not all that much use when compared to a standard
scale like a national exam with a grade point total, but they had to recruit
over several countries and this was the best way to get things on a common
metric, however crude.
The subjects were all Caucasians i.e. white and they
were most of them about 30 years of age, by which time they should have
completed college. 23.1% had a college degree. The authors do not mention this,
but since white IQ is 100 just about anywhere in the world, it suggests that
those with IQ 109 and above were getting into college (23.1% of the white
population have IQs of 109 and above). Depending on your attitudes to further
education you may see it as a great thing that persons at that level of
intellect are in college, or a waste of money and a dreadful lowering of
standards. To me it suggests that “college” covered a wide range of courses. The
more demanding colleges recruit from those with IQs of 115 and above (top 16%
of the population) and elite colleges require IQ 130 (top 2.2%). This trade-off
between intellect and educational quality is depicted in a previous post “Social
class and university entrance”. http://drjamesthompson.blogspot.co.uk/2012/11/social-class-and-university-entrance_28.html
However, not all is lost. The diligent authors found
that the peace loving Swedes had given all their military service conscripts a
proper IQ test, and the very same genetic markers did a better job of
predicting IQ for this subset, accounting for a princely 2.5% of the variance. So,
the continuous measure of intelligence was slightly easier to predict than the
lumpy and not so informative educational measures. By means of comparison with
other personal characteristics, the same genetic analysis predicted 10% of the
variance for height. By means of historical comparison, until the last two
years the amount of variance of intelligence which could be explained by
genetic analysis was zero.
Even larger samples with IQ measures and further analysis
of the genetic code may well increase the intelligence variance accounted for. In
all probability there are very many genes which contribute to what we call
intelligence, all with slight but useful effects.
The hunt continues.
"10% of the variance for height" seems to me to be reasonably successful. Or is it the case that "Height is just a social construct" and therefore they must be wrong? It's just so hard to keep up with the counterfactuals to which one is expected to make obeisance these days.
ReplyDeleteHi James. It's not the case that "a score derived from [three SNPs with genome wide significance] accounted for 2% of educational attainment and 2.5% of cognitive function.
ReplyDeleteAll SNPs regardless of significance accounted for 2%: The three significant SNPs accounted for 100th of that....just 02%.
"Education" is a multidimensional aggregation. That aggregation likely reduces power dramatically, as Sophie van der Sluis has shown.
van der Sluis, S., Verhage, M., Posthuma, D., & Dolan, C. V. (2010). Phenotypic complexity, measurement bias, and poor phenotypic resolution contribute to the missing heritability problem in genetic association studies. PLoS One, 5(11), e13929.
pdf
Tim, many thanks. Sloppy writing on my part, so thanks for correcting. van der Sluis very interesting. Aggregation is a problem, but the crudity of the educational measure makes it worse. As to measurement invariance, who actually achieves it? I think this is the fearsome Dolan once again, demanding a purity not yet attained by many psychometricians.
Delete