Sunday, 26 May 2013

A response to two critical commentaries on Woodley, te Nijenhuis & Murphy (2013)

Michael A. Woodley, Jan te Nijenhuis, & Raegan Murphy

Our study on the lowering of intelligence has drawn massive attention from the media, with headlines from Brazil to Vietnam. Also thousands of reactions were posted on blogs, including two highly relevant critical comments on the blogs of Scott Alexander and HBD Chick. We give a response in this post. We are also pleased that our paper in Intelligence is starting a scientific discussion on the lowering of intelligence.

Alexander (2013) advances the argument that Galton’s sample is unrepresentative of the population of Victorian London, and may be heavily skewed towards those with high-IQ and faster reaction times (RTs) owing in part to the fact that Galton charged a small fee to those wishing to participate in his data collection exercise. Hence, these studies should not be used as the basis for comparison with more modern studies, which, it has been argued are relatively far more representative in many cases of the populations from which they are drawn. We show here that this argument is wrong.

HBD Chick (2013) has advanced a second argument to the effect that Galton’s sample, and other contemporaneous 19th century studies (i.e. Ladd & Woodsworth, 1911; Thompson, 1903) represent ethnically homogeneous samples in comparison with more modern ones, which are obviously less homogeneous. Given the existence of ethnic-group differences in reaction time (RT) means (i.e. Jensen, 1998), this is proposed as a cause of the substantially depressed means in current-era studies, thereby undercutting our conclusion that RT has become slower for the general population (HBD Chick, 2013). We show here that this second argument is wrong in as much as changing population composition cannot account for the preponderance of the observed secular trend.

In addressing the first argument, the seminal paper of Johnson et al. (1985) which constitutes the source of Galton’s simple visual RT data employed in both our study and that of Silverman (2010), contains excellent data on the socio-economic and occupational diversity of the relevant subset of Galton’s exceptionally large sample (N around 17,000 individuals, 4838 [or 30%] of whom were included in Johnson et al’s study). The paper states that “… a sizable portion of Galton’s sample consists of professionals, semi-professionals, and students. However … all socioeconomic strata were represented” (p. 876). As can be seen in Tables 10 and 11 (pp. 890-891), the male cohort could be split into seven socioeconomic groups (Professional, Semi-professional, Merchant/Tradesman, Clerical/Semiskilled, Unskilled, Gentlemen [aristocracy] and Student or Scholar). For females, there were six socioeconomic groups represented in the data (Professional, Semi-professional, Clerical/Semiskilled, Unskilled, Lady [aristocracy] and Student or Scholar). In both the male and female sample the modal group appears to be the Student or Scholar category; in both cases these groups exhibit the largest Ns – 1657 in the case of 14-25 year old males, and 297 in the case of equivalently aged females. The second- and third-largest groups amongst the males of equivalent age were Clerical/Semiskilled (N=425) and Semi-professional (N=414). This is basically true of the female sample also, with Semi-professional being the next largest group after Student or Scholar (N=104) and Clerical/Semiskilled comprising the third largest group (N=47). Whilst it is obviously true that the sample is skewed towards Students or Scholars in both cases, individuals from these lower-middle/upper-working class occupations combined (see p. 888 in Johnson et al., 1985; for a full description of how these occupational categorizations correspond to employment type), make up a respectable proportion of the 14-25 year old samples also (>30% in the case of the males, and >30% in the case of the females). It is important to note that according to Johnson et al (1985) many of the students would have been pupils at schools accompanied by teachers on day-trips to Galton’s laboratory at the Kensington Museum. However, a fundamental point is that Silverman’s (2010) study uses only data for those aged 18-30 (see Table 1, p. 41 in Silverman [2010] for full details of this subsample), hence is quite unlikely to have been nearly as skewed towards school-aged students relative to the sample as a whole, which included a much larger range of ages.

A careful reading of Silverman (2010) will reveal that he was cognizant of precisely how much socioeconomic diversity was present in Galton’s dataset. Accordingly he was very careful to include only samples that would broadly match one or more of the categories in Galton’s dataset (see: Silverman, 2010, Table 2, pp. 42-43 for full disclosure of the sample background characteristics). One advantage of Silverman’s care and meticulous attention to detail is that it permits us to make like for like comparisons with specific socioeconomic and occupational groups in Galton’s data, thus we can directly test the claims of Alexander (2013). Concerning the post-Galton studies Silverman included five student samples, two of which date from the 1940s (Seashore et al. 1941), and the remaining three of which date from the 1970s to the 2000s (mean testing year = 1993; Brice & Smith, 2002; Lefcourt & Siegel, 1970; Reed et al., 2004). These can be compared with the combined Galton and Thompson 19th-century student data in a three-way comparison as follows:          

Comparison involving male students          Difference in mean N-weighted RT means
19th-century students vs. 1940s-era students                          +16.8 ms (183.2-200 ms)
19th-century students vs. ‘modern’ students                           +74.2 ms (183.2-257.4 ms)
1940s-era students vs. ‘modern’ students                              +57.4 ms (200-257.4 ms)

The difference between the 19th century and the ‘modern’ male students is very similar to the meta-regression-weighted increase in RT latency between 1889 and 2004, estimated on the basis of all samples included in the meta-analysis (81.41 ms). Silverman also included data from other socioeconomic groups. For example the study of Anger et al. (1993) included a combined male + female sample of 220 postal, hospital and insurance workers from three different US cities. These occupations clearly fall into the Clerical/Semiskilled and Semiprofessional groups identified in Galton’s study. For both males and females in Galton’s data, the N-weighted RT mean for these two groups is 185.7 ms, the N-weighted average amongst the participants in the study of Anger et al. (1993) was 275.9 ms. This equates to a difference of 90.2 ms between the 19th century and 1993. Again, this is not dissimilar to our meta-regression-weighted estimate of the cross-study increase in RT latency (81.41 ms).

The results of these broadly socioeconomically- and occupationally-matched study comparisons therefore imply an additional degree of robustness to the findings of our more statistically involved analysis of the overall secular trend. Furthermore, this evidences Silverman’s contention that as an aggregate, the ‘modern’ studies have broadly equivalent representativeness to the subset of Galton’s data employed in his and our own analyses. Alternatively we could state that neither Galton’s nor Silverman’s data are truly fully representative of any population, however they are both ‘biased’ in their sampling towards broadly similar groups.

We continue with the second concern, i.e. the lack of strict ethnic matching criterion, hypothesized to lead to substantially depressed RT means in current-era studies. Ethnic-group differences in performance on various elementary cognitive tasks have been documented and are to be expected (i.e. Jensen, 1998). Substantial changes in terms of the ethnic composition of test-takers would however be needed in order for the magnitude of change to be solely or even substantially a consequence of this process. This is assuming of course that within and between ethnic-group comparisons in terms of RT produce proportional results.

RT is related to g via mutation load (as measured using fluctuating asymmetry; Thoma et al., 2006). Mutation load is therefore likely to be a general source of individual differences in cognitive functioning within populations (Miller, 2000), but not between them (e.g. Rindermann, Woodley & Stratford, 2012), hence there is no good reason to expect ethnic-group differences in RT means to be meaningfully comparable to within-group differences in terms of proportionality (consistent with this is the observation that on simple RT these differences whilst present are actually quite small; Jensen, 1993; Lynn & Vanhanen, 2002, pp. 66-67). So, indeed ethnically heterogeneous samples will exhibit slightly slower or even faster reaction times (depending on the populations and proportions involved), however the current proportions of groups exhibiting slower simple RT means to Whites in Western countries are simply too small, and the group-differences too slight to have had a substantial effect.

It is also worth noting that the weighted mean of our modern (post-1970) aggregated estimate (264.1 ms) is actually less than Jensen’s (1993) finding of a 347.4 ms mean of simple visual RT amongst a sample of 582 White US pupils described as being of European descent, and also Chan and Lynn’s (1989) finding of a 371 ms simple RT mean for over 1000 White British school children in Hong Kong. It must be noted however that these studies were conducted on young children – simple RT shortens until the late 20’s when full neurological maturation is achieved (e.g. Der & Deary, 2006), hence Jensen and Chan and Lynn’s estimates are likely to be underestimates of the adult simple RT means of these Whites, which may be somewhat closer to our sample mean of ‘modern’ (mostly White) populations in actuality.

We would like to thank Scott Alexander and HBD Chick for their interest in our study, and for their commentaries, however the counter-arguments, whilst thought-provoking, do not appear to withstand scrutiny. We must therefore conclude that the secular slowing of simple reaction time between the closing decades of the 19th century and the opening one of the 21st has had little to do with sampling issues.


Alexander, S. S. (2013). The wisdom of the ancients. Slate Star Codex. URL: [retrieved on 24/05/13]

Anger, W. K., Cassitto, M. G., Liang, Y.-X., Amador, R., Hooisma, J., Chrislip, D. W., et al. (1993). Comparison of performance from three continents on the WHO-recommended
Neurobehavioral Core Test Battery (NCTB). Environmental Research, 62, 125–147.

Brice, C. F., & Smith, A. P. (2002). Effects of caffeine on mood and performance: A study of realistic consumption. Psychopharmacology, 164, 188–192.

Chan, J., & Lynn, R. (1989). The intelligence of six year-olds in Hong Kong. Journal of Biosocial Science, 21, 461-464.

Der, G., & Deary, I. J. (2006). Age and sex differences in reaction time in adulthood: Results from the United Kingdom Health Lifestyle Survey. Psychology and Aging, 21, 62–73.

HBD Chick. (2013). We’re dumber than the Victorians. HBD Chick. URL: [retrieved on 24/05/13]

Jensen, A. R. (1993). Spearman’s hypothesis tested with chronometric information-processing tasks. Intelligence, 17, 47-77.

Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT:

Johnson, R. C., McClearn, G., Yuen, S., Nagosha, C. T., Abern, F. M., & Cole, R. E. (1985). Galton's data a century later. American Psychologist, 40, 875–892.

Ladd, G. T., & Woodworth, R. S. (1911). Physiological psychology. New York, NY: Scribner.

Lynn, R., & Vanhanen, T. (2002). IQ and the Wealth of Nations. Westport, CT: Praeger.

Miller, G. F. (2000). Mental traits as fitness indicators: Expanding evolutionary psychology’s adaptationism. Annals of the New York Academy of Sciences, 907, 62–74. 

Reed, T. E., Vernon, P. A., & Johnson, A. M. (2004). Sex difference in brain nerve conduction velocity in normal humans. Neuropsychologica, 42, 1709–1714.

Rindermann, H., Woodley, M. A., & Stratford, J. (2012). Haplogroups as evolutionary markers of cognitive ability. Intelligence, 40, 362-375.

Seashore, R. H., Starmann, R., Kendall, W. E., & Helmick, J. S. (1941). Group factors in simple and discrimination reaction times. Journal of Experimental Psychology, 29, 346–394.

Silverman, I. W. (2010). Simple reaction time: It is not what it used to be. The American Journal of Psychology, 123, 39–50.

Thoma, R. J., Yeo, R. A., Gangestad, S., Halgren, E., Davis, J., Paulson, K. M., & Lewine, J. D. (2006). Developmental instability and the neural dynamics of the speed-intelligence relationship. Neuroimage, 32, 1456-1464.

Thompson, H. B. (1903). The mental traits of sex. An experimental investigation of the normal mind in men and women. Chicago, IL: The University of Chicago Press.

Woodley, M. A., te Nijenhuis, J., & Murphy, R. (2013). Were the Victorians cleverer than us? The decline in general intelligence estimated from a meta-analysis of the slowing of simple reaction time. Intelligence. doi:10.1016/j.intell.2013.04.006 


  1. On a technical point, "Gentlemen [aristocracy]" is wrong for Britain. Gentlemen probably just meant someone who didn't need to work i.e. someone who lived off his wealth. Aristocracy was, in Britain, a far smaller set. Not in Poland perhaps, but in Britain that explanation won't do. The modern equivalent of Gentlemen may perhaps be Trustafarian.

  2. "Comparison involving male students Difference in mean N-weighted RT means
    19th-century students vs. 1940s-era students +16.8 ms (183.2-200 ms)
    19th-century students vs. ‘modern’ students +74.2 ms (183.2-257.4 ms)
    1940s-era students vs. ‘modern’ students +57.4 ms (200-257.4 ms)"

    Does this mean that the great bulk of the slowing in reaction time appears to have occurred post-1940s?

  3. Great post, James. Thanks for sharing your excellent insight into this and other relevant topics. Keep the great posts coming!

    - Jackie from Marchionne Insurance, the elite provider of Car Insurance in North Shore MA