Thursday 2 April 2015

Income, brain, race: more questions

I have been looking again at the Supplementary tables of the Nature Neuroscience paper “Family income, parental education and brain structure in children and adolescents”. The authors have used data collected as part of the multi-site Pediatric Imaging, Neurocognition and Genetics (PING) study (

The great problem is that this is not an epidemiological sample, so it is almost impossible to draw any solid conclusions from it. By recruiting “a coalition of the willing” you miss out the unwilling who, unless approached with care and some incentives, tend to avoid surveys, and are disproportionately more likely to have problems and get into trouble. The Dunedin birth sample has made all this very clear. Tracking a birth sample is a strong method: roping in bystanders a poor one. In particular, when recruiting minority groups you may end up with “the brightest and the best” who are better integrated, rather than the most needy, isolated and less able.  All of this is such a pity, because brain scans are expensive, and would have been a great addition to a properly constructed sample.

The education variable in this paper is a somewhat cruder composite than I had realised. As I had already said, this is not a fatal error, but it certainly means that a fine detail analysis of the lower and higher bounds of any education effect should probably be taken with a pinch of salt. Indeed, the coding system might under-represent educational effects at both ends of the spectrum. For example, knowing the actual years of schooling rather than “fewer than 7 years in school” might be particularly relevant for some recent immigrants. Equally, a little more power might have been achieved for all those at the higher educational level by trying to enquire how many of the qualifications were in the hard sciences, or from hard-to-enter colleges. None of this is a big issue of itself, but in my view the problem does not disappear in the later statistical treatment: it influences the results in ways which might somewhat underplay the effects of education.




The income classification is also too lumpy at the lower level. This is even more important to the thesis being examined by the authors. At low incomes every extra dollar counts. As far as I can see the authors simply had to take over these classifications, but it underplays the possible effects of very low incomes. As a general rule, income effects show strongly diminishing returns, and it would be good to know what the precise shape of the function looks like at lower incomes if more fine grain detail was available. Again, not a fatal error, but it makes life harder for what the authors are trying to show.


They continue: we investigated associations between socioeconomic factors (parent education, family income) and surface area, adjusting for age, scanner site, sex and genetic ancestry factor (GAF; Table 1). In all of the analyses, we took care to examine the unique and overlapping variance in brain structure attributable to distinct socioeconomic factors.

That is to say, genetic group differences are adjusted out, so an actual source of difference in brain size is removed from the regression equations, though it still exists in reality. The authors use genetic analyses to get rid of racial variance so as to obtain what they see as being a purer measure of socio-economic status.

With an average income of $ 97,640 these families are very wealthy in global terms. The authors do not mention it, but U.S. real (inflation adjusted) median household income was $51,939 in 2013. The distribution is skewed so the figures are not comparable, but it seems this sample was richer than the US average, which often happens with volunteer samples (they can afford to contribute to something they value). They are at the 80th percentile in the US. Rich, in anyone’s language. So, if there are effects due to absolute poverty per se, they would be shrinking the brains of most of the global population.

Table 2 reveals a highly significant finding for brain surface and the African group:

GAF African       Beta −0.213   t   −7.731    p  <0.001

That holds true for Model 1 (depicted) and Models 2 and 3. No other racial group stands out to this extent in their set of analyses, as far as I can see.

After trying various adjustments, the authors say:

Specifically, when adjusting for age, age2, scanner, sex and the 20 PCs, parental education was significantly associated with surface area (β = 0.152, P = 0.021, F(37, 1060) = 20.34, P < 0.001, R2Adjusted = 0.395; Supplementary Table 1). Similarly, when adjusting for age, age2, scanner, sex and the 20 PCs, family income was also significantly associated with total surface area (β = 0.183, P = 0.005,
F(37, 1060) = 20.94, P < 0.001, R2Adjusted = 0.402; Supplementary Table 2).

Comment: If you see parental education as a rough proxy for parent’s intelligence, and parental income as another, probably weaker, proxy, then this makes sense. Brighter parents have brighter children, and this study shows us that brighter children have bigger brains.

As you will be aware, I am blessed with a sharp-minded readership. One reader Franklin D Madoff spotted far more material than I did in all the supplementary spread sheets, and another  reader Emil OW Kirkegaard has very quickly crunched that data and will work on it further before posting it up himself early next week.

Emil says: The sample appears to be heavily selected for cognitive ability; participants may have been selected from university neighbourhoods, which would explain why performance scores and incomes are high; the actual results may reveal a general factor of brain size, and there are some interesting other findings on the sub-groups.

I would demand more of him, but Emil tells me that despite the obvious pleasures of data analysis he has a social engagement tonight, and I would not dream of standing in the way of Danish conviviality, which I so recently experienced first hand a few days ago in Copenhagen. Happiest country in the world, according to global surveys. It is a pity this important finding was not made known to Emil’s distant ancestor, Soren, who cheerfully observed:

Since boredom advances and boredom is the root of all evil, no wonder, then, that the world goes backwards, that evil spreads. This can be traced back to the very beginning of the world. The gods were bored; therefore they created human beings.



  1. The R^2 values they report are surprisingly large, I'd say. Their measures of socioeconomic status "explain" about 40 percent of differences in brain surface area. In contrast, SES "explains" perhaps 10-15 percent of IQ differences, and while IQ explains at most 10 percent of brain size differences.

    I wonder if their high R^2 values are due to the fact that they bin their SES data, thus removing most variance.

    1. The R^2 value is for their entire model, not the SES variable.

  2. "Binning" the crucial education and income data may have caused problems. Emil has re-done all the regressions, and is now doing a factor analysis of the brain measures. I ignored the attempts at brain localisation because of my doubts about the representativeness of the sample.

  3. One does not, of course, wish to stir up derision, but how many of the data are self-reported? I mean, does anyone check the purported salaries, PhDs and whatnot? The only time I ever bothered to check the claimed educational qualifications of a job applicant, I found that he'd lied. By way of corroboration, I cite the fact that he later became an MEP.

    My wife and I have known two people who claimed to have two PhDs. One really had; the other hadn't and was eventually sacked for her lies, from an Chair in an ancient university. Mind you, she also claimed to be a concert-grade pianist, which really should have given the game away earlier.

  4. Self report often contains self promotion, except where recording number of meals and snacks is concerned. The deceit and boasting often happens even in compensation cases, where there is a big incentive for claimants to exaggerate, and a commensurate incentive for the defense to ask for documentary proofs of attainments. One psychiatrist I knew was destroyed in the witness box because he averred that the claimant was being truthful about being traumatized, but was then revealed to have lied about his O level qualifications (which, if correct, would have boosted his compensation considerably).

  5. Both the income and education variables are sub-optimal. When you have something like "less/more than x years" or "less/more than xxx income" you're likely to have a problem of censoring data. I have talked about it in this article. It's not to say that the result can't be trusted. What censored data involves is an intercept value and regression value in the analysis uncorrected for censoring that are different from the analysis corrected for data censoring. The stronger is the censoring effect, the more bias will your analysis be. To know this, you have to look at the data distribution (using histogram). If you see that the data tend to form a cluster either at the lower tail or upper tail, you have a censored data distribution. Here's an example of censored distribution. Is it the case here ?


    One of the largest studies of fetal head circumference cited in the “Handbook of Anthropometry” (Edited by Victor R. Preedy) concluded that pre-natal head growth in Nigerian infants is genetic but post natal head growth is environmental. It’s hard to argue their findings when one sees the average height of Nigerian men is 5’4″. There’s no information given on social status of infants, but I would imagine that poorer Nigerians have more kids. Take a look it’s very detailed. Quote from conclusion:
    “We also conclude that modern studies, which implicate HC in intelligence, suggest that the well known decline in HC in the African vis-à-vis other racial groups occurs after birth and not before it. We are equally aware that early maturation does not necessarily indicate better intelligence. However, we suggest that post natal decline in velocity of HC in the African is probably due to low nutrition, and that with improved diet, this seeming difference will be corrected.”


  7. thanks, will look. However, not likely that US circumstances will be as found in Nigeria