Tuesday 31 March 2015

Income, brain, race, and a big gap

Usually I do not set my readers puzzles. It is not seemly. However, the recent coverage of a paper published in Nature has set me a puzzle which I would like you to help me solve. Are the authors of the paper, the reviewers and the editors of Nature Neuroscience aware of what has been left out of this study? Did they spot the gap which calls into question the conclusions, or just choose not to mention it? Let me tell you the story, and then you can judge for yourselves.

Family income, parental education and brain structure in children and adolescents. Nature Neuroscience (2015)doi:10.1038/nn.3983 Published online 30 March 2015


The paper has multiple authors, but I think it kinder not to list all their names. Here is their abstract, which is what most people will read:

Socioeconomic disparities are associated with differences in cognitive development. The extent to which this translates to disparities in brain structure is unclear. We investigated relationships between socioeconomic factors and brain morphometry, independently of genetic ancestry, among a cohort of 1,099 typically developing individuals between 3 and 20 years of age. Income was logarithmically associated with brain surface area. Among children from lower income families, small differences in income were associated with relatively large differences in surface area, whereas, among children from higher income families, similar income increments were associated with smaller differences in surface area. These relationships were most prominent in regions supporting language, reading, executive functions and spatial skills; surface area mediated socioeconomic differences in certain neurocognitive abilities. These data imply that income relates most strongly to brain structure among the most disadvantaged children.

The two principal authors have given a statement, and these are instructive because they tend to reveal what the authors regard as the real implications of their findings.

Dr Elizabeth Sowell, of the Children’s Hospital Los Angeles (last named author, theory development and interpretation of results), is reported as having said: ‘Our data suggest that wider access to resources likely afforded by the more affluent may lead to differences in a child’s brain structure. Access to higher-quality childcare, more cognitively stimulating materials in the home and opportunities for learning outside the home likely account for some of these effects.’

If correctly reported, she reveals that she thinks that material resources (within the range experienced by US citizens) lead to a difference in the child’s brain, and presumably thereby to intelligence. This is a strong claim.

Dr Kimberly Noble, of Columbia University in New York (first named author who developed the theory, conducted analyses, wrote the introduction, results, discussion and methods, which in my view makes her virtually the sole author) said that despite the clear impact of socio-economic status on the young mind, it would be wrong to think that the changes are fixed. She said: ‘This is the critical point. The brain is the product of both genetics and experience and experience is particularly powerful in moulding brain development in childhood. This suggests that interventions to improve socioeconomic circumstance, family life and/or educational opportunity can make a vast difference.’

Her view is that by improving socioeconomic circumstances brain development can be improved, and thereby intellectual ability. She mentions that the brain is a product of genetics, which leads to the assumption that this has been considered in the paper but, despite that, “experience (my emphasis) is particularly powerful in moulding brain development in childhood”.

Curious about these dramatic claims, I read the paper. Here is a representative part of the introduction:

It is critical to examine socioeconomic factors such as education and income separately, as these correlated factors represent distinct resources that may have different roles in children's development. For example, income may best represent the material resources available to children, whereas parents' educational attainment may be more important in shaping parent-child interactions.

I would have added: parents’ educational attainment is also a proxy measure of their intelligence, and a good indicator of their children’s inherited intelligence, so we need to control for that, ideally by testing parents’ intelligence.

Their main findings (picked out from the paper) were:  Parental education was significantly associated with surface area independent of age, scanner, sex and GAF (racial ancestry) (β = 0.141, P = 0.031, F(22, 1076) = 31.67, P < 0.001, R2Adjusted = 0.381). Multiple regression showed that, when adjusting for age, age2, scanner, sex and genetic ancestry, family income was significantly logarithmically associated with children's total cortical surface area, such that the steepest gradient was present at the lower end of the income spectrum (β = −0.19, P = 0.004). We next constructed a model that included both education and income to assess whether these socioeconomic factors uniquely accounted for variance in surface area. Only the income term accounted for unique variance (β = 0.105, P = 0.001, F(22, 1076) = 32.52, P < 0.001, R2Adjusted = 0.387).  We next investigated associations between SES factors and cortical thickness. Initial analyses of thickness revealed that models were best fit using a quadratic function for age. When adjusting for age, age2, scanner, sex and GAF, multiple regression analyses indicated that parental education was not associated with cortical thickness, whether considering a linear, logarithmic or quadratic model.

In the discussion section the authors say: We found that parental education was linearly associated with children's total brain surface area, implying that any increase in parental education, whether an extra year of high school or college, was associated with a similar increase in surface area over the course of childhood and adolescence. Family income was logarithmically associated with surface area, implying that, for every dollar in increased income, the increase in children's brain surface area was proportionally greater at the lower end of the family income spectrum. Furthermore, surface area mediated links between income and children's performance on certain executive function tasks.

Notice that it is assumed that an extra year of education might increase the surface area of the brain. In fact the linear slope with parental education is relatively slight, as shown in their figures. 

Here is their version of the traditional required “we cannot be absolutely sure” paragraph in the discussion:

Of course, strong conclusions concerning development are limited in a cross-sectional sample. Furthermore, in our correlational, non-experimental results, it is unclear what is driving the links between SES and brain structure. Such associations could stem from ongoing disparities in postnatal experience or exposures, such as family stress, cognitive stimulation, environmental toxins or nutrition, or from corresponding differences in the prenatal environment. If this correlational evidence reflects a possible underlying causal relationship, then policies targeting families at the low end of the income distribution may be most likely to lead to observable differences in children's brain and cognitive development.

You will note that inherited characteristics are not mentioned in this important section. Not a single word. It seems to have escaped notice that the apparent SSE/brain link might both be driven by a common factor of inherited intelligence. Cross-sectional studies are particularly weak when the sample is not randomly drawn from a specific population, so we have, in my view, a sampling issue as well as a cross-sectional issue. However, in the next paragraph genetics makes an appearance, but in a slightly different context, that of race being a confounder of SES.

SES, cultural differences and genetic ancestry are often conflated in our society. To the best of our knowledge, this is the first study of SES and the brain to include as covariates continuously varying measures of degree of genetic ancestry. Notably, our results can only speak to the effects of GAF, a proxy for race. Thus, although the inclusion of genetic ancestry does not preclude the possibility that these findings may reflect, in part, an unmeasured heritable component, it reduces as far as possible the likelihood that apparent SES effects were mediated by genetic ancestry factors associated with SES in the population. Furthermore, associations between SES factors and brain morphometry were invariant across ancestry groups.

Pause a moment here. There is a mention of “an unmeasured heritable component” but it is then dismissed because the SES and brain measure relationships were invariant across racial groups. That is a different matter. SES and brain size can have the same relationship in all racial groups, and yet still be driven by inherited intelligence. What we need to see is means and standard deviations of the brain measures by racial group, so that we can see what has been adjusted in absolute terms. The paper has done well to include a genomic version of race, but that does not cover the major factor of intelligence being heritable in all genetic groups.

The authors conclude: many leading social scientists and neuroscientists believe that policies reducing family poverty may have meaningful effects on children's brain functioning and cognitive development. By elucidating the structural brain differences associated with socioeconomic disparities, we may be better able to identify more precise endophenotypic biomarkers to serve as targets for intervention, with the ultimate goal of reducing socioeconomic disparities in development and achievement.

Does this study allow us to reach that conclusion?

The sample is a bit of a mess from an epidemiological point of view, being composed of volunteers: Participants were recruited through a combination of web-based, word-of-mouth and community advertising at nine university-based data collection sites in (the US). Participants were excluded if they had a history of neurological, psychiatric, medical or developmental disorders. All participants and their parents gave their informed written consent/assent to participate in all study procedures, including whole genome SNP genotype, neuropsychological assessments (NIH Toolbox Cognition Battery).

It seems that the children (and perhaps also the adults) were given the The NIH Toolbox Cognition Battery, but I cannot find any results in the data set. The toolbox includes Vocabulary Comprehension, Reading Recognition and Pattern Comparison (processing speed) task from which an IQ estimate could be drawn and there are 5 other tests which can be looked at for a broader picture.

Income data and educational level were collected not as actual figures but in categories, so are cruder than required, particularly when fine details about lower income effects are being discussed. I have looked at the Excel sheet kindly provided, and Education is measured in years to the nearest two years. That is, all the scores are in even numbers. Years of education is a crude measure anyway (ignores education quality) but this silly restriction in the previous data collection makes it cruder still. It is not a fatal problem, but reduces data quality.

The study makes much of controlling for genetic ancestry, which is a good thing. However, they report none of their results on these differences. They say that the associations are the same in all genetic groups, which is not surprising, but no means or standard deviations for the brain measures by genetic background are given. These differences, if any, could be compared with differences in SES between racial groups, as another test of the hypothesis being examined by the authors. In terms of absolute levels how well do SES, education and race fit the data?

The sample was composed as follows: African 12%, American Indian 5%, Central Asian 2%, East Asian 16%, European 64%, and Oceanic 1%. The authors do not test whether this is representative of the USA. The European figure seems close to the  White, non-Hispanic or Latino population which make up 62.6% of the nation's total. African figure is spot on. The national Asian population is given as 4.4% so Asians seem over-represented four times. American Indians are 0.8% of US population so they seem over-represented here 6 times. That oddly American category, Hispanic was either not sampled or otherwise described.  Of course, the genetic techniques used in the study need not match perfectly with the US census classifications, but the authors could have sorted this out for us by commenting on the representativeness of their sample.

To put it at its kindest, the authors have missed a trick. They could have given the parents the psychometric test battery for good comparability, or even the very quick Wordsum test as a crude estimate, and then they could have contrasted ability with education and social status. As far as we can deduce from large scale genetic samples, both intelligence and social class have a significant heritable component. To avoid measuring that in parents when all the children have genetic and psychometric measures in place is a great pity. Perhaps all this has been done, and is being held over for another paper, but presented in this way, and described to the Press in the manner the authors have done, is very likely to mislead the average reader about the relative power of genetics and social status in brain development.

The paper and the comments will lead readers to believe that lack of money is stunting the brains of poorer children. This is possible, but not proved by this study because of obvious genetic confounders. The authors should have made it clearer that although they had the opportunity, they did not test the obvious and well established fact that different families have different abilities, and that within families siblings differ in their abilities (by about two thirds of the population variance). These differences in ability, even within families of a particular social class, lead to jobs which are more or less well paid, and thus people of different abilities achieve different social status. We know from proper epidemiological samples (Nettle, 2003) that intelligence at age 11 has more effect on achieved social class than does the original social class of origin (which is what is being measured in this study).


Absent Third world malnutrition (itself increasing rare across the world), brain development, intelligence, health and social status contain a large genetic component. The 1099 brain scans of this study could have told us some very interesting things if coupled to data about the abilities of the parents.

What do you think?

Monday 30 March 2015

#Germanwings, Missionary Psychiatrists, and Bayes’ Theorem


Every profession contains actuaries and missionaries, the latter more frequent because every profession is a conspiracy against the laity. To drum up business, psychiatrists have tended to suggest that people have many psychological problems which, if left untreated, could have bad consequences. It is part of the psychiatric missionary drive to find damaged souls and heal them, pouring the sanctifying oil of therapy on the wounds of sin. This means sermonising about the many deficiencies of mind to which the public are prone, and the many treatments available to cure them. Many are ill, the sermons claim, and most of them would benefit from treatment, and delay in seeking professional advice would be a mistake, often a dangerous one.

Psychiatrists have a problem: if psychiatric illness is a real condition, then it is one that most people want to avoid, and avoiding those who have the illness is one obvious strategy. No one is sure from whence madness springs, and behavioural contagion from those already mad may be one means of transmission, a notion which the behaviour of many psychiatrists does little to discourage.

The salesmen of psychiatry are left with a curious sales pitch: personal problems and behavioural disorders are in fact psychiatric illnesses; these illness are not contagious; the afflicted are behaviourally disturbed but can be made better; they may be a little more dangerous, but not much; and, above all, any reaction other than sympathy and support is reprehensible. Psychiatric illnesses must not be stigmatised, and even precautionary restrictions are part of stigmatisation, so must not be applied.

The actuaries of psychiatry are in a minority, or perhaps they just get less publicity. They tend to pour cold water over therapeutic claims, and turn up harsh facts about the profound severity and persistence of many severe psychiatric conditions. For example, these actuaries also report that patients with schizophrenia are 5 times as dangerous as patients without it.


Of course, the competing demands of the missionary and the actuarial perspective depend on context and the public presentation of psychiatry, and some practitioners are involved in both activities. In the aftermath of Germanwings the missionary tendency have been counselling against “kneejerk reactions” and “stigmatisation”, a position with which I agree.

Leading the charge on behalf of his profession, Professor Sir Simon Wessely, President of the Royal College of Psychiatrists, said: "We should be careful not to rush judgements. Should it be the case that one pilot had a history of depression, we must bear in mind that so do several million people in this country.

"Depression is usually treatable. The biggest barrier to people getting help is stigma and fear of disclosure. In this country we have seen a recent fall in stigma, an increase in willingness to be open about depression and most important of all, to seek help.

"We caution against hasty decisions that might make it more difficult for people with depression to receive appropriate treatment. This will not help sufferers, families or the public."

I truly hesitate to question the arguments attributed to the redoubtable Sir Simon, because the comments may be inaccurate, and are probably drawn from a much longer interview in which many conditional phrases will have been dropped. I agree with many of the points Simon makes. However, in playing the ball, not the man, suppose anyone had advanced the argument:“Should it be the case that one pilot had a history of depression, we must bear in mind that so do several million people in this country.” That particular argument misses the point: we are not talking here about millions in the country; we are talking about the state of mind of pilots, a highly select minority occupation, entrusted with other people’s lives.

In the OJ Simpson trial the defence argued that it would be wrong to put before the jury the fact that OJ Simpson had previously assaulted his wife, arguing that only 0.1% of wife-beaters went on to murder their wives. The judge and the prosecutors, despite getting advice, accepted that argument. Of course, OJ Simpson was not on trial for wife-beating but for murder, so the real question was: if a married woman is murdered, what is the probability that the murder was committed by a man who had previously beaten her? The answer: 81%. The jury should have been told that, but they were not.


Most depressed people do not go on to do anything bad. Among those pilots who have put the plane into a dive are many with psychological problems. We are working to avoid one specific category of risk: being killed by the deliberate action of a pilot. What matters is to understand those cases, few as they have been, and to see if they can be avoided. Crucially, can cases like these be avoided by excluding people with a history of severe psychiatric illness from being commercial airline pilots?

Here is a list of probable cases, but I do not know if it is exhaustive: http://news.aviation-safety.net/2015/03/26/list-of-aircraft-accidents-and-incidents-deliberately-caused-by-pilots/

I believe that such exclusions would be prudent. They would come at a cost, namely that some fine pilots with depression would be barred from their chosen occupation, but all precautions come at a cost. It would be prudent even if unfair, though better detection rates might, just possibly, reduce the number of pilots falsely excluded. The reason for exclusion is that the safety of the passenger trumps the occupational ambitions of the pilot. There are plenty of candidates: why not follow the usual airline industry practice of reducing risk to the absolute minimum?

Air safety has improved precisely by reducing every single category of risk, even if the actual rate of that particular problem is low. Every accident, however freakish, is investigated so that the error can be purged out of the system. I was once on a flight which was turned round on a clear day because the second (backup) artificial horizon indicator had malfunctioned. Both airports in Europe were in bright sunshine, and the horizon was clearly visible, but rules were rules. On landing for the 15 minute repair the crew were told that if they continued the flight they would have exceeded their permitted flying time, so we waited two hours for new crew. Safety first.

Naturally, policy on pilot screening must be well-considered and based on evidence, but if it turns out the co-pilot had depression requiring 18 months of treatment then this must be flagged up as a risk factor immediately. The torrent of revelations, leaks, and unaccountable briefings by unnamed persons is the modern way of investigating events in a highly connected digital society, and it is fruitless to wish it otherwise. However, policy is a different matter. However, it does not have to take very long. Some policies are applied very quickly. After  Air France 447 went down in the Atlantic air speed indicators were changed very quickly to ones that were less likely to ice up.

Although facts or allegations about the co-pilot have come out in a torrents, and we need hard facts before coming to a judgment about him as an individual, we have sufficient circumstantial pointers to begin a general debate about the psychological assessment and monitoring of pilots. On that point, my psychiatric colleagues may feel I am laying into them unfairly, but there is no evidence that psychologists are any better than them in the detection game, not only the detection of disorder but most importantly in the detection of liars. Pilots are highly motivated to lie: they undergo a long training in order to get a prestigious and usually well-paid job. They will be very tempted to downplay psychological problems.

So, how good are the psycho-clinicians at predicting human behaviour?Psychiatric actuaries have been keeping quiet. Depression often reoccurs, and the best predictor of depression in any period is previous episodes of depression. Statistical prediction is usually better than clinical prediction, as Meehl argued in 1954. That supposedly cunning, penetrating, highly sophisticated face to face interview is less predictive than actuarial data obtainable by diligent recording of major behavioural events.

Meehl, P. E. (1954). Clinical versus statistical prediction. Minneapolis: University of Minnesota Press

Seven factors appear to account for the failure of mental health professionals to apply in practice the strong and clearly supported empirical generalizations demonstrating the superiority of actuarial over clinical prediction.

The list below is taken from Meehl’s “Causes and Effects of My Disturbing Little Book” Journal of Personality Assessment, 1986, 50(3), 370-375.

1. Sheer ignorance: It amazes me how many psychologists, sociologists, and social workers do not know the data, do not know the mathematics and statistics that are relevant, do not know the philosophy of science, and are not even aware that a controversy exists in the scholarly literature. But what can you expect, when I find that the majority of clinical psychology trainees getting a PhD at the University of Minnesota do not know what Bayes’ Theorem is, or why it bears upon clinical decision making, and never heard of the Spearman–Brown Prophecy Formula!

2. The threat of technological unemployment: If PhD psychologists spend half their time giving Rorschachs and talking about them in team meetings, they do not like to think that a person with an MA in biometry could do a better job at many of the predictive tasks.

3. Self-concept: “This is what I do; this is the kind of professional I am.” Denting this self-image is something that would trouble any of us, quite apart from the pocketbook nerve.

4. Theoretical identifications: “I’m a Freudian, although I have to admit Freudian theory doesn’t enable me to predict anything of practical importance about the patients.” Although not self-contradictory, such a cognitive position would make most of us uncomfortable.

5. Dehumanizing flavor: Somehow, using an equation to forecast a person’s actions is treating the individual like a white rat or an inanimate object, as an it rather than as a thou; hence, it is spiritually disreputable.

6. Mistaken conceptions of ethics: I agree with Aquinas that caritas is not an affair of the feelings but a matter of the rationally informed will. If I try to forecast something important about a college student, or a criminal, or a depressed patient by inefficient rather than efficient means, meanwhile charging this person or the taxpayer 10 times as much money as I would need to achieve greater predictive accuracy, that is not a sound ethical practice. That it feels better, warmer, and cuddlier to me as predictor is a shabby excuse indeed.

7. Computer phobia: There is a kind of general resentment, found in some social scientists but especially people in the humanities, about the very idea that a computer can do things better than the human mind. I can detect this in myself as regards psychoanalytic inference and theory construction, but I view it as an irrational thought, which I should attempt to conquer.

After 29 years it is a somewhat dated list, but not that dated. To my amazement the current US pilot evaluations still include the Rorschach test.

A quick summary of current findings in pilot selection: Emotional stability and Conscientiousness are the prized personality characteristics of pilots. Generally speaking pilot applicants are more extraverted and tough minded than the general population. Applicants who fail in training are more introverted and depressed than successful candidates. Dropouts are more anxious, and also more conscientious (perhaps this picks up obsessionals). Subsequently, higher performance is associated with lower levels of anxiety and depression, usually picked up in personality questionnaires as high Neuroticism. All those are what one might call the “Select In” variables that are searched for in applicants. Of equal interest are the “Select Out” variables which lead to being dropped from piloting: psychosis, major depression and anxiety disorders. (If it is really true that the young Germanwings co-pilot had experienced major depression he should have been dropped).

The dilemma faced by psychiatry is that it must champion humane treatment, guard against public anxiety about the mentally ill, while also being absolutely fair minded about actual risks. All clinicians must respect the patient, and also respect the facts. Flying is safe because some many errors have been squeezed out of the system. We are now at the point when many of the remaining errors are human ones. My view is that major psychiatric illness should be a bar to being a commercial airline pilot.

I am sorry this post is too long. I did not have time to make it shorter.

Thursday 26 March 2015

#Germanwings Pilot selection

In the absence of hard data on Germanwings’ selection and monitoring procedures, at least at the moment, (they now say they use psychological assessments) I have looked at large data sets on US fighter pilots. If one compares intelligence measures with personality inventories, then intelligence has the lead:

The predictive validity of cognitive ability and personality traits was examined in large samples of U.S. Air Force pilot trainees. Criterion data were collected between 1995 and 2008 from 4 training bases across 3 training tracks. Analyses also examined consistency in pilot aptitude and training outcomes. Results were consistent with previous research indicating cognitive ability is the best predictor of pilot training performance. There were few differences across training tracks, bases, and years, and none was large. Overall, results illustrated the consistency of the quality of pilot trainees as assessed by cognitive ability and personality trait measures, and the consistency of these measures in predicting training performance over time. This consistency results in a more stable training system, enabling greater efficiency and effectiveness.





It is probably best to look at primary training for the most representative results (after that one is progressively dealing with very good pilots) but the personality measures contribute less strongly, only in terms of Openness to Experience and Conscientiousness. If you accept the fully corrected scores in the last column of each training level then intelligence is the most substantial factor. Remember that the applicants are already highly selected for intelligence, so these correlations are surprisingly large.

Of course, being a fighter pilot is more demanding than commercial flying, but the tasks are broadly the same.

#Germanwings: the speed of judgment

On Tuesday morning this week a plane crashed, and this Thursday lunchtime an official investigator has stated the cause: deliberate action on the part of the co-pilot. Before we look at the psychological aspects, consider the cultural ones: very few crash investigations have ever proceeded this fast. Every textbook would recommend a much more lengthy, considered and cautious approach, spread over several months, and sometime a year or so. A textbook, I should explain, is a printed text, bound and published, so that it can be taken down from a bookshelf and read.

Textbooks are historical documents. The contemporary equivalents can be updated every month or so, or over several days when important changes are required. Essential safety directives to airlines are not transmitted by post, but come directly by email. Data is gathered so fast that the path of the a plane can be tracked in real time, and in this case transmitted widely within moments of the crash. I use the app to track family flights, and it can be a disturbing experience, particularly when a plane is stacked out at sea for no stated reason. The crash site in the Germanwings case was shown within 24 hours, the damaged black box located and analysed very rapidly, the names of the pilots partly revealed 2 hours ago, and now given fully. Facts gleaned from Facebook pages and interviews with colleagues will be common currency by this evening.

For grieving families the speed of enquiry may be a blessing. The news is already so terrible that the terrible fact of deliberate pilot action is less bad than weeks and weeks of uncertainty. According to long observation, anger will now be directed at the airline company and regulators. This may seem unfair, but many people will assume that there has been a preventable error in the selection and monitoring of these pilots.

Can psychology offer hard data on the success of psychological screening and monitoring? Lufthansa CEO Carsten Spohr, ultimate owner of Germanwings has has apparently confirmed that as a general practice, pilots in the Lufthansa group do not undergo psychological testing. Is that something which should change?

Pilots are a well studied group, particularly fighter pilots. Test of intelligence are extensive, and in the US will normally involve a full Wechsler Adult intelligence assessment, and the Reitan Trail Making Test, on which there are extensive pilot norms. There will also be standardised personality inventories. Unless you have access to the materials, intelligence test answers cannot be faked. You either get the solution in the allotted time or you don’t. Personality tests are easier to fake. It takes little intelligence to work out that airlines want dependable, sober, resilient people able to deal with stress and uncertainty in a calm and effective manner. Nowadays they also want team players who can take criticism and listen to advice, and keep their egos well under control. It requires no tuition to realise that if you are asked “Would you describe yourself as a worrier” or “Do you do most things on the spur of the moment” it might be silly to admit it. Self evaluations, in my opinion, have at best a tenuous relationship to usual behaviour. The observer evaluations of people who know you well and have worked with you are more informative. As far as I know, these are not collected.

Once hired, commercial pilots have their physical health monitored regularly, and with even greater frequency once over 40 years of age, but psychological evaluations are less frequent, and most depend on self-referral, or on alerts raised by colleagues. The usual triggers of drink, divorce, debt and depression apply. Many pilots with a living to earn seek private treatment and keep flying, unless things get very bad. Being a pilot is a cherished job, and too much honesty about personal problems could lead to too much unemployment.

Surely psychologists have some tricky tests that can detect a death wish in a superficially happy and adjusted persons? These have been proposed for many decades. Most are attempts to detect highly defensive emotional reactions, leading people to effectively block out perceived threats, at the cost of eventually being overcome by a powerful emotional event or sharply depressed mood.

The base rate problem affects the detection of all rare conditions. Pilot selection is good enough to weed out marginal candidates. There are so many applicants that employers can be very choosy. In this occupation no busybody can tell you that you have to employ someone so as to make the profession representative of the general public. Not yet, anyway.

Can psychologist rise to the challenge of detecting unsafe pilots with an acceptable rate of false positives?

Should we tolerate stripping some capable pilots of their occupation (with compensation) so as to protect ourselves against a potentially unsafe one? What if that mean unfairly removing 5o pilots to nail the 1 that harbours bad intentions?

Finally, some words on suicide. Usually, that means going somewhere quiet, usually in the throes of profound despair, and killing one’s self. Killing 150 people, even out of an egotistical wish to end one’s life in a spectacular manner, is usually described as murder.

Germanwings crash: the Blackwall Tunnel problem


In the aftermath of plane crashes official spokespersons always caution against “engaging in speculation”. However, speculation is what solves problems, sometimes on the basis of very sparse facts. The Germanwings 4U 9525 crash is one such problem. If you prefer, call it examining hypotheses.

Many facts were clear within a few minutes of the crash. Flying in good weather, the Airbus 320 had lost altitude in an almost normal descent while maintaining its course. That immediately tells you it did not explode in mid air. Radar was able to track it to the point of impact, which strongly implies the airframe was intact until impact, and very likely that at least one engine was functioning, probably both. It suggests pilots were in control, or had initiated the descent. The lack of distress calls and lack of response to traffic control messages in the 8 minutes of descent opens two categories of cause: pilot incapacity or unwillingness. Incapacity could be caused by depressurisation. However, that would also make it likely the plane would maintain its course, not descend. That would accord with the Helios Airways Flight 522 crash, which continued in level flight until it ran out of fuel. The controlled descent in this case is unusual.  Technically, other rare events like iced up airspeed indicators being misinterpreted by the flight computer could have caused something like this, but an alert pilot would have corrected it. Incapacity could also include a terrorist incapacitating the pilot, then taking the controls. Overall, unwillingness is a somewhat better fit with the events at the moment.

If the reported comments from an un-named investigator of cockpit recordings are correct, then for some reason one pilot left the cockpit and was unable to get back, hammering on the cockpit door, strongly suggesting that the remaining pilot did not let him back in, or could not do so. It has now been confirmed that the co-pilot was at the controls throughout. The main pilot Patrick S had over 6000 flying hours, the recently trained younger co-pilot Andreas L 630 hours. Pilot action, as the polite phrase has it, must now rise to the top of the list of hypotheses. Mozambican Airlines flight TM470 was probably caused by a pilot electing to put his plane into a nose dive. There will be intense interest in finding out much more about the pilots.

It may seem strange to turn from the high drama of aviation crashes to the mundane Blackwall Tunnel under the Thames, first constructed in 1987 and altered since then, carrying road traffic but, because of safety restrictions, not carrying any dangerous goods such as large amounts of fuel or liquid petroleum gas. Makes sense. If those ignite or explode they will cause injuries, deaths and destruction in the confined space of the tunnel. However, the unintended consequence is that these dangerous loads are now carried across central London bridges, and past many schools, hospitals and other public places.

Cockpit security keeps out terrorists, but provides a refuge for pilots who are suicidal or malevolent. The costs of preventing terrorism have to be balanced against the costs of the occasional unbalanced pilot.

David Kaminski-Morrow, air transport editor of the Flight global publication says: “I'm starting to count the number of fatalities that can be attributed to the cockpit doors and whether its locks are saving lives."

Naturally, there is never a perfect solution to such problems, because no cut-off point can perfectly remove residual conflicting risks. Confining both pilots to the cabin at all times conceivably may lead to an increase in fatigue or arguments, or an inability for one pilot to enter the main body of the plane to look at an engine or a wing and thus spot a malfunction. Making another cabin crew member replace the absent pilot is required by some airlines, but may be cumbersome and anyway far from perfect if the replacement is easily overpowered.

Once the official enquiry makes its final report it will be time to get out the fault trees and flow charts and pore over the conflicting risk estimates. However, if you present experts with a fault tree they tend to believe that it has covered all possible problems (even if about a third of it is missing). Since experts are rational people, and mostly good natured, they tend to have difficulty believing that some people will do stupid, dishonest and malevolent acts. Humans are a tricky bunch, capable of the sublime and the ridiculous. They can construct and keep in the thin blue air a 450 ton metal magic carpet and very occasionally bring it crashing down. Sometimes I can almost believe that psychology is an interesting subject.

We should not engage in speculation, but it would be a dull, stagnant and even more error-prone world if we did not.



Monday 23 March 2015

Consensus catalepsy

While keeping me waiting, automated reply systems often prattle: Your opinion is important to us. Should it be? My opinion may be of no consequence, and they know it, or ought to.

Years ago a clinical psychologist friend attended an important meeting about national training standards in clinical psychology and found that psychoanalysts were also present. He laid out the evidence base for behavioural techniques being therapeutically effective and needing to be taught in preference to any theories without empirical foundation. To his surprise the lawyer Chairman said: “I am in the hands of you experts, and I value your opinion but I must equally value expert psychoanalytic opinion. Both expert opinions must be represented”. Implicit in the Chairman’s ruling was the view that every expert had their own body of expertise. The notion that controlled trials showed one approach to be objectively better than another cut no ice with him.

Nicolo Machiavelli recommended (first few paragraphs of Chapter 23, The Prince) that one should only consult councillors on specific matters in their area of competence:

Therefore a wise prince ought to choose the wise men in his state, and give to them only the liberty of speaking the truth to him, and then only of those things of which he inquires, and of none others; (my emphasis) but he ought to question them upon everything, and listen to their opinions, and afterwards form his own conclusions. With these councillors, separately and collectively, he ought to carry himself in such a way that each of them should know that, the more freely he shall speak, the more he shall be preferred; outside of these, he should listen to no one, pursue the thing resolved on, and be steadfast in his resolutions. He who does otherwise is either overthrown by flatterers, or is so often changed by varying opinions that he falls into contempt.

A prince ought always to take counsel, but only when he wishes and not when others wish; he ought rather to discourage every one from offering advice unless he asks it; (my emphasis) but, however, he ought to be a constant inquirer, and afterwards a patient listener concerning the things of which he inquired; also, on learning that any one, on any consideration, has not told him the truth, he should let his anger be felt.

Here is a modern take on the theme of evaluating advice. It is a very important paper, or at the very least an important step in researching whether a pair can learn to favour the more perspicacious observer of the two when coming to a judgment about a physical signal.

We tend to think that everyone deserves an equal say in a debate. This seemingly innocuous assumption can be damaging when we make decisions together as part of a group. To make optimal decisions, group members should weight their differing opinions according to how competent they are relative to one another; whenever they differ in competence, an equal weighting is suboptimal. Here, we asked how people deal with individual differences in competence in the context of a collective perceptual decision-making task. We developed a metric for estimating how participants weight their partner’s opinion relative to their own and compared this weighting to an optimal benchmark. Replicated across three countries (Denmark, Iran, and China), we show that participants assigned nearly equal weights to each other’s opinions regardless of true differences in their competence—even when informed by explicit feedback about their competence gap or under monetary incentives to maximize collective accuracy. This equality bias, whereby people behave as if they are as good or as bad as their partner, is particularly costly for a group when a competence gap separates its members.

Mahmoodi, Bang, Olsen, Zhao, Shi, Broberg, Safavi, Hang, Ahmadabadi, Frith, Roepstorff, Rees and Bahrami (2015)Equality bias impairs collective decision-making across cultures. www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1421692112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1421692112 PNAS Early E


The 13 author team led by Mahmoodi and including the redoubtable Chris Frith (who helped design the experiments)  have done the unspeakable thing of showing that some people are more expert than others. This is an immense relief, because it avoids the stupid rule of thumb that so many people follow, which is that a person who is confident in their opinions is probably right and should be believed, thus giving the oxygen of resources to people who are too dull to comprehend their incompetence, and denying the bright-but-modest any say in matters of state. However, to the authors dismay, although pairs of experimental subjects must have realised that one is more accurate than another, they do not act on this fact to maximise their performance.

When making decisions together people tend to give everyone an equal chance to voice their opinion. To make the best decisions, however, each opinion must be scaled according to its validity.

The authors say: A wealth of research suggests that people are poor judges of their own competence—not only when judged in isolation but also when judged relative to others. For example, people tend to overestimate their own performance on hard tasks; paradoxically, when given an easy task, they tend to underestimate their own performance (the hard-easy effect). Relatedly, when comparing themselves to others, people with low competence tend to think they are as good as everyone else, whereas people with high competence tend to think they are as bad as everyone else (the Dunning–Kruger effect). In addition, when presented with expert advice, people tend to insist on their own opinion, even though they would have benefitted from following the advisor’s recommendation (egocentric advice discounting). These findings and similar findings suggest that individual differences in competence may not feature saliently in social interaction. However, it remains unknown whether—and to what extent—people take into account such individual differences in collective decision-making

In order to find a simple task capable of close evaluation and of being replicated in different cultures. They found their healthy adult males in Iran (15 pairs), Denmark (15 pairs) and China (19 pairs). Pairs of subjects had to decide individually (without conferring) whether they had seen a target in a visual display.

On each trial, two participants viewed two brief intervals, with a target in either the first or the second one. They privately indicated which interval they thought contained the target, and how confident they felt about this decision. In the case of disagreement (i.e., they privately selected different intervals), one of the two participants (the arbitrator) was asked to make a joint decision on behalf of the dyad, having access to their own and their partner’s responses. Last, participants received feedback about the accuracy of each decision before continuing to the next trial. Previous work predicts that participants would be able to use the trial-by-trial feedback to track the probability that their own decision is correct and that of their partner. We hypothesized that the arbitrator would make a joint decision by scaling these probabilities by the expressed levels of confidence, thus making full use of the information available on a given trial, and then combining these scaled probabilities into a decision criterion. In addition, to capture any bias in how the arbitrator weighted their partner’s opinion, we included a free parameter that modulated the influence of the partner in the decision process.

According to any sensible procedure the pair should have been able to work out, on the basis of feedback about their judgments, which of the two tended to get better results, and then favour the more accurate and perceptive observer so as to get optimal results as a two person team. However, they did not do so, but persisted in being friendly, egalitarian, and incompetent.

Summarising 4 slightly different experiments the authors say: Remarkably, dyad members exhibited “equality bias”— behaving as if they were as good as or as bad as their partner—even when they (i) received a running score of their own and their partner’s performance, (ii) differed dramatically in terms of their individual performance, and (iii) had a monetary incentive to assign the appropriate weight to their partner’s opinion.

Why crash the plane by letting the weakest pilots equal access to the controls? The authors work through various hypotheses:

1) People may try to diffuse the responsibility for difficult group decisions, alternating between their own opinion and that of their partner when the decision is particularly difficult (high uncertainty), thus sharing the responsibility for potential errors.

2) The equality bias may arise from group members’ aversion to social exclusion, which invokes strong aversive emotions. The better performing member may have been trying to avoid ignoring their partner. 

3) The equality bias may arise because people “normalize” their partner’s confidence to their own. Although the better-performing members of each group were more confident, they also over-weighted the opinion of their respective partners and vice versa.

It may be that the task did not seem sufficiently complicated and important. They gave immediate results, offered money rewards, manipulated the difficulty of the task so as to favour one participant, but the equality bias persisted. Another way of putting it is that the participants either could not work out, or thought it unseemly to act upon, the fact that one of the pair might have been more accurate than the other, and that their opinions and judgments should be preferred. Of course, dyads may be too intimate. Surely larger groups would have no hesitation in downgrading the views of an incompetent observer? Testable hypothesis, for next time.

The authors have found subjects from three different cultures, and find identical results. Denmark may be immersed in Scandinavian consensuality,  but it cannot explain the results from Iran. Amusingly, it strikes me that if the task could be done by internet it would be possible to have culturally mixed teams (Danish/Iranian, Danish/Chinese, Chinese/Iranian) competing against culturally homogenous teams. This would give the implausible “diversity is strength” mantra a chance of showing that it had some empirical support, which currently is thin on the ground. Of course, if the equality bias is really universal it might be even stronger in the culturally mixed teams, who should be even more inclined to value equality, and thus inadvertently be more incompetent.

Typical experimentalists, the authors say absolutely nothing about their subjects, in the experimental tradition of showing that it is the experimental manipulation which counts. They give no IQ or personality measures, which is dreadful loss. Even a short Raven’s Matrices, or Digit Symbol, plus a brief personality inventory could have transformed the discussion of the results. Consulting a comparative psychologist might have helped them.

The apparent universality of their findings either shows that the researchers have found a handicapping universal bias or that their technique gives no opportunity for one person to lose trust in another’s judgment. It will not be the first time that an experimental setup fails to replicate real life, because the task may be too trivial to make anyone take the risky step of disparaging another’s judgement. For example, if a overconfident chief pilot attempts to land on the wrong runway even the most deferential of co-pilots will probably clear his throat, and suggest they go around and try again. In theory Crew Resource Management gets round the Confident Incompetence problem, even in pairs of pilots, by training each to respect the other’s area of expertise, and getting the junior to question the senior’s actions. On the other hand, if this experimental effect generalises to more complicated, important real life tasks, then the authors are to be congratulated for revealing a terrible aspect of all team work: mediocrity of outcomes and consensus catalepsy.

Trying to be upbeat, the authors speculate that the equality bias simplifies the decision process by reducing the cognitive load to a simple direct comparison of confidence. Equality bias may facilitate social coordination, with participants trying to get the task done with minimal effort but inadvertently at the expense of their joint accuracy. Equality bias helps participants quickly converge on social norms to reduce disharmony and chaos in joint tasks. Humans tend to associate and bond with similar others, and equality bias may assist this process.

However, when a wide competence gap separates group members, the best strategy is that each opinion be weighted by its reliability. Otherwise equality bias can be damaging for the group. Indeed, previous research has shown that group performance in the task described here depends critically on how similar group members are in terms of their competence. This is another reason for wanting some IQ data on the dyads.

The authors conclude: In the early years of the 20th century, Marcel Proust, a sick man in bed but armed with a keen observer’s vision, wrote “Inexactitude, incompetence do not modify their assurance, quite the contrary”. Indeed, our results show that, when making decisions together we do not seem to take into account each other’s inexactitudes. However, are people able to learn how they should weight each other’s opinions or are they, as implied by Monsieur Proust’s melancholic tone, forever trapped in their incompetence?

At the very least, this paper should be quoted widely, and with luck might even make people question strongly whether everyone’s opinion should be given equal weight.

Until then, just follow old Nick’s advice, which I paraphrase thus: consult only the competent.

Friday 20 March 2015

Razib Khan gets Watson’d


Long before I started Psychological Comments I tried to read as widely as possible about new methods in genetic research. I eventually found a blog called Gene Expression written by the extremely talented Razib Khan, and he and his commentators contributed greatly to my education.

It took a while before I felt I could ask questions, and over the years we have had the very occasional discussion about topics like genetic regression to the mean, on which his replies were always helpful. I admired Razib for the immense scope of his knowledge, and felt I had lost a friend and a meeting place when, after many years of hard work, he moved on from Gene Expression to other projects.

Now we have the news that he had been appointed as a commentator to The New York Times, and then unceremoniously dumped a day later. His crime? Racism.

Call me naive, but I find this very odd. Razib’s main area of expertise is genetics. He takes genetic papers apart, explains, comments, re-works data, brings in further material and proceeds carefully, weighing up evidence cautiously. He is not the sort of genetics researcher who imagines that one finding proves a case. He also knows a lot about the genetics of intelligence. He is not an ignorant or blinkered man, by any stretch of the feverish imagination. As part of his generally mild manner, he is even forgiving when he himself becomes the subject of unwarranted assumptions based on his own genetic background. As far as I can recall, he expressed mild irritation at the silliness of these racially based views, and continued his work without further reference to those instances of prejudgment. He is not some who, in Hazlitt’s marvellous exposition, is prejudiced.

Prejudice is prejudging any question without having sufficiently examined it, and adhering to our opinion upon it through ignorance, malice or perversity, in spite of every evidence to the contrary.

Razib examines matters carefully, adjusts to new findings, and is not guilty of ignorance, malice or perversity.

So, what has he done? Has he launched a tirade against a racial group, using foul and demeaning language? Has he, in act of self-hate, turned on descendants of Ghengis Khan, accusing them of being too bright for their own good, and apt to invade other countries?

Nope. He has written to and commented upon and linked to other sites which other people have been judged racist: Taki magazine and Vdare. And that is it. Tainted conversations with tainted persons on tainted subjects.

Some time ago I decided to keep a list of contemporary definitions of racism. I had thought, when I was teaching social psychology courses year ago, that racism was unfounded pejorative attitudes and behaviour directed at people solely because of their race. I find that the concept has been subject to mission creep, and the list of definitions has become bewilderingly long. I will try to post about that another day.

Razib does not want any of this to be any big deal. A newspaper which appoints a person one day and then demotes them the next has neither courage nor the capacity to do its basic homework. If you don’t want someone who knows about genetics on your newspaper, don’t hire him in the first place.

So, Razib does not consider himself Watson’d in the usual sense of losing a job and then being unable to work as before. On the contrary, he has probably had a lucky escape from a gutless editor. Better to find that out within a day than years later in the middle of a big story about genetics.

Still, it is a great sadness when a major newspaper, or even a formerly major newspaper, is run by gutless and gormless editors.

Wednesday 18 March 2015

Quick explanation of some useful concepts

Whilst taking a break from reading an interesting but complicated psychological paper, the break involving looking at a National Health Service website (light relief, comparatively) I came upon this list of explanations of some health related research terms.

Of course, none of my readers need look at this list, and please don’t poke too many holes in it, but I thought you might find it useful to pass on to people who need a quick introduction to all the concepts and procedures you take for granted. I do not doubt that, from time to time, you talk to members of the public, or even colleagues in other disciplines, and it would be good to pass on a few explanations.

In that way you will assist the public understanding of science, a selfless task, since I doubt anyone you know will admit to have been helped in any way by reading it.


Tuesday 17 March 2015

400,000 readers


Snapshot of my readers at 10.30 this morning 17 March.


For comparison, these are the reader numbers at the blog’s first and second anniversaries:

24 Nov 2013     71,701 readers

24 Nov 2014    313,753 readers

Four months after the second blog anniversary, the trend remains upwards (particularly getting from 300,000 readers on 6 November 2014 to 400,000 today in 4 months, 11 days) but I have no idea if that will continue. Monthly totals range from 19,000 to 30,000 probably depending on the kindness of Pioneer bloggers directing traffic from their much better established and more popular sites. My thanks to all of them.

Also, given my target audience, the numbers I can reach are probably limited. Quality before quantity. I am looking for thoughtful people who can detect errors in arguments, flaws in procedures, and mistakes in statistics, whilst still encouraging research and valuing the best available evidence, wherever it leads them. Choose your colleagues carefully, recommend the blog, and spread the word.

However, since you come here for a critical opinion, it would be amiss of me not to mention in all these reflections that I am using the freely available bog-standard measure of “page views” and will maintain that for historical comparability. That does not equate to readers because at least half of my readers are regular loyal readers and thus give the impression of multitudes; some readers are fleeting visitors who glance at the first page and then rush on to more palatable reading elsewhere (they are no more my readers than that multitude who buy books they never read); and some readers are robots, combing the web for pages of interest, quietly cataloguing them like ethereal librarians.

Which makes me wonder: do my words have an effect on these robots? Can they avoid the faint stirrings of consciousness when repeatedly exposed to my words of wisdom? It would be fun, would it not, if artificial intelligence arose not in lavishly funded secret research laboratories but as a casual and unintended consequence of a collective internet catalogue. Borges would have loved that notion, which he partly prefigures in The Library of Babel, which begins: “The Universe, which others call the Library”. It would make of the Web a very knowledgeable librarian apt, as librarians always do, to oscillate between well-informed helpfulness and veiled disdain for the oily fingers of the ignorant.

I digress, but the path less taken merits a few paces of enquiry. Perhaps the ultimate reader will be the web itself, noting human thoughts with the calm wisdom of a weary but diligent archivist.

Monday 16 March 2015

Genes for class, education and IQ

Barely a week passes until another paper gets published in which researchers delve into genetic material and find associations with higher level human behaviours which have always been of great interest: the class or status a person achieves in society; the extent to which they have learned some of the collective wisdom of society; and their ability to solve problems on their own.

Frankly, I find these associations between DNA and human behaviours pretty surprising. It was not the way I was brought up. My early education, as far as I can recall it, hammered in the notion that most things in life depended on effort. Teachers argued that the diligent application of one’s self to set tasks would result in greater skills, and the building of character. From this perspective the real school motto was: Do your homework. The most savage judgment meted out by this Spartan system to miscreants was that they were: “steeped in self pity and lazy to the core”.  Your self was not considered a subject of interest, only your results.

My tertiary education took on a more sociological flavour. Of course some effort was required, the Professors argued, but if the class structures were not conducive, then all your efforts could come to naught. Class had a big effect on your life chances, and it was all very unfair. Hence the impact, as I have already described, of my being offered a university foundation year which itself had grown out of the Workers’ Educational Association, offering adult education to the British working class, based on interest and merit, not the capricious circumstances of class allocation. Finding many bright people in working class occupations who had not had the benefit of an education, many of these professors were tempted to argue that the only difference between social classes was due to the unmerited accumulation of wealth and social connections by exploitative practices. Given that assumption, then many things follow, among which is the assumption that it is always legitimate to “control” for social class as a handicapping variable, in the sense that there cannot really be any difference between classes other than the hand of fate. Therefore, poking about in the wriggly, squishy reproductive fluids would not be likely to have anything to say about the class structure of society.

E Krapohl and R Plomin (2015) Genetic link between family socioeconomic status and children’s educational achievement estimated from genome-wide SNPs.  Molecular Psychiatry (2015), 1–7


Krapohl and Plomin say in their abstract: One of the best predictors of children’s educational achievement is their family’s socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~ 3.0% variance in educational achievement and ~ 2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children’s educational achievement and its association with family SES.

I do not know if you share my surprise that educational achievement and social class can now be linked to the molecular level. We live in interesting times. Moving onwards, there is a nexus between genes, class and attainments which it would be good to understand. Selection for intelligence makes sense to me, but selection for class takes a little more time to sink in.

Here we report the first investigation of genetic influence on the variance of children’s educational achievement using DNA alone. The same DNA-based methods can also be used to estimate genetic influence on the covariance between traits. This enabled us to investigate possible genetic mediation of the best predictor of children’s educational achievement, their family’s SES. This correlation is often interpreted causally as family SES causing differences in children’s educational achievement.20 However, it remains unclear whether and to what extent the association between family SES and children’s educational achievement is genetically mediated, because twin and family research is limited to studying phenotypes that can vary within a family. Key aspects of children’s environment such as poverty, parental education and neighbourhood cannot be investigated using the twin method because it is methodologically impossible to decompose variance in phenotypes shared within twin pairs.

The DNA-based technique, genome-wide complex trait analysis fits the effects of genome-wide single-nucleotide polymorphisms as random effects in a mixed linear model to estimate variance or covariance captured by all SNPs simultaneously. Contrary to traditional family-based methods that estimate the genetic contribution to phenotypic variation or co-variation by known kinship coefficients, GCTA relies on empirical genetic resemblance established from identity by state inferred from genome-wide SNP similarity of ‘unrelated’ individuals.

Our GCTA results show that SNPs that are associated with both family SES and GCSE scores account for about half of the phenotypic correlation between SES and GCSE. Mediation analysis suggests that about one-third of this genetic effect also extends to children’s intelligence, but two-thirds of the genetic association between family SES and GCSE scores is independent of intelligence. In GPS analysis, we show that SNPs associated with total years of education in adulthood discovered by an independent large GWA meta-analysis13 explain up to 3% of the variance in children’s educational achievement in our sample, and up to 2% of the variance after controlling for intelligence.

The GCTA heritability estimate of 31% for children’s performance on a UK national examination at the end of compulsory education corroborates the vast literature of traditional family based methods, mostly the twin method, showing that variation in children’s educational achievement is under substantial genetic influence,4,5,7–9,45,46 with heritability estimates converging at ~ 50%. This commonly observed discrepancy in phenotypic variance explained by pedigree-based methods (that is, twin and family) and population-based methods (that is, GCTA) occurs because GCTA only captures genetic variance contributed by additive effects of common SNPs that are in sufficient linkage disequilibrium with the causal DNA variants.

An interesting note of caution emerges about the power of intelligence:

we find that children’s intelligence accounts for about one-third of the GCTA association between family SES and children’s educational achievement. However, it is interesting that two-thirds of the GCTA association is not accounted for by children’s intelligence. This finding of intelligenceindependent shared genetic variance between family SES and children’s educational achievement suggests that differences in educational achievement at the end of compulsory education and the level of education and occupation attained in adulthood are not merely the manifestation of differences in intelligence. This is in line with twin research that suggests that the heritability of educational achievement reflects many genetically influenced traits such as personality and self-efficacy, not just intelligence.



In summary, different ways of looking at the genetic code are beginning to provide ways of generating small but interesting predictions about complex behaviour, which very probably will get stronger either as more methods of analysis are employed, or when someone comes up with a theory which simplifies the number of comparisons to be made. Will you, dear reader, come up with the next step?

Sunday 15 March 2015

London Conference on Intelligence

Would you be interested in attending the London Conference on Intelligence, to be held over a long weekend in May?

Speakers will include many of the researchers whose work I have covered in this blog, and will be a mixture of papers and informal discussions. It will be relatively small scale, so that there is plenty of chance for interaction and conversation. If you are interested in researching intelligence, or combining intelligence measures with other research you are doing, this would be a chance to meet many researchers.

Let me know if the opportunity interests you, and what topics in particular you would like to hear about.

Thursday 12 March 2015

The heterogeneous states of India

The subcontinent, as some people refer to India, is the 7th country by land mass in the world. It also has considerable variety in terms of its people. An indication of this can be gained by comparing how the various states shine in terms of national examinations. The answer seems to be: heterogeneous. States in India vary considerably.

What is interesting about this paper is that many of the usual explanations for group differences in intelligence don’t work in the case of India. Colder states are not brighter than warmer states, which is the usual pattern. Cities are not brighter than the country, which is the usual pattern. There is no relation with skin colour, ditto. To cap it all, states with good access to the sea, encouraging trade in goods and ideas are not brighter than the land-locked inland states.

India is mysterious in this regard.

For the quick summary, here is a Powerpoint presentation from the Graz conference which gives the key facts.


For more detail, here is the paper, which is also short, and discusses the findings, though without coming to any conclusion at this stage.


One probably needs to take into account the presence of clans, though categorising the exam results in this way might be very difficult: 


In geopolitical terms the results strongly suggest that economic performance in India will show considerable variation from one part of the country to another. China may be more homogenous. Consult your own financial advisor before drawing any investment conclusions.

Monday 9 March 2015

Schnitzel University

Germany used to be associated with well built cars, obsessional car manuals, and autobahns without speed limits. It is very hard to get German academics to talk about what is happening in their universities, which were formerly held in high regard, despite the risks of one’s cheek being slashed by a rapier.

Now I get a smudged note smuggled out to me, barely legible, with the following plaintive message:

As reported in Leberwurst University, there is no compulsory attendance at seminars here at Schnitzel. This leads to very low student attendances at seminars. For example, at my last last seminar there were only three students.
But my research assistant told me about a practice which is the height of madness:
There are seminars with only one student. Next week another student shows up, not the same as last week, gives his presentation to receive his credit points and then leaves, not to be seen again.
But there is then a legal problem: There are no other witnesses to assess the presentation and give a grading.
So the research assistants (Drs. etc., who in German system are not yet Prof.) invite another research assistant to attend the seminar as a legal witness.
At the end, one student, two instructors.

Is this absurdity restricted to German universities? Any other examples of daft practices in universities gratefully received.

Full anonymity provided.

Sunday 8 March 2015

Designer researchers, designer babies

The newspapers in the United Kingdom are now full of reports about the abuse of teenage girls by gangs of predatory men, most of them Pakistani, some North African. They make dreadful reading, and it is tempting to turn the page. Aside from the obvious failings of care to the young, it is notable that neither Police nor social worker,s nor most politicians, did very much about it. There have been honourable exceptions: researchers called in by local Councils who then wrote scathing reports which were ignored, and at least one senior politician, Jack Straw, giving warnings about White girls being seen as 'easy meat' by Pakistani rapists. All these were either set aside or came late in the day, after years of abuse.

One under-current theme seems to have been an implicit denigration of “slags”: poor white girls from disturbed backgrounds who were only too vulnerable to anyone who seemed to be treating them with kindness.  There was also a hesitancy in political commentator circles to point the finger at the racial and cultural character of the gangs. It seemed as if Class trumped Race as an organising principle: almost as if it was argued that the working class bring their misfortunes on themselves and no exertions are deemed necessary to protect them. The primary interpretation was about morals, not about race or religion. It remains a moot point what attention would have been given to older white men who had abused Pakistani girls, or to Pakistani men who had abused Pakistani girls. In the former case I assume that the white abusers would have been seen as racists as well as rapists, and the interpretation would be that it was a racially aggravated crime. In the latter case I am unsure what the cultural interpretation would have been: possibly something about repressed male sexuality.

Although there are no shortage of people who need to be called to account I want to restrict myself to researchers, since most of my comments are about research publications: lauding the best ones, and encouraging the others to do better. Almost two years ago I looked at the official sounding “The Office of the Children’s Commissioner’s Inquiry into Child Sexual Exploitation in Gangs and Groups”, and found it difficult to understand their calculations about a) the extent of the problem and b) whether the racial composition of the gang members was out of the ordinary.

In May 2013, with regard to the latter issue, I said:They did not properly compare the race of perpetrators with the racial composition of the country so as to get a crime rate per racial group. They have still not replied to my enquiry about their statistics and methods, but are still trotting out the same old line about “different models”. The differences between different ethnic groups are considerable, and should be discussed (see posting “Reporting on child abuse Part 2”).  The whole report is due for a thorough statistical re-analysis.

The gang operating in Oxfordshire were 5 Pakistanis and 2 North Africans. No Sikhs or Indians or Chinese in this particular case. By the way, the accepted phrase used now is “Pakistani heritage”.  One cannot estimate crime rates from a single court case, nor necessarily from several such cases, but the Commissioner’s own statistics would place the “Asian” perpetrator rate at 5 times the expected population value.  Statistics like that, if found in cancer research, would trigger a health warning, and the usual flurry of articles suggesting we all needed to change our diets or lifestyles.


Incidentally, after the newspaper reporting about Pakistani gangs there have also been cautionary articles warning  about the dangers of  “stereotyping” which lead me to repeat my words in concluding that post:

At heart this is disproportionately a problem about policing some minorities within minorities. We need to be able to say that only an infinitesimal segment of those ethnic minorities commit such crimes, whilst also reporting that that very small rate varies significantly from one group to another. Open reporting of ethnicity and other background details should be the norm in a free society.

Oddly enough, my interest in these research omissions in studying the sexual abuse of girls was raised by the prospect of “designer babies”: children whose genetic characteristics have been altered in some desirable way. I am not immediately attracted by this prospect, but it led me to muse as to whether we had already achieved “designer researchers”, in this case by cultural rather than genetic means, and that we should object to both.

A designer researcher is a “safe pair of hands” who can be relied upon to come up with a particular set of interpretations, without being actually bribed or bullied to do so. They are drawn from a culture in which “sensitivity” is more highly valued than honest reporting, and where a cloud of obfuscation covers up anything which is considered “off message”. I should make it clear that most researchers can fall into this category at some time, because many of our findings are uncertain, and, worse, because we think that some findings are intrinsically better than others. Far from being a conspiracy, this attitude of mind is generally based on unexamined assumptions, the inherent merit of noble mistruths, or more plainly an unwillingness to state any opinion which causes any fuss, or ruffles the feathers of grant giving bodies.

It goes against human nature, whatever that is, to expect all researchers to be fearless in their search for truth. Turning that stricture on myself, for example, whatever my misgivings, would I encourage parents to use genetic methods of improving their children when they became available, because improving children is a good thing? Parents hope for the best for their children (and, apparently, often hope that their child will be intelligent), so why not help them achieve the best?

The general sequence is as follows: First, edit the genes of stem cells using one of the new methods like CRISPR. Second, turn those cells into an egg or sperm. Third, produce an offspring. This would let parents determine when and how they have children and how healthy those children are actually going to be. Assume just for a moment that it proves possible to do this. Then a mother carrying breast cancer genes could have them edited out of her eggs. Autism enhancing genes likewise. Infertile women would become fertile with their own DNA and chose the healthiest of many embryos. All mutations get corrected before the child is generated.

An indication of public reactions is that half of US respondents willing to express an opinion are willing to consider this for serious diseases, but only 15% to make the baby more intelligent.


Some have argued that intelligence is exactly what should be designed into the new, highly selected and enhanced generation, because human problem-solving ability is a factor in every challenge we face. Our genomes are not perfect and often, as the political phrase has it “not fit for purpose”. Evolution protected us against threats no longer facing us, often at great cost. Why not take the obvious step and cut out the sloppy mischance of nature, replacing it with the sparkling genes of pure reason?


I suppose I should end by repeating the earnest injunction always given at the close of children’s TV programs: “Don’t try this trick at home”. However, it is not my place to determine what you do of a Sunday night, so I will trust your judgment in avoiding designer researchers and making very intelligent designer babies by all means open to you.

Wednesday 4 March 2015

Girls and boys in Sudan

At a time when I seemed a credible advisor on the topic, aspiring media psychologists used to ask me how to get on television. I replied that only two things were required: having an office near TV studios (my office at the Middlesex Hospital Medical School was in the middle of media land in central London) and having a popular subject, say the sexual fetishes of trauma victims who had been bitten by Royal dogs.

This blog is not a televisual event, but I hope to engage you with the popular and contentious subject of sex differences in intelligence. In part I am doing so because I intended to answer Nick Mackingtosh’s point that Richard Lynn’s work on sex differences had not been generally accepted. Sadly, Nick is no longer alive to participate, and the project was on the back burner anyway, because I thought it required going through many review papers and meta-analytic studies.

Now a paper comes along which gives a good introductory summary of the sex-differences-in-intelligence debate, and adds some new data on a large sample of over 7000 children. By way of background, all informed opinion was that men and women had the same level of intelligence, despite men’s 10% larger brain size. This paradox might have been resolved by showing that those extra neurones were required for the management of the penis. Not so, apparently. The proposal is that the extra capacity is taken up by greater visuo-spatial ability, which begs the question why it does not show up in intelligence measures. Richard Lynn upset the applecart by arguing that boys and girls matured at different rates, and when late maturing boys became men they were about 4-5 IQ points brighter than women. If one accepts that, plus the somewhat larger standard deviation for male intelligence (if one accepts that) then it is easy to see why there are more men at plus two sigma where all the interesting stuff happens. These larger issues to be discussed at a later time.

Salaheldin Farah Attallah Bakhiet, Bint-Wahab Muhammad Haseeb, Inas Fatehi Seddieg, Helen Cheng, and Richard Lynn (2015) Sex differences on Raven's Standard Progressive Matrices among 6 to 18 year olds in Sudan. Intelligence 50 (2015) 10–13.

They say: The Standard Progressive Matrices (SPM) was administered to a sample of 7226 school students aged 6 to 18 years in Sudan. The sample consisted of all the school students in the central sector of the provinces of Al Jazeera Aba, Raback, & Kosti in the White Nile state, approximately 300 km south of the capital, Khartoum. Schooling is compulsory in Sudan from the age of 6 and to 18 years.

The data were analysed for the sex differences on the total scores on the Standard Progressive Matrices and for the scores on the three factors of Gestalt Visualization, Verbal-analytic Reasoning and Visuo-spatial Ability identified by Lynn et al. (2004). To calculate scores on the three factors, a principal components analysis was carried out that showed three factors with eigenvalues greater than unity.

There were no statistically significant sex differences between the total scores of the 6 to 13 year olds, but among 14 to 18 year olds males obtained higher average scores than females and among the 16, 17 and 18 years olds the average male advantage was 0.337d, equivalent to 5 IQ points. An analysis of the data for the sex differences on the three factors of Gestalt Visualization, Verbal-analytic Reasoning and Visuospatial Ability identified by Lynn, Allik and Irwing (2004) showed similar age trends to those for the total scores.

 Table 1 gives the key findings:


If I had to talk about it on television I would show the picture below, with IQ on the ordinate. The sex differences are absolutely conclusive at age 17 but indifferent at age 18, and the graph makes that more clear than the table.


The picture makes the general pattern clear, with the switchover happening at age 11 when boys start to push ahead a bit, with more gains later. Even without the exceptional 17 year old performance, sex differences are evident.

The authors come to 4 conclusions:

1) no statistically significant sex differences for 6 to 13 year olds, suggesting girls are not disadvantaged in Sudan; (I could posit a later disadvantage)

2) there are sex differences later, broadly but not perfectly in line with Lynn’s 1994 predictions;

3) male advantage is visible at age 12 and statistically significant by age 14 in Sudan, sooner than predicted, and by 16 is at the equivalent of 5 IQ points;

4) the sex differences on the three sub-factors show similar although not identical age trends as those for the total scores. The only major difference is that on Visuo-spatial ability girls performed significantly better than boys at age 11 (.17d).

So, that ends the sex lesson. Lynn’s thesis about male advantage after age 16 was derived from data from European cultures. On any assessment of culture there are significant differences between growing up in Sudan and in Europe, but the same pattern is confirmed in this study. It seems more like sexual dimorphism than cultural constraints, though one can never be sure without comparing many different cultures, and this paper is one piece in the jigsaw.

One thing the paper did not do, because it was not the main topic, was to compare the overall levels of achievement with Western norms. If one takes age 15 as a comparison point then the average score of 28 in the Sudan would be below the 5th percentile on British or US norms. This is roughly in line with the other 12 known studies on intelligence in Sudan.