Friday, 11 April 2014

Agent Pistorius


I used to be in favour of televising trials, on the basis that justice needed to be seen to be done, and that the legal process should be known to all citizens as part of their education, in case they gave evidence or had evidence given against them, or wondered why someone had been set free or convicted, or failed to appreciate the power and majesty of common law.

Now I am not so sure. Why should I be watching a trial in another country, concerning people I do not know, in a case which is unrepresentative of South Africa, a country with very high rates of murder and rape perpetrated against the anonymous multitudes? Celebrity and beauty seem to be the answer. We can be enthralled, entertained and, perhaps, educated by someone famous and someone beautiful. More than that, we can be beguiled by the redemptive story of a man who, in the harsh parlance of the past ,would have been considered a cripple, rising on magic blades to take all the prizes including the prize girl.  The populace like nothing more than to see someone rise to the stars and then fall to the ground. Even better, they get to see the couple’s private text messages. Awesome.

Milady Judge Masipa, who looks like a bright cookie, was understandably severe with the hoi polloi who chattered at the dramatic bits, telling them acidly that this was not a public entertainment, but that’s what it is. It is both a very serious trial and an entertainment. I am in the audience, complicit in the drama, as are you, dear reader, if you know anything about this case.

Everyone has a cover story. The State wants to show that pretty, high status White Folk don’t get special treatment. The Judiciary want to parade their independence, and don’t mind their 15 minutes of fame, which will buttress their profiles and do their pensions no harm. The media and their lackeys, including psychologists it must be admitted, opine. The Police have been revealed to the world as flat-footed bunglers and watch thieves, which must bring some comfort to the long suffering public, though some embarrassment to the State. Best of all, South Africa puts on a riveting detective story which might even lead to a True Confession, and which might conceivably lead people to forget that the average victim of South African crime gets far less attention and the perpetrators far less flamboyant and expensive defence barristers.

What of the psychology? First things first. Mr Pistorius’s defence is that he made an honest mistake, understandable in the context of high crime South Africa, and even more understandable in terms of White fear of violent black men in the dark, but that doesn’t need spelling out, and is best left unsaid anyway. The State has a mammoth task, because they have to prove a malicious intention, yet all the material facts from which intention might be deduced have been admitted. So, the case hinges on some very psychological but intangible factors.

Rightly, the two barristers Roux and Nel have become stars in their own right. I think I have had about 8 cross examinations as an expert witness, and probably about 20 case conferences with barristers. In my experience they are bright, quick witted, adept at dissecting language and drawing out subtle implications from statements. They try to be several steps ahead, and usually succeed. They also have a large bag full of tricks. They love side alleys into which the unwary can be drawn, for no other purpose than to reveal them to be idiots or liars. They will pounce on an obscure and irrelevant distinction and worry it to death until the poor witness agrees to anything out of pure frustration and boredom, only to have this concession turned against them on contrived grounds.  They revel in double negatives, elaborate dependent clauses, arch suggestions and tendentious interpretations. Best trick of all is Nasty Surprise. The victim, known to them as Baby Seal, is fed a comforting diet of banal questions, each answer being met by flattering agreement “Absolutely right. So very helpful. I am most grateful to you for saying that” and then a new document is produced showing that the person you are championing has some fatal flaw they have not disclosed to you, or that you have ignored vital contrary evidence in a major publication. Too late, you look back at your former replies with painful regret. Another trick is to take a specific issue and to discuss it to death so that its importance rises to dominate the case. For example, a barrister of my acquaintance, later a notable Judge, was defending a driver who was so drunk that Police decided to skip the “walk a straight line” test, fearing he would fall on the station floor and injure himself. Roydon, for that was his name, spent a long time convincing the jury that the one single test which mattered, on which the entire case must hinge, was the straight line test, and that had not been carried out. Case dismissed. The prime aim of all barristers is to secure for their client a miscarriage of justice. For the barristers the Pistorius trial is business as usual, but this time with a global audience. In some South African township hovel a young child is muttering: That’s what I want to be when I grow up.

What do we make of Oscar Pistorius? As I said in another place (see below) psychologists should not comment on people they have not interviewed. A televised trial does not give you first hand observation of a person’s face, though one certainly picks up a lot from the tone of voice. Most important, the contingencies are that Pistorius will gain considerably if his account of what happened is accepted, so his statements and behaviour cannot be taken at face value. Perhaps viewers are learning first hand some of the expensive merits of adversarial justice.

The prosecution case has centred on a theme which is a staple of criminal justice: responsibility. Perpetrators tend to minimize their actions and wish to show them as reactions. The retired prison doctor Theodore Dalrymple described a young man’s account of stabbing someone thus: “Then the knife went it”. The knife did it. The guy just happened to be there, admittedly with a knife in his hand.

Agency has a dual meaning. It can signify that you understand that you are able to operate on the world, or it can mean that you are merely an agent, carrying out a role as required. In that sense perpetrators cast themselves as victims of circumstance, agents of some higher cause, in this case the respectable home owner defending himself and his guest from intruders. 

Prison psychologists once approached me to offer trauma services to perpetrators who were traumatised by what they had done. I questioned whether this was a wise course of action, since vivid regret at their actions might possibly guide these malefactors into questioning their behaviour for ever, to the benefit of society. I said I would concentrate my scarce resources on victims.

There is an interesting issue here: how does one tell that a person is lying? The answer is simple: you need a stopwatch. Lies take longer than truth, because liars are fabricators not reporters. Fiction takes time. Ask any novelist.

Live update: Now I have just seen the prosecutor Mr Nel ask the defendant the obvious question: if Pistorius believed he was shouting for his girlfriend to call the Police while pointing a gun at an intruder hiding in a toilet, why did not the innocent girlfriend answer from behind the toilet door with a simple “Its me, Oscar”. An implication which hangs there, with Milady Judge making a note, before the Court is adjourned till Monday.

What’s my cover story? CNN asked me to comment on the trial. Otherwise I would have continued leading the search for MH370. Honestly.

Thursday, 10 April 2014

IQ, Neuroticism, booze, and those damn vegetables again


A long suffering toiler in the groves of academe writes in to say that, rather than just bemoaning the lack of intelligence and personality measures in epidemiology, I should pay some attention to a study which has done precisely that, and help boost the visibility of such work in the health literature. In fact, I have a vague recollection of the paper, but now is the time to make amends for my forgetfulness.

Bryan Pesta apologizes for the “ghastly” link below, but it is free, folks, so just copy and paste into the search bar.,d.aWM&cad=rja

By way of background, to the external world the USA is a monolith with some minor regional variations. To its citizens is a union of sovereign states, and all the better for it when that union is not too close.  Pesta, Bertsch,McDaniel, Mahoney and Poznanski (Intelligence 40 (2012) 107–114) have gathered data on all 50 American states, have found a link between IQ and neuroticism measures and health variables, and have tried to tease out possible causal links.

They found that at the State level, drinking alcohol correlates positively with exercising and eating fruits and vegetables; and it correlates negatively with rates of smoking and many chronic diseases. These data are consistent with a growing but mixed literature showing that alcohol consumption correlates inversely with chronic disease rates. This may be nothing to do with ethanol as such, but we should follow the tradition of results first, explanations later.

The authors work through the key data using multiple regression.

At Step 1, the linear combination of IQ and N alone explained 57% and 61% of the variance in Chronic Disease and Metabolic Syndrome, respectively. Both IQ and N remained significant (but attenuated) predictors of disease, after entering Health Behaviors at Step 2. Not surprisingly, Health Behaviors itself explained large amounts of variance (over IQ and N) in both Chronic Disease and Metabolic Syndrome. Note that the variance explained at Step 2 is unusually large for social science research. Fully 80% of the variance in Chronic Disease (77% in Metabolic Syndrome) was explained by the combination of IQ, N and Health Behaviors.
The size of the effects here, though, could exemplify the "high resolution" that aggregate-level data offer, relative to studies that use individuals. At Step 3 they found that IQ (Beta=−.18), N (Beta=.35) and Health Behaviors (Beta=
−.53) all remained significant as predictors of chronic disease, even after controlling for state income (Beta=−.12, ns).

Here are the correlations between IQ and :


Health Behaviours .45

Chronic disease −.51

Metabolic syndrome −.53 C

Crime −.76

Education .41

Religiosity −.55

Income .57

So, here we have a nice clean study, admittedly at State level (aggregated data) which shows the importance of IQ and Neuroticism in influencing health outcomes. Why has this engaging study only been cited once? It may be that the intelligence literature is not read by epidemiologists. Another problem may be the title: “Differential epidemiology: IQ, neuroticism, and chronic disease by the 50 U.S. states”. It is accurate but dull, and hardly worth tweeting about in its current form. I think that the Pesta gang need to get with the spirit of the age, and re-issue it with a snappier, media friendly title:

Dull worriers die sooner: Avoid West Virginia.


Disclaimer: I am sure that the denizens of West Virginia are bright and stable people. It was just a suggestion.

Tuesday, 8 April 2014

Nature, the journal, and the Nature of Samples


As you will know by now, I lead a quiet life, avoiding trouble wherever possible. The natural calm of my afternoon was shattered by a Nature News piece with the startling and quite specific headline: “Stress alters children's genomes”. As if the drama of the finding was not enough, Jyoti Madhusoodanan’s article added further details: “Poverty and unstable family environments shorten chromosome-protecting telomeres in nine-year-olds”.

They shorten the genome. Not, they are associated with shorter genomes. They take a healthy long genome and shorten it. Imagine what the shock of the headline must have done to my genome. On your behalf, I read on, undaunted. Nature had posted up a nice picture of telomeres, which is usually enough to win over the unconvinced. Ever the sceptic, I read on, and came across these prize paragraphs:

When researchers examined the DNA of 40 boys from major US cities at age 9, they found that the telomeres of children from harsh home environments were 19% shorter than those of children from advantaged backgrounds. The length of telomeres is often considered to be a biomarker of chronic stress.

The telomeres of boys whose mothers had a high-school diploma were 32% longer compared with those of boys whose mothers had not finished high school. Children who came from stable families had telomeres that were 40% longer than those of children who had experienced many changes in family structure, such as a parent with multiple partners.

At this point I wondered who would be silly enough to imagine that n=40 was a suitable basis for concluding anything, and whether anyone would be even more silly to imagine that if there were observed differences between bright and dull mothers that it followed that those differences were caused by independently existing stressful environments, rather than being due to prior differences in the genetics and the behaviour of brighter mothers.

To spell it out: dull mothers are at risk of all sorts of things, from their genes upwards and outwards; brighter mothers might be spared those risks for reasons which range from their genes upwards and outwards.

However, I knew that I must have got it wrong, because such foolish errors would never be tolerated by Nature. So I had a look at the paper.

Here is the method statement:

Initially, we identified 40 families based on a three-step process. In the first step, the sample was constrained to boys meeting the following conditions: (i) boys provided saliva at age 9 (wave 5) in-home interview, (ii) whose mother self-identified her race as black or African American, (iii) for whom no data were missing on the criterion variables described below, and (iv) who were male.

Next, we arrayed the subsample on an index of advantage–disadvantage from birth to age 9 based on an equally weighted combination of (i) family economic conditions, (ii) parenting practices, and (iii) family structure/stability.

Finally, we took children with the 20 highest scores in the disadvantaged index whose mothers had experienced at least one depressive episode and the children with the 20 lowest scores on the index whose mothers had never experienced depression. Thus, boys who scored highest on this index (n = 20) lived in homes with high levels of poverty, high levels of family instability, harsh parenting, and maternal depression. Boys who scored lowest (n = 20) lived in affluent, stable families and were not exposed to either harsh parenting or maternal depression. We then assayed the children for Telomere Length.

So, from one perspective we can say they chose a bad genetic group and compared it with a good genetic group. From another perspective we can say they chose a bad luck group and compared it with a good luck group.

What steps did the authors take to compare these two perspectives? What steps did they take to distinguish, for example, between between bad luck on the one hand and bad decisions on the other; between the slings and arrows of blind fate on the one hand and the natural consequences of damn fool decisions on the other?

As far as I can tell, none. They assume that all this bad stuff rains down on one group and that the other group is spared, but that in genetic terms both groups are identical, or close enough to warrant a comparison of the effects of these extraneous life events.

I looked up the telomere lengths and did a t test. For the harsh environment 9.6 (4.2) and for the nurturing environment 10.3 (2.5). 20 boys in each sample. Non-significant. Nowhere close. So, I presume it is only significant if you put together a model of variables to be controlled for, but otherwise not.

To my primitive eyes the sample size seems far too small to conclude anything much, and far too small for comparisons of individual gene effects. In addition, there is a highly questionable assumption that all that these boys inherited from their parents was an “environment”. Incidentally, so many fathers were untraceable in the “unlucky” group that fathers had to be left out of the whole study. A look at the genetics of the mothers would be a start. Looking at the telomere length of the mother’s DNA might also be worth a look.

Have I got it entirely wrong, and is there some innocent explanation I have missed?

I would like some assistance from anyone who can help me understand why this paper, which I think flawed, is considered suitable for publication by the editors of Nature.

Is it healthier to eat 7 vegetables or 7 scientists?


I wish no harm to the authors of “Fruit and vegetable consumption and all-cause, cancer and CVD mortality: analysis of Health Survey for England data”

J Epidemiol Community Health doi:10.1136/jech-2013-203500

whose recent publication has received approving coverage in the media. We are colleagues at the same godless institution, so they cannot be all bad. But (and you expect nothing less from me) I am not bowled over by their arguments about the benefits of vegetables. You will have caught the general tenor of my criticisms of this sort of work in “Diet is an IQ test”

Following Prof David Colquhoun, I joined him in quoting with approval a paper BMJ 2013;347:f6698 doi: 10.1136/bmj.f6698 (Published 14 November 2013) by John Ioannidis in whose train I ranted thus: “Samples of about 70,000 followed until death (with a proper link to death registers) will be required to identify even a few general patterns in diet which might account for a 5-10% increase in risk. If the studies are to mean anything, IQ, personality, sociological and occupational variables will have to enter the mix, and participants will probably have to be paid to stick to the course, and put up with random visits of inspectors looking in the fridge and the medicine cabinet.”

So imagine my pleasure, or alarm, when this paper turns out to have followed 65,226 persons drawn from a nationally representative sample for 7.7 years and visited them at home to find out what they had eaten yesterday (thus remarkably improving accuracy of their recall) and then linking the respondents to death registers. Rather disarming, isn’t it? The authors seem to have got good data without paying participants or raiding their refrigerators. The authors admit that the main limitation is that measurement of fruit and vegetable intake occurred at only one point in time and relies on self-report. There may be social desirability bias and random error (forgetting) in the recall of fruit and vegetable consumption. However, while short of perfect monitoring, this is a big step forwards. All this is very good, and shows epidemiology at its best.

Undaunted, I moved to the second half of my diatribe “IQ, personality, sociological and occupational variables will have to enter the mix”. Here I have found some things to complain about. Although they included sociological and occupational variables, they did not measure IQ or personality. Frankly, I don’t expect that of epidemiologists, because those measures are often neglected by psychologists anyway.

That aside, the authors carry on doing good things by offering us some old fashioned means and standard deviations, with participants categorized by the number of portions of fruit and vegetables they consume. These are the sorts of simple statistics I can understand. For example, the English eat slightly over 2 portions of fruit, and 1.5 portions of vegetables a day. The propaganda about vegetables has left them relatively unmoved.

Table 1 shows that those who do eat vegetables tend to have non-manual occupations, and the more vegetables they eat, the more likely they are to be in middle class occupations. Do vegetables make you rise in social class? Do bananas telephone? Do efficient compasses misdirect? (Can you spot the origin of the last two questions?)

Similarly, 7-vegetables-a-day types are much more likely to have a university degree than vegetable refusers, who tend to be less educated folk. Also, they are less likely to smoke and are more likely to be physically active. On the other hand, they are just as fat and almost as boozy as everyone else. Those who consumed more fruit and vegetables were generally older, less likely to smoke and more likely to be women, in a non-manual household, with degree level education. Veggie Mummies, yah?

Finally, when it comes to deaths during the study period, here’s the crunch: overall, 6.7% of the sample died during the study period of 7.7 years. The sad fact is that if you are 57 years old you have a 6.7% chance of being dead by the time you are 65 years old. (Or would have been 65, for pedants). Those who eat no vegetables have an 8.2% chance of death, the One to Three vegs a day 7.9%, the Three to Five vegs a day 6.4%, the Five to Seven 5.3% and the Seven Plus vegetables only 4.1%. So, although your chance of dying is relatively low, you can make it even lower by feasting on vegetables.

At first glance, the avid vegetable eaters have half the death rate of the no vegetable eaters. It suggests that vegetables are the cause of the difference. However, it could be that vegetables have nothing to do with it.

In table 2 they offer a “fully adjusted” Model 1: Adjusted for sex, age-group, cigarette smoking and social class; and the even more adjusted Model 2: Adjusted for sex, age-group, cigarette smoking, social class, BMI, education, physical activity and alcohol intake. Of course, as sharp eyed readers you will note that they do not offer a Model 0: adjusting for sex and age, the only things which are truly not controllable by individuals. That is a pity.

In table 2 they use hazard ratios, where eating no vegetables (the highest apparent risk category) is set to 1 and the other conditions lots of vegetables rates as 0.69. This certainly shows the differences with increasing consumption of vegetables, but no longer reveals absolute risk. I prefer table 1. In fact, I would have liked to have seen a correlation matrix. I can read those. I concede that such a matrix would not reveal covariance, but it would allow me to begin to think about the associations between the variables. One or two plots of data would also have helped. In my usual ferreting mode I had a look at the supplementary data.


At about 120 months the fruit effect dies out for some, probably artefactual, reason.

In both these adjusted models and in other variations the effect of vegetable consumption continues to be significant. They go into further detail about vegetables (good) and fruit (slightly less efficacious in keeping you alive) and note that canned fruit seems to slightly increase mortality, probably because of the sugary syrup in which they float.

The authors have bundled together factors that none of us can control like our age and sex, with factors we can control like how long we stay in education and the sort of work we do; with factors we can and probably ought to control like how much we eat and drink. All those different categories are “controlled for”. Some mistake, surely? I can understand the “control” for age. Older people are more likely to die in any time period than younger people. However, if I chose to become a university teacher, why “control” for that choice? I took up that occupation precisely because I thought it would be agreeable, if not well paid, and that I would be highly unlikely to suffer industrial accidents. My choice, plus my ability to get such undemanding light labour against, frankly, rather sparse competition, reveals something about me. It may explain my willingness to follow health advice, or it may simply be that I am a cautious man, minimising my risks in my personal and occupational life. A simple fearfulness of character could explain all the associations.

Consider the adjustments. These are based on the assumption that the cigarette smoking, social class, BMI, education, physical activity and alcohol intake are not related to something which itself has an influence on health. They are seen as imposed external factors which can influence health, rather than a series of behaviours related to an intrinsic factor: system integrity. System integrity is a hypothesized intrinsic characteristic which gives you a good body and a good mind, such that you are healthy and intelligent. This may be related to your genetics and/or a favourable beginning in utero. The one give-away sign of system integrity is fast reaction times to simple stimuli. See the Edinburgh group under Ian Deary for all these findings.

Seen this way, the intelligent live longer and healthier lives not because they are wise, but because they are lucky. They eat vegetables because it seems to be the clever thing to do from a health point of view, and perhaps because they can work out that the need for protein from meat is relatively small, so vegetables are more cost-effective. They may even like the taste of them. They also wear seatbelts, use condoms, brush their teeth, don’t smoke, go for walks, don’t eat or drink too much, study hard, strive to get good jobs and always save money.

The conclusion of this study is that we should eat our vegetables, and 7 portions rather than only 5. Perhaps so. It is still possible simply that bright people live longer, even when they are slightly plump and somewhat boozy. No, my gripe is about the way they have interpreted the findings, and the assumptions which underlie their calculations of hazard ratios. The authors make it clear that “This study has found a strong association, but not necessarily a causal relationship. There are additional unmeasured confounders not included in the analyses, including other aspects of diet.” However, they go on to mention other dietary factors, not the psychological ones.

Vegetables may be good for you. But I have been assured that scientists make a most delicious, nourishing, and wholesome food, whether stewed, roasted, baked, or boiled; and I make no doubt that it will equally serve in a fricassee or a ragout.

Monday, 31 March 2014

The mind’s construction in a face

Of course, this is only a bit of fun, but presumably you can tell how bright a person is just by looking at them? Such confident judgments are anathema to proper clinical psychologists, who would rather spend an hour giving a Wechsler intelligence test than stoop to such populist nonsense.

Now Karel Kleisner  Veronika Chvátalová, and Jaroslav Flegr have decided to put this silly stereotype to the test in a PLOS One paper “Perceived Intelligence Is Associated with Measured Intelligence in Men but Not Women” and find it not so silly, at least as far as men’s faces are concerned. (As my readers already know, a stereotype is an insight waiting to be proved.) Perhaps the girls are so exclusively judged on prettiness  that their intellectual countenances are ignored, whilst boy’s faces can be judged for both intellectual and sexual purposes.

Let us get the criticisms in quickly. The sample size is small (n=160), and more importantly the faces are from university students and the raters are also university students (mean IQ 125 sd 17). This could be a case of bright people recognising other bright people. There is a restriction of range problem, and the authors should try a representative sample of faces and raters, and are likely to get better results.

I presume no-one was photographed with glasses on, though they “avoided cosmetics, jewellery and other decorations”. On the plus side, they have published their entire data set.

In general, I like the way the authors have presented this paper. They admit that their method of analysing the composition of faces shows no relation to measured IQ, yet that there must be something about the pictures of the men’s faces which allows the positive predictions of their intelligence to be generated. The authors say that this must be due to a cultural stereotype. Weak argument. Where on earth would such a “stereotype” come from? If cultural stereotypes mean anything they would be random, and have low predictive power. This reliance on the notion of “cultural stereotype” is a crucial misunderstanding on their part, because it does not explain how a correct stereotype comes about, other than by someone noticing something which is true.

Their line of best quadratic fit I found something of a disappointment. Above IQ 140 the strength of the prediction falls considerably, and these paragons of intellect are seen as pretty stupid. In statistical terms these outliers are freaks, so in evolutionary terms it might not be worth detecting them. Or they carry so much mutation load that they look awful.

In both sexes, a narrower face with a thinner chin and a larger prolonged nose characterizes the predicted stereotype of high-intelligence, while a rather oval and broader face with a massive chin and a smallish nose characterizes the prediction of low-intelligence.

Do you have a gracile face? For once in my life my larger nose seems to be a benefit in generating a positive stereotype about me.  Do you look like the clever person you actually are, deep inside? If you wish to comment, please append a photograph. If you are over IQ 140 you may omit the photo.

Rock and Roll


Lest it be thought I lead too quiet a life dwelling on the minutiae  of psychometry, I spent most of yesterday partying with rockers in a secret London location as the guest of Richard Thompson OBE (no relation) whose gig at the Half Moon on 4 April has sold out in 10 minutes. Since Fairport Convention he has achieved fame as a solo artist and has an extensive and extremely loyal fan base. His songs have been covered by everyone in the business, and I have witnessed adoring fans standing in the pouring rain at Fairport’s Cropredy Convention in Oxfordshire, hanging on every word and chord.

In a respectful obeisance to rock tradition, he wore blue suede shoes (“Crepe, actually” the great man said).


2014-03-27 16.09.13


Also there was Jo “Nashville Rock with an English Accent” Burt who played with Black Sabbath, and whose wife and backing singer Antonia talked about their new album, suggesting “The Mess” as the track which would be of most psychological interest to my readers. The picture shows Antonia, with Richard Thompson in the background and Jo Burt, looking away to his left.


2014-03-27 15.18.39

Third up, a new duo launching their latest album next month, but once again for some reason my photograph shows the female part of the combination with somewhat greater emphasis than her truncated male partner.


2014-03-27 14.40.00

Following rock tradition, the rest of the party becomes a bit hazy, so the names have become somewhat jumbled, and I have run out of links. I am told that all of us danced to the classic tracks. There was a whole lot of shaking going on. You will have to help me recall the rest of it. Meet on the ledge.

Friday, 28 March 2014

#MH370 Reincarnation and sea junk

It would appear that, despite collecting data for several decades, we do not have baseline estimates for sea junk per area of ocean. Our watery world is crisscrossed by a conveyor belt of ships carrying container loads of materials, a portion of which fall into the water, joining the rubbish deliberately thrown overboard from ships and the rubbish which makes its way into the oceans from the stuff we throw into rivers and leave on beaches.

Baseline measures aren’t sexy. One unintended consequence of the search for flight MH370 is that we will have learnt that even the far reaches of the southern Indian Ocean, deep in the roaring forties, have generous scatterings of man made rubbish. Perhaps we will even be able to quantify this in terms of number of discriminable man made objects per thousand square miles.

Note that, if the number is very low (and the more appropriate measure turns out to be objects per ten thousand square miles, or even a hundred thousand square miles, which is a little over the size of the United Kingdom) this would strengthen the significance of finding any object floating in the ocean. Signal detection would become a little easier. We could argue, as the Malaysian government officials have done (they are not having a good time, are they?) that floating junk means floating plane junk. Find some junk and the plane will be on a sea bed somewhere upstream from the sea currents, if those can be calculated with any degree of precision.

On the other hand, if the number of floating objects is high, then the task becomes even more difficult, and pushes us towards the next problem: can crashed plane junk be discriminated by satellite or observer plane from all the other junk or do we have to rely on retrieval by ship of every likely floating object?

This question came into my mind a few minutes ago, when the revered BBC website displayed a picture taken by a journalist from a New Zealand plane showing a white floating rectangle. “I am no expert” as people say in the Twitter-sphere, (before launching into an elaborate speculation) but it is not immediately apparent to me how this object potentially relates to a crashed airliner. It is very probably nothing to do with the skin of the plane, nor does it look like any inner section, or any type of cargo. However, it is man-made, and floating.

So, how are our probability estimates looking at the moment? It seems that the range of the aircraft is fuzzier than previously disclosed. The plane was traveling faster than previously envisaged, thus burning more fuel, and therefore travelling less deeply into the southern wastes. If one plots out the error arc of the Inmarsat calculations, and the error range of the speed and fuel calculations, quite a chunk of ocean remains in contention. (I do not know how much, and wonder if anyone else does).

So, which way would you gamble, using Bayesian techniques? Three main components to be factored in are as follows: 1) area to be searched (ranging from the highly probable to the less probable) 2) the time left before the black box pinger battery gives up, and 3) search efficiency.

My rough calculations would be that: the search area remains too large; the pinger will fade to almost nothing in another 15 days (though cold water might extend that time) and search efficiency is extremely low. This latter point was well studied by the mathematician Bernard Osgood Koopman who wrote the first proper handbook for searches at sea in the 1940s. Looking at the sea is boring, you cannot scan the whole area so it is best to look a little down from the horizon, you should change places every 15 minutes to lift your alertness, you should start with probable places and move to slightly less probable place but ignore possible but improbable place (success is unlikely when probability of a target is low, and visual search is inefficient) and no plane can fly for ever.

My bet would be that, absent any more refinement in the calculation of impact location and subsequent drift, the searcher must gamble, and should maximise the area that can be searched. The area closer to Perth maximise the proportion of the target area that can be searched. Look where the light is brightest, particularly when the light is about to go out.

And finally, a word about reincarnation. About forty years ago I read somewhere, possibly in the Pali cannon of sayings of the Buddha or a commentary a line about the chance of someone being born without having been reincarnated. The chance was rated as being “as low the chance that a turtle that rises to the surface once in a thousand years will put its head through a life belt cast upon the Indian ocean”.

Can someone look it up for me? I am busy searching for a missing plane.

Tuesday, 25 March 2014

#MH370 : public and private understanding


Sitting in TV studio waiting rooms is a good way to meet and listen to experts with technical knowledge. Some days ago I had referred to the missing Malaysian airliner as posing us the ultimate IQ test. It now seems the test was solved in a few days, at least as regards probable location, if not probable motivation.

The Inmarsat story is a very interesting one, and is slowly being disclosed. As of this early morning the account was that there were 7 hourly “pings” to serve as the data points. Now it turns out that there was an 8th incomplete ping following shortly after the 7th, about 10 minutes later, and not at the usual hourly interval. This may have been an “exception report” coinciding with the moment of impact.

The early story was that, given these 7 hourly pings, Inmarsat was able to work out very quickly that they were consistent with transmissions coming from somewhere on an arc running North to South roughly from the point of the last radar contact with the plane. The presumption was that the satellite in geostationary orbit could calculate a possible arc from which the transmissions might have been sent, but no more than that. The increasing delays of transmission from plane to satellite might have been due to the plane travelling north or equally, south. How to resolve this directional issue?

Here is the Malaysian government’s explanation at to how Inmarsat did this:

In recent days Inmarsat developed an innovative technique which considers the velocity of the aircraft relative to the satellite. Depending on this relative movement, the frequency received and transmitted will differ from its normal value, in much the same way that the sound of a passing car changes as it approaches and passes by. This is called the Doppler effect. The Inmarsat technique analyses the difference between the frequency that the ground station expects to receive and that actually measured. This difference is the result of the Doppler effect and is known as the Burst Frequency Offset.

The Burst Frequency Offset changes depending on the location of the aircraft on an arc of possible positions, its direction of travel, and its speed. In order to establish confidence in its theory, Inmarsat checked its predictions using information obtained from six other B777 aircraft flying on the same day in various directions. There was good agreement.

While on the ground at Kuala Lumpur airport, and during the early stage of the flight, MH370 transmitted several messages. At this stage the location of the aircraft and the satellite were known, so it was possible to calculate system characteristics for the aircraft, satellite, and ground station.

During the flight the ground station logged the transmitted and received pulse frequencies at each handshake. Knowing the system characteristics and position of the satellite it was possible, considering aircraft performance, to determine where on each arc the calculated burst frequency offset fit best.

The analysis showed poor correlation with the Northern corridor, but good correlation with the Southern corridor, and depending on the ground speed of the aircraft it was then possible to estimate positions at 0011 UTC, at which the last complete handshake took place.

Burst frequency analysis is apparently well known, but to make offset calculations on Doppler like effects so as to infer location is an innovation. It would seem that when tested on real plane data these Doppler effect calculations matched the Southern arc better than the Northern arc. I still need to go through some further steps of understanding, but it seems very neat work, done very quickly. Once the most likely corridor of the flight path had been worked out, then calculations could be made on the fuel range of the aircraft so as to plot a likely impact location where both calculated ranges transected. Dropping sonar buoys in that area might pick up the last feeble transmissions from the black box. Finding debris will be another matter, and finding the wreck itself with the much prized black box even more problematical, and possibly not feasible.

Real time reporting of black box type data from aircraft is likely to be made mandatory, and eventually the black box will be as redundant as a library.

What these calculations show is that a minority can deal with probabilistic hypotheses based on statistical and scientific concepts, and usually those calculations are relatively private; and a majority would like to see publicly testable, absolutely tangible proofs, in the form of bodies and wreckage.

The private discussion has to be impersonal, detached, open to improvement and criticism, and rigorous when searching for errors. Reportedly, Inmarsat researchers eventually realised that a satellite in geostationary orbit is not in fact totally stationary, and by correcting for some drift were able to refine their location estimates. According to some accounts it was this correction which revealed the southern arc as the stronger hypothesis.

The grieving relatives want certainty, and want it in public. Scientists can only offer probability estimates, with the detailed results in private, or in that strange space, academic publication, where you have to know quite a bit about the subject in order to evaluate the paper, and can only discuss it with a few other people.

To cap it all, also not disclosed, quite properly, are the ways in which each airline deals with hijackers/terrorists. For example, it would not be good security to have it generally known how airlines intend to deal with hijackers, what distress codes they have put in place, how the cabin crew communicate with the pilots, and whether or not they can open the locked door to the cockpit, let alone why jets were not scrambled to shoot down the missing plane. Knowing this would aid public understanding. Keeping it private would assist security.

The Malaysian authorities released the wrong message. They announced certainty without even a shred of a wet paper napkin with the Malaysian Airways logo on it.  Private calculation understood by the few trumping the public bewilderment of the many.

They should have said something like: “The search continues for the plane and the passengers. All indications are that it came down in the far south of the Indian Ocean. It is very unlikely that there are any survivors. We are still searching for wreckage of the aircraft”.

All though it is unlikely now, one hopes that one day the relatives may rest in peace, free of the anxious torment of sweet dreams cruelly dashed.


Sunday, 23 March 2014

#MH370 and Goodness of Fit

Goodness of fit is an alluring concept. First, it is a good name for a statistical procedure, suggesting that one is on the side of the angels, and above all the devilish tricks of statisticians. Second, it describes how well a set of observations fits a theoretical model. The smaller the discrepancy between the observed values and the expected, model values, the better the fit.

Question is, fit with what? Chi square simply tells you the extent to which a particular frequency of observed values fits what would be expected from, usually, a chance model. It depends on a model, which in turn depends on a set of assumptions. In simple cases the chance model is fine. In more complicated cases the expected frequencies are somewhat harder to calculate.

However,  that is by no means the main problem. Non-statisticians use a different heuristic, and count the number of points of concordance between a narrative and a set of observations. It is goodness of fit only in the sense of a comfortable and convincing similarity. Chi square it ain’t.

For example, in buying a car one might make a list of desirable characteristics, and then measure the extent to which each car “ticks the right boxes”. Car manufacturers know this, so they construct the list for you, and then reveal the perfect fit: “Our car has doors, and you wanted doors, so 1 point to us” and so on. Fitting the facts to a narrative often follows a similar, self-serving, confirmation bias. People tend to count the points of concordance without looking at individual probabilities.

In trying to make sense of the mysterious loss of Malaysian Airlines MH370 many people want to start with the narrative. In the well known example of a simple explanation: if there had been a fire on board, and if that fire damaged communication systems, and if the the pilots had set a course to the nearest safe airport, then they might have been overcome by fumes and carried on flying in the same automatic Westerly direction until either the fire consumed the plane, or the fuel ran out. Most convincingly, the map of the safe airport showed it had the characteristics the narrative had required: an approach over water to a particularly long airstrip on which you would wish to land if you were on fire. Spooky.

Of course, it is better to assemble the facts first, and display them with their error terms. Some of those basic facts, and the error terms, have been difficult to track down. Not all of them. Inmarsat plotted two possible trajectories, with associated error bands, which appear to give objective guidance, even though they are based on very new types of inference. More confusing were the timelines of events, and now even the cockpit to control tower conversations, some of which have reportedly been challenged because of translation errors. The Malaysian Government read out an urgent note about apparent debris spotted by a Chinese satellite, but gave very large dimensions which turned out to be wrong.

The Bayesian approach would be to look at each of the assumptions in terms of  probabilities, and then establish confidence limits for those probability estimates. The chain of assumptions contain some that can vary semi-independently, and others which appear to cancel each other out. If a plane has crashed somewhere, it is not likely to keep “pinging” in a way that can be received by an antenna on a satellite, since the capacity to “ping” depends on having an intact system with a power supply.

Equally, if a camera on a satellite shows something floating on the sea (or just under the surface) then the significant of that firstly depends upon the error rate of interpretation (that the signal relates to something solid, and not a pattern of waves) and more importantly, a probabilistic judgment about whether the something is likely to be part of a plane or a container, pallet, or glutinous conglomeration of pallets, plastic bags, bottles, trainers and yellow plastic ducks (all of which have polluted the oceans and improved our detailed knowledge of sea currents).

Doctors face these sorts of dilemmas every time they cannot reach a diagnosis. They generally ask for more tests, which may illuminate or delay the decision. Privately, they often use a frequency table, on the basis that frequent things occur frequently, and that is the best and most defensible guide to action.

Consider the following data from a Flight Global showing why planes have been lost during level flight, which is usually the safest part of a plane journey.  Sabotage 13, Loss of Control 8, Airframe 8, Explosion or Fire 4, Collision 4, Hijack 2, Ditching 1, Power Loss 1, Shot down 1, Unknown 4 (includes MH370).

In terms of prior probability you would go for sabotage as the primary suspect, followed by loss of control or airframe. Given that no debris was found at the point of last transmission, or nearby, that means that there was no bomb, no loss of control, no airframe disintegration, no explosion, probably no fire, no collision (pretty sure of that), no ditching, power loss or shot down (unless there is one hell of a cover up).

Looks like hijack, in the sense of hijack by pilot for reasons unknown. However, out of  45 planes falling down from level flight before this one, hijack accounted for 2 and unknown for 3. So, looks like unknown, possibly hijack.  Time to send some satellites to look at the debris to the West of the Maldives, just as a control case.

Finally, at what level of cost will governments begin to lose interest? I predict by the 35th day after the disappearance, when the black box pinger stops, everything will be scaled back, and the searchers will return to the statistics lab, until a bit of the tailplane shows up years later in a fishing net.

Wednesday, 19 March 2014

MH370 Intelligence to the rescue?


In the exercise of their intelligence, a large number of citizens have decided to try to find the missing Malaysia Airlines MH370 flight. According to the Wisdom of Crowds, if a sufficiently large number of people guess the longitude and latitude of this vehicle and its unfortunate occupants, then we should be able to refine our search of the vast immensities of the world that conceivably could have been covered by this fully fuelled jet.

The wisdom of crowds, as you may remember, posits the view that because the pooled estimates generated by individuals guessing the number of beans in a glass jar are often close to the real number of beans, that there is a mysterious force guiding us to the correct solution, sadly hidden from individuals. Of course, readers of this journal may suspect that averaging estimates on tasks with low intellectual content can sometimes reduce error terms, particularly if you omit outliers, but this is no time to get nasty with authors of popular books.

Typically, humans being humans, these disparate citizens have eschewed the control condition of just guessing the location, as the jar of beans example requires. They have taken to poring over maps and searching satellite photos and referring to Google Earth. We are all searching, each in our own very special way.

I am in favour of speculation, and would not to interfere with anyone’s hobby, even if it involves flight simulators.  There has been little need of heavy labour since the domestication of wheat, so people have to find things to do, myself included.

Better, this understandable wish to help solve a mystery has brought in some experienced pilots, with much to add in explaining the Pilot Point of View. Naturally pilots tend to stick up for pilots, but plane manufacturers stick up for plane manufacturers, and governments….. you get the drift. Chris Goodfellow, pilot, has put forward an interesting speculation, which is that a fire in the aircraft forced the pilots to change course to a nearby airport. The pilots, unknown to them, had already lost communication links, but set the autopilot toward the nearest and most suitable airport. Overcome with fumes and smoke the crew collapsed, but the plane continued in a straight course on autopilot until it ran out of fuel, somewhere just West of the Maldives, which is the area which should searched, in his opinion.

There is much to like in the account. First, it is written by a pilot. They get to wear impressive uniforms. Second, the hypothesis is testable, in that it gives a location (roughly). Third, it is testable in that it suggests that, should the plane ever be found, fire damage will be found in some of the systems. Fourth, testable in that every body will probably be still in its seat, asphyxiated. There will be no piles of bodies relating to crowds trying to break down the cockpit door. Fifthly, the black box will show one change of course, and nothing else. Sixthly, it teaches us something about pilots, and they are worth learning about.

Let me describe this in a little more detail. As an 11 year old I spent a portion of every airliner flight in the cockpit. I did at least one landing in the company of my younger brother standing next to pilots as they landed at Carrasco airport, Montevideo. I recall that, in a late burst of health and safety awareness, in the final stages of the approach one pilot muttered as he coaxed the bucketing DC3 downward “Grab on to something”. One learnt a lot in those days.

In later years I always talked to the pilot. On Concorde the pilots concluded their 10 minute explanations about the controls, instrumentation, flying characteristics, thermal properties of aluminium alloys, and the intricacies of altering the centre of gravity to vary the angle of attack by ending on a studied, laconic note, describing what was one of the world’s fastest-ever aircraft thus: “Its a good bit of kit”.

More ordinary planes had pilots who sometimes lamented their reduced condition. On a long flight to Tokyo one said to me “I am not a pilot, I am a systems engineer. Systems monitor, in fact”. He showed me how he was pumping fuel into the wingtip tanks to reduce stresses on the wings. He also showed me how he was monitoring “every airport that can take us”. This had not been a Latin American concern decades before. Sure enough, every suitable airport was on the moving map, with coordinates and runway characteristic available should we need to land quickly. A very worthwhile precaution.

By the way, most of these cockpit visits, though they linger in my mind, were very short. Pilots kept working, and some of the time I just stood there in silence, watching. The later flights were all in the hijacking era, but not in the hijack and suicide era. Different times.

However, there is a problem with all these all attempted explanations. We are all relying on the belief that we know the actual sequence of events in the first hour or so of the flight. I suppose it is conceivable that a fire somewhere in the airframe should have eaten its way through one automatic reporting system while leaving the radio intact. The timing of the loss of systems becomes absolutely critical in this analysis, as indeed in all of the analyses. Is there a sequence of events which is agreed, and trustworthy? Not quite.

Last contact was said to be at 1.30 am local time, and a nearby plane, contacting MH370 just after that to remind it to call into Vietnamese air control only heard static and some mumbling. However, as of today, the revised timeline is apparently as follows.

1:07 - ACARS ping (from automated system)
1:19 - “All right” (said by one of the pilots)
1:22 - Transponder quits (quits, which doesn’t mean it was switched off deliberately)
1:37 - no ACARS ping (just that, no ping, make of that what you will).

So, some of the times have changed, and the sequence looks different. However, until the precise sequence is confirmed speculation will only lead to infinite confusion. If a transponder quits, then that is all one can say, unless one has a means of distinguishing between deliberate and accidental causes of a transponder quitting.

It looks as if we have a communications problem somewhere in Malaysia. Fronted by the government minister for Defence and for Transport Hishammuddin Hussein, by background a lawyer from a political family, the authorities seem to be juggling their own national data (from radar); with the data reported or not reported from the radar systems in other countries; with the insights from Boeing; with the insights from American investigators, including the FBI and intelligence agencies; with insights from French investigators (see Air France 447); with data from Inmarsat; with with the understandable wish to look good in the eyes of their electorate and the wider world. This cautious lawyerly approach need not be sinister, and can of itself avoid fanning rumour, but in conjunction with technical matters which need explanation it seems to have lead to some avoidable confusion. Also, all the above groups sometimes give their opinions “off the record” to trusted journalists. Confusion squared.

The arrival of bereaved and angry relatives was to be expected. They are grieving without bodies, always very hard, and coping with a confusing story which keeps their hopes alive and makes their rage at fate turn into rage at these halting communicators of a dreadful mystery, which requires politicians to evaluate and communicate some very technical scientific details, of which probability, inference, and error terms are an important part.

Meanwhile, in the light relief section of the Press,  The Blonde has surfaced in Australia, saying that she had no idea that spending the entire flight with another blonde friend in the cockpit chatting with the male pilots constituted any sort of procedural risk. 

It may be time for everyone to get out of the way and read Arthur Conan Doyle’s short story “The Lost Special” about a train that disappears. Here is the link:

In that little story is the line which has become very well known: “"It is one of the elementary principles of practical reasoning that when the impossible has been eliminated the residuum, HOWEVER IMPROBABLE, must contain the truth.”

Let’s hope those poring over the raw data will be able to eliminate the impossible, and then explore the residuum.