Psychological comments: June 2014

Friday, 27 June 2014

Measurement errors

One popular criticism of intelligence testing is that scores could be affected by motivation and levels of practice. By implication, those who are not motivated to take the test will do badly and will be unfairly judged, to the detriment of any society which uses intelligence test results as a ticket of admission to education or employment. By further implication, such lack of motivation may apply most strongly to those who are poorer and most dispirited for other reasons.

Test administrators know all that, and make sure that subjects understand the test, take and pass the practice items, and are encouraged before, during and after each test (guided by protocols as to what help and encouragement is permissible) and that at least 6 months elapses between face to face testing sessions, and that alternate forms are used if testing has to be conducted sooner. Hence psychometric reports talk about the person’s level of engagement, the amount of effort they show, and the specific problems they may have encountered. If there are significant problems the results are either set aside, or labelled as being under-estimates and further testing carried out later usually resolves the issue. Monitoring is easier in face to face testing, but item analysis gives some insight into lack of effort in group tests. Group tests often have more practice items and care is taken to provide good quality test settings. By following all these procedures practice effects and motivational differences are reduced, but not eliminated entirely. It is still possible that some low results may be due to low motivation, and also that some high results might be due to lucky guessing. How big could these effects be?

Assume for a moment that motivational and practice effects have an influence, and that to the true low scores of less able people must be added the false low scores of those who found the test boring, pointless, and not worth bothering about. People like me, for example. I prefer watching clothes dry on a cloudy day than taking most intelligence tests.

If that were true, IQs would under-predict real life successes in things which were intrinsically interesting: getting good qualifications so as to get on in life, making money, and becoming famous.

If motivation were a major confounder, then correlations between IQ scores and real life scores would be low. However, IQ and real life are strongly correlated. For example, the largest recent study (Deary et al., 2007) of over 70,000 English children found correlations of r=0.81 between general intelligence measured at 11 years of age and GCSE scores at age 16. This is an extremely high predictive power (accounting for 64% of the variance). The colossal sample size gives us exceptional confidence in the robustness of the results. By way of comparison, most educational psychology publications have sample sizes of a few hundred, and are far less robust. As further proof of the common sense view that intelligence is involved in academic achievement, we can be even more precise about the impact of intelligence on different subjects. IQ scores on their own accounted for 58.6% of the results in Mathematics, 48% in English and down to 18.1% in Art and Design, that subject being the least intellectually demanding (Deary et al., 2007).

I. J. Deary, S. Strand, P. Smith and C. Fernandes (2007) Intelligence and educational achievement. Intelligence 35, 1, pp13-21. (For private study, email the author at the University of Edinburgh and ask for a copy).

Problems of motivation and practice also apply to scholastic examinations and to any procedures followed in job interviews. Varying motivation applies not just to IQ test but to all measures: intelligence tests, scholastic tests, and work assessments. Nobody gets round measurement error, not even the Spanish Inquisition.

In summary: Assume some people’s IQ scores are reduced by lack of motivation. That will reduce the correlation between IQ and other real life measures. IQ at 11 correlates 0.81 with scholastic attainment at 16. If motivation is a problem, the correlation is really higher.

If you prefer that as a Tweet:

If IQ scores are reduced by lack of motivation, but IQ at 11 correlates 0.81 with GCSEs at 16, then the real correlation is much higher.

Wednesday, 25 June 2014

Suarez and the tooth of God

With heavy heart I must turn to topic of biting. One should not bite anyone, not even a sporting opponent, not even a little, and not even if you find them very irritating. It is not sporting, not refined and, worst of all, grossly impolite. However, it is very human. That is an observation, not an excuse, because cultivated manners should be far better than the wild sort.

Biting is human, and suppressing biting in conflict requires effort. Biting is a weapon humans use, and comes naturally. Bite marks show up after fights, and forensic dentistry has developed techniques to match the marks to putative perpetrators and to determine when they happened. Some of this work is done by getting forensic experts to bite pig’s ears and then study the results, porcine skin being a good match with the human sort. Showing teeth in an angry grimace is a standard and well recognised threat signal. Biting is not only part of uncoordinated fighting, it is also a feature of abuse, showing animal dominance and disregard for the abused person, often a woman or child.

Of course, in football the standard weapon of conflict is the foot, used against the opponent under the pretext of trying to kick the ball. Apart from “late” tackles there is the innocent treading on the opponent’s foot by mistake, the innocent elbow in the face, the unplanned crash of bodies, the inadvertent head butt and, depending on the heat of the moment, the trading of punches in pure provoked self-defence. For once the availability of the instant-replay from several angles allows us to analyse these subterfuges in detail and, sometimes, to punish them.

It would be understandable if many spectators, having witnessed these lapses from proper sporting behaviour, wanted to turn from football to other sports. Yet, somehow, both far more brutal sports like boxing, and far less brutal ones like basketball do not attract such large global audiences. Perhaps the current regulated game has the “right” level of physical conflict for current tastes. Moral considerations aside, in the search for an easily understandable winning goal spectators tolerate and even condone judicious violence exercised in the service of their cause. Maradona famously credited “the hand of God” for his game-winning foul, and Suarez’s tooth, while not directly related to anything, was followed two minutes later by a real goal, to which it might have been a contributor, leaving in its wake an enraged and discomfited defence.

One mildly comforting notion, despite this dispiriting lapse from sportsmanship, is that international football competitions have become a surrogate for war: national anthems, national flags, nationalism unifying disparate factions, but very little in the way of casualties. The better angels of our nature have taken us from the horror of the trenches a mere century ago into a mostly peaceful carnival of football, although it includes the lamentable lapses of some un-sportsmen: biting, diving, Diva-ing, pretending to be at death’s door, protesting fake innocence, and generally being muscular Machiavellis on our national behalf.

It is not all that important, in the context of the march of history, but the one goal of the match was made by Diego Godin. He has good teeth, which he kept in the right place.

Tuesday, 24 June 2014

Iraq latest: the cousins sort out

The English have a penchant for maps, and a benefit of having an Empire is that they were able to draw straight lines, and make things look tidy for students of geography Back Home. No strangers to the notion of good breeding, the English also had an eye for bright tribes, and in creating Iraq in 1920 they got the Sunni minority to administer the kingdom.

Picking the brighter minority makes sense, though from time to time they get slaughtered by the resentful majority. This still leaves the intellectual problem encountered in all genocide: how do you know who to slaughter? Since human beings are more alike than different, this presents a problem. One approach to this dilemma is to classify friend and foe by consanguinity, and to side with one’s close cousins against one’s more distant cousins. In Iraq the 50% cousin marriage rate is one of the highest rates in the world, comparable to places like Saudi Arabia and Pakistan (and certain neighborhoods in Bradford). If you have been indulging in cousin marriage for generations (in various complicated variations of the same thing) you end up being close, very close. Your obligations are to them and, since they are very like you, that obligation makes sense. You should find work for them, not for some mystical Best Person for the Job who is not your cousin. Cousins matter, nations less so, (unless they are playing football). Democracy sits uncomfortably with cousin obligation. Why should the opinions and interests of distant cousins trump the needs of your close blood brothers?

iraq - tribes

Here is a map provided by HBDchick in her post below.

http://hbdchick.wordpress.com/2014/06/22/fbd-cousin-marriage-and-clans-and-tribes-in-iraq/

This helps me understand who is fighting who. I was not trained in nation building, but it looks as if there are three different groups with slightly different views as to what constitutes the national interest. I would not design a new Iraq without looking at the cousins in the nations next door, so perhaps the solution needs to be wider, much wider. I leave these practical matters to others. However, I cannot resist suggesting the name of the new caliphate: Sumeria.

If there is a prize going for the suggestion, I promise to share it with my cousins.

Sunday, 22 June 2014

Jumping to conclusions: flight MH370

On 8 March Malaysian Airways flight MH370 went missing.

My first posting was on 18 March (in academia commenting on any issue within 10 days is considered impetuosity of the rashest sort).

http://drjamesthompson.blogspot.co.uk/2014/03/regular-guys-pilots-on-flight-mh370.html

I abstract the relevant sections:

Speculation is what we are told we should not do. We should wait for facts. However, speculating is part of being intelligent. Indeed, it is one of its core features. A predator passes behind an obstruction and we speculate which side of the rock they will come out again. Getting the prediction right helps keep us alive. Speculating is what leads us to find out how things work. Puzzles intrigue us because we are curious. We see it as our business to seek for an answer, and begin to distrust the answers we are given. Good. The Enlightenment continues.

Is a having a home simulator prima facie evidence of mental disorder? James Reason, of Human Error, might argue that simulators assist the corruption of reality.

A home simulator might encourage fantasies of being a fighter pilot and of flying fast over hilly terrain, avoiding enemy radars. It might allow the rehearsal of landings in far away airports, off the Malaysian Airways beaten track. It would add a layer of extra skill to even a skilled pilot, who could attempt manoeuvres never allowed in civilian flight. Did the more experienced and older pilot play so many combat fighter games that, at a personal moment of anger or despair, he wanted to try them out for real? Or will it turn out to be no more than a hobby? I see it as a bit more than a mild obsession.

On 23 March I wrote the following:

Consider the following data from Flight Global showing why planes have been lost during level flight, which is usually the safest part of a plane journey. Sabotage 13, Loss of Control 8, Airframe 8, Explosion or Fire 4, Collision 4, Hijack 2, Ditching 1, Power Loss 1, Shot down 1, Unknown 4 (includes MH370).

In terms of prior probability you would go for sabotage as the primary suspect, followed by loss of control or airframe. Given that no debris was found at the point of last transmission, or nearby, that means that there was no bomb, no loss of control, no airframe disintegration, no explosion, probably no fire, no collision (pretty sure of that), no ditching, power loss or shot down (unless there is one hell of a cover up).

Looks like hijack, in the sense of hijack by pilot for reasons unknown. However, out of 45 planes falling down from level flight before this one, hijack accounted for 2 and unknown for 3. So, looks like unknown, possibly hijack.

On 11 May I wrote the following about the missing flight MH370

So, the Malaysian government faces the worst possible outcome: a Malaysian pilot or pilots took a Malaysian airways plane and deliberately flew it into the Southern Pacific, where it is very hard to find. A Malaysian problem. They have to review their personnel selection and health check procedures, and pilot/cabin staff security arrangements, which are best kept secret. They may have put in a third pilot since then, as a stop-gap procedure.

Today, 22 June, papers are reporting on the results of the enquiry so far: they cannot find anything suspicious about anyone else on the plane; the older pilot had made absolutely no forward engagements beyond 8 March; and had plotted a route to a small South Sea island on his flight simulator, and then deleted it. Investigators have recovered the flight plan and now apparently intend to search in a new location in the southern seas.

So, we cannot jump to the conclusion that the older pilot hijacked the plane. However, everything points that way, and the fact that the pilot or pilots had hijacked the plane was highly likely within a day or so of it disappearing off the radar screens.

Moral: do not jump to conclusions, unless not coming to a conclusion would lead to further loss of life. In the latter case take risks with your reputation and make a judgement call on the basis of probabilities as quickly as you can. Sometimes one has to jump to a conclusion.

Thursday, 19 June 2014

Detecting bias, and understanding American culture

If you, dear reader, are seeking to make your way in academia, be of good cheer: many people are doing fine work, and you will prosper if you follow their example, and will triumph if you take the best of their practices and extend them further.

On the basis that any moral tale must have exemplars of good and bad practice, I commend April Bleske-Rechek and Kingsley Browne;

http://drjamesthompson.blogspot.co.uk/2014/06/does-college-entry-depend-on.html

caution you against Casey Miller and Keivan Stassun who wrote the silly Nature piece;

http://drjamesthompson.blogspot.co.uk/2014/06/nature-stumbles-again.html

and commend this paper by Dwight Davis, Kevin Dorsey, Ronald Franks, Paul Sackett, Cynthia Searcy, and Xiaohui Zhao: Do Racial and Ethnic Group Differences in Performance on the MCAT Exam Reflect Test Bias? Acad Med. 2013;88:593–602. First published online March 8, 2013 doi: 10.1097/ACM.0b013e318286803a

https://drive.google.com/file/d/0B3c4TxciNeJZTjdmbjZPMk1PRDQ/edit?usp=sharing

Here is a brief summary of the underlying issues to be discussed. A test is not flawed because some people do better than others. Tests reveal those differences. A test is not flawed because some groups do better than others. Tests reveal those differences. However, a test result can be questioned if it has lower predictive power for any group or, more precisely, if it over- or underestimates that group’s later achievements on some agreed benchmark. Errors of this sort strongly suggest the test is too harsh or too lenient in assessing the talents of that group. So, in any instance in which test bias is suspected, a couple of further analyses of the data are required.

Here is the abstract: The Medical College Admission Test (MCAT) is a standardized examination that assesses fundamental knowledge of scientific concepts, critical reasoning ability, and written communication skills. Medical school admission officers use MCAT scores, along with other measures of academic preparation and personal attributes, to select the applicants they consider the most likely to succeed in medical school. In 2008–2011, the committee charged with conducting a comprehensive review of the MCAT exam examined four issues: (1) whether racial and ethnic groups differ in mean MCAT scores, (2) whether any score differences are due to test bias, (3) how group differences may be explained, and (4) whether the MCAT exam is a barrier to medical school admission for black or Latino applicants.
This analysis showed that black and Latino examinees’ mean MCAT scores are
lower than white examinees’, mirroring differences on other standardized admission tests and in the average undergraduate grades of medical school applicants. However, there was no evidence that the MCAT exam is biased against black and Latino applicants as determined by their subsequent performance on selected medical school performance indicators. Among other factors which could contribute to mean differences in MCAT performance, whites, blacks, and Latinos interested in medicine differ with respect to parents’ education and income. Admission data indicate that admission committees accept majority and minority applicants at similar rates, which suggests that medical students are selected on the basis of a combination of attributes and competencies rather than on MCAT scores alone.

First, they show that black applicant’s scores are one standard deviation below the mean for white applicants, in line with other measures of intellectual ability such as grade point averages, and also in line with the Graduate Record Examination and the entrance exams for other professions.

Medical College Admission Test (MCAT)               1.0
Graduate Record Examination (GRE)                  1.3
Graduate Management Admission Test (GMAT) 1.0

Law School Admission Test (LSAT)                        1.1

Ever helpful, they then spell out the misconception about test bias into which the Nature article plunged: Although some individuals may conclude that any test showing average differences in performance between majority and URM examinees is biased, such differences, in the absence of corroborating evidence, are insufficient to conclude that there is bias. Instead, professional testing standards compel testing experts to gather logical and empirical evidence about potential sources of these group differences using well-established (and generally agreed-on) procedures to determine whether test bias exists. Test bias arises “when deficiencies in a test itself or the manner in which it is used result in different meanings for scores earned by members of different identifiable subgroups.

Two processes are used to remove presumed test bias. The first is to fling out virtually any item which could conceivably be influenced by the presumed effects of being in a sub-culture. Incidentally, this may remove items which show real differences, but that is an inevitable casualty of the “if my scores are low then the test is biased” complaint. Incidentally, the same exclusion procedure is done for items which show sex differences. I hope to return to that particular issue sometime.

The second procedure to detect and remove test bias is to look at predictive power, in the sense of trying to detect differential prediction for the different racial groups. The authors are to be commended for their simple explanations of the procedures (which you might like to forward to researchers who do not grasp the concept):

If the MCAT exam predicts success in medical school in a comparable fashion for different racial and ethnic groups, medical students with the same MCAT score will, on average, achieve the same outcomes regardless of racial or ethnic background. On the other hand, if their outcomes differ significantly, test bias in the form of differential prediction exists because the prediction will be more accurate for some groups than for others.

If the observed graduation rates were 90% for white students but 95% for black students with the same MCAT score, this would be evidence of predictive bias against black students because their four-year graduation rate was higher than predicted and also higher than that of white students earning the same score. This is an
example of under-prediction.

As you can see, in the case of black medical students the test over-predicts the eventual outcome, particularly as to whether they graduate within the usual 4 years. The test also over-predicts the 4 year outcome for Latino students. The test is not biased against them. On the contrary, it has tendency to flatter them.

The authors then go on to list a whole lot of variables which may make life difficult for black and Latino applicants. They do not analyse these, or distinguish between them in terms of whether they are independent of intelligence, nor do they attempt to “control” for them in statistical terms. They are possible reasons for different outcomes and not the main part of the paper.

Finally, they look at the relationship between scores and admittance to medical school by race.

Figure 2 is an interesting picture of the United States of America today. This is what the authors say about the results:

Figure 2 compares white, black, and Latino individuals’ academic qualifications, as measured by MCAT scores and reported in applications for admission to the 2010 matriculating class, with the percentages of applicants in each group who were ultimately offered acceptance by one or more medical schools. As the figure illustrates, the percentage of white applicants with MCAT scores ≥ 25 is much greater than the percentage of black or Latino applicants reporting similar scores (84% for whites versus 37% for blacks and 56% for Latinos). This profile of MCAT scores stands in sharp contrast to the overall acceptance rates shown in Figure 2 (47% for whites versus 40% for blacks and 49% for Latinos), reflecting differences of
7 percentage points for white versus black applicants and –2 percentage points for
white versus Latino applicants.
The similar overall acceptance rates for the three groups suggest that admission
committees do not limit themselves to the consideration of MCAT scores in their efforts to identify the applicants who are the most likely to succeed in medical school. If they did, differences in acceptance rates across racial and ethnic groups would more closely parallel differences in their mean MCAT scores.
That is, greater emphasis on the MCAT exam would decrease the percentages
of minority applicants selected for entry into medical school. In sum, although group differences on the MCAT exam have the potential to reduce the percentage of URM students selected into medical school, these results show that this is not occurring in practice.

The authors do not say: “the failure of medical schools to use the best test predictors is grossly unfair to applicants, most of whom are well qualified whites”. Figure 2 shows the blatant manipulation of the entry process, in which black and latino applicants are given preference on the basis of their genetics.

For example, if entry to medical school was based on ability, 84% of white applicants would get an offer and only 37% of blacks. In fact, only 47% of whites gain entry and 40% of blacks. Very confusingly, the authors describe this as “reflecting differences of 7 percentage points”. In fact, the difference is 84-47 = 37 percentage points. Over a third of qualified white candidates are rejected. No black candidate is rejected. 7% of Latinos are rejected.

I do not live in the US, but the phrase “admission committees do not limit themselves to the consideration of MCAT scores” strikes me as an understatement of Orwellian proportions. Qualified white candidates are rejected at very high rate, Latinos rejected at a far lower rate, black applicants all accepted. This is clear evidence that admission is not based on merit, but that a substantial rent is paid to race. If a culture countenances such malfeasance in the allocation of resources its fundamental values become corrupted. The “other” criteria used for admission result in fewer candidates graduating even after 5 years, so those criteria, whatever they are, do not lead to success in medicine. I presume that we can reject the notion that it is a rational policy to place one’s life in the hands of an incompetent physician just for the exotic fun of racial variety.

Can anyone explain what is going on in America?

Monday, 16 June 2014

Nature stumbles again

Something has gone wrong at Nature, the former science publication. THE science publication, as was. Perhaps they just don’t like the topic of intelligence, and are on the search for knocking copy, publishing anything critical of tests and examinations.

Only a few days ago I had posted some proper work on the GRE, showing that although this was the best predictor, minorities (and to a lesser extent women) are being admitted to US colleges despite having lower scores.

http://drjamesthompson.blogspot.co.uk/2014/06/does-college-entry-depend-on.html

An eagle-eyed reader forwards this gem from Nature, in which it is claimed that the Graduate Record Examination is no good, and should be replaced by an interview. They entitle their piece “A test that fails: A standard test for admission to graduate school misses potential winners”, say Casey Miller and Keivan Stassun. After a provocative title like that one expects a reasoned argument as to why the test fails, and a simple exposition of which tests or procedures succeed. As a rule of thumb, given that we have GRE data going back to 1982, if not earlier, to live up to the title one expects a proper set of alternative test results going back a five or ten years. Multiple intelligence tests or emotional intelligence tests or gastro-intestinal intelligence tests. Procedures of which Robert Sternberg approves. Things like that. Anything.

We might also expect some data on over and under prediction. Yes, both of those. All tests miss some potential winners and pass some duffers. See R.L. Thorndike. The concepts of over- and underachievement. Columbia University, 1963.

Here is an example of the quality of their argument: According to data from Educational Testing Service (ETS), women score 80 points lower on average in the physical sciences than do men, and African Americans score 200 points below white people. In simple terms, the GRE is a better indicator of sex and skin colour than of ability and ultimate success.

At this stage you might wish to turn to other matters but charitably the authors might conceivably go on to show data to confirm that the GRE is a poorer predictor for African Americans than White Americans. As Jensen pointed out in 1980, tests are not bad if they show lower scores for some groups, but if they lead to poorer predictions for those groups. (That is my short summary of his Bias in Mental Testing). Instead, when these authors talk about correlations, they mean that lower scores are associated with some groups of test takers. They present no data on poorer predictions. I suppose one might say that they perform a public service by showing the results for different genetic groups, clearly showing that Asians are ahead, but cast it as a case of bias, without evidence. To be consistent they should say that the test is biased in favour of Asians.

Perhaps (this is a somewhat psychodynamic hypothesis, but I am writing this in a sunny cafe in Totnes, which has a slightly hippy feel to it) the authors are clever sillies, wanting to look good in public while at the same time holding up the examination results for everyone to see. Innocents of some kind. Like those who are opposed to pornography, but who among their many protests and calls for censorship keep providing detailed links to the websites they find most worthy of condemnation.

Have a look at the paper, just in case I have missed something.

https://drive.google.com/file/d/0B3c4TxciNeJZYTZ5X0s3dmVsXzQ/edit?usp=sharing

Why does Nature publish stuff like this?

Friday, 13 June 2014

Are Scottish reaction times slowing up?

We have spoken about the Scots many times, a Northern tribe whose exploits I have championed. Now grim news comes in from Woodley, Madison and Charlton suggesting that, for women at least, Scottish reaction times over the last 40 years have been slowing up, equivalent to a g equivalent decline of -7.2 IQ points, or -1.8 points per decade. This cannot be due to any problems with Victorian pendulums since the measures were taken with modern instruments. The rate of decline is higher than the –1.2 derived from both sexes since Victorian times. Men, with slower maturation, do not show such an effect.

Michael A. Woodley, Guy Madison, Bruce G. Charlton. Possible Dysgenic Trends in Simple Visual Reaction Time Performance in the Scottish Twenty-07 Cohort: A Reanalysis of Deary & Der (2005). Mankind Quarterly. In press.

In a 2005 publication, Deary and Der presented data on both longitudinal and cross-sectional aging effects for a variety of reaction time measures among a large sample of the Scottish population. These data are reanalyzed in order to look for secular trends in mean simple reaction time performance. By extrapolating longitudinal aging effects from within each cohort across the entire age span via curve fitting, it is possible to predict the reaction time performance at the start age of the next oldest cohort. The difference between the observed performance and the predicted one tells us whether older cohorts are slower than younger ones when age matched, or vice versa. Our analyses indicate a significant decline of 36 ms over a 40-year period amongst the female cohort. No trends of any sort were detected amongst the male cohort, possibly due to the well-known male neuro-maturation lag, which will be especially pronounced in the younger cohorts. These findings are tentatively supportive of the existence of secular declines in simple reaction time performance, perhaps consistent with a dysgenic effect. On the basis of validity generalization involving the female reaction time decline, the g equivalent decline was estimated at -7.2 IQ points, or -1.8 points per decade.

Get the whole thing here:

https://docs.google.com/document/d/1RPAja_2SsVWUu1H5ysww13ospA-WdiLgcjFnyDVTmVU/edit?usp=sharing

Thursday, 12 June 2014

Woodley launches his Victorian defence

You may recall that I promised in November last year that young Woodley would counter-attack his tormentors, who have now been caught startled, without their gum shields. Vicious uppercuts are battering the proffered chins of his adversaries, the redoubtable Mighty Champion Jim Flynn, a grizzled New Zealand pugilist, mentored by the great Jensen himself; the Australian Ted Nettlebeck, sun-bleached and hardened by the toil of 20 years of inspection time; the experienced Silverman, a quieter fighter, with a good solid punch, the original collector of historical reaction times; and the formidable Russian Dodonov and Dodonova duo, an unusual husband and wife combination of legendary aggression.

After a long wait, in a burst of pent up energy and a flurry of punches, Woodley, te Nijenhuis and Murphy break free, and in a phrase made current by D Day commemorations “all hell breaks loose”.

The Woodley gang argue that, once they have done a complete re-analysis to respond to the points raised against their original “Victorians” paper, their new results reveal a seemingly robust secular trend towards slowing reaction time in these two countries, which translates into a potential dysgenics rate of −1.21 IQ points per decade, or −13.9 points in total between 1889 and 2004. We conclude by arguing that the best way forward is to test novel predictions stemming from our finding relating to molecular genetics, neurophysiology and alternative cognitive indicators, thus shifting the research focus away from the purely methodological level towards the broader nomological level. We thank our critics for helping us to arrive at a much more precise estimate of the decline in general intelligence.

Michael A. Woodley, Jan te Nijenhuis, Raegan Murphy. Is there a dysgenic secular trend towards slowing simple reaction time? Responding to a quartet of critical commentaries. Intelligence

I highlight Table 1 which shows how the Wechsler subtests relate to simple reaction times and inspection times. What interests me is that the most substantial of the rather low correlations is with Information at 0.3, which does not immediately make sense, unless of course you see reaction time as a measure of crystallized verbal intelligence, which the other correlations with Arithmetic and Vocabulary would tend to confirm. Inspection Time, on the other hand, relates most strongly to Object Assembly 0.393 and Coding 0.351 and Block Design 0.306 which are all Performance subtests. Of course, this table is particularly interesting because of the g loading and heritability data.

https://drive.google.com/file/d/0B3c4TxciNeJZN29lSlpmdmlOcVE/edit?usp=sharing

I am heartily glad to see this paper published, because I have been sitting on it for many months, awaiting the final permission from the authors, who in turn were waiting for the conclusion of the peer review process, which is intended to achieve a quality standard, which it often achieves. It is not intended to delay the timely publication of academic work, a malign outcome it certainly achieves. A great pity.

Wednesday, 11 June 2014

Markus Jokela with new IQ/mobility results

Many thanks for the blog post, James. I completely agree with you that here intelligence is the engine and adult SES is the conveyor belt that takes things forward, as you aptly put it. I included SES as a covariate to assess how much of the migration patterns associated with IQ were due to the fact that people with higher IQ pursue and achieve higher education, which is linked to moving to more urban areas.

I used the word "attenuate" only in a statistical sense—the B coefficients get smaller—and I was only interested in mediation effects of SES, not any “confounding” effects of SES. At least for social scientists reading the paper, I thought it would be necessary to show how SES comes into play in the association, even though IQ was assessed in young adulthood. So adjusting for SES does not take away any explanatory power from IQ that was measured many years before SES.

But I do see that the adjusted models may not come across the way I intended. Some readers might erroneously conclude that only SES matters. I hope most the readers don’t interpret the results this way! The unadjusted models are the more interesting ones, the adjusted models only elaborate the social processes by which IQ becomes associated with migration patterns.

It’s true that the analysis only looked at average IQ levels of migrants and non-migrants (instead of the migration probabilities of individuals with different IQ levels). I thought presenting the average IQ of these groups would be the most intuitive way to present the data, especially when you consider the analysis from a demographic point of view of population dynamics. The data could have been presented the other way around, but this would have complicated the analysis of the interaction effects between origin and destination. I kind of had a similar reaction to the differences as Steve Sailer in his comment—not as large as you might have expected. But the population-level effects might still be non-trivial.

I had a quick look at the probabilities of moving by IQ deciles by first categorizing IQ into 9 groups in the total population (1= below 20 percentile, 9=above 90 percentile) and using this grouping as a categorical variable in a multinomial logistic regression predicting future residence. I included only participants who lived in a rural area at baseline. The probabilities of moving to a central city for these rural residents were:

6%, 7%, 9%, 9%, 10%, 12%, 11%, 14%, and 16%

in increasing order of IQ decile groups (i.e., 6% for those below 20 percentile, 16% for those above 90 percentile, allowing non-linear associations).

The same analysis showed that the probabilities of staying in a rural area were:

54%, 53%, 52%, 46%, 45%, 44%, 42%, 36%, and 35%

in increasing order of IQ decile groups. So about one-third of the people scoring in the top 10% of IQ distribution did not leave rural areas if they were living there as adolescents/young adults.

But migration is a complex outcome to study, because there are so many ways to operationalize it, and people move around back and forth. Not to mention differences between different subgroups. As always, more studies are needed!

Sunday, 8 June 2014

Where the bright kids go

Years after he had left his childhood town of Hannibal, Missouri, Mark Twain returned, a successful writer, but not yet someone who would be recognised on sight. He asked a stranger after a number of his childhood friends, and finally innocently included his own name, Samuel Clements, in the list and asked what had become of him. His interlocutor replied: “Same as all the ones who did well, he went to St Louis and did well”.

In fact Mark Twain had gone, in succession, from Hannibal to New York City, Philadelphia, St. Louis, and Cincinnati, but from the parochial viewpoint of Hannibal, St Louis was where the bright kids went. Now the general observation that bright kids leave to the bright lights has been tested by Markus Jokela.

Flow of cognitive capital across rural and urban United States. Intelligence Volume 46, September–October 2014, Pages 47–53

DOI: 10.1016/j.intell.2014.05.003

Jokela has mined the 16-year longitudinal data from the U.S. National Longitudinal Survey of Youth 1979 (NLSY79) to examine whether cognitive ability assessed at age 15–23 predicted subsequent urban/rural migration between ages 15 and 39 (n = 11,481). Higher cognitive ability was associated with selective rural-to-urban migration (12 percentile points higher ability among those moving from rural areas to central cities compared to those staying in rural areas) but also with higher probability of moving away from central cities to suburban and rural areas (4 percentile points higher ability among those moving from central cities to suburban areas compared to those staying in central cities).

The sample of the present study included all participants for whom data on cognitive ability were available (assessed in 1980) and who had participated in at least one follow-up study after that between 1980 and 1996 (n = 11,481 unique individuals with up to 129,424 person-year observations over the follow-up period).

Socioeconomic status was measured with two separate variables of the highest completed education and household income assessed at each follow-up phase. The participants reported their highest completed grade on a 20-point scale (range from 1 = no education to 20 = 8th year of college or more) and household income as the total net household income from all income sources of the participant and her/his spouse in the past calendar year

First of all, this is a great sample. With numbers like these, and the nationally representative sampling, we can be reasonably sure of the results. Second, the fact that intelligence was tested early in life allows us to consider it as a putative cause of later achievements, including educational achievement and the income gained in employment. Third, this study looked at changes of address and required proof of residence, not just later recall, which strengthens the observed results. Although this is a study of the effects of intelligence on mobility, personality factors (not measured here) could also be contributors.

Studies of personality and residential mobility have demonstrated that psychological characteristics may contribute to selective migration, including rural-to-urban migration (Camperio Ciani et al., 2013,Duncan et al., 2012 and Jokela, 2013). For example, neuroticism and openness to experience have been associated with higher migration probability (Camperio-Ciani and Capiluppi, 2011, Jokela, 2013 and Silventoinen et al., 2008), extraversion has been associated with rural-to-urban migration (Camperio-Ciani and Capiluppi, 2011 and Jokela et al., 2008), and conscientiousness and agreeableness have been associated with lower likelihood of leaving current residence (Jokela, 2009 and Jokela, 2013).

The first analysis examined the average cognitive ability level by the interaction effect between categorically coded 1980 baseline location and subsequent location after 1980, adjusted for sex, race/ethnicity, birth year, subsample, and time-varying age.

Here we go again. That adjustment would make any sex and racial intelligence differences disappear. This loses a item of interest: is there an IQ premium for women and minorities? Simple means and standard deviations would have helped orient the reader before doing the adjustments

The author explains his procedure: These two analyses were thus conducted with cognitive ability as the outcome variable, and with the residential categories and their interaction effects as the independent variables, adjusted for covariates (my emphasis).

So, IQ has a big effect, but when you take away the socioeconomic status which is a consequence of your intelligence, the effect of intelligence is apparently diminished. Well, it would be, wouldn’t it? Socioeconomic status was not handed out by some perverse deity in the sky: employers evaluated the applicants on offer, and selected those most likely to help them achieve their objectives.

However, Jokela says: Adjusting for socioeconomic status in adulthood substantially attenuated the associations between cognitive ability and selective migration. This suggests that underlying cognitive-ability differences get differently distributed across rural and urban areas largely because educational attainment and household income are related to selective urban/rural residential mobility (i.e., mediation effect).

I don’t understand what is meant here. Clearly, you cannot move to a more agreeable place unless you have the money to do so. You will not get the money unless you flourish in a job for which you are qualified. You cannot get qualified unless you have the ability to complete your studies. You cannot graduate unless you pass the exams, which will be due to ability and diligence. The measures of ability were taken early in life, so they have not been caused by later socio-economic status or later educational qualifications. Education is the conveyor belt. Intelligence is the engine. Socioeconomic status has not attenuated the effects of intelligence. It is a consequence of intelligence.

Bright kids move from the farm to the city, and from thence to the leafy suburbs. As in Pompeii and the villas of Herculaneum.

Although the author does not mention it, most minorities drift towards cities. For example, in UK survey data on sex, if the classification of homosexuality nationally is about 4% (I think that was defined as a same sex relationship in the last 5 years) then the figure for London is 8%. If partners are hard to find, best to go to a central meeting place. Same for bright people, who are 2% nationally, and an unknown but probably higher percentage in London. Some drift to provincial university towns, few stay on the farm.

The mobility patterns associated with cognitive ability were largely but not completely mediated by adult educational attainment and income. The findings suggest that selective migration contributes to differential flow of cognitive ability levels across urban and rural areas in the United States.

There is a lot to like about this paper. Excellent, large and representative sample. Good measures of intelligence and long follow ups. However, I am not one to let a noble researcher leave uninjured: why “attenuate” for income without discussing the underlying assumption behind such a move? There is a case for controlling for parental income, but the bright person’s income is their own, earned by them. Equally, they gained educational qualifications because they had the ability and completed the course. Attenuate if you must, but only with care for the underlying logic.

I still remember the sunlight in the room at home in the countryside where I read Twain’s collected works. My favourite was his great American novel The Adventures of Huckleberry Finn. I am still floating on the raft of his genius.

Thursday, 5 June 2014

Richard Lynn, Kazakhstan, cold winters and the world

How bright are human beings? A first step to answer this question would be to make a list of intelligence test results in all the countries of the world. Perhaps out of fear at finding the answers, no official body has ever done this, even though they have the resources to do so. The OECD make sure that PISA (Programme for International Assessment) is very well funded, and they test children and sometimes adults. Measuring scholastic attainment is deemed acceptable, intelligence not. The word “intelligence” never passes their lips. Too sensitive. Of course, one can derive intelligence scores from scholastic attainments, but why not measure it directly?

Instead of that, one retired professor working from home has done the job himself. Bereft of government patronage, he uses the same simple technique followed by Darwin: he makes professional relationships with other scholars across the world, encourages them in their work, and collates the results. The results have been published in numerous books. Richard has been working to flesh out the gaps in the register of national assessment results, and to improve the findings from smaller and poorer countries in particular. All he needs is more co-workers across the planet. If you would like to help collecting data, please let me know.

Also, if you can help put together a proper database of all the world’s national IQ results, please have a look at the current list of intelligence studies listed in his most recent book. These studies need gathering into one place, and need to be assessed in terms of sample sizes, representativeness, measuring instrument characteristics and, wherever possible, demographic and genetic characteristics. There is a very large meta-analytic paper to be written by one (or several) of you.

https://docs.google.com/document/d/11QtbEeyWfBZ_NEZEshrX4srzPR0QxFltFcSn87bOQYg/edit

Anyway, no sooner is Prof Richard Lynn back from Lappland than he sets off to Kazakhstan in the company of Grigoriev to measure Kazakhstan’s intelligence. They used a new version of Raven Matrices, and give their results in “British IQ” or as we used to call it Greenwich Mean IQ, that is to say standardized on a UK population.

Andrei Grigoriev and Richard Lynn. A study of the intelligence of Kazakhs, Russians and Uzbeks in Kazakhstan. Intelligence Volume 46, September–October 2014, Pages 40–46.

http://dx.doi.org.libproxy.ucl.ac.uk/10.1016/j.intell.2014.05.004

An IQ for Kazakhstan can be calculated from these results as follows:

The Russians have a mean British IQ of 103.2 and comprise 23.6% of the population; the Kazakhs have a mean British IQ of 82.2 and comprise 63.1% of the population; the Uzbeks have a mean British IQ of 86.0 and comprise 2.8% of the population. Weighting the IQs of these three groups by their percentages of the population gives an IQ of 87.9 for Kazakhstan. These three groups comprise 89.5% of the population. The remaining 10.5% consists of Chuvash, Tartars, Uyghurs and other south Asian peoples. Early studies of intelligence in the former Soviet Union found that these peoples had lower IQs than ethnic Russians (Grigoriev & Lynn, 2009). Their IQ is likely about the same as that of Kazakhs (82.2). On this assumption, adding this fourth group and weighting the IQs of the four groups by their percentages of the population gives an IQ of 87.3 for Kazakhstan.

This figure compares quite closely with the British IQ of 84.7 for Kazakhstan calculated byLynn and Vanhanen (2012, p.24) from the PISA 2009 study of the achievement of school students in grades 4 and 8 in mathematics and science and with the British IQ of 85.6 for Kazakhstan calculated from the PISA 2012 study of the achievement of 15 year old school students. The closeness of the estimates from the PISA studies and the present IQ study is a further confirmation that the PISA results give a good measure of the intelligence of nations. However, these results are not consistent with the 2007 TIMSS of the mathematical and science abilities of grade 4 and grade 8 school students from which an IQ 101 for Kazakhstan was calculated by Lynn and Mikk (2007). This suggests errors in the 2007 TIMSS data for Kazakhstan.their percentages of the population gives an IQ of 87.3 for Kazakhstan.

The low IQ of the Kazakhs and Uzbeks raises a problem for the explanation of the evolution of racial differences in intelligence. The leading theory for this is the cold winters theory proposed by Lynn, 1991 and Lynn, 2006 that higher intelligence evolved in environments with colder winters as adaptations to the greater cognitive demands of survival through these. This theory has been accepted by Rushton (2000), Kanazawa (2008) and Templer and Arikawa (2006) who have presented data for lowest winter temperatures and national IQs for 129 countries and reported a correlation of − .66, i.e. there is a tendency for the populations of higher IQ countries to have lower winter temperatures. More recently, this association has been confirmed by Meisenberg and Woodley (2013) who have reported a correlation of − .746 between lowest winter temperatures and national IQs for 143 countries.

These negative correlations support the cold winters theory, but Kazakhstan and Uzbekistan are anomalies because they have very low winter temperatures but not high IQs. Templer and Arikawa (2006)give data for average winter temperatures for 129 countries including − 15 °C for Kazakhstan and − 6 °C for Uzbekistan, compared with around zero for northern and central Europe (e.g. − 3 °C for Germany, − 1 °C for Belgium, 2 °C for France and Britain), and − 3 °C for China and Japan.

In addition to the cold winters theory, it has been proposed by Miller, 2005 and Miller, 2014 and Lynn (2006) that it is necessary to posit the appearance of new alleles for enhanced intelligence that appeared as genetic mutations in some populations but failed to appear in others or, if they did appear, failed to spread throughout the populations. It has been shown by Cochran and Harpending (2009) that a number of new alleles appeared in different populations during the last ten thousand years. The present results showing the low IQs of Kazakhs and Uzbeks despite the very cold winters in Kazakhstan and Uzbekistan are a further anomaly for the cold winters theory of the evolution of racial differences in intelligence and a further strengthening of the hypothesis of the appearance of new alleles for enhanced intelligence that appeared as genetic mutations in some populations but failed to appear or failed to spread in others including central Asia.

It would seem that cold winters often, but not always, boost intelligence. I would say “Necessary but not sufficient” but I am not even sure how necessary they are, because genetic mutations should show up even balmy climates, and should confer advantage. Nonetheless, the grand aim of science is to cover as many observations with as few axioms as possible, and cold winter theory does a good job on that account. Lynn, the great protagonist of cold winter theory, is doing what all empiricists should do: reporting anomalies wherever he find them. New science grows out of the anomalous margins. Perhaps the theory should be amended to: cold winters favour intelligent survivors, thus boosting intelligence wherever favourable mutations occur.

Wednesday, 4 June 2014

International Society for Intelligence Research

http://www.isironline.org/

Clicking on the link above takes you to the new, improved website, and gives you details of the next conference in Austria in December.

I am a member, but not an agent for this society, so I can give you the low-down. Conferences last for about three days, and begin at 8 am in the morning, and go on to about 6 pm at night. Barbaric, I agree.

At these events you will find the world’s leading researchers on intelligence. It is engaging to watch researchers of very different backgrounds and opinions arguing their cases and having coffee and planning work together. They are not of one mind. Some researchers agree with other researchers on some topics but not on others. Strong hereditarians meet strong environmentalists with relatively little fuss, though sometimes the wine intake increases a bit. Invited guests cover the whole spectrum of views and subject areas within intelligence research.

There is a flourishing new generation rapidly replacing what had sometimes (a decade or so ago) looked like becoming a retired professors club. There are prizes for the best student paper. At the other end of the life cycle, there are lifetime achievement awards, but sadly no lifetime non-achievement awards, a lapse I back as a matter of principle.

Don’t know much about Austria. Yodelling, I think.

Tuesday, 3 June 2014

Lapps, Finns, cold winters and intelligence

Renée Zellweger cropped.jpg

Cold Winter theory is very simple: warm blooded, warm climate adapted humans drifted North in search of game, and perished unless they could hunt, cope with the climate, and plan wisely so as to live from one winter to the next. Hence, survivors had more forethought, more behavioural restraint regarding immediate gratification, and a whole lot of other changes to help them adapt to hunting and later farming in cold climates.

If any of this is true, people living in the far North should be very bright. All the short-term-ist, happy go lucky, non-planners should have died off long ago, leaving crafty, calculating, clever survivors.

Elijah L. Armstrong, Michael A. Woodley, Richard Lynn. Cognitive abilities amongst the Sámi population. Intelligence Volume 46, September–October 2014, Pages 35–39

http://dx.doi.org.libproxy.ucl.ac.uk/10.1016/j.intell.2014.03.009

Into these wintry wastes trek the young Woodley, the even younger Armstrong and the older Lynn to sniff out whether the far Northern tribe of the Lapps (or Sami) differ from the nearby Finns, a mere reindeer ride to the South. This isolated group is similar to the Siberian Chukchi. Cranial volumes are slightly smaller than the European mean. Sámi appear to be different from other Artic peoples, and were historically reindeer herders or fishermen, although today they are largely urbanized while maintaining traditional folkways.

Not much different, it seems. The Lapps are not that different from the Finns in terms of intelligence. Lapps have an IQ around 101, as do the comparison group of rural Finns, and are tilted towards visuo-spatial ability and away from verbal ability. These data suggest that the Sámi have the same profile that most people of the world have, i.e., they perform better on spatial than on verbal tests relative to the Caucasoid norm. Both groups are stronger on non-verbal than verbal skills, which might be expected of hunters searching for game in the landscape.

However, these three intelligence explorers are careful to show how many other factors must be considered. They looked at data from four studies on the Skolt Lapps, who may have turned out to be the brighter ones. They do some careful work to estimate the underlying g factor, and correct for the reliability of the tests, for restriction of range and for the Flynn effect.

Overall, the results are at best only somewhat consistent with Cold Winters theory (Lynn, 1991 and Lynn, 2006). As Hart (2007, p. 417), noted, a higher IQ for the Sámi is expected from this theory. Nevertheless, the Sámi may live in a lower quality environment, thus their genotypic IQ might actually be higher still. The Sámi might have evolved their distinct cognitive profile in response to recurrent features of an Arctic ecology over the last 2,000 years (before which they may have more closely approximated the other Caucasoids in terms of the structure of their mental abilities). This would support Kura's (2013) and Woodley and Figueredo's (2013) conjecture of very recent accelerated evolution in response to cold temperatures.

The IQ of the Laplanders is higher than that of the Inuit peoples, whose IQ is around 90.5, and the Aleut, whose IQ is around 92 (Lynn, 2006). This may be related to the fact that Mongoloid Arctic peoples are genetically close to the North Amerindians, whose IQ is about 86 (Lynn, 2006), whereas the Sámi are genetically about equidistant from the Amerindians and Europeans (Jensen, 1998 and Lynn, 2007 estimates the IQ of the Mongols using a similar strategy). Comparing the IQ of the Sámi to that of three other Arctic groups, the Ainu, Tungus and Altai, the Sámi exhibit similar IQs to the Ainu (IQ 97; Kura et al., 2014) but possess substantially higher IQs than the Tungus and the Altai (Tungus IQ 70–80, Altai IQ 67–75; Lynn & Shibaev, under review). The latter study listed two samples of Tungus, who attained IQs of 70 and 80. The latter sample was extremely poor and isolated (information about the living conditions of the first sample was not given), which may account largely for their low IQ. The study also cited one study of Altai IQ where Altai (who were largely illiterate) received IQs of 67 or 75, depending on the test.

The authors continue: Two final problems with the present study are, firstly, that all of our samples are Skolt Sámi; there are no other Laplanders included in this sample. Skolts are not the only subgroup of Sámi, and they were somewhat more isolated than other Sámi when the studies were conducted (Forseius, 1973), so the samples may therefore be unrepresentative. Secondly, our estimate for the Finnish national IQ (101) is somewhat conservative. A higher Finnish IQ is indicated by reaction time studies (Woodley, te Nijenhuis, & Murphy, under review), which are not included in Lynn and Vanhanen's (2012) review. Therefore, the Finnish IQ, and by extension the Sámi IQ (which in the first analysis was estimated relative to the Finnish IQ), may be somewhat higher than that estimated here.

In summary, cold winters probably had as much effect on Lapps as on Finns, but it does not of itself account for the lower intelligence of other Artic peoples who were drawn from other, less able, populations. So, I would interpret this result as one of those “Consistent with” rather than “Stongly suggest of” conclusions.

Actress Renée Zellweger, pictured above, is of possible Sami extraction, on her mother’s side. Sibelius, he was Finnish.

Sunday, 1 June 2014

Does college entry depend on intelligence or ballot rigging?

Entrance to tertiary education is presumed to be based on the capacity to think. If the brightest get the opportunity to study longer, some good may result for society. If the less bright take up scarce places, less good will result.

April Bleske-Rechek and Kingsley Browne have looked into this in: Trends in GRE scores and graduate enrollments by gender and ethnicity

http://dx.doi.org.libproxy.ucl.ac.uk/10.1016/j.intell.2014.05.005

Does college entry depend on the capacity to think? The Graduate Record Examination (GRE) is a cognitive abilities test that predicts success in graduate training (Kuncel and Hezlett, 2007, Kuncel et al., 2001 and Kuncel et al., 2010). Because of its reliability, validity, and predictive utility, it is used by many graduate schools to inform admissions decisions. However, some critics describe the GRE as a gatekeeper that limits equitable access across groups to higher education (Dutka, 1999, Pruitt, 1998 and Toyama, 1999).We explored how scores on the GRE have fared over time as a function of test-taker gender and ethnicity, and we investigated whether enrolment patterns over time implicate the GRE as obstructing efforts toward increasing parity in higher education. First, we found that the gap between men's and women's GRE quantitative reasoning scores has changed little since the 1980s, although female representation in science, technology, engineering, and math (STEM) graduate programs has increased substantially. Second, ethnic gaps on the GRE persist, especially in quantitative reasoning, although representation of historically disadvantaged ethnic groups in graduate programs has increased. Enrolment gaps have narrowed despite ethnic and gender GRE gaps persisting, so it appears that continued use of the GRE for admissions decisions has not blocked efforts toward equalizing representation in higher education.

In a nutshell, the authors find that although sex and race differences in intelligence have not changed, and intelligence is the best predictor of college success, admission to university has increased for racial minorities. So, something other than intelligence is determining entry for these racial groups. With some understatement the authors observe: Such a finding would raise a subsidiary issue, however, which is whether and to what extent efforts to achieve diversity by de-emphasizing GRE scores have impaired the first goal of selecting those applicants who are most likely to benefit from advanced training.

The GRE functions as an intelligence test, and not a test of interests, preferences or motivation. It correlates with the Scholastic Aptitude Test and with other intelligence tests. It has been in use since the 1950s and has a good track record of predicting educational outcomes.

In 1982,men outscored women by 79 points, and in 2007 by 78 points, so no change over a generation.

First, here is a graph of their findings for men and women:

Men are ahead on verbal intelligence, and even more ahead on quantitative intelligence.

Despite this, more and more women have entered courses which require quantitative skills, though the effect is less pronounced for mathematics and computing. This suggests that entry criteria have been lowered for women entrants or equivalently, wider participation of women generally has resulted in more women entering courses with lower ability scores.

Next, here are the scores by race:

They show the familiar pattern, though with no Asian advantage on Verbal intelligence, but confirming the usual hierarchy in quantitative skills. The latter are very substantially different. Asians consistently earned higher GRE-Quantitative scores than did test-takers from any other ethnic group. In 1982, Asian test-takers scored 49 points higher on average than did White test-takers, who scored
171 points higher than Black test-takers; 25 years later, Asian test-takers scored 55 points higher than White test-takers, who scored 143 points higher than Black test-takers. In other words, the Asian-Black gap in quantitative reasoning was 220 points and 25 years later was 198, a narrowing, but still a very substantial difference.

Women are given modest preferences, minorities substantial preferences. Bluntly, some candidates either play the genetic card or the colleges play it for them. Ballot rigging at the entry stage can then lead unjustified complaints at the exit stage: weak entrants can claim that later weak performance at work or in academia is due to unfair assessments later in life.

One might expect that if the GRE is a valid predictor of graduate school success, and if universities accept applicants with lower GRE scores for demographic reasons, then those students may not have the same success as students with stronger records even if they are admitted in equal proportions. This is especially a concern in STEM disciplines, because gender and ethnic gaps in quantitative ability increase in magnitude farther along the right tail (Hedges & Nowell, 1999; Lakin, 2013; Wai et al., 2010), and exceptional levels of achievement in STEM are closely linked to exceptional levels of quantitative reasoning ability (Park, Lubinski, & Benbow, 2008). If women in STEM careers are highly able but still discrepant, on average, from their male counterparts, one would expect average gender differences in rates of tenure, publication, citation counts, and funding (Ceci & Williams, 2011). The same logic applies to ethnic differences, although perhaps to a larger degree given the magnitude of the test-score discrepancies. If not understood, these discrepancies
could lead to unwarranted perceptions of discrimination. As noted by Ceci and Williams (2011), gender differences in rates of tenure, publication, citation counts, and grant funding in STEM disciplines are tied to women's abilities and preferences; and interventions that focus on discrimination as the primary cause are unlikely to be successful.

Ability distribution differences are also key for understanding concerns about attrition of minorities from the academic pipeline (Griffith, 2010). The foundation for this concern is buttressed by Garrison's (2013) finding that racial disparities in STEM fields, including in graduate education, are more a product of differences in graduation rate than matriculation rate. Suggesting that attrition rates are influenced by qualifications, Baker (1998) found that substantial race differences in Ph.D. completion rates disappear once measures of “ability” (GRE scores, GPA, and NSF Graduate Fellowship panel evaluations) are controlled for. Degree completion is not the only outcome measure that varies by ethnic group. Price and Price (2006) found, for example, that minority graduate students in humanities and social sciences are less likely than non-minorities to publish as graduate students or within three years of finishing graduate school.

Although over 25 years white scores have gone up from 535 to 562, white entry to college has fallen from 84% to 75% and apart from losing some spaces to brighter Asians it is mostly to make way for students with lower scores.

These are sobering findings. If you tell lies to all students about their abilities at entry then you are open to accusations of bias when disappointment sets in later. You cannot defend the apparent later disparity in outcomes because you have suppressed the original differences in ability.

There is one thing the authors could have done, which is to show what the sexual and racial composition of tertiary education would have been like if it had been based on merit. For sex differences, if we set the entry point at the 2007 male mean of 599 points on the quantitative measure, then that accepts 50% of the men but only 29% of the women.

The quick calculation for racial differences for 2007, the most recent year for which we have data, is as follows. I have taken white students as the reference group, because they are the largest group by far, and for simplicity have assumed that those who are even 1 point below the white mean on the quantitative measure should not be admitted to university, on the grounds that they will probably not do well. By definition, this will lead to only 50% of white applicants being allowed to enter university. 66% of Asians will get in, 32% of American Indians, 28% of Mexicans, 25% of Puerto Ricans, and 14% of Black Americans.

So, are entrants to university in the USA being selected by ability or by representativeness in genetic terms? The latter, it would seem.