Thursday, 21 April 2016

Estonia, the abstract

For those of us with refined, and possibly even nerdish, tastes, Estonia is in the news. Not one but now two papers have been NIT-picking through the fine detail of the items used in the national intelligence test, and coming to conclusions about the contribution to the rise in intelligence test scores made by the capacity, or strategy, of considering higher levels of abstraction. One might even begin to posit a dimension: as children develop and cultures evolve, they move from specific instances to general principles. Perhaps the same is true of nations. They arise out of sheer necessity, a wall defensive against the outside world, where protection is offered at the cost of homage and tribute; then flourish as a sceptr’d  isle which breeds a happy band of citizens who develop refined abstract thoughts; and then collapse when challenged by other tribes driven by sheer necessity. The rise and fall of civilisations may not be due to climate, illness, weapons, harvests, religion or politics, but simply to the inevitable drift from concentrating on pressing instances to formulating general abstractions.

Olev Must ⁎, Aasa Must, Jaan Mikk

Predicting the Flynn Effect through word abstractness : Results from the
National Intelligence Tests support Flynn's explanation. Intelligence 57 (2016) 7–14

The current study investigates the Flynn Effect (FE) and its relation to abstract thinking ability.We compare two cohorts of Estonian students (1933/36, n=888; 2006, n=912) using the Concepts (Logical Selection) subtest of the Estonian adaptation of the National Intelligence Tests (NIT). The item presentation order of the subtest correlates with the abstractness of the words used in the items (r = .609) of the subtest. The different test results (right, wrong and missing answers) were analysed in order to make an estimate of the FE magnitude. The FE for abstract thinking ability of those samples was 1.06 Hedges' g (adjusted for guessing). The magnitude of the FE is dependent upon the degree of difficulty of the items (an item's difficulty is estimated by determining its abstractness and its familiarity to students). The more difficult part of the subtest (the second half) showed a FE=1.80 whereas the easier part (the first half) of the subtest showed a FE=.72.Word abstractness was a strong predictor of all the testing results in both cohorts (Beta=.700). The familiarity of words used in the test items has no correlation with the test results if word abstractness is controlled in both cohorts. Our findings support Flynn's
explanation that the FE is primarily an indicator of the rise in abstract thinking ability.

The essence of this paper is the close analysis of item responses, which is as close to the real data as is possible to get. (It would be good to have concurrent fMRI brain scans). Jim Flynn proposed that a large part of the Flynn effect was an increase in abstract thinking. In olden times, a candidate asked to say what was similar about a man and a dog would reply “A man hunts with a dog” and might possibly from that say that both were hunters. In recent times candidates would be more likely to understand that both were animals.

The older sample (1933/36; N = 888) consists of students from grades 4 to 6, whose mean age was 13.3 (SD = 1.24) years. The more recent sample (2006, N = 912) consists of students from grades 6 to 8 with a mean age of 13.5 (SD = .93) years.

Looking at the test words they measured word familiarity, word abstractness, and the responses, including the missing responses to items. I am rushing ahead, but since this paper deals with the same data as described in the previous post, I am going straight to the results.

Abtractness and familiarity


The data set has been very well studied. Incidentally, it supports Chris Brand’s hypothesis that part of the Flynn effect is that modern kids are willing to guess, whereas students in the past were more cautious, well behaved and even deferential. Modern kids are willing to chance their arm, and thus pick up some extra points (unless the marking system specifically penalises wrong answers). In a nutshell, modern students will make good psychologists but bad engineers. Never let children guess things that might fall down and crush them.

Over 7 decades the scores on each item of a test of abstractness have risen from 0.48 to 0.6 which is a sizeable improvement.

The other finding is that intelligence testing is a game of two halves.

The first half (12 items) makes use of words that have low levels of abstractness and are simple and familiar to students. The second half of the subtest (also 12 items) uses less familiar and more abstract words. In the NIT data the highest magnitudes of the FE are evident in the more difficult part of the A3, although the probability of guessing is also higher as well. If we were to use the previous algorithm to calculate the FE, then in the more difficult section would be FE = 2 * .85 + .61–.51 = 1.80. The data does not support the hypothesis
that the younger cohort guessed more often than the older one. In the easier part of the A3, the FE=2 * .16 + .09 + .31 =.72. It can be concluded that the FE may be more than 2 times greater (1.80) for items where the words are abstract and less familiar, than it is for items that are less abstract and easier to understand (.72).

Interesting that this comparison of the first half against the second half makes the point about the rise being due to increasing abstractness so succinctly.

The abstractness of the items strongly predicts the answer patterns of the A3 (the abstractness test) Regression analysis shown in Table 2 indicates a clear result pattern.
If the presentation order of the items in the subtest is not taken into account, then it is word abstractness that becomes the main predictor of test scores for the A3. Right,wrong and missing answers can be predicted through word abstractness. The Beta of word abstractness ranged from −.682 (predicting 2 right answers in the 2006 cohort) to .775 (predicting 1 wrong answer in the 2006 cohort). It is easier to receive more points from easier items, and subsequently the test-takers more frequently received most of their points from the less-abstract items. This is the reason for the minus sign. This negative relationship in predicting 2 right answers was clear in the 1933/36 data as well as in the 2006 data (Beta accordingly −.638 and −.682). The situation changes however, when attempting to predict the one point score.

Altogether: the 2006 cohort was clearly more able to solve more subtest items (according to the 2-point criteria), while at the same time they took more risks with the highly abstract items. This strategy gave them an advantage over the older cohort.

Our results are in accordance with Flynn's explanation that the Flynn Effect is a demonstration of the rise in abstract thinking ability over time. The results also accord with Amstrong et al. (2016) empirical description of the abstractness vector of the NIT. The rise of abstract thinking
also corresponds with Flynn's explanation that the teaching process at school develops this ability. It should be kept in mind that in the present study the younger cohort had been educated for 2 years more than the older. Thus it can be concluded, that the A3 was relatively easy for the younger cohort, which is why the estimation for guessing in test taking
behaviour is relatively low.

Our findings also support Vygotsky's (1934/1987) theory of cultural–historical development wherein Vygotsky made a distinction between everyday (spontaneous) concepts that emerge from everyday experience, and scientific concepts that are taught in school. The acquisition of scientific concepts corresponds to the rise in higher-level thinking and facilitates the development of abstract thinking abilities.
According to Vygotsky's theory, child development is the acquisition of scientific concepts and the concomitant adoption of an abstract decontextualized systematic way of thinking. In this sense, the two additional years of schooling could have offered a significant advantage in abstract thinking ability for the 2006 cohort in comparison with the
1933/36 cohort.

Please read this paper together with the Armstrong paper I commented on in my previous post.

Final point: never throw away old data. Make sure the results are stored in a dry basement and carefully dusted at least once a decade.


  1. elijahlarmstrong22 April 2016 at 05:15

    I think it is now a pretty well-replicated finding that "abstractness" correlates with Flynn effect magnitudes. An interesting question: why doesn't abstractness show up as a within-cohort factor if it's such a salient factor between cohorts? As we point out in our paper, the closest thing is Gf as distinct from g, and even that's not always replicated. This is quite closely analogous to the "heritability paradox," why environment can do so much between cohorts and so little within them (if one accepts the usual estimate of c^2~0).

  2. Dear Elijah, Thanks for your intriguing question. It seems a switch has been flicked somewhere, on the basis of group membership, like one group being give a code word that the other lacked. The wisdom of (defined) crowds. Will ponder it further.

  3. Lithuanians and Letts do it .....

    Did Cole Porter have a prejudice against Estonians?

  4. Abstractness increased atheism/agnosticism*

    Older people who are predominantly concrete thinkers tend to be more ''selfish'' by certain perspectives, more pragmatically practical phenotypical mind.

    ''God help me''

  5. I suspect the Flynn Effect is related to Moore's Law.

    "The Flynn Effect is a side effect of the developers of the IQ test being on “the right side of history.” We’re used to hearing progressives denounce IQ tests as obsolete pseudoscience on the wrong side of history, but, in reality, IQ testing in the United States has some amusing organic ties to the triumph of Silicon Valley. Louis Terman’s son Fred Terman (1900-1982), a professor of electrical engineering at Stanford, was the perhaps the single most important figure in the rise of Silicon Valley. The mentor of Hewlett and Packard, he largely invented the model of Stanford grad students like Larry Page and Sergey Brin starting up high tech firms like Google.

    "You are supposed to believe that the Termans were all wrong, but it sure looks like we’re living in the world the Terman family anticipated."

    1. In terms of the history of ideas, it could be that both the specialist skills a particular era requires, and the items on tests of ability move in unison.