What makes problems difficult? Indeed, what ever made events turn into problems? Usually, I assume, specific instances had to be confronted and dealt with: an approaching predator, an escaping prey, an edible nut that had to be opened. In all these instances a solution is required to a pressing problem. In time, some general principles may be discerned: perhaps those were discussed in campfire stories, or formed the preparation rituals of hunter-gatherers. Perhaps, more likely, people made them up as they went along.
Intelligence test items are very specific instances of problems. They are chosen to be unfamiliar, so that the test is always a real test, and not the exercise of specifically trained skills. Tests have to be kept secret, and defended from cheats. Tests only test problem-solving when the correct solutions are not known to the test taker. Tests must not only be unfamiliar, but preferably easily explained by using familiar concepts, ones known to almost everybody in that culture, or in any culture. Things can get bigger and smaller, go in front of or behind other objects, increase or reduce in number: that sort of thing. The mental habits of our species, as depicted on pottery, funerary objects, sculpture, buildings, jewellery, ornaments and dress. The all of it, as they say in Oxfordshire.
So, when a specific problem arises one examines the instance, and then attempts a solution. Is it helpful to be able to abstract general principles? Does abstraction assist, or is it better to concentrate on the individual task?
Into this hall of mirrors step an international gang to bring us some jewels from Estonia, a Finnic country with high income and living standards, which has the additional benefit of having done proper intelligence testing in 1933, and has the results item by item. These can then be compared with the data for 2006, item by item. And what a gang: they come from the US, Korea, Brazil, Germany, Belgium and of course Estonia.
Elijah L. Armstrong, Jan te Nijenhuis, Michael A.Woodley of Menie, Heitor B.F. Fernandes, Olev Must, Aasa Must. A NIT-picking analysis: Abstractness dependence of subtests correlated to their Flynn effect magnitudes. Intelligence 57 (2016)http://dx.doi.org/10.1016/j.intell.2016.02.009
We examine the association between the strength of the Flynn effect in Estonia and highly convergent panel ratings of the ‘abstractness’ of nine subtests on the National Intelligence Test, in order to test the theory that the Flynn effect results in part from an increase in the use of abstract reference frames in solving cognitive problems. The vectors of abstractness ratings and Flynn effect gains, controlled for guessing) exhibit a near zero correlation (r = −.02); however, abstractness correlates positively with (and is therefore confounded by) g-loadings (r = .61). A General Linear Model is used to determine the degree to which the abstractness vector predicts the Flynn effect vector, independently of subtest g-loadings and the portion of the secular IQ gain due to guessing (the Brand effect). Consistent with the abstract reasoning model of the Flynn effect, abstractness positively predicts Flynn effect magnitudes, once controlled for confounds (sr=.44), which indicates an increasing tendency to utilize factors external to the items in order to abstract their solutions.
Flynn effects were derived from the difference in scores between 1933/36 and 2006 administrations of the National Intelligence Test to samples of Estonian schoolchildren (N = 890 for the older sample, 913 for the more recent sample). The Method of Correlated Vectors (MCV) was utilized to determine the effect of abstractness on the Flynn effect independent of both subtest g loadings and the Brand effect — or the portion of the secular gain in IQ that is due purely to the results of guessing.
Their method has been to get raters to assess items for abstractness.
The 28 raters used to obtain the abstract thinking dependencies for each subtest were classified into the following categories: non-professionals (without degrees in psychology), graduate students, or professionals (N=10 non professionals, 5 graduate students, 13 professionals). Each rater rated the abstract thinking dependency of each subtest on a scale from 0–100, using a text vignette defining abstract thinking (Supplement 1) as a rating criterion. The text gave examples of three hypothetical test items heavily dependent on abstract thinking; one was drawn from Luria (1976), one from Flynn (2009), and one from Flynn (2012) in a discussion of Fox and Mitchum (2013). The raters used Form 2 of the British National Intelligence Test to rate the abstract thinking dependence of each subtest.
Here are the main results in one table:
To nit-pick, they should have listed these by level of abstractness for ease of reading, but Analogies (76) are almost twice as abstract as all the other tests. Synonym-Antonym are next (38) and then Vocabulary (35). The most concrete test is Comparisons (22). This provides a useful metric with which to consider what makes items difficult.
The correlation between the size of the Flynn effect on a subtest (corrected for guessing) and its level of rated abstractness is −.02, or virtually zero. A large negative correlation exists between the guessing-corrected Flynn effect and g loadings (−.55), and a large positive correlation exists between g loadings and abstractness (.61). The Brand effect and g loadings correlate strongly and positively (.8). Modest magnitude correlations exist between abstractness and the Brand effect gains (.32), and between the Brand effect and corrected Flynn effect gains (−.42). None of the effect sizes are significant; however, null hypothesis significance testing is not appropriate for evaluating the substantiveness of these results, as the N is extremely small (9 subtests).More attention should be paid to both the magnitude
of the effects, which range from small to large in magnitude (Cohen, 1988), and to the degree to which the directionality of the effects are consistent with explicit theoretical expectations.
It can be seen that abstractness now becomes a strong positive predictor of Flynn effect magnitude (r = .44), once controlled for the Brand effect and subtest g-loadings. Thus, our analysis supports the contention that abstract thinking may causally contributes to the Flynn effect. g loadings do not change as a predictor of the Flynn effect once controlled for abstractness and the Brand effect. The Brand effect residual becomes a mildly positive predictor of the Flynn effect (−
.42 to .19). At the suggestion of a reviewer,we reran the analysis excluding subtest B4 as an outlier in terms of abstractness. The recalculated effects, included in Table 2, indicate that abstractness was greatly attenuated in effect size as a predictor of the Flynn effect, but the direction of correlations did not change.
The authors run through a list of possible issues regarding this work, but to my mind their main thesis stands. Jim Flynn was probably right that level of abstraction is part of the cause of the secular rise in intelligence test scores, without their being any notable commensurate rise in actual intelligence. It would appear that schoolchildren have learned an intellectual trick which helps them leapfrog from instances to general rules.
This is an important paper, which brings us closer to understanding the Flynn effect, and the nature of intelligence test items.