Between WAIS-R and WAIS III this is not known, as the scores are not comparable. Between WAIS III and WAIS IV intelligence probably did rise.
EXAMINING THE FLYNN EFFECT IN THE WECHSLER ADULT INTELLIGENCE SCALE Nicholas F. Benson1 , A. Alexander Beaujean1 and Gordon E. Taub2 1 Baylor University, Nicholas_Benson@baylor.edu. 2 University of Central Florida.
The Flynn effect (FE: i.e., increase in mean IQ scores over time) is commonly viewed as reflecting population shifts in intelligence, despite the fact that most FE studies have not investigated the assumption of score comparability. Consequently, the extent to which these mean differences in IQ scores reflect population shifts in cognitive abilities versus changes in the instruments used to measure these abilities is unclear.
This study used participants from the Wechsler Adult Intelligence Scale’s revised (WAIS-R; n = 1,800), third (WAIS-III; n = 2,450), and fourth (WAIS-IV; n = 2,200) editions’ standardization samples. First, WAIS subtest scores were equated using data obtained from participants who were administered two editions of the WAIS, either the WAIS-R and WAIS-III (n = 192) or the WAIS-III and WAIS-IV (n = 284). Each WAIS-R and WAIS-IV subtest was equated to the corresponding subtest raw score on the WAIS-III. Score equating enabled the combination of scores from all three instruments into 1 of 13 age groups before converting raw scores into Z scores.
The second part of the study involved examining invariance of the WAIS structure across standardization samples via multi-group latent variable models. Some factor loadings and all of the subtest intercepts were found to be non-invariant when comparing the WAIS-R and WAIS-III samples on the equated scores. Thus, score changes reflect, at least in part, a recalibration of the instrument’s metric or scale. Conversely, when comparing the WAIS-III and WAISIV samples on the equated scores, strict invariance was tenable. Thus, score changes between the WAISIII and WAIS-IV most likely reflect changes in psychometric g rather than changes to the instrument.
These results suggest that there is some evidence for an increase in intelligence, but also call into question many published FE findings as presuming the instruments’ scores are invariant when this assumption is not warranted. Even though score comparability across instruments depends on a minimum level of invariance, FE studies do not typically examine this assumption. Thus, any difference reported in manifest scores from these instruments (e.g., FSIQ) could just as easily be due to changes in the instrument as due to changes in the examinees. Our use of score equating and assessment of invariance in the present study allows for a more direct test of whether the FE arises from genuine secular changes in intelligence or simply reflects changes to the test.
It appears that there is evidence for a rise in the level of psychometric g from between the time the WAIS-III and WAIS-IV were normed, but little is known about the period between the WAIS-R and WAIS-III as the scores are not comparable.